Separate PK in Jxn Tbl?

S

Sylvain Lafontaine

Sorry if I didn't responded before but I wasn't sure to understand what you
have wrote. I partage your opinion that a composite key can not only
unequivocally identifying any item in a particular database but also remain
constant (ie. never change from database state to database state) in many
systems.

But as I said, this is not true for all systems and there are occasions
where the value will change from state to state. For instance, is instead
of an inventory system you have a sport ligue system, with a table making a
jonction between a list of players and a list of team; it's easy to see that
in such a table, the composite key may change its value from state to state;
for example when a player is exchanged between two teams.

I don't want to enter into a discussion of the full range of possibilities
(for example, do you want the database to remember the previous
relationship, what about the player number (if a player change his/her
number, do you want to keep the older information?), the individual
statistics, etc., etc.) but it doesn't take too long to see that in such a
situation, the use of a composite key to express the relationships between
tables will rapidly become like hell. Like someone else has said: « been
there, done that » and personally, it's not my intention to go back there.
Everyone know that when it's time to make a decision, one personal
experience has more weight than a thousand opinions so for me, my first
reaction about using a composite primay key will be a no go.

But there are other occasions, like your inventory system, where the
possibility that a composite primary key can change its value don't exist.
In these occasions, would it be overkill to use a separate primary key?
Personally, I don't mind using a separate primary key even on these
occasions but I fully understand that other people might feel unconfortable
to do the same. Everyone know that the devil like to hide in the details;
probably that besides the details, the composite keys are also one of his
favorite places to hide.

--
Sylvain Lafontaine, ing.
MVP - Technologies Virtual-PC
E-mail: sylvain aei ca (fill the blanks, no spam please)


Brian Selzer said:
Sylvain Lafontaine said:
To that, I would add that the increased simplicity of using a surrogate
(or artificial or autonumber) key as the primary key in place of a
composite key is only half their advantage.

The biggest problem that I have with composite keys is that they share
the same fundamental problem as natural keys: using them as the primary
key is allowing the fact that a primary key can change its value over
time. IMHO, a primary key should never be allowed to change its value
once it has been created; a assumption which will forbid the use of a
composite key in many cases. (Of course, if you don't mind to see a
primary key changing its value after its creation then you are not
concerned by this argument.).

This argument has an inherent fallacy in it. Just because a key is
composed from multiple columns doesn't necessarily mean that its values
can be different in different database states. For example, in an
Inventory table that has the key, {ItemKey, WarehouseKey}, with references
to an Item table and a Warehouse table respectively, the combination
values that comprise each key value can never change from database state
to database state. A particular combination of values identifies a
particular individual in the Universe of Discourse in /every/ database
state in which it appears. It can /never/ identify any other individual.
Therefore, it should be obvious that adding an additional autonumber
primary key in this instance would be superfluous, since each {ItemKey,
WarehouseKey} combination already rigidly designates a distinct individual
in the Universe of Discourse.

The same can be said for many natural keys. For example, suppose you have
a table, Queue, that has an integer key, {Position}. Each value for
Position rigidly designates a distinct individual in the Universe of
Discourse (3 always means "third in line" in any database state in which
there are 3 or more elements), so therefore there is no need for an
additional autonumber primary key.
This is not only a theoritical argument as many interfaces - like
Access - won't like to see a primary key that could change it value. But
even if you take out such interfaces out of the equation, the use of a
surrogate key for all tables reveals itself to be advantageous in many
database problems. For example, if you want to add a log of all changes
to a table, it's much more easier to design it if the table use a
surrogate key for its primary key than a natural key or a composite key.

Personally, I stopped using natural keys and composite keys many years
ago and probably that something like half of my problems with the design
of databases have vanished with them. On these occasions when I was
called to work on a problematic database, chances was much higher to see
that the problems were associated with the use of natural keys and/or
composite keys than with the use of a surrogate keys and the solutions
were usually much more complicated to solve in the first case than in the
second case.

Also, I've remember some peoples who have done like me and have stopped
using natural and composite keys in favor of the exclusive use of
surrogate keys but I don't remember anyone doing the opposite; ie. going
from the use of surrogate keys to the use of natural and composite keys.

--
Sylvain Lafontaine, ing.
MVP - Technologies Virtual-PC
E-mail: sylvain aei ca (fill the blanks, no spam please)


On Jan 24, 11:00 pm, "James A. Fortune" <[email protected]>
wrote:

Whenever I have multiple key fields, natural or not, I create an
AutoNumber PK for pragmatic reasons. The main reason is that it makes
it easier to create the joins. The theorists are champions at joining
tables and don't have to be concerned with the complexity of the SQL
they write.

Word to the wise: 'theorists' hate SQL.

Queries involving many to many relationships often add
additional tables later and highlight the need to keep joins as simple
as possible.

I think I'm with Bob Badour (gulp!) on the issue of complexity,
though: if you think more columns in the ON clause makes a SQL join
more 'complex' then I think you could be looking at things wrong.
Having more characters to type increases the risk of typos? More
columns mean you may omit one in error? The SQL engine may be twice as
slow in handling two columns rather than one? Is it more 'complex' to
split a post address into 'subatomic' columns (address lines from
postal code/zip etc)?

Surely the issue you allude to (I think) is the one that Access
Relationships (as distinct from Jet foreign keys) were invented to
solve? i.e. you pre-define the join columns and 'join type' (inner
join, left outer join or right outer join) and the join clause gets
written as SQL for you when you drop the tables into the Query Builder
thing. I would have thought the 'theorists' would love the fact that
you also create foreign keys in the same Relationships dialog i.e. you
end up with a natural join (not having to explicitly specify the
columns yourself) because one table references the other.

[I tend to be dismissive of tools that write SQL code for me but I
think I should perhaps review my stance e.g. I still write all my SQL
Server procs by hand whereas I have tasked myself to investigate CRUD
generators. But, for the time being, ...] As a SQL coder myself, I
find it more annoying that I have to create multiple joins to get the
'natural key' values, having to discover what the 'artificial key'
columns are in the first place.

Lately, I've increased the amount of normalization in one
of my databases and the joins got even more complicated, adding about
a
line or so in the SQL view in Access for every new query using those
tables.

Bad luck: I think you might have got way with "reduced the amount of
denormalization" ;-) In this thread I've already broken my personal
rule (!!) about not mentioning normalization [formulated because the
average 'replier' around here thinks "fully normalized" is BCNF, which
they think is 3NF anyhow, and doesn't pay much attention to anomalies
that normalization doesn't address, unless the 'asker' mentions
storing calculations...]

I keep Jamie's advice in the back of my mind, about how enforcing
constraints at the table level is better than enforcing them through
code

..and best to do it in both places! Bear in mind that it's a rule of
thumb i.e. "strict rules modified in practise." Checking something in
the in front end allows you to give timely user feedback and could
save them some keying, not to mention a database roundtrip. Checking
in the database catches anything neglected in the front end by
omission of validation or introduction of bugs. In practice, some
things are better done in one place but not the other: contrast the
validation of the basic pattern of an email address with the
verification that an addressable entity can be contacted at that email
address; I don't think it would be sensible to put the latter test
into a table constraint, even if it were possible.

Jamie.

--

What part of simpler don't you understand :). Only one expression in
the ON is simpler. Needing less indexes is simpler. Not having to
look for your multi-key fields is easier, although your point that
Relationships can handle that is valid. If the AutoNumber key has a
one-to-one relationship with the multi-key fields then it's fine to
use it. There's no down side that I can see. I also like to rely on
coding to detect inconsistent data rather than on error trapping, so I
have to check the multi-key values anyway before adding a new record.
I think that your idea about enforcing constraints at both the table
level and in code is an excellent idea. The OP wanted to know what
people did and why. I still don't see any reason put forward for me
to change to a multi-field key. Are totals queries easier when multi-
field keys are used? BTW, "reduced the amount of denormalization"
works just as well. Real databases experience denormalizing
influences.

James A. Fortune
(e-mail address removed)
 
S

Sylvain Lafontaine

OK since you admit you raised the 'budget' issue, what do *you* think
the impact on 'budget' is when choosing random autonumber over
incremental autonumber?

If you strictly take a look at performance considerations, random
autonumbers don't scale well with indexes. Most (but not all) systems will
probably scale better if you are using a monotonously increasing primary
key. A monotonously increasing sequence is also probably easier to read than
a set of random number when you are debugging a system. The fact that I'm
not interested in the value of any particular primary key doesn't mean that
I won't mind to see a collection of primary key values to have the property
to be a sequence of monotonously increasing (or decreasing, especially in
the case of some replication scenarios) numbers. Like I said in my previous
post, my brain is like the budget of my clients: it's not infinite either.

However, there are of course some scenarios where a random autonumber will
be used. This is often used in replicated systems; particularly when there
are online/offline clients.
Again, I ask you to consider the posts we see in the Access groups
asking to reseed an incremental autonumber back to one or because they
are perplexed/outraged that gaps have appeared in their autonumber
sequences. If the exact values of these so-called surrogates have
"zero importance" to these people then why are they making these
requests?

You don't see this only in Access groups. The same question is often asked
on other groups as well: searching Google Groups for IDENTITY_INSERT reveals
6520 hits:

http://groups.google.com/groups/search?ie=UTF-8&oe=UTF-8&q=IDENTITY_INSERT

PS what about the other comments you seemingly aimed at me: codifying,
reengineering, etc? Why no mention of them in your reply?

I seemingly aimed no comment at you. (And usually - but not always - at
anyone else.). This is a thread with a great number of messages posted by
multiple peoples and my comments are destined to anyone interested in
reading my posts. Maybe my style of writing is not enough impersonal but
practically all my posts should be read as being impersonal; excerpt for the
inclusion of the usual forms of politeness.

As for why I don't mention everything in my replies, I volontairily do so
because I try to reply only when I've something new to say. When I wrote
something, if somebody else is not able to understand it - without making
any assumption here on why he/her don't understand it - quite probably that
he/her won't be able to understand it again if I repeat myself. Usually, I
don't feel the need to try to have the last word. There are a lot of people
that will read these messages and I think that they are quite capable of
forging their own opinion even if I didn't repeated the same thing over and
over again or didn't posted last. Of course, my mind is not perfect; so I
do repeat myself or make a last post from time to time.
 
B

Brian Selzer

Sylvain Lafontaine said:
Sorry if I didn't responded before but I wasn't sure to understand what
you have wrote. I partage your opinion that a composite key can not only
unequivocally identifying any item in a particular database but also
remain constant (ie. never change from database state to database state)
in many systems.

But as I said, this is not true for all systems and there are occasions
where the value will change from state to state. For instance, is instead
of an inventory system you have a sport ligue system, with a table making
a jonction between a list of players and a list of team; it's easy to see
that in such a table, the composite key may change its value from state to
state; for example when a player is exchanged between two teams.

I was just pointing out that there are times when composite key values and
natural key values permanently identify individuals, and in those instances,
the autonumber primary key is just an added complication that serves no
material purpose. One can claim that they speed up queries, when in fact
they they slow things down due to the additional indexes that need to be
maintained. And as has been shown in other posts, the number of joins
necessary to answer even simple queries increases with the use of autonumber
primary keys, so in fact they may degrade query performance.
I don't want to enter into a discussion of the full range of possibilities
(for example, do you want the database to remember the previous
relationship, what about the player number (if a player change his/her
number, do you want to keep the older information?), the individual
statistics, etc., etc.) but it doesn't take too long to see that in such a
situation, the use of a composite key to express the relationships between
tables will rapidly become like hell. Like someone else has said: « been
there, done that » and personally, it's not my intention to go back there.
Everyone know that when it's time to make a decision, one personal
experience has more weight than a thousand opinions so for me, my first
reaction about using a composite primay key will be a no go.

I think that if you distill all of the reasons you're citing, they all boil
down to whether or not a key's values are permanent identifiers or not. If
they're not, and there is a need that they be, then the addition of a
permanent identifier may be necessary. Now whether that's an autonumber or
some other kind of surrogate isn't really that important.
But there are other occasions, like your inventory system, where the
possibility that a composite primary key can change its value don't exist.
In these occasions, would it be overkill to use a separate primary key?
Personally, I don't mind using a separate primary key even on these
occasions but I fully understand that other people might feel
unconfortable to do the same. Everyone know that the devil like to hide
in the details; probably that besides the details, the composite keys are
also one of his favorite places to hide.

--
Sylvain Lafontaine, ing.
MVP - Technologies Virtual-PC
E-mail: sylvain aei ca (fill the blanks, no spam please)


Brian Selzer said:
Sylvain Lafontaine said:
To that, I would add that the increased simplicity of using a surrogate
(or artificial or autonumber) key as the primary key in place of a
composite key is only half their advantage.

The biggest problem that I have with composite keys is that they share
the same fundamental problem as natural keys: using them as the primary
key is allowing the fact that a primary key can change its value over
time. IMHO, a primary key should never be allowed to change its value
once it has been created; a assumption which will forbid the use of a
composite key in many cases. (Of course, if you don't mind to see a
primary key changing its value after its creation then you are not
concerned by this argument.).

This argument has an inherent fallacy in it. Just because a key is
composed from multiple columns doesn't necessarily mean that its values
can be different in different database states. For example, in an
Inventory table that has the key, {ItemKey, WarehouseKey}, with
references to an Item table and a Warehouse table respectively, the
combination values that comprise each key value can never change from
database state to database state. A particular combination of values
identifies a particular individual in the Universe of Discourse in
/every/ database state in which it appears. It can /never/ identify any
other individual. Therefore, it should be obvious that adding an
additional autonumber primary key in this instance would be superfluous,
since each {ItemKey, WarehouseKey} combination already rigidly designates
a distinct individual in the Universe of Discourse.

The same can be said for many natural keys. For example, suppose you
have a table, Queue, that has an integer key, {Position}. Each value for
Position rigidly designates a distinct individual in the Universe of
Discourse (3 always means "third in line" in any database state in which
there are 3 or more elements), so therefore there is no need for an
additional autonumber primary key.
This is not only a theoritical argument as many interfaces - like
Access - won't like to see a primary key that could change it value.
But even if you take out such interfaces out of the equation, the use of
a surrogate key for all tables reveals itself to be advantageous in many
database problems. For example, if you want to add a log of all changes
to a table, it's much more easier to design it if the table use a
surrogate key for its primary key than a natural key or a composite key.

Personally, I stopped using natural keys and composite keys many years
ago and probably that something like half of my problems with the design
of databases have vanished with them. On these occasions when I was
called to work on a problematic database, chances was much higher to see
that the problems were associated with the use of natural keys and/or
composite keys than with the use of a surrogate keys and the solutions
were usually much more complicated to solve in the first case than in
the second case.

Also, I've remember some peoples who have done like me and have stopped
using natural and composite keys in favor of the exclusive use of
surrogate keys but I don't remember anyone doing the opposite; ie. going
from the use of surrogate keys to the use of natural and composite keys.

--
Sylvain Lafontaine, ing.
MVP - Technologies Virtual-PC
E-mail: sylvain aei ca (fill the blanks, no spam please)


On Jan 24, 11:00 pm, "James A. Fortune" <[email protected]>
wrote:

Whenever I have multiple key fields, natural or not, I create an
AutoNumber PK for pragmatic reasons. The main reason is that it makes
it easier to create the joins. The theorists are champions at joining
tables and don't have to be concerned with the complexity of the SQL
they write.

Word to the wise: 'theorists' hate SQL.

Queries involving many to many relationships often add
additional tables later and highlight the need to keep joins as
simple
as possible.

I think I'm with Bob Badour (gulp!) on the issue of complexity,
though: if you think more columns in the ON clause makes a SQL join
more 'complex' then I think you could be looking at things wrong.
Having more characters to type increases the risk of typos? More
columns mean you may omit one in error? The SQL engine may be twice as
slow in handling two columns rather than one? Is it more 'complex' to
split a post address into 'subatomic' columns (address lines from
postal code/zip etc)?

Surely the issue you allude to (I think) is the one that Access
Relationships (as distinct from Jet foreign keys) were invented to
solve? i.e. you pre-define the join columns and 'join type' (inner
join, left outer join or right outer join) and the join clause gets
written as SQL for you when you drop the tables into the Query Builder
thing. I would have thought the 'theorists' would love the fact that
you also create foreign keys in the same Relationships dialog i.e. you
end up with a natural join (not having to explicitly specify the
columns yourself) because one table references the other.

[I tend to be dismissive of tools that write SQL code for me but I
think I should perhaps review my stance e.g. I still write all my SQL
Server procs by hand whereas I have tasked myself to investigate CRUD
generators. But, for the time being, ...] As a SQL coder myself, I
find it more annoying that I have to create multiple joins to get the
'natural key' values, having to discover what the 'artificial key'
columns are in the first place.

Lately, I've increased the amount of normalization in one
of my databases and the joins got even more complicated, adding about
a
line or so in the SQL view in Access for every new query using those
tables.

Bad luck: I think you might have got way with "reduced the amount of
denormalization" ;-) In this thread I've already broken my personal
rule (!!) about not mentioning normalization [formulated because the
average 'replier' around here thinks "fully normalized" is BCNF, which
they think is 3NF anyhow, and doesn't pay much attention to anomalies
that normalization doesn't address, unless the 'asker' mentions
storing calculations...]

I keep Jamie's advice in the back of my mind, about how enforcing
constraints at the table level is better than enforcing them through
code

..and best to do it in both places! Bear in mind that it's a rule of
thumb i.e. "strict rules modified in practise." Checking something in
the in front end allows you to give timely user feedback and could
save them some keying, not to mention a database roundtrip. Checking
in the database catches anything neglected in the front end by
omission of validation or introduction of bugs. In practice, some
things are better done in one place but not the other: contrast the
validation of the basic pattern of an email address with the
verification that an addressable entity can be contacted at that email
address; I don't think it would be sensible to put the latter test
into a table constraint, even if it were possible.

Jamie.

--

What part of simpler don't you understand :). Only one expression in
the ON is simpler. Needing less indexes is simpler. Not having to
look for your multi-key fields is easier, although your point that
Relationships can handle that is valid. If the AutoNumber key has a
one-to-one relationship with the multi-key fields then it's fine to
use it. There's no down side that I can see. I also like to rely on
coding to detect inconsistent data rather than on error trapping, so I
have to check the multi-key values anyway before adding a new record.
I think that your idea about enforcing constraints at both the table
level and in code is an excellent idea. The OP wanted to know what
people did and why. I still don't see any reason put forward for me
to change to a multi-field key. Are totals queries easier when multi-
field keys are used? BTW, "reduced the amount of denormalization"
works just as well. Real databases experience denormalizing
influences.

James A. Fortune
(e-mail address removed)
 
N

Neil

Marshall said:
Amusingly, a lot of people misperceive the word as "mute."

It's a mute point, meaning it can't say anything any longer.
It sorta vaguely works in a metaphoric way.

Of course, on the internet, you can find many examples of
fractured usage. Ultimately it's just a waist of time.<snicker>

Reminds me of a time that I was laughed at by a group of coworkers for
insisting that the phrase "that doesn't jibe" is correct, and "that doesn't
jive" (as they insisted) was incorrect. J-j-j-jive talking....

I also love the usage of "nip it in the butt" instead of "nip it in the
bud." One time I corrected someone for using "butt," and she apologized,
thinking I was offended by her use of the word "butt," and was creating my
own "softer" version with "bud" instead. Hahahaha!
 
B

Bob Badour

David said:
would agree


developers, and


more the


they are


argue with



I agree with you. However, we should keep in mind that the same arguments
could be made about people learning bad programming habits by building
amateur programs in BASIC, or bad website design habits by using a tool
like Front Page.

Hmmmm... MS Access, MS Basic, MS Front Page... anyone notice a trend?


In general, the tools that require a very short learning
curve encourage the belief that the longer learning curve is of no practical
value.

It doesn't help when vendors, whose own employees know better, encourage
life-long ignorance among their customers.

We've seen that view voiced here (perhaps facetiously) by one of the Access
MVPs. To the extent that he has acquired a lot of credibility with Access
newbies, however acquired, if he gives advice that will become bad advice
when scaled upward, he aggravates the pitfall you warn against.

Elsewhere in the discussion, I opined that Access applications were
generally stored in the same file as the database. I've heard enough
contrary opinions to stand corrected on that score. (I can't find that part
of the discussion anymore.)

However, I still think that hundreds of DIFFERENT application programs
accessing a single database and written by programmers who did not build the
database, is qualitatively different from the design target of the people
who write Access databases and applications.

Years ago, I heard that the median Access application used a single
table with 500 rows. I wonder whether that has changed any.

If they ever get to the point where the complexity of what they are doing
matches the complexity of what practitioners using SQL Server, Oracle, or
DB2 are doing, or the complexity that database theorists are addressing,
they will be forced to either learn or disprove what some of us know, or
think we know.

Access is a good end-user query tool. The problem is some are deluded
into believing it should instead be a crappy application development
tool or a piss-poor data management tool.

I have to admit that, when I'm just playing around, I engage in sloppy
work. I would not go so far as to recommend sloppy habits as good ones in a
newsgroup, however.

This is why, in spite of our many differences of opinion, you never seem
to make my kill file. Your opinions and anecdotes are clearly designated
as such and never passed off as the state of the art.
 
B

Bob Badour

Salad said:
I prefer KISS.

Well, the second S is right.

Thus I prefer an autonumber.

So, adding a useless third column makes it simpler?!?

Then again, junction tables are rarely needed.

That's a remarkable statement requiring extraordinary evidence. What
evidence do you have to support such an assertion?
 
J

James A. Fortune

Sylvain said:
Shouldn't a database be designed right from the beginning?

I didn't say overkill doesn't work, did I :)? I think that until we
delineate the true trade-offs between natural keys and artificial keys,
if any, you should design your schemas/schemata as you deem best. If it
turns out that there are situations where each has advantages then those
situations should determine the correctness of the schema.

James A. Fortune
(e-mail address removed)
 
F

Frank Hamersley

Tony said:
OTOH do you want them using Excel for data management? Single user?

Urrk - where is my garlic smeared wooden stake?

[..]
And, of course, Excel is single user as far as I know.

But for how much longer? Don't get me wrong as I spend nigh on 80% of my
day "in" Excel - but only as a tool in financial markets - not as a
repository!
A97 has the wizard. I don't recall about previous versions.

FWICR 1.0 had nothing, 1.1 you could get at the "object" using AB on a
sleight of hand styled bending of a path+file attribute as I recall.
Then came the wizard but if you were distributing code .mdb's out to
remote sites you still need to AB it so you could make it part of your
own app. Of course the dabbler doesn't contemplate such when making
promises about future glory for all.

Cheers, Frank.
 
J

James A. Fortune

David said:
database, is qualitatively different from the design target of the people
who write Access databases and applications.

If they ever get to the point where the complexity of what they are doing
matches the complexity of what practitioners using SQL Server, Oracle, or
DB2 are doing, or the complexity that database theorists are addressing,
they will be forced to either learn or disprove what some of us know, or
think we know.

I don't have broad enough experience to dispute your argument. I
understand that people who specialize in SQL and deal with more complex
situations than most develop practices that make use of their more
intimate knowledge of SQL. However, I can't just take their word about
their decisions. I have to understand how those choices apply to what
I'm doing. Without making light of their potential contribution, I
avoid the specious argument that because a large company or IT
department does things a certain way or spends more money on the problem
makes their solution inherently correct. Plus, the complexity of the
problems they face often argue against their use in Access. Few Access
developers have the luxury to hire or supervise a full-time SQL
developer. If using multiple field natural keys causes a problem(s), a
full-time SQL developer has time to work with the SQL until the problem
is solved. SQL is only part of our job.

It has been nice to see posters in microsoft.public.access such as Jamie
Collins and Ken Sheridan, who seem to have a lot of standard SQL
experience, branch off into other issues that Access programmers face.
I think their understanding of those issues can help us differentiate
between purely SQL issues and Microsoft implementation issues. We all
agree that Microsoft has made questionable design decisions in Access,
but the possibility of inclusion of unbound forms or of AutoNumber
primary keys might not be part of that list, as some have suggested. I
have enough experience to say that using unbound forms in Access and
using artificial keys did not cause any problems when scaling an Access
application up to an ASP solution using SQL Server. For something more
complicated perhaps there are subtle issues that arise that merit our
attention.

Table level constraints also raise an issue. At the table level, there
is no VBA code run to ensure that any constraints (e.g., on the natural
keys themselves) are enforced. Thus, a set of natural keys is
sufficient to specify the key constraint. Any artificial key becomes
superfluous. The possibility of separate applications using the same
table, brought up by David Fenton, brings up the interesting possibility
that the applications have separate, possibly disparate constraints. In
disparate constraint situations not all of the constraints can be at the
table level, although they can be specified at the query level in each
application rather than via code.

Maybe many Access programmers prefer a single key to limit the number of
fields that get corrupted :). Perhaps an enumeration of sticky
situations encountered by those using multiple field natural keys would
help us understand why some avoid them. Maybe better solutions to those
sticky problems would encourage developers to stay with natural keys
more often.

James A. Fortune
(e-mail address removed)
 
S

Salad

-CELKO- said:
Home Depot's contractor tracking system on ACCESS? It's failures are
famous in Atlanta.
You object of derision is the application whereas an outsider like me
would assume management, specs, and skill level created the failure.
AFAIC, the programmers in CDMA are world class folks and obviously were
not part of the project you brought up.

Dilbert management providing ill thought out/band-aid specs while
employing low-level programmers that don't speak English can create a
famous failure. By chance are you on the failure's committee?

Do you have any links to the famous failure you mentioned? It might
make for a fun read of management incompetency.

If you do respond, be sure to cut out all my post so nobody knows who
you are responding to and make it appear you are talking to yourself for
no particular reason.
 
B

Brian Selzer

James A. Fortune said:
I don't have broad enough experience to dispute your argument. I
understand that people who specialize in SQL and deal with more complex
situations than most develop practices that make use of their more
intimate knowledge of SQL. However, I can't just take their word about
their decisions. I have to understand how those choices apply to what I'm
doing. Without making light of their potential contribution, I avoid the
specious argument that because a large company or IT department does
things a certain way or spends more money on the problem makes their
solution inherently correct. Plus, the complexity of the problems they
face often argue against their use in Access. Few Access developers have
the luxury to hire or supervise a full-time SQL developer. If using
multiple field natural keys causes a problem(s), a full-time SQL developer
has time to work with the SQL until the problem is solved. SQL is only
part of our job.

It has been nice to see posters in microsoft.public.access such as Jamie
Collins and Ken Sheridan, who seem to have a lot of standard SQL
experience, branch off into other issues that Access programmers face. I
think their understanding of those issues can help us differentiate
between purely SQL issues and Microsoft implementation issues. We all
agree that Microsoft has made questionable design decisions in Access, but
the possibility of inclusion of unbound forms or of AutoNumber primary
keys might not be part of that list, as some have suggested. I have
enough experience to say that using unbound forms in Access and using
artificial keys did not cause any problems when scaling an Access
application up to an ASP solution using SQL Server. For something more
complicated perhaps there are subtle issues that arise that merit our
attention.

Table level constraints also raise an issue. At the table level, there is
no VBA code run to ensure that any constraints (e.g., on the natural keys
themselves) are enforced. Thus, a set of natural keys is sufficient to
specify the key constraint. Any artificial key becomes superfluous. The
possibility of separate applications using the same table, brought up by
David Fenton, brings up the interesting possibility that the applications
have separate, possibly disparate constraints. In disparate constraint
situations not all of the constraints can be at the table level, although
they can be specified at the query level in each application rather than
via code.

Constraints should always be checked by the DBMS, not by applications. If
you have two separate applications that manipulate the same table, and one
enforces one constraint while another enforces another, then all you need to
do to bypass one constraint is to use the other application! What, then, is
the point of even having the constraint?
Maybe many Access programmers prefer a single key to limit the number of
fields that get corrupted :). Perhaps an enumeration of sticky
situations encountered by those using multiple field natural keys would
help us understand why some avoid them. Maybe better solutions to those
sticky problems would encourage developers to stay with natural keys more
often.

In my opinion, the only sound reason for using artificial keys is when the
values of a natural key do not permanently identify individuals in the
Universe of Discourse--that is, the micro-world that the database is
supposed to be a model of--and when there is a demonstrable requirement for
permanent identification. Any other use adds complexity for complexities
sake--in other words, for no particularly good reason.
 
S

Sylvain Lafontaine

In my opinion, the only sound reason for using artificial keys is when the
values of a natural key do not permanently identify individuals in the
Universe of Discourse--that is, the micro-world that the database is
supposed to be a model of--and when there is a demonstrable requirement
for permanent identification. Any other use adds complexity for
complexities sake--in other words, for no particularly good reason.

Even in the very few cases were you think that a natural key will be a true
natural key, ie. it will be permanent and totally managed at 100% (which
mean no data entry error of any kind, no change of value because of identity
theft, etc., etc.); the privacy issues will forbid their use in many cases.
Many countries now forbid by law to ask for something such as your social
security number when you don't need it and when you need it, to use it as a
key instead of storing it confidentially (ie., by encrypting it).

Even MS had problem with this in the past: the first version of the
algorithm for generating a GUIG was using the NIC number as part of its
algorithm. They had to remove it later and now, the generation of any GUIG
is totally random. The older algorithm is still available in Windows under
another name but MS warns that anyone using it might be doing so illegally
under the laws of their countries.

The same argument apply to things like RFID: how many of you would like to
see big corporation to know which medicaments you are bringing home or your
spouse to know that you have just bought a box of condom?

Using natural keys is like building a castle of cards: they might look
impressive but it doesn't take to much to bring them down.
 
T

Tony Toews [MVP]

-CELKO- said:
It is a good classic screw up, with blame for everyone!

1) ACCESS programmer builds desktop app on his own that looks good for
his immediate needs.
2) Management sees the app and wants to deploy it all over the
company. Hey, why design anything new when we have it already?
3) ACCESS programmer claims it will deploy and and management believes
him.
4) It does not scale, it does not interface with mainframe apps,
external apps, etc. It has no documentation, etc.
5) ACCESS programmer now has a career being the only guy who can keep
the sinking boat up. Never mind how many times a week it has to be re-
booted or how much data is lost.
6) Neither the programmer nor management will scream for help and ask
for a budget. Management would look stupid; programmer would lose his
job and power

That sounds quite typical for some Access applications. But consider just where was
IT in all this? D*cking around on their Oracle database that will save the company.
Or this app didn't fit in the IT priorities. We'll create this fabulous system.
Uhh, when? Sometime this century.

It's better than Excel. Although not by much in such incapable hands as you
describe. And that I've seen myself in the past.

Tony
--
Tony Toews, Microsoft Access MVP
Please respond only in the newsgroups so that others can
read the entire thread of messages.
Microsoft Access Links, Hints, Tips & Accounting Systems at
http://www.granite.ab.ca/accsmstr.htm
Tony's Microsoft Access Blog - http://msmvps.com/blogs/access/
 
D

David Cressey

James A. Fortune said:
I didn't say overkill doesn't work, did I :)? I think that until we
delineate the true trade-offs between natural keys and artificial keys,
if any, you should design your schemas/schemata as you deem best. If it
turns out that there are situations where each has advantages then those
situations should determine the correctness of the schema.
I'll go further than that. For most design problems there is more than one
acceptable solution. This is particularly true of schema design. Design
trade-offs will help determine which of two possible designs is better in
any given situation.
 
D

David Cressey

James A. Fortune said:
I don't have broad enough experience to dispute your argument. I
understand that people who specialize in SQL and deal with more complex
situations than most develop practices that make use of their more
intimate knowledge of SQL. However, I can't just take their word about
their decisions. I have to understand how those choices apply to what
I'm doing.

I agree, absolutely. What I was arguing against was the dismissal without
evaluation of what theorists have to offer.


Without making light of their potential contribution, I
avoid the specious argument that because a large company or IT
department does things a certain way or spends more money on the problem
makes their solution inherently correct. Plus, the complexity of the
problems they face often argue against their use in Access. Few Access
developers have the luxury to hire or supervise a full-time SQL
developer. If using multiple field natural keys causes a problem(s), a
full-time SQL developer has time to work with the SQL until the problem
is solved. SQL is only part of our job.

SQL isn't the silver bullet, either.
Maybe many Access programmers prefer a single key to limit the number of
fields that get corrupted :).

In that case, I believe they are wrong.
 
B

Brian Selzer

Roy Hann said:
I agree very heartily with the first part of this statement, for the
reasons you give below. I disagree with the second part (as stated).
There is no reason why applications shouldn't also test what constraints
they can. The problem is that they should not have hand-coded
re-implementations of the constraints because those will get out of sync
with the database over time. What would be very nice is if one day it were
possible for applications to download the relevant constraints at
run-time, the way they presently download other metda-data.

That's a good point. I should have said instead, "Constraints should always
be /enforced/ by the DBMS, not by applications." It is often a good thing
for an application to do some checking because it can reduce the number of
round-trips, and maybe even some transaction rollbacks.
 
B

Brian Selzer

Sylvain Lafontaine said:
Even in the very few cases were you think that a natural key will be a
true natural key, ie. it will be permanent and totally managed at 100%
(which mean no data entry error of any kind, no change of value because of
identity theft, etc., etc.); the privacy issues will forbid their use in
many cases. Many countries now forbid by law to ask for something such as
your social security number when you don't need it and when you need it,
to use it as a key instead of storing it confidentially (ie., by
encrypting it).

So, where does it say that there can only be one natural key on a table?
Even MS had problem with this in the past: the first version of the
algorithm for generating a GUIG was using the NIC number as part of its
algorithm. They had to remove it later and now, the generation of any
GUIG is totally random. The older algorithm is still available in Windows
under another name but MS warns that anyone using it might be doing so
illegally under the laws of their countries.

The same argument apply to things like RFID: how many of you would like to
see big corporation to know which medicaments you are bringing home or
your spouse to know that you have just bought a box of condom?

Why should it be a problem for my spouse to know that I just bought a box of
condoms? More often than not she is the one who lets me know that we need
them. Oh! Wait! I get it! That must be something those of you who can't
keep it in your pants worry about.
Using natural keys is like building a castle of cards: they might look
impressive but it doesn't take to much to bring them down.

Even with artificial keys, you still need natural ones. Without them, you
can have multiple values for a key that mean exactly the same thing, but
with no way to determine which, thereby rendering the key useless.
 
J

Jon Heggland

Quoth Roy Hann:
What would be very nice is if one day it were possible for applications to
download the relevant constraints at run-time, the way they presently
download other metda-data.

Dataphor does this.
 
J

Jon Heggland

Quoth Roy Hann:
You don't think I was clever enough to come up with the idea all by myself
do you? :)

Why not? It's a pretty simple idea, really.
I admire what Dataphor set out to do. But having been forced to adopt SQL
against my will a long time ago, I am under no illusion that anything better
than SQL will ever catch on.

Well, Dataphor databases are typically built on top of SQL databases.
Just tell management that Dataphor is really just a presentation layer
on top of SQL. :)
The best hope is that embedded SQL might be less stupid in future.

If that was my best hope, i think I would quit being a database engineer.
 
D

David Cressey

Brian Selzer said:
That's a good point. I should have said instead, "Constraints should always
be /enforced/ by the DBMS, not by applications." It is often a good thing
for an application to do some checking because it can reduce the number of
round-trips, and maybe even some transaction rollbacks.

I think that if an application does some checking, it will also do some
enforcing.

I think you might have been aiming at something like the following:

The DBMS should always enforce the constraints that it can enforce, rather
than relying on applications to refrain from writing data that violates the
constraints.

Note that the above is silent on what applications should or should not do
regarding constraints.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top