PK - To AutoNumber or Not To AutoNumber - That is the Question! :-

D

dee

Seriously... if I have, for example, employee numbers in an Employees table,
should I use the employee number (will never change) or should I add an
autonumber PK?
 
M

mray29

Dee:
If your employee numbers are unique, there's no reason to add another
autonumer PK. Just make the employee numer your primary key for the table and
you'll be fine.
 
J

Jeff Boyce

The only thing an Access Autonumber is designed to do is provide a unique
record/row identifier. If you already have one, why bother with another?!

Regards

Jeff Boyce
Microsoft Office/Access MVP
 
J

John W. Vinson

Seriously... if I have, for example, employee numbers in an Employees table,
should I use the employee number (will never change) or should I add an
autonumber PK?

A primary key candidate should meet three criteria: it should be unique within
the table; it should (preferably) be stable, not something that will get
edited after entry; and (ideally) it should be reasonably small (8 bytes is
the size of an Autonumber so that's a benchmark).

If your employee number meets those criteria it's a good "natural key" and you
should by all means use it.

John W. Vinson [MVP]
 
B

BruceM

I am in a situation where they changed the format of the EmployeeID number.
Perhaps updating thousands of records in related tables for a hundred
employees is simpler than I imagine by using Cascade Update Related Fields,
but I am not anxious to attempt the project. I am left either with doing
the update (and verifying that it has gone as planned) or using the old
number as the PK (incrementing it by 1 for new employee records) and adding
another field for the new ID number.
 
K

Klatuu

I think BruceM has a good point. Too many times I have heard "This will
never change". You can never bank on anything staying the same. I had a
similar instance where there was a code convention for a data item that was
set up to satisfy or ERP system. One a littler over a year, they changed ERP
systems twice. The coding system had to change both times.
--
Dave Hargis, Microsoft Access MVP


BruceM said:
I am in a situation where they changed the format of the EmployeeID number.
Perhaps updating thousands of records in related tables for a hundred
employees is simpler than I imagine by using Cascade Update Related Fields,
but I am not anxious to attempt the project. I am left either with doing
the update (and verifying that it has gone as planned) or using the old
number as the PK (incrementing it by 1 for new employee records) and adding
another field for the new ID number.
 
J

Jerry Whittle

I once had the 'pleasure' of trying to fix a problem where a company was
bought by another and, guess what, the employee numbers were all changed!
Since they didn't have Referential Integrity enabled (actually couldn't as
there were some orphaned records) couldn't use Cascade Update.

I'm a firm believer in autonumbers for PKs. If it doesn't have a meaning,
you won't be tempted to change it.
--
Jerry Whittle, Microsoft Access MVP
Light. Strong. Cheap. Pick two. Keith Bontrager - Bicycle Builder.


Klatuu said:
I think BruceM has a good point. Too many times I have heard "This will
never change". You can never bank on anything staying the same. I had a
similar instance where there was a code convention for a data item that was
set up to satisfy or ERP system. One a littler over a year, they changed ERP
systems twice. The coding system had to change both times.
 
J

Jeff Boyce

I tend to use Autonumber PKs for most of my tables, much to the dismay of
the "natural key" purists (heh, heh, heh!). And to build on your
suggestion, Jerry, the OP can simply add another field that is the
"CurrentCompanyEmployeeIdentifier", and THAT can get changed for any reason!

Regards

Jeff Boyce
Microsoft Office/Access MVP
 
J

Jamie Collins

I once had the 'pleasure' of trying to fix a problem where a company was
bought by another and, guess what, the employee numbers were all changed!
Since they didn't have Referential Integrity enabled (actually couldn't as
there were some orphaned records) couldn't use Cascade Update.

I'm a firm believer in autonumbers for PKs. If it doesn't have a meaning,
you won't be tempted to change it.

Now consider the scenario where two companies use auto-increment
values for their personnel data then those companies merge...

Jamie.

--
 
J

Jamie Collins

I tend to use Autonumber PKs for most of my tables, much to the dismay of
the "natural key" purists (heh, heh, heh!).

My view: the primary (no pun) consideration is the logical: put a
unique (lowercase) constraint on natural employee number because
failure to do so will result in loss of data integrity, as described
by Jerry Whittle in this thread. Physical considerations are secondary
in my opinion e.g. does my SQL DBMS grant special powers to the
PRIMARY KEY (uppercase) and if so what are they? Does my SQL DBMS
physically repeat the key values for a foreign key and, if so, can I
trust it to cascade the values if they change? Is the performance of
my SQL DBMS so poor that I feel I must restrict myself to a single
eight byte key column? Will my application be successful, hence will
be ported to another platform with different physical considerations?

Jamie.

--
 
J

Jamie Collins

A primary key candidate should meet three criteria: it should be unique within
the table; it should (preferably) be stable, not something that will get
edited after entry; and (ideally) it should be reasonably small (8 bytes is
the size of an Autonumber so that's a benchmark).

Do you mean PRIMARY KEY (SQL keywords in uppercase)? If so you can
add: cannot include the NULL value; what special meaning is implied by
such designation (hint: clustering on compact of file).

If you mean primary key (general meaning in lowercase) then you've
missed out some important ones (by accident or design?): trusted
source (hint: you don't want to get into the situation where your mdb
is the trusted source of a key and you must expose your autonumber
value to database users), familiarity, validation (e.g. a check digit
can reduce keying errors), verifiability (e.g. an ISBN can be verified
by looking on the back cover, on Amazon, etc), simplicity (contrast
'The United Kingdom of Great Britain and Northern Ireland' with ISO
4217 alpah-3 country code = 'GBP'). Also note that 'stable' (not
subject to frequent change) is not the same as 'immutable' (not
subject to change).

Jamie.

--
 
B

BruceM

Then they have a problem that they might have avoided with a more "natural"
EmployeeID used as the key, until they decide that since everybody is part
of one company everybody's ID should be the same format. There is no single
best answer, but having the number format change within a single
organization is probably the more likely scenario.
 
B

BruceM

The employee number used in the example is as meaningless as a random or
sequential number (unless the "E" has some special significance). The
example is unclear as to whether the form number is the form identifier
(such as Form 1040) or a sequence number. I will assume the latter. The
author seems to be saying that the only possible natural key comprises the
employee number of the employee making the purchase; the form number, which
is already unique; and the date. Since the form number is already unique,
it is unclear how much more unique the record would be with the addition of
two additional fields to the key. Even if there is an argument in the
article's labored logic that there should be a compound key, it does not
address the question about whether the employee ID should be autonumber or
another format.
 
J

John W. Vinson

Do you mean PRIMARY KEY (SQL keywords in uppercase)? If so you can
add: cannot include the NULL value; what special meaning is implied by
such designation (hint: clustering on compact of file).

If you mean primary key (general meaning in lowercase) then you've
missed out some important ones (by accident or design?): trusted
source (hint: you don't want to get into the situation where your mdb
is the trusted source of a key and you must expose your autonumber
value to database users), familiarity, validation (e.g. a check digit
can reduce keying errors), verifiability (e.g. an ISBN can be verified
by looking on the back cover, on Amazon, etc), simplicity (contrast
'The United Kingdom of Great Britain and Northern Ireland' with ISO
4217 alpah-3 country code = 'GBP'). Also note that 'stable' (not
subject to frequent change) is not the same as 'immutable' (not
subject to change).


Thanks, Jamie - all excellent points. Hadn't even thought of the trusted
source issue.

John W. Vinson [MVP]
 
J

Jamie Collins

The
author seems to be saying that the only possible natural key comprises the
employee number of the employee making the purchase; the form number, which
is already unique; and the date. Since the form number is already unique,
it is unclear how much more unique the record would be with the addition of
two additional fields to the key.

I think the article makes it very clear (it's a paradigm shift for
you, I recognize your reluctance). "If an auditor checked to see if
reimbursements were duplicated she/he would look at only the non-key
columns because the key column is by definition unique."

Employee E7654 submitted petty cash form 23416 on April 16, 1998 for a
purchase made at Ace Hardware on April 15, 1998 in the amount of $8.43
that is described and justified as "replaced wrench broken this
morning."

Employee E7654 submitted petty cash form 23417 on April 16, 1998 for a
purchase made at Ace Hardware on April 15, 1998 in the amount of $8.43
that is described and justified as "replaced wrench broken this
morning."

The petty cash form number is the meaningless unique key (let's say
sequential) and the domain expert attested that an auditor would
identify the above as duplicate purchases. "The real-world key allows
us to stop requests from being entered twice by mistake. Enforcing
this auditing rule at the time of submission eliminates a mistake that
could be very embarrassing to an employee. If an employee had several
requests and was distracted during entering them, a request could be
entered twice. Explaining this to the boss or a higher manager would
not be a highlight of the employee's day."
[the article] does not
address the question about whether the employee ID should be autonumber or
another format.

The article is not concerned with the generation of employee numbers.
The meaningless unique key under discussion in the article is petty
cash form number, specifically its usage as a key rather than
considerations of generation algorithm (sequential is a reasonable
assumption) or source.

Let me attempt to clarify "whether the employee ID should be
autonumber" for you: employee number should not be an autonumber (note
that autonumber is not a 'format'). To do so would make the mdb the
trusted source of employee numbers for the enterprise, which would be
an absurdity (problems with autonumber going corrupt notwithstanding).
The trusted source of employee numbers should be the enterprise's
personnel officer or other authorized person with the ability to
verify employees in reality. Neither an mdb nor an autonumber can do
this.

The OP is asking whether they should use EITHER employee number OR
employee number plus autonumber. To use the OP's simple scenario,
consider these natural language examples:

ID 1 is auto-generated for Employee E7654.
ID 2 is auto-generated for Employee E7654.
ID 3 is auto-generated for Employee E7654.

Are these duplicates? Yes. Should there be a unique constraint on
employee number to prevent the above? Of course.

Jamie.

--
 
J

Jamie Collins

Then they have a problem that they might have avoided with a more "natural"
EmployeeID used as the key, until they decide that since everybody is part
of one company everybody's ID should be the same format. There is no single
best answer

I agree but I think many people see "autonumber primary key" as always
being the answer, the panacea. I'm not sure whether the blame lies
with the "I tend to use Autonumber PKs for most of my tables" brigade
or whether it's Access's "A table must have a primary key for you to
define a relationship between this table and other tables in the
database [incorrect]. Do you want to create a primary key now?" but,
if proposed designs posted in these groups are anything to go by,
consideration of candidate keys tends to go no further than
"autonumber primary key" :(

Jamie.

--
 
J

Jamie Collins

I tend to use Autonumber PKs for most of my tables, much to the dismay of
the "natural key" purists (heh, heh, heh!).

Dismay? No, I don't think that's it, though I may not be
representative because I'm more respectful of "alternative lifestyle
choices" (e.g. autonumber as surrogate) than the "purist" moniker
implies. My response would depend on whether you'd also put a unique
constraint on the natural key e.g. employee number in the OP's case.
If you'd seemingly given it no serious consideration I'd call you
unprofessional. If you'd considered it and still failed to constrain
the natural key my emotional repose would reflect the contempt you
would have yourself shown for your clients and end users. If you'd
taken steps to put unique constraints on both autonumber and natural
key but had chosen to give the PRIMARY KEY (SQL keywords in uppercase)
designation to the meaningless autonumber column, I'd be slightly
saddened, though remain unsurprised, that an Access MVP was not aware
of the Access/Jet engine's clustered-key compact method that was
introduced in version Jet 3.0 (http://support.microsoft.com/kb/
137039), especially with me banging on about it all this time :)

Jamie.

--
 
B

BruceM

Jamie Collins said:
I think the article makes it very clear (it's a paradigm shift for
you, I recognize your reluctance). "If an auditor checked to see if
reimbursements were duplicated she/he would look at only the non-key
columns because the key column is by definition unique."

Employee E7654 submitted petty cash form 23416 on April 16, 1998 for a
purchase made at Ace Hardware on April 15, 1998 in the amount of $8.43
that is described and justified as "replaced wrench broken this
morning."

Employee E7654 submitted petty cash form 23417 on April 16, 1998 for a
purchase made at Ace Hardware on April 15, 1998 in the amount of $8.43
that is described and justified as "replaced wrench broken this
morning."

The petty cash form number is the meaningless unique key (let's say
sequential) and the domain expert attested that an auditor would
identify the above as duplicate purchases. "The real-world key allows
us to stop requests from being entered twice by mistake. Enforcing
this auditing rule at the time of submission eliminates a mistake that
could be very embarrassing to an employee. If an employee had several
requests and was distracted during entering them, a request could be
entered twice. Explaining this to the boss or a higher manager would
not be a highlight of the employee's day."

The same reimbursement request made on successive days could be a duplicate.
Rather than a wrench, let's say it is an expendable item such as a can of
paint. It is purchased in the morning. Toward the end of the work day it
is discovered that more paint is needed. It is purchased, and the request
is submitted the following day. Same item purchased on the same day, with
the reimbursement request submitted on successive days. Would the auditor
reject the second purchase because it was made on the same day as the first,
or accept it because the requests were submitted on different days? Should
time of day (most receipts these days contain this information) be part of
the request? Should there be a field for supervisor approval? Should there
be a comments field so that "Didn't have enough paint to finish the job" is
included in the record for the second purchase? I think the whole petty
cash example was rather poorly chosen. EmployeeID, FormID, and Date are not
sufficient in combination to verify uniqueness.
[the article] does not
address the question about whether the employee ID should be autonumber
or
another format.

The article is not concerned with the generation of employee numbers.
The meaningless unique key under discussion in the article is petty
cash form number, specifically its usage as a key rather than
considerations of generation algorithm (sequential is a reasonable
assumption) or source.

Let me attempt to clarify "whether the employee ID should be
autonumber" for you: employee number should not be an autonumber (note
that autonumber is not a 'format'). To do so would make the mdb the
trusted source of employee numbers for the enterprise, which would be
an absurdity (problems with autonumber going corrupt notwithstanding).
The trusted source of employee numbers should be the enterprise's
personnel officer or other authorized person with the ability to
verify employees in reality. Neither an mdb nor an autonumber can do
this.

My point about EmployeeID is not that the number used to punch in (or
whatever) should be autonumber, but that if it is the field involved in the
relationships it can be quite inconvenient when the EmployeeID number format
changes. By the way, I mean "format" generically. I realize that
autonumber is a type of long integer. In my company some of the EmployeeID
numbers went up by 12 (or something like that). Employee 8888 is now
Employee 8900. I don't know what became of 8888, or if it is assigned to
somebody else, or any of that. I just know that it is a nuisance. If they
decide to add a letter prefix to the number (which is now long integer),
there is yet another hassle in that the related field also needs to change
to a text field. The EmployeeID number should be unique. This can be
enforced. However, it is another matter to say that because it is unique it
should be the linking field. I think I prefer to keep EmployeeID unique,
but to have another unique (meaningless) field for linking.
The OP is asking whether they should use EITHER employee number OR
employee number plus autonumber. To use the OP's simple scenario,
consider these natural language examples:

ID 1 is auto-generated for Employee E7654.
ID 2 is auto-generated for Employee E7654.
ID 3 is auto-generated for Employee E7654.

Are these duplicates? Yes. Should there be a unique constraint on
employee number to prevent the above? Of course.

Again, I am not arguing against a unique constraint on EmployeeID, but
rather in favor of a separate field for linking.
 
J

Jeff Boyce

It would appear you've "read in" a lot more than I stated (or feel).

Regards

Jeff Boyce
Microsoft Office/Access MVP
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top