PK - To AutoNumber or Not To AutoNumber - That is the Question! :-

D

dee

I seem to have really stirred things up here! :)

It's so interesting to read the articles suggested and all of your responses.

I have come up against something that may just have forced me to use a
question number in a survey database I am creating. I have the questions in
one table and now am creating the answers table. The kicker is that the
answers have been input in Excel and need to be imported into Access. I will
be relating the answers to the question number. I had initially used an
autonumber questionID, but I'm not sure this will be useful at this point
because there is no such thing in the Excel file.

What there IS in the Excel file is a unique question number (combination of
letters and numbers) for each and every question and these will never change.

Opinions or suggestions? Thanks!
--
Thanks!

Dee


Jamie Collins said:
Then they have a problem that they might have avoided with a more "natural"
EmployeeID used as the key, until they decide that since everybody is part
of one company everybody's ID should be the same format. There is no single
best answer

I agree but I think many people see "autonumber primary key" as always
being the answer, the panacea. I'm not sure whether the blame lies
with the "I tend to use Autonumber PKs for most of my tables" brigade
or whether it's Access's "A table must have a primary key for you to
define a relationship between this table and other tables in the
database [incorrect]. Do you want to create a primary key now?" but,
if proposed designs posted in these groups are anything to go by,
consideration of candidate keys tends to go no further than
"autonumber primary key" :(

Jamie.
 
D

dee

On another note, I just read something else interesting.

If the human beings responsible for input of the question number, etc. that
will be the PK make a mistake, it will be difficult to update.

If, however, an autonumber is used for the PK, then the change can be made
and cascading to all related tables.

Any input would be appreciated.
--
Thanks!

Dee


Jamie Collins said:
Then they have a problem that they might have avoided with a more "natural"
EmployeeID used as the key, until they decide that since everybody is part
of one company everybody's ID should be the same format. There is no single
best answer

I agree but I think many people see "autonumber primary key" as always
being the answer, the panacea. I'm not sure whether the blame lies
with the "I tend to use Autonumber PKs for most of my tables" brigade
or whether it's Access's "A table must have a primary key for you to
define a relationship between this table and other tables in the
database [incorrect]. Do you want to create a primary key now?" but,
if proposed designs posted in these groups are anything to go by,
consideration of candidate keys tends to go no further than
"autonumber primary key" :(

Jamie.
 
J

John W. Vinson

What there IS in the Excel file is a unique question number (combination of
letters and numbers) for each and every question and these will never change.

Well, why not use that field as the Primary Key of the questions table? It's
unique; it's stable; it's probably a perfectly good PK.

John W. Vinson [MVP]
 
J

John W. Vinson

If the human beings responsible for input of the question number, etc. that
will be the PK make a mistake, it will be difficult to update.

If, however, an autonumber is used for the PK, then the change can be made
and cascading to all related tables.

You have it backwards.

Cascade Update is completely useless for an Autonumber, since an Autonumber
field cannot be edited. If instead you use a Text field as the PK, it *can* be
edited and cascade updates will take care of the related tables.

John W. Vinson [MVP]
 
D

dee

Hi John,

I value your advice, help and input. I will use the text fields that make
sense. I know that there is a possibility of human error when inputting.
Any suggestions on how to handle that? I guess just the validation rules,
field size, etc.?

Perhaps part of the process will be having the data entry verified after
input.

I believe you had also guided me recently to the survey database by Duane
Hookom. Thanks so much. It has helped to a degree, except that the audience
is really those who wish to just modify the questions and survey titles, etc.

I am trying to create my database from scratch. Quite a challenge. If you
have any other resources or suggestions regarding this type of database, I'd
really appreciate hearing about them!

Thanks again.
--
Thanks!

Dee
 
J

John W. Vinson

Hi John,

I value your advice, help and input. I will use the text fields that make
sense. I know that there is a possibility of human error when inputting.
Any suggestions on how to handle that? I guess just the validation rules,
field size, etc.?

Well, to the extent that you want the user to select from a set of existing
values (question numbers say), use the builtin tools that do so: Combo Boxes
and Listboxes. It's much easier to make an error typing blindly into a blank
textbox than selecting from a list.
Perhaps part of the process will be having the data entry verified after
input.

The Form's BeforeUpdate event is the place to do so. You can check for valid
values and combinations of values on the form, and set Cancel to True with a
warning to the user if there are errors.
I believe you had also guided me recently to the survey database by Duane
Hookom. Thanks so much. It has helped to a degree, except that the audience
is really those who wish to just modify the questions and survey titles, etc.
I am trying to create my database from scratch. Quite a challenge. If you
have any other resources or suggestions regarding this type of database, I'd
really appreciate hearing about them!

I've *adapted* Duane's At Your Survey a couple of times to meet different
needs. It's hard to top.

John W. Vinson [MVP]
 
J

Jamie Collins

EmployeeID, FormID, and Date are not
sufficient in combination to verify uniqueness.

It sounds to me like the author used a real life example where a real
life domain expert said they would be challenged as 'duplicates' by a
real life auditor. But, hey, you know best, eh <g>? Perhaps you could
suspend disbelief and concentrate on the point: would the meaningless
unique number (petty cash form number) satisfy an auditor that no
duplicates had been entered? I can't tell you how to think but would
encourage you to open your mind (perhaps it would help if you imagined
and Access MVP had written the article said:
My point about EmployeeID is [that]
it can be quite inconvenient when the EmployeeID number format
changes.
In my company some of the EmployeeID
numbers went up by 12
I just know that it is a nuisance. If they
decide to add a letter prefix to the number (which is now long integer),
there is yet another hassle in that the related field also needs to change
to a text field.

So a meaningless unique number (such as autonumber) solves a potential
problem that may never happen in reality. Big deal. I could say,
"Autonumbers make the data less readable, necessitating a join to the
referenced tables rather than examining just the referencing table
itself", noting that I look at a table far more often than a client
changes an enterprise key that has been mutually agreed to be
considered stable, and you could say, "Big deal." It's a balance.
I realize that
autonumber is a type of long integer.

Last time I looked autonumber could be a 'Replication ID' which is not
a 'long integer'.

Jamie.

--
 
J

Jamie Collins

It would appear you've "read in" a lot more than I stated (or feel).

You didn't say why you tend to use Autonumber PKs for most of your
tables. I don't know how you are feeling. I assume the same goes for
the OP. I genuinely don't want to stifle discussion. Please tell us
your views.

Jamie.

--
 
B

BruceM

Jamie Collins said:
It sounds to me like the author used a real life example where a real
life domain expert said they would be challenged as 'duplicates' by a
real life auditor. But, hey, you know best, eh <g>? Perhaps you could
suspend disbelief and concentrate on the point: would the meaningless
unique number (petty cash form number) satisfy an auditor that no
duplicates had been entered? I can't tell you how to think but would
encourage you to open your mind (perhaps it would help if you imagined
and Access MVP had written the article <g>?)

The point of the article seems to have been that a combination of fields
would provide adequate demonstration of uniqueness to satisfy the auditor
that the record is not a duplicate. My point is that it would not. The
meaningless number would not satisfy the auditor, nor would any combination
of the fields used in the example.
No, I do not claim to "know best", but a natural key argument based on
flawed logic does not convince me that the author does either. I expect
there is a valid point in the article, but the example gets in the way.
The logic would have been flawed no matter who had written the article. On
the subject of open-mindedness, do you make any particular claims? ;-)
My point about EmployeeID is [that]
it can be quite inconvenient when the EmployeeID number format
changes.
In my company some of the EmployeeID
numbers went up by 12
I just know that it is a nuisance. If they
decide to add a letter prefix to the number (which is now long integer),
there is yet another hassle in that the related field also needs to
change
to a text field.

So a meaningless unique number (such as autonumber) solves a potential
problem that may never happen in reality. Big deal. I could say,
"Autonumbers make the data less readable, necessitating a join to the
referenced tables rather than examining just the referencing table
itself", noting that I look at a table far more often than a client
changes an enterprise key that has been mutually agreed to be
considered stable, and you could say, "Big deal." It's a balance.

I prefer to stay away from situations that would involve updating all of the
records in a related table. Perhaps this is a result of inexperience, and
you would not give such an update a second thought. Companies are bought,
sold, merged, and so forth all the time, so even if somebody today says the
number is stable, that person may be out of work in a month when the company
is bought out and a new system is put in place. I would rather guard
against that not improbably circumstance. You are not daunted by the
prospect of what to me looks like a massive update. As you said, it's a
balance.
Last time I looked autonumber could be a 'Replication ID' which is not
a 'long integer'.

In table design view the Data Type is listed as Autonumber, and the Field
Size as Long Integer. There is probably a semantic reason why the
terminology is inaccurate, but these are the terms Access uses. An
autonumber can be related only to a Long Integer field. I don't know what a
ReplicationID is, or how it differs from other use of an autonumber field.
I was responding to your reminding me that autonumber is not a format.
 
D

dee

Hi again, John,

When you adapted his database, did you do so from scratch or modify what was
there? I have a few areas that I'm not sure of and don't know if you could
provide some guidance:

1. He has all of his answers in the answers table, listed by question
number and response, i.e. 1. first choice answer 1. second choice
answer 1. third choice answer, etc. I have some about 3 questions that
require a choice from about 15 - 25 choices. Should I simply continue his
way of doing this, or create sub-tables?

2. On my forms, I have the respondent code and the interviewer code, plus
the date and time the survey was filled in. I decided to use the question
code as my PK because it is unique - each has its own combination of letters
and numbers - however, these other fields aren't numbered. Should I add a
fictitious number and have the interviewer code, date, etc. as just another
answer, or should I treat these differently?

3. I also have a lot of information about the respondents in a separate
table, but I guess this is just additional information that I would add to
the one in Duane's db.

4. I have two sets of surveys - one for adults and one for children - the
names of the surveys indicate which is which. I suppose there is no need to
separate these at all... Just have a table with questions, a table with
answers, a table with responses.

Sound correct?

Thanks!

--
Thanks!

Dee
 
J

Jamie Collins

I expect
there is a valid point in the article, but the example gets in the way.

Personally, I can never see the valid point of an article once I've
spotted a spelling mistake or see the word 'enormity' used to mean
'big' rather than 'evil' or when another such irrevocable feature has
got in the way. Excuse me while I go for a lie down.
In table design view the Data Type is listed as Autonumber, and the Field
Size as Long Integer. There is probably a semantic reason why the
terminology is inaccurate, but these are the terms Access uses. An
autonumber can be related only to a Long Integer field. I don't know what a
ReplicationID is, or how it differs from other use of an autonumber field.
I was responding to your reminding me that autonumber is not a format.

I'm confident I can show you to be wrong but could you admit it to
yourself, I wonder? Oh well, here goes nothing...

In table design view, click on 'Long Integer' and it will reveal
itself to be a dropdown. The second option on the dropdown after 'Long
Integer' is 'Replication ID' while the 'Data type' still shows
'Autonumber' (if I wasn't open minded, how would I know so much about
autonumber and in the Access user interface too <g>?)

I think describing autonumber as a 'data type' is as wrong as
describing it as a 'format'. I'd suggest 'property'. Here's what it
looks like as Access/Jet SQL DDL (ANSI-92 Query Mode syntax):

CREATE TABLE Employees (
myID GUID DEFAULT GenGUID() NOT NULL UNIQUE,
employee_number CHAR(10) NOT NULL PRIMARY KEY
);

The best article I can come up with is:

How to use GUID fields in Access from Visual C++
http://support.microsoft.com/kb/170117

"Microsoft Access 95 introduced a new GUID (Globally Unique
Identifier) data type used for database replication. You can use GUID
fields (called Replication ID fields in Access) to store any GUID
number in a compact 16-byte binary format that can be indexed and used
as a primary key for a table... Microsoft Access also provides an
"AutoNumber" type GUID field. You can use this type of field for the
primary key of a table and let Access automatically generate each new
GUID."

I hope those C++ examples don't lead you to dismiss the article
outright <g> because, for me at least, it quite clearly states that
GUID fields a.k.a. Replication ID fields are autonumber fields.

Jamie.

--
 
B

BruceM

Jamie Collins said:
Personally, I can never see the valid point of an article once I've
spotted a spelling mistake or see the word 'enormity' used to mean
'big' rather than 'evil' or when another such irrevocable feature has
got in the way. Excuse me while I go for a lie down.

The article claims that a particular combination of fields used as the key
will demonstrate uniqueness to the auditor's satisfaction (and deny
reimbursement to the person in my example who bought a can of paint late in
the day). The problem is not a spelling error or a misused word, but rather
that the author uses faulty logic to make a logical point that is therefore
not valid. If my logic is flawed, please point it out rather than restating
that the only problem with the article is that I am of closed mind as I read
it.
I'm confident I can show you to be wrong but could you admit it to
yourself, I wonder? Oh well, here goes nothing...

I did not claim to be right about anything except my observations of the
table design interface. There is a column labeled Data Type. In that
column, Autonumber may be selected. In the Field Size box, Long Integer or
ReplicationID may be selected. I do know that a relationship must be
between two fields of the same Data Type (Text and Text, etc.). When one of
the fields is Autonumber, the other must be Long Integer. If you believe I
am wrong in referring to these things by the terms by which they are
identified in the design interface, I will try to remember to use your
chosen words when conversing with you ;-). I think "property" would
probably be more accurate, as you suggest, but if I was describing the
interface I would use the term the user will actually find there.
In table design view, click on 'Long Integer' and it will reveal
itself to be a dropdown. The second option on the dropdown after 'Long
Integer' is 'Replication ID' while the 'Data type' still shows
'Autonumber' (if I wasn't open minded, how would I know so much about
autonumber and in the Access user interface too <g>?)

I think describing autonumber as a 'data type' is as wrong as
describing it as a 'format'. I'd suggest 'property'. Here's what it
looks like as Access/Jet SQL DDL (ANSI-92 Query Mode syntax):

Geez. I used "format" to refer to how the number looks to an observer. I
acknowledged that it was inaccurate terminology. Then I used the term
Access uses. Again, this was problematic. It looks like I'm wrong at every
turn. The worst part is, it's not even relevant. You know what I mean.
 
M

Michael Gramelspacher

Hi again, John,

When you adapted his database, did you do so from scratch or modify what was
there? I have a few areas that I'm not sure of and don't know if you could
provide some guidance:

1. He has all of his answers in the answers table, listed by question
number and response, i.e. 1. first choice answer 1. second choice
answer 1. third choice answer, etc. I have some about 3 questions that
require a choice from about 15 - 25 choices. Should I simply continue his
way of doing this, or create sub-tables?

2. On my forms, I have the respondent code and the interviewer code, plus
the date and time the survey was filled in. I decided to use the question
code as my PK because it is unique - each has its own combination of letters
and numbers - however, these other fields aren't numbered. Should I add a
fictitious number and have the interviewer code, date, etc. as just another
answer, or should I treat these differently?

3. I also have a lot of information about the respondents in a separate
table, but I guess this is just additional information that I would add to
the one in Duane's db.

4. I have two sets of surveys - one for adults and one for children - the
names of the surveys indicate which is which. I suppose there is no need to
separate these at all... Just have a table with questions, a table with
answers, a table with responses.

Sound correct?

Thanks!
Here is something I have put together on surveys. It is merely another thing to
take a look at. http://www.psci.net/gramelsp/temp/Survey_Practice_2.zip

I tried to implement a design that was suggested by several other people.
 
J

Jamie Collins

There is a column labeled Data Type. In that
column, Autonumber may be selected. In the Field Size box, Long Integer or
ReplicationID may be selected. I do know that a relationship must be
between two fields of the same Data Type (Text and Text, etc.). When one of
the fields is Autonumber, the other must be Long Integer.

Even if the Autonumber is a 16 byte Replication ID (GUID) field? I
don't think so.

Jamie.

--
 
B

BruceM

OK, you got me. I don't know what a replication ID is, nor how it is used
in relationships.
 
J

Jamie Collins

The worst part is, it's not even relevant. You know what I mean.

I did cut you some slack: you *can* create a relationship between a
Long Integer Autonumber and (for example) a DATETIME column but I
*know* you meant with RI enforced ;-)

Jamie.

--
 
J

John W. Vinson

Hi again, John,

When you adapted his database, did you do so from scratch or modify what was
there? I have a few areas that I'm not sure of and don't know if you could
provide some guidance:

I modified his database. This was about four or five years ago, I no longer
have the database (I was working for someone else at the time), and I'm not
sure I remember all the details.
1. He has all of his answers in the answers table, listed by question
number and response, i.e. 1. first choice answer 1. second choice
answer 1. third choice answer, etc. I have some about 3 questions that
require a choice from about 15 - 25 choices. Should I simply continue his
way of doing this, or create sub-tables?

A table can hold tens of millions of records. You do NOT need or want to
create subtables. Just add more records to the answer table!
2. On my forms, I have the respondent code and the interviewer code, plus
the date and time the survey was filled in. I decided to use the question
code as my PK because it is unique - each has its own combination of letters
and numbers - however, these other fields aren't numbered. Should I add a
fictitious number and have the interviewer code, date, etc. as just another
answer, or should I treat these differently?

So each question will be asked once, and once only, for the entire use of the
database?! Surely not; but that's what a primary key does. And why should
*fields* be numbered? I'm really not at all sure what you're saying here! The
Questionnaire table should have its own primary key; the Question table should
have its own primary key (the question number, which may very well be a Text
field); but when the users are entering answers, that table should NOT use the
Question Number as its primary key!!
3. I also have a lot of information about the respondents in a separate
table, but I guess this is just additional information that I would add to
the one in Duane's db.

Sure, just add fields to the people table.
4. I have two sets of surveys - one for adults and one for children - the
names of the surveys indicate which is which. I suppose there is no need to
separate these at all... Just have a table with questions, a table with
answers, a table with responses.

Exactly. There would be separate Questionnaire records for the two surveys,
each related one-to-many to the Questions table and the Answers table.
Sound correct?

Thanks!

John W. Vinson [MVP]
 
D

dee

HI there,

A nice clean database. I have a question for you... What if the answers are
not limited to set choices.

In my case, I have some questions whose answers are limited to the list of
choices, but I have others where they need to fill in information, such as x
number of times per week or day or month (they have to choose) that something
takes place. Other times, they may simply write their own answer.

How would that be handled?

Thanks!
 
D

dee

Hi John,

Sorry if I wasn't clear - was up half the night working on this thing for
the last few nights and I think I'm getting a bit incoherent! :)

I will have separate tables for:

- Questionnaire names and descriptions and questionnaire PK, an autonumber.

- Questions, using the question number, which, yes, is text as the PK.

- Answers, using a combined question number and answer PK and including only
those two fields.

- Responses table, using an the respondentID combined with the questionID as
PK (?) and including the individual responses.

- Questionnaire Respondents table with respondentID as PK and including
questionnaireID and participantID

- Respondents table, using their ID code as the PK. This will be very
detailed, including DOB, language, address, phone, male-female, etc., etc.

- Interviewer table, using the ID code as the PK. I am getting your idea
about a "people" table that would include both respondents and interviewers,
but there would be many blanks fields for the interviewers because we may
only include their code, first and last names and maybe one other field.
Does this sound correct?

I think that's it! Does it seem sound?

Thank you so much for your time. I greatly appreciate it and your valuable
advice.





--
Thanks!

Dee
 
M

Michael Gramelspacher

HI there,

A nice clean database. I have a question for you... What if the answers are
not limited to set choices.

In my case, I have some questions whose answers are limited to the list of
choices, but I have others where they need to fill in information, such as x
number of times per week or day or month (they have to choose) that something
takes place. Other times, they may simply write their own answer.

How would that be handled?

Thanks!
Too many errors in my database. I uploaded a somewhat better version, but there
are still too many unresolved conceptual problems. I do not think this will
work as it is.

Your question is another unresolved issue.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top