Why is it mportant to maintain data integrity in access?

Kay · Aug 24, 2007

Can anyone give me an answer to the question. I have a general idea but the
question has been asked of me and I would like a back up to what I think it
may be. Also what impact it may have on a company if data integrity was not
maintained

Gina Whipp · Aug 24, 2007

Simple answer: Without data intregrity your data is USELESS. This doesn't
just apply to Access but to any data management system. Think of you bank
account, suppose data intregrity didn't apply to it? No one would ever know
how much money was in your account!

More reading:
http://www.databasedev.co.uk/data_integrity.html

George Nicholson · Aug 24, 2007

Garbage In = Garbage Out

Data integrity is the first line of defense against Garbage In.
If garbage is acceptable then data integrity isn't a requirement.

HTH,

Ken Sheridan · Aug 24, 2007

Lets look at a very simple example:

You have a table which is basically a list of addresses. These could be
customers, suppliers, whatever. In this table you include a City column and
a State column. In one row someone enters Birmingham and Alabama, in another
San Francisco and California. So far so good. I know that in reality you'd
be more likely to use the abbreviated forms of the state names, but for this
example lets assume form the moment that the names are entered in full

Now someone enters a row with San Francisco and Callifornia. A simple typo,
but easily done (I once found three variations of my name in a database of
authors of technical papers!). If you then try to return all rows for
California this row is not going to be returned. So you create a States
table with one row per State and enforce referential integrity between this
and the table of addresses. Now it becomes impossible for a user to enter a
non-valid state name in the table of addresses.

Integrity is still at risk, however, as there is nothing to stop someone
entering an address in Birmingham, California. For all I know there may be a
Birmingham in California as well as the one in Alabama and the original one
30 miles down the road from me here, but if so lets assume that there isn't
and this is a mistake. If you then look for rows in Alabama this one won't
be returned of course. So anomalies are still possible. Imagine what either
of the above would do for a companies sales figures grouped by state or
regionally.

To ensure integrity we need a Cities table with a unique CityID primary key
column and a City text column (city names, like people's, can be duplicated
so are not suitable as a key). This table will also have a State column and
referential integrity will be enforced between the States and Cities tables.
The main table of addresses, however, will NOT have a State column but only a
CityID column referencing the primary key of Cities. As the state is known
from the CityID via the enforced relationships to have a State column in the
main table introduces redundancy and once again leaves the door open to
anomalies. In the jargon its what's known as a transitive functional
dependency, which indicates that a table is not properly normalized.

Really, of course, the process of normalization should be extended further
as you might have the same street entered in two rows in a table but,
incorrectly, with different CityID values, so integrity is still under
threat. In reality most people would not go so far as to cater for this by
decomposing the addresses table further, though here in the UK its now common
for addresses to be recorded by just the post code plus a house number or
similar, as the post code tells us everything else. My post code for
instance covers all properties on one side of my street, so the post code
identifies which street I live in and that this is in Stafford,
Staffordshire, England. All that's needed in addition is my house number to
pin me down precisely. Add-ins are available whereby entering a post code
automatically enters the other details, and drops down a combo box's list of
all house numbers or names in that post code, so the user is forced to select
a valid one. This need not necessarily be the correct one, of course; I
could still select one of my neighbours' house numbers! This illustrates the
difference between integrity and accuracy, the former can be catered for in
the design of the database, but its all but impossible to stop a user
entering an inaccurate yet valid value.

For an exhaustive examination of the theoretical basis of database integrity
issues Chris Date's 'Introduction to Database Systems' has a chapter of some
30 pages on the subject. It is quite abstract, however.

Ken Sheridan
Stafford, England

John W. Vinson · Aug 25, 2007

Can anyone give me an answer to the question. I have a general idea but the
question has been asked of me and I would like a back up to what I think it
may be. Also what impact it may have on a company if data integrity was not
maintained

That's not particularly an Access question, or even a database question.

Ask it this way: Does it matter if an Invoice entered into the system pertains
to a nonexistant customer?

Does it matter if an Order contains a dozen products which your company does
not stock?

Does it matter if an employee is assigned to a nonexistant department?

Does it matter if a payroll check is made out to a nonexistant employee... who
happens to have the same name as the consultant who designed the payroll
database?

If the answer to any of these questions is yes, then you need data integrity.

John W. Vinson [MVP]

Guest · Aug 25, 2007

Because it is more work to fix it up afterwards than it is to
get it right first time.

(david)

John W. Vinson · Aug 25, 2007

Because it is more work to fix it up afterwards than it is to
get it right first time.

<g>

Elegant, eloquent, accurate and damn near complete. Thanks David!

John W. Vinson [MVP]

tina · Aug 25, 2007

Does it matter if a payroll check is made out to a nonexistant employee...
who

happens to have the same name as the consultant who designed the payroll
database?

ROFLOL

Larry Linson · Aug 25, 2007

Kay said:
Can anyone give me an answer to the question. I have a general idea but
the
question has been asked of me and I would like a back up to what I think
it
may be. Also what impact it may have on a company if data integrity was
not
maintained

Could you mean "referential integrity"? I'm sure you know that "data
integrity" is necessary -- invalid data is useless.

Referential integrity, however, has very specific meaning in the relational
database world... it is enforcement of rules to avoid accidentally entering
an order detail for a non-existent order, or deleting a record for an office
which has a related employees list... it uses the database engine itself,
usually, to improve overall data integrity by enforcing definitions,
relationships, and procedures. (But, in some databases which don't have an
engine with this capability, similar rules are enforced by the code.)

Larry Linson
Microsoft Access MVP

Why is it mportant to maintain data integrity in access?

Kay

Gina Whipp

George Nicholson

Ken Sheridan

John W. Vinson

Guest

John W. Vinson

tina

Larry Linson