How would I remove all data from northwinds template.

B

BruceM via AccessMonster.com

You could make a new blank database. Import the objects from Northwinds
(File >> Get External Data >> Import). Navigate to Northwinds. Click the
Options button, and select to import tables as Definition Only. Import the
rest of the objects (queries, forms, etc.) too, if you want to have the same
interface as in Northwinds.
 
K

KenSheridan via AccessMonster.com

However, do note that the Customers table in Northwind is not properly
normalized. The following is a simple summary of basic normalization, and
asa an example includes a brief explanation of the flaws in the Customers
table in Northwind:


"Normalization is the process of eliminating redundancy from a database, and
involves decomposing a table into several related tables. In a relational
database each table represents an entity type, e.g. Contacts, Companies,
Cities, States etc. and each column in a table represents an attribute type
of the entity type, e.g. ContactID, FirstName and LastName might be attribute
types of Contacts and hence columns of a Contacts table. Its important that
each attribute type must be specific to the entity type, so that each 'fact'
is stored once only. In the jargon its said that the attribute type is
'functionally dependent' solely on the whole of the primary key of a table.

To relate tables a 'referencing' table will have a foreign key column which
makes the link to the 'referenced' table, e.g. a Contacts table might have a
CompanyID column as a foreign key, while a Companies table has a CompanyID
column as its primary key. Consequently no data other than the CompanyID
needs to be stored in a row in the Contacts table for all the company
information for that contact to be known; its available via the relationship
and can be returned in a query joining the two tables on the CompanyID
columns.

Similarly the Companies table might have a CityID column and the Cities table
a StateID column. If its an international database the States (or more
generically Regions) table would have a CountryID referencing the primary key
of a Countries table. So via the relationships, simply by entering (in
reality this would be selected from a list of Companies in a combo box, not
typed in) a CompanyID in the Contacts table the location of the contact's
company is also known. Redundancy, and therefore repetitive data entry is
avoided.

To see how a database is made up of related tables take a look at the sample
Northwind database. Its not completely normalized in fact (deliberately so
for the sake of simplicity) but illustrates the main principles of how tables
representing entity types relate to each other. An example of its lack of
proper normalization can be found in its Customers table. You'll see that
this has City, Region and Country columns so we are told numerous times that
São Paulo is in SP region (as is Resende) and that SP region is in Brazil.
Not only does this require repetitive data entry, but more importantly it
opens up the risk of inconsistent data, e.g. it would be perfectly possible
to put São Paulo in California in one row and California in Ireland! Proper
normalization as I described above would prevent this as the fact that São
Paulo is in SP region would be stored only once in the database as would the
fact that SP region is in Brazil and California is in the USA.

An example of what at first sight might seem to be redundancy, but in fact is
not, can also be found in Northwind. The Products table and the OrderDetails
table both have UnitPrice columns. It might be thought that the unit price
of a product could always be looked up from the Products table, so its
unnecessary in Order Details. However, the unit price of a product will
change over time, but each order needs to retain the price in force at the
time the order was created. Consequently a UnitPrice column is needed in
both tables; that in products holds the current price and is used to get the
value for that in Order Details (code in the ProductID control's AfterUpdate
event procedure in the Order Details Subform does this), which then remains
static when the current price (in products) changes. In each case UnitPrice
is functionally dependent on the key of the table, so there is no redundancy"


By all means use Northwind as the basis for your application, but be aware of
these flaws and their possible consequences. It would not be a difficult job
to decompose the Customers table by creating separate Cities, Regions and
Countries tables, but this would be best done before entering any data.
You'd then need to amend as necessary any forms, queries or reports which use
the Customers table.

For more on normalization see:

http://www.datamodel.org/NormalizationRules.html

Ken Sheridan
Stafford, England
 
B

BruceM via AccessMonster.com

The link is not available. There is a notice telling the owner to call a
number. Perhaps domain name registration has expired or some such thing.
 
K

KenSheridan via AccessMonster.com

So I see. It was there yesterday! I never post a link without checking
first. There's always Wiki:

http://en.wikipedia.org/wiki/Database_normalization

or just Google 'normalization'; there's no shortage of sites!

Ken Sheridan
Stafford, England
The link is not available. There is a notice telling the owner to call a
number. Perhaps domain name registration has expired or some such thing.
However, do note that the Customers table in Northwind is not properly
normalized. The following is a simple summary of basic normalization, and
[quoted text clipped - 71 lines]
 
J

John_G via AccessMonster.com

Hi Ken -

I know this is a bit off-topic for the thread, but I disagree with you that
the Northwind Customers table is not properly normalized, at least with
respect to City, Region and Country. All of those are needed in that table
to properly describe an address. You are assuming that there is ony ONE São
Paulo (or any other city name) in the world, in which case you would be
right. But there are probably many São Paulo's, so each São Paulo address
must specify which São Paulo it is. An example closer to home, if someone
says to me "I'm going to London" I have to ask them "London where - Canada or
England?". Similarly, in Canada we can have the same city name in two or
more provinces (Region). Now, I suppose that the combination of City + Region
could be unique, but there is no guarantee of that, and indeed no requirement.


I think we have to leave the Customers table as it is.

Cheers!

John
 
J

John W. Vinson

Now, I suppose that the combination of City + Region
could be unique, but there is no guarantee of that, and indeed no requirement.

There are two places named Los Alamos in New Mexico.
 
K

KenSheridan via AccessMonster.com

John:

Thanks for the comments, but I have to differ I'm afraid. I don't think that
there can be any question that Northwind's Customers table is not adequately
normalized.

Taking a couple of rows at random Great Lakes Food Market is in Eugene which
is in Oregon which is in the USA. Hungry Coyote Import Store is in Elgin
which is in Oregon which is in the USA. So we are told twice that Oregon is
in the USA. That is redundancy and leaves the door wide open to inconsistent
data. Its perfectly possible to update one of these by changing USA to UK,
both incorrectly and inconsistently. Country is determined by Region, which
is a non-key column, so there is a transitive functional dependency and the
table is not normalized to Third Normal Form, the definition (from Date) of
which is:

'Third Normal Form: A relvar is in 3NF if and only if its in 2NF and every
non-key attribute is non-transitively dependent on the primary key.'

Very loosely speaking a table is the equivalent of the term relation, a table
definition is the equivalent of the term relvar (relation variable), a column
(aka field) is the equivalent of the term attribute, and a row (aka record)
is the equivalent of the term tuple in the formal language of the relational
model.

The table needs decomposing into Cities, Regions and Countries. As regards
city names these can be legitimately duplicated, so the table should have a
numeric CityID column referencing a surrogate CityID primary key of Cities.
While 'regions' have distinct values in the USA (states), here (for
addressing purposes the county is used as the regional unit here) and I
assume in Canada, I'm not sure if this is true worldwide, so I'd play safe
and use a surrogate RegionID key, but if the database covers only a single
country a natural key could be used. Country names do have distinct values
so a natural key would be fine.

For ways of handling this type of hierarchical data by means of correlated
combo boxes while maintaining a normalized structure see my demo at:

http://community.netscape.com/n/pfx...yMessages&tsn=1&tid=23626&webtag=ws-msdevapps


which uses the local administrative area of Civil Paris, District and County
in my area.

If one wished to be relationally purist to an absurd level one could of
course use multiple keys, but that would mean the key of a Cities table in
Northwind would be made up of City, Region and Country columns, and Regions
would have a key of Region and Country. This would allow the enforcement of
referential integrity and avoid the potential inconsistencies to which
Northwind is currently wide open, but it would be very cumbersome. In any
event Northwind has no such tables. Anyone using Northwind as a model, who
doesn't wish to decompose the Customers table would nevertheless be well
advised to create and fill such tables, which is a trivial task, and create
enforced relationships.

I actually came across an unusual example of how easily incorrect and
inconsistent data can be entered only a few days ago, not in a database as
such, but in the 1911 census returns. My wife, who's a professional
genealogist, had printed off a copy of the census return for my great-
grandfather's house as an example for her students. He'd entered the
birthplace of my great-grandmother (the only non-Irish member of my family in
that generation) as Cumberland, Scotland, whereas it is in fact in England
(not far from the border, its true, but England all the same). Now if only
he'd been using a database with well normalized tables, he'd have got it
right! At a more prosaic level I once found in one database three versions
of my own name as author of technical papers in my own field of work as a
result of poor normalization.

Ken Sheridan
Stafford, England

John_G said:
Hi Ken -

I know this is a bit off-topic for the thread, but I disagree with you that
the Northwind Customers table is not properly normalized, at least with
respect to City, Region and Country. All of those are needed in that table
to properly describe an address. You are assuming that there is ony ONE São
Paulo (or any other city name) in the world, in which case you would be
right. But there are probably many São Paulo's, so each São Paulo address
must specify which São Paulo it is. An example closer to home, if someone
says to me "I'm going to London" I have to ask them "London where - Canada or
England?". Similarly, in Canada we can have the same city name in two or
more provinces (Region). Now, I suppose that the combination of City + Region
could be unique, but there is no guarantee of that, and indeed no requirement.

I think we have to leave the Customers table as it is.

Cheers!

John
... An example of its lack of
proper normalization can be found in its Customers table. You'll see that
[quoted text clipped - 6 lines]
Paulo is in SP region would be stored only once in the database as would the
fact that SP region is in Brazil and California is in the USA.
 
J

John_G via AccessMonster.com

Hi Ken --

Sorry, I still don't agree. You are making assumptions that are not valid.
Taking a couple of rows at random Great Lakes Food Market is in Eugene which
is in Oregon which is in the USA. Hungry Coyote Import Store is in Elgin
which is in Oregon which is in the USA. So we are told twice that Oregon is
in the USA.

Not true. We only know that THAT Oregon is in the USA. The might be another
"Oregon" region somewhere else.
That is redundancy and leaves the door wide open to inconsistent
data. Its perfectly possible to update one of these by changing USA to UK,
both incorrectly and inconsistently.

Only because you know that there is no Oregon in the UK. What if there were?
It's a perfectly valid option (from a human point of view), but there is no
data rule that prevents it.
Country is determined by Region

No, it's not. On what basis can you make this statement? First, you would
have to know every region (whatever that is) in every country, and second,
there would have to be a world-wide enforced rule (even more than a
convention) that no country could duplicate a region name.
city names these can be legitimately duplicated, so the table should have a
numeric CityID column referencing a surrogate CityID primary key of Cities.

Which for a large database could number in the thousands. Would you really
expect your poor users to deal with that mess?
... He'd entered the
birthplace of my great-grandmother (the only non-Irish member of my family in
that generation) as Cumberland, Scotland, whereas it is in fact in England
(not far from the border, its true, but England all the same). Now if only
he'd been using a database with well normalized tables, he'd have got it
right!

Again, only because you yourself happen to know that. What if there was a
Cumberland in Scotland as well? Could easily be, and no data rules are
violated - from a database theory viewpoint. For that particular application,
yes, but that's not relevant.


Do you see what I'm getting at here? What might be true for a given
application (e.g. only one country) might lend itself to your techniques, but
from a general theoretical viewpoint it is not correct.

John
 
K

KenSheridan via AccessMonster.com

John:

I think we'll have to agree to disagree on this. It seems to me your
understanding of what Functional Dependency means is the problem. You are
treating it as though it’s the individual attribute value which is the
determinant, whereas it is in fact the attribute type. Its possible to
determine that a table is not in 3NF irrespective of the values. This
obviously has to be the case because there is no knowing what values may need
to be inserted in the future.

Its not that 'Oregon' determines 'USA', but that the attribute type Region
determines the attribute type Country, i.e. for every value of Region the
value of Country is known; in Northwind this can't be assumed to be so hence
the redundancy and the resulting risk of bad data where Region is a non-key
column. Whether there is one Oregon worldwide or 6 is immaterial.
Normalization to 3NF does not concern itself with the data per se, only the
attribute types; apart from trivial examples like days of the week or months
of the year the total possible legitimate values of a type are at any time
unknown, but what is possible to say is whether the values must be distinct
or not. This is why the table needs decomposing by projection on City, Region
and Country, and the 'natural' key City needs replacing with a distinctly
valued surrogate key so that, allowing for n cities of the same name, it does
determine RegionID (the result of that projection is not normalized to 3NF
of course, so needs further decomposition by projection on Region and Country
in the same way).

Your argument is that because of the theoretical possibility of n Oregon's
the table is normalized to 3NF. That this argument is based on a flawed
understanding of what Functional Dependency means is easily demonstrated by a
hypothetical update. Lets assume that in the row for the customer in Eugene
is changed to UK. You argue that there may well be a region Oregon in the UK
with a City Eugene, which is theoretically possible, but is not in fact the
case, so the table does now contain inconsistent and incorrect data in the
Region and Country columns in that we are told that there are Oregon regions
in both countries, which is untrue.

Lets assume that there really is an Oregon region in the UK, but it does not
contain a city Eugene. In this scenario the same update results in
inconsistent and incorrect data in that we are told that the Oregon region in
the UK contains a city Eugene, which is also untrue in our hypothetical
scenario.

Also of course there is the possibility of a simple typo such as Orogon. Of
the three versions of my name I mentioned, two were correct variations, one
was incorrect, clearly resulting from a typo.

Decomposing the table not only results in a set of tables all in 3NF (and
beyond in fact) by virtue of the non-key columns being functionally dependent
on 'the key, the whole key and nothing but the key, so help me Codd', but
also, and this is the real object of normalization, removes the possibility
of the inconsistent and incorrect data which can result from redundancy.
Customers references CityID in Cities, which in turn references RegionID in
Regions, which in turn references Country in Countries. By using surrogate
keys the 4 Staffords in the USA can be distinguished from the original one
where I live. But its not possible to select a non-existent Stafford,
Staffordshire, USA, as no row for such exists in Cities. A user could of
course insert a new row of Staffordshire, USA in Regions and a new row for
Stafford, Staffordshire in Cities (with a different CityID to the real
Stafford, Staffordshire) but that would take a really perverse user ( I think
we can assume that the US government is not going to create a new state of
Staffordshire), whereas Stafford, Staffordshire, USA in the non-normalized
Customers table could be very easily entered as a result of a simple typo or
a moment's inattention by the user, or even possibly as a result of a level
of geographical ignorance such as that exhibited by my great-grandfather.

Somehow I don't think you'll be convinced of the error of your ways, and I
certainly won't be. So while these debates are entertaining any continuance
would, I think be further pointless time wasting. The OP and any lurkers can
judge for themselves.

Ken Sheridan
Stafford, England
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top