Aligning beyond "Joining" in query--forcing blank rows to appear

Peter · Apr 9, 2010

I'm using MS Access 2002. I want to design a query that forces a certain key
table's number of rows to be repeated in each query result, even when some
elements in other tables do not report data for certain items on the key
table.

Background: I'm updating a big benchmarking database that I previously
slogged through Excel to clean up. This time I want to design better queries
in Access before exporting the data to Excel for statistical analysis. The
data involves 2,500 data items per electric utillity per year, with 200+
utility companies on the list, going back every year to 1994. If you
multiply those out you end up with 7.5 million data points...even though I
only end up using 5% of the data, but it will still be several hundred
thousand data items even before I start the number crunching in Excel.

The key table is called Row Literal Table. It defines what the row numbers
in the other tables mean.

Example: The Plant in Service table if fully loaded, per all the items for
it in the Row Literal Table, has 104 lines of data, but one utility may have
no nuclear power plants, so it will have blanks where the nuclear plant data
rows are, and then it will show (say) only 80 lines out of the full 104
possible, in its Plant in Service query results, while another utility that
has no transmission lines will have a different subset out of the 104 lines
of data on the same query result.

When I export my data into Excel I want to line up all the Row Literals in
the same rows in Excel, across for each year from 1994 to the present, but,
the rows don't line up, because of the blank items in some utilities make
their lists shorter.

So I want to do a query so that if a utility does not report something, its
query result should have a line with the row literal (e.g. Nuclear plant in
service) and then blanks on that row, rather than missing that row. This
way, in Excel, I can do statisics much more easily. There's too much data
and too much variability to use vlookups or nested IF statements in Excel to
resolve this.

Thanks in advance.

Peter

Marshall Barton · Apr 9, 2010

Peter said:
I'm using MS Access 2002. I want to design a query that forces a certain key
table's number of rows to be repeated in each query result, even when some
elements in other tables do not report data for certain items on the key
table.

Background: I'm updating a big benchmarking database that I previously
slogged through Excel to clean up. This time I want to design better queries
in Access before exporting the data to Excel for statistical analysis. The
data involves 2,500 data items per electric utillity per year, with 200+
utility companies on the list, going back every year to 1994. If you
multiply those out you end up with 7.5 million data points...even though I
only end up using 5% of the data, but it will still be several hundred
thousand data items even before I start the number crunching in Excel.

The key table is called Row Literal Table. It defines what the row numbers
in the other tables mean.

Example: The Plant in Service table if fully loaded, per all the items for
it in the Row Literal Table, has 104 lines of data, but one utility may have
no nuclear power plants, so it will have blanks where the nuclear plant data
rows are, and then it will show (say) only 80 lines out of the full 104
possible, in its Plant in Service query results, while another utility that
has no transmission lines will have a different subset out of the 104 lines
of data on the same query result.

When I export my data into Excel I want to line up all the Row Literals in
the same rows in Excel, across for each year from 1994 to the present, but,
the rows don't line up, because of the blank items in some utilities make
their lists shorter.

So I want to do a query so that if a utility does not report something, its
query result should have a line with the row literal (e.g. Nuclear plant in
service) and then blanks on that row, rather than missing that row. This
way, in Excel, I can do statisics much more easily. There's too much data
and too muchb variability to use vlookups or nested IF statements in Excel to
resolve this.

I think all you need to do is use an outer join from the
literals table to the plants table. Maybe a little like:

SELECT [Row Literal Table].*, Service.*
FROM [Row Literal Table] LEFT JOIN Service
ON [Row Literal Table].pkfield = Service.fkfield

KARL DEWEY · Apr 9, 2010

You need a LEFT JOIN Like this --
FROM [Row Literal Table] LEFT JOIN tblSecondary ON [Row Literal
Table].[RowNumberField] = tblSecondary.[RowNumberField]

Peter · Apr 9, 2010

Thanks... but now the depth of my ignorance will show. Do I paste that into
a SQL query?

KARL DEWEY said:
You need a LEFT JOIN Like this --
FROM [Row Literal Table] LEFT JOIN tblSecondary ON [Row Literal
Table].[RowNumberField] = tblSecondary.[RowNumberField]

--
Build a little, test a little.

Peter said:

I'm using MS Access 2002. I want to design a query that forces a certain key
table's number of rows to be repeated in each query result, even when some
elements in other tables do not report data for certain items on the key
table.

Background: I'm updating a big benchmarking database that I previously
slogged through Excel to clean up. This time I want to design better queries
in Access before exporting the data to Excel for statistical analysis. The
data involves 2,500 data items per electric utillity per year, with 200+
utility companies on the list, going back every year to 1994. If you
multiply those out you end up with 7.5 million data points...even though I
only end up using 5% of the data, but it will still be several hundred
thousand data items even before I start the number crunching in Excel.

The key table is called Row Literal Table. It defines what the row numbers
in the other tables mean.

Example: The Plant in Service table if fully loaded, per all the items for
it in the Row Literal Table, has 104 lines of data, but one utility may have
no nuclear power plants, so it will have blanks where the nuclear plant data
rows are, and then it will show (say) only 80 lines out of the full 104
possible, in its Plant in Service query results, while another utility that
has no transmission lines will have a different subset out of the 104 lines
of data on the same query result.

When I export my data into Excel I want to line up all the Row Literals in
the same rows in Excel, across for each year from 1994 to the present, but,
the rows don't line up, because of the blank items in some utilities make
their lists shorter.

So I want to do a query so that if a utility does not report something, its
query result should have a line with the row literal (e.g. Nuclear plant in
service) and then blanks on that row, rather than missing that row. This
way, in Excel, I can do statisics much more easily. There's too much data
and too much variability to use vlookups or nested IF statements in Excel to
resolve this.

Thanks in advance.

Peter

Click to expand...

KARL DEWEY · Apr 9, 2010

What I posted was a part of a SQL statement. If you create your query, open
in design view, click on VIEW - SQL View it will open another window with the
SQL of the query you built.
Edit the FROM portion to look like what I posted. When you return to design
view you will see the connecting line between the two tables has an arrow
head on the end away from [Row Literal Table].

--
Build a little, test a little.

Peter said:
Thanks... but now the depth of my ignorance will show. Do I paste that into
a SQL query?

KARL DEWEY said:

You need a LEFT JOIN Like this --
FROM [Row Literal Table] LEFT JOIN tblSecondary ON [Row Literal
Table].[RowNumberField] = tblSecondary.[RowNumberField]

--
Build a little, test a little.

Peter said:

I'm using MS Access 2002. I want to design a query that forces a certain key
table's number of rows to be repeated in each query result, even when some
elements in other tables do not report data for certain items on the key
table.

Background: I'm updating a big benchmarking database that I previously
slogged through Excel to clean up. This time I want to design better queries
in Access before exporting the data to Excel for statistical analysis. The
data involves 2,500 data items per electric utillity per year, with 200+
utility companies on the list, going back every year to 1994. If you
multiply those out you end up with 7.5 million data points...even though I
only end up using 5% of the data, but it will still be several hundred
thousand data items even before I start the number crunching in Excel.

The key table is called Row Literal Table. It defines what the row numbers
in the other tables mean.

Example: The Plant in Service table if fully loaded, per all the items for
it in the Row Literal Table, has 104 lines of data, but one utility may have
no nuclear power plants, so it will have blanks where the nuclear plant data
rows are, and then it will show (say) only 80 lines out of the full 104
possible, in its Plant in Service query results, while another utility that
has no transmission lines will have a different subset out of the 104 lines
of data on the same query result.

When I export my data into Excel I want to line up all the Row Literals in
the same rows in Excel, across for each year from 1994 to the present, but,
the rows don't line up, because of the blank items in some utilities make
their lists shorter.

So I want to do a query so that if a utility does not report something, its
query result should have a line with the row literal (e.g. Nuclear plant in
service) and then blanks on that row, rather than missing that row. This
way, in Excel, I can do statisics much more easily. There's too much data
and too much variability to use vlookups or nested IF statements in Excel to
resolve this.

Thanks in advance.

Peter

Click to expand...

Click to expand...

Peter · Apr 9, 2010

Thanks - I have to head out now but will burn some midnight oil this weekend
to use your suggestion!

Marshall Barton said:
Peter said:

I'm using MS Access 2002. I want to design a query that forces a certain key
table's number of rows to be repeated in each query result, even when some
elements in other tables do not report data for certain items on the key
table.

Background: I'm updating a big benchmarking database that I previously
slogged through Excel to clean up. This time I want to design better queries
in Access before exporting the data to Excel for statistical analysis. The
data involves 2,500 data items per electric utillity per year, with 200+
utility companies on the list, going back every year to 1994. If you
multiply those out you end up with 7.5 million data points...even though I
only end up using 5% of the data, but it will still be several hundred
thousand data items even before I start the number crunching in Excel.

The key table is called Row Literal Table. It defines what the row numbers
in the other tables mean.

Example: The Plant in Service table if fully loaded, per all the items for
it in the Row Literal Table, has 104 lines of data, but one utility may have
no nuclear power plants, so it will have blanks where the nuclear plant data
rows are, and then it will show (say) only 80 lines out of the full 104
possible, in its Plant in Service query results, while another utility that
has no transmission lines will have a different subset out of the 104 lines
of data on the same query result.

When I export my data into Excel I want to line up all the Row Literals in
the same rows in Excel, across for each year from 1994 to the present, but,
the rows don't line up, because of the blank items in some utilities make
their lists shorter.

So I want to do a query so that if a utility does not report something, its
query result should have a line with the row literal (e.g. Nuclear plant in
service) and then blanks on that row, rather than missing that row. This
way, in Excel, I can do statisics much more easily. There's too much data
and too muchb variability to use vlookups or nested IF statements in Excel to
resolve this.

Click to expand...

I think all you need to do is use an outer join from the
literals table to the plants table. Maybe a little like:

SELECT [Row Literal Table].*, Service.*
FROM [Row Literal Table] LEFT JOIN Service
ON [Row Literal Table].pkfield = Service.fkfield

Peter · Apr 9, 2010

Thanks Karl - I have to head out now but will burn some midnight oil this
weekend to use your suggestion!

KARL DEWEY said:
What I posted was a part of a SQL statement. If you create your query, open
in design view, click on VIEW - SQL View it will open another window with the
SQL of the query you built.
Edit the FROM portion to look like what I posted. When you return to design
view you will see the connecting line between the two tables has an arrow
head on the end away from [Row Literal Table].

--
Build a little, test a little.

Peter said:

Thanks... but now the depth of my ignorance will show. Do I paste that into
a SQL query?

KARL DEWEY said:

You need a LEFT JOIN Like this --
FROM [Row Literal Table] LEFT JOIN tblSecondary ON [Row Literal
Table].[RowNumberField] = tblSecondary.[RowNumberField]

--
Build a little, test a little.

:

I'm using MS Access 2002. I want to design a query that forces a certain key
table's number of rows to be repeated in each query result, even when some
elements in other tables do not report data for certain items on the key
table.

Background: I'm updating a big benchmarking database that I previously
slogged through Excel to clean up. This time I want to design better queries
in Access before exporting the data to Excel for statistical analysis. The
data involves 2,500 data items per electric utillity per year, with 200+
utility companies on the list, going back every year to 1994. If you
multiply those out you end up with 7.5 million data points...even though I
only end up using 5% of the data, but it will still be several hundred
thousand data items even before I start the number crunching in Excel.

The key table is called Row Literal Table. It defines what the row numbers
in the other tables mean.

Example: The Plant in Service table if fully loaded, per all the items for
it in the Row Literal Table, has 104 lines of data, but one utility may have
no nuclear power plants, so it will have blanks where the nuclear plant data
rows are, and then it will show (say) only 80 lines out of the full 104
possible, in its Plant in Service query results, while another utility that
has no transmission lines will have a different subset out of the 104 lines
of data on the same query result.

When I export my data into Excel I want to line up all the Row Literals in
the same rows in Excel, across for each year from 1994 to the present, but,
the rows don't line up, because of the blank items in some utilities make
their lists shorter.

So I want to do a query so that if a utility does not report something, its
query result should have a line with the row literal (e.g. Nuclear plant in
service) and then blanks on that row, rather than missing that row. This
way, in Excel, I can do statisics much more easily. There's too much data
and too much variability to use vlookups or nested IF statements in Excel to
resolve this.

Thanks in advance.

Peter

Click to expand...

Click to expand...

Peter · Apr 10, 2010

It looks like the left joins I've tried are all doing the same thing--causing
the "missing" rows from the key Row Literal Table to appear just once in the
query results. I need them to appear for each of the 200 utility companies.

I am thinking I'll need to paste Row Literal Table into Excel, cookie-cutter
copies of it for each utility ID Number, and paste a big (40,000 line long)
version of it into Access to then run a query....I wish there were an easier
way to do this....what would a Union query do?

Thanks again.

KARL DEWEY said:
You need a LEFT JOIN Like this --
FROM [Row Literal Table] LEFT JOIN tblSecondary ON [Row Literal
Table].[RowNumberField] = tblSecondary.[RowNumberField]

--
Build a little, test a little.

Peter said:

I'm using MS Access 2002. I want to design a query that forces a certain key
table's number of rows to be repeated in each query result, even when some
elements in other tables do not report data for certain items on the key
table.

Background: I'm updating a big benchmarking database that I previously
slogged through Excel to clean up. This time I want to design better queries
in Access before exporting the data to Excel for statistical analysis. The
data involves 2,500 data items per electric utillity per year, with 200+
utility companies on the list, going back every year to 1994. If you
multiply those out you end up with 7.5 million data points...even though I
only end up using 5% of the data, but it will still be several hundred
thousand data items even before I start the number crunching in Excel.

The key table is called Row Literal Table. It defines what the row numbers
in the other tables mean.

Example: The Plant in Service table if fully loaded, per all the items for
it in the Row Literal Table, has 104 lines of data, but one utility may have
no nuclear power plants, so it will have blanks where the nuclear plant data
rows are, and then it will show (say) only 80 lines out of the full 104
possible, in its Plant in Service query results, while another utility that
has no transmission lines will have a different subset out of the 104 lines
of data on the same query result.

When I export my data into Excel I want to line up all the Row Literals in
the same rows in Excel, across for each year from 1994 to the present, but,
the rows don't line up, because of the blank items in some utilities make
their lists shorter.

So I want to do a query so that if a utility does not report something, its
query result should have a line with the row literal (e.g. Nuclear plant in
service) and then blanks on that row, rather than missing that row. This
way, in Excel, I can do statisics much more easily. There's too much data
and too much variability to use vlookups or nested IF statements in Excel to
resolve this.

Thanks in advance.

Peter

Click to expand...

Peter · Apr 10, 2010

It looks like the left joins I've tried are all doing the same
thing--causing the "missing" rows from the key Row Literal Table to appear
just once in the query results. I need them to appear for each of the 200
utility companies.

I am thinking I'll need to paste Row Literal Table into Excel, cookie-cutter
copies of it for each utility ID Number, and paste a big (40,000 line long)
version of it into Access to then run a query....I wish there were an easier
way to do this....what would a Union query do?

Thanks again.

Marshall Barton said:
Peter said:

I'm using MS Access 2002. I want to design a query that forces a certain key
table's number of rows to be repeated in each query result, even when some
elements in other tables do not report data for certain items on the key
table.

Background: I'm updating a big benchmarking database that I previously
slogged through Excel to clean up. This time I want to design better queries
in Access before exporting the data to Excel for statistical analysis. The
data involves 2,500 data items per electric utillity per year, with 200+
utility companies on the list, going back every year to 1994. If you
multiply those out you end up with 7.5 million data points...even though I
only end up using 5% of the data, but it will still be several hundred
thousand data items even before I start the number crunching in Excel.

The key table is called Row Literal Table. It defines what the row numbers
in the other tables mean.

Example: The Plant in Service table if fully loaded, per all the items for
it in the Row Literal Table, has 104 lines of data, but one utility may have
no nuclear power plants, so it will have blanks where the nuclear plant data
rows are, and then it will show (say) only 80 lines out of the full 104
possible, in its Plant in Service query results, while another utility that
has no transmission lines will have a different subset out of the 104 lines
of data on the same query result.

When I export my data into Excel I want to line up all the Row Literals in
the same rows in Excel, across for each year from 1994 to the present, but,
the rows don't line up, because of the blank items in some utilities make
their lists shorter.

So I want to do a query so that if a utility does not report something, its
query result should have a line with the row literal (e.g. Nuclear plant in
service) and then blanks on that row, rather than missing that row. This
way, in Excel, I can do statisics much more easily. There's too much data
and too muchb variability to use vlookups or nested IF statements in Excel to
resolve this.

Click to expand...

I think all you need to do is use an outer join from the
literals table to the plants table. Maybe a little like:

SELECT [Row Literal Table].*, Service.*
FROM [Row Literal Table] LEFT JOIN Service
ON [Row Literal Table].pkfield = Service.fkfield

Marshall Barton · Apr 10, 2010

No! A union query will not do it. And Excel should not be
needed for any of what you have described so far.

It seems that there is at least one more table (Companies?)
involved. Before you say "of course", please post a more
specific list of the tables, their relevant fields and which
fields are used as primary and foreign keys to link the
tables. Without that info we will have to go back and forth
several times while I guess what you have and you expose one
more tidbit of the information needed to solve your problem.
--
Marsh
MVP [MS Access]

Marshall Barton said:
It looks like the left joins I've tried are all doing the same
thing--causing the "missing" rows from the key Row Literal Table to appear
just once in the query results. I need them to appear for each of the 200
utility companies.

I am thinking I'll need to paste Row Literal Table into Excel, cookie-cutter
copies of it for each utility ID Number, and paste a big (40,000 line long)
version of it into Access to then run a query....I wish there were an easier
way to do this....what would a Union query do?

Marshall Barton said:

Peter said:

I'm using MS Access 2002. I want to design a query that forces a certain key
table's number of rows to be repeated in each query result, even when some
elements in other tables do not report data for certain items on the key
table.

Background: I'm updating a big benchmarking database that I previously
slogged through Excel to clean up. This time I want to design better queries
in Access before exporting the data to Excel for statistical analysis. The
data involves 2,500 data items per electric utillity per year, with 200+
utility companies on the list, going back every year to 1994. If you
multiply those out you end up with 7.5 million data points...even though I
only end up using 5% of the data, but it will still be several hundred
thousand data items even before I start the number crunching in Excel.

The key table is called Row Literal Table. It defines what the row numbers
in the other tables mean.

Example: The Plant in Service table if fully loaded, per all the items for
it in the Row Literal Table, has 104 lines of data, but one utility may have
no nuclear power plants, so it will have blanks where the nuclear plant data
rows are, and then it will show (say) only 80 lines out of the full 104
possible, in its Plant in Service query results, while another utility that
has no transmission lines will have a different subset out of the 104 lines
of data on the same query result.

When I export my data into Excel I want to line up all the Row Literals in
the same rows in Excel, across for each year from 1994 to the present, but,
the rows don't line up, because of the blank items in some utilities make
their lists shorter.

So I want to do a query so that if a utility does not report something, its
query result should have a line with the row literal (e.g. Nuclear plant in
service) and then blanks on that row, rather than missing that row. This
way, in Excel, I can do statisics much more easily. There's too much data
and too muchb variability to use vlookups or nested IF statements in Excel to
resolve this.

Click to expand...

I think all you need to do is use an outer join from the
literals table to the plants table. Maybe a little like:

SELECT [Row Literal Table].*, Service.*
FROM [Row Literal Table] LEFT JOIN Service
ON [Row Literal Table].pkfield = Service.fkfield

Click to expand...

Peter · Apr 10, 2010

Glad to provide more info. Row Literal Table is published every year as part
of the database. This year it has 2,760 rows, from more than 100 tables, but
I typically use only 20 or 30 of the tables. The columns in Row Literal
Table are:

1) Schedule_Table_Name
2) Respondent_Id (these are the numbers 1 to 400+ identifying each utility
that reports to the Federal Energy Regulatory Commission (FERC)
3) Row number (these really are the rows on each page of the "paper" version
that each utility reports--each page on the paper version is a different
Schedule
4) Row Sequence (usually this is the same number on each line, as the row
number)
5) Row Literal (the words on each line of the paper version--example would
be "Distribution System Additions for the Year" in the schedule Plant in
Service, or "Transmission - Maintenance of Structures" in the Schedule
Operations and Maintenance Expenses)
6) Row Status (usually the letter A - not used much)
7) Row Change Year (when this has a year other than 1994, then I have to
redo the rows for every row after the change if the change was an addition of
a new row...and it gets messy--this is one of the other reasons I want to
learn Access better, to deal with the fact that the Row Literal Table changes
every 2 or 3 or 4 years due to changes in the industry (example--added lines
in the Transmission Op and Maintence table because third parties are now
allowed to sell power across a utility's lines).

Another table used in every query is Respondent ID, which puts the utility's
name in place of the Respondent ID number from the other tables.

I'll keep reading your posts and also some library books on Access I just
took out. So for example when I tried a :Left join, e.g. American Electric
Power only reported 30 out of the 104 items on the Plant in Service table,
but instead of getting the desired 104 with blanks for non-used rows, my
query multiplied 30 x 104 for that utility, and repeated values in the data
columns on the right....

Thanks in advance for your help again!

Regards,
Peter

Marshall Barton said:
No! A union query will not do it. And Excel should not be
needed for any of what you have described so far.

It seems that there is at least one more table (Companies?)
involved. Before you say "of course", please post a more
specific list of the tables, their relevant fields and which
fields are used as primary and foreign keys to link the
tables. Without that info we will have to go back and forth
several times while I guess what you have and you expose one
more tidbit of the information needed to solve your problem.
--
Marsh
MVP [MS Access]

It looks like the left joins I've tried are all doing the same
thing--causing the "missing" rows from the key Row Literal Table to appear
just once in the query results. I need them to appear for each of the 200
utility companies.

I am thinking I'll need to paste Row Literal Table into Excel, cookie-cutter
copies of it for each utility ID Number, and paste a big (40,000 line long)
version of it into Access to then run a query....I wish there were an easier
way to do this....what would a Union query do?

Marshall Barton said:

Peter wrote:
I'm using MS Access 2002. I want to design a query that forces a certain key
table's number of rows to be repeated in each query result, even when some
elements in other tables do not report data for certain items on the key
table.

Background: I'm updating a big benchmarking database that I previously
slogged through Excel to clean up. This time I want to design better queries
in Access before exporting the data to Excel for statistical analysis. The
data involves 2,500 data items per electric utillity per year, with 200+
utility companies on the list, going back every year to 1994. If you
multiply those out you end up with 7.5 million data points...even though I
only end up using 5% of the data, but it will still be several hundred
thousand data items even before I start the number crunching in Excel.

The key table is called Row Literal Table. It defines what the row numbers
in the other tables mean.

Example: The Plant in Service table if fully loaded, per all the items for
it in the Row Literal Table, has 104 lines of data, but one utility may have
no nuclear power plants, so it will have blanks where the nuclear plant data
rows are, and then it will show (say) only 80 lines out of the full 104
possible, in its Plant in Service query results, while another utility that
has no transmission lines will have a different subset out of the 104 lines
of data on the same query result.

When I export my data into Excel I want to line up all the Row Literals in
the same rows in Excel, across for each year from 1994 to the present, but,
the rows don't line up, because of the blank items in some utilities make
their lists shorter.

So I want to do a query so that if a utility does not report something, its
query result should have a line with the row literal (e.g. Nuclear plant in
service) and then blanks on that row, rather than missing that row. This
way, in Excel, I can do statisics much more easily. There's too much data
and too muchb variability to use vlookups or nested IF statements in Excel to
resolve this.

I think all you need to do is use an outer join from the
literals table to the plants table. Maybe a little like:

SELECT [Row Literal Table].*, Service.*
FROM [Row Literal Table] LEFT JOIN Service
ON [Row Literal Table].pkfield = Service.fkfield

Click to expand...

Click to expand...

.

Peter · Apr 10, 2010

Also, just for some perspective on this data, three links--first to FERC,
then the other two are two articles I published in a utility industry
magazine about my benchmarking results.

http://www.ferc.gov/docs-filing/forms/form-1/viewer-instruct.asp

http://www.fortnightly.com/result.cfm?i=/4451.cfm

http://www.fortnightly.com/display_pdf.cfm?id=01012006_CuttingCosts.pdf

Marshall Barton said:
No! A union query will not do it. And Excel should not be
needed for any of what you have described so far.

It seems that there is at least one more table (Companies?)
involved. Before you say "of course", please post a more
specific list of the tables, their relevant fields and which
fields are used as primary and foreign keys to link the
tables. Without that info we will have to go back and forth
several times while I guess what you have and you expose one
more tidbit of the information needed to solve your problem.
--
Marsh
MVP [MS Access]

It looks like the left joins I've tried are all doing the same
thing--causing the "missing" rows from the key Row Literal Table to appear
just once in the query results. I need them to appear for each of the 200
utility companies.

I am thinking I'll need to paste Row Literal Table into Excel, cookie-cutter
copies of it for each utility ID Number, and paste a big (40,000 line long)
version of it into Access to then run a query....I wish there were an easier
way to do this....what would a Union query do?

Marshall Barton said:

Peter wrote:
I'm using MS Access 2002. I want to design a query that forces a certain key
table's number of rows to be repeated in each query result, even when some
elements in other tables do not report data for certain items on the key
table.

Background: I'm updating a big benchmarking database that I previously
slogged through Excel to clean up. This time I want to design better queries
in Access before exporting the data to Excel for statistical analysis. The
data involves 2,500 data items per electric utillity per year, with 200+
utility companies on the list, going back every year to 1994. If you
multiply those out you end up with 7.5 million data points...even though I
only end up using 5% of the data, but it will still be several hundred
thousand data items even before I start the number crunching in Excel.

The key table is called Row Literal Table. It defines what the row numbers
in the other tables mean.

Example: The Plant in Service table if fully loaded, per all the items for
it in the Row Literal Table, has 104 lines of data, but one utility may have
no nuclear power plants, so it will have blanks where the nuclear plant data
rows are, and then it will show (say) only 80 lines out of the full 104
possible, in its Plant in Service query results, while another utility that
has no transmission lines will have a different subset out of the 104 lines
of data on the same query result.

When I export my data into Excel I want to line up all the Row Literals in
the same rows in Excel, across for each year from 1994 to the present, but,
the rows don't line up, because of the blank items in some utilities make
their lists shorter.

So I want to do a query so that if a utility does not report something, its
query result should have a line with the row literal (e.g. Nuclear plant in
service) and then blanks on that row, rather than missing that row. This
way, in Excel, I can do statisics much more easily. There's too much data
and too muchb variability to use vlookups or nested IF statements in Excel to
resolve this.

I think all you need to do is use an outer join from the
literals table to the plants table. Maybe a little like:

SELECT [Row Literal Table].*, Service.*
FROM [Row Literal Table] LEFT JOIN Service
ON [Row Literal Table].pkfield = Service.fkfield

Click to expand...

Click to expand...

.

Marshall Barton · Apr 11, 2010

I think I should look at A Copy/Paste of your query's SQL
view too.

AFAICT, your FROM clause would look like:

FROM [Row Literal Table]
LEFT JOIN ([Plant in Service]
INNER JOIN Respondents
ON [Plant in Service].Respondent_Id =
Respondents.[Respondent ID])
ON [Row Literal Table].serviceID = [Plant in
Service].serviceID
--
Marsh
MVP [MS Access]

Marshall Barton said:
Glad to provide more info. Row Literal Table is published every year as part
of the database. This year it has 2,760 rows, from more than 100 tables, but
I typically use only 20 or 30 of the tables. The columns in Row Literal
Table are:

1) Schedule_Table_Name
2) Respondent_Id (these are the numbers 1 to 400+ identifying each utility
that reports to the Federal Energy Regulatory Commission (FERC)
3) Row number (these really are the rows on each page of the "paper" version
that each utility reports--each page on the paper version is a different
Schedule
4) Row Sequence (usually this is the same number on each line, as the row
number)
5) Row Literal (the words on each line of the paper version--example would
be "Distribution System Additions for the Year" in the schedule Plant in
Service, or "Transmission - Maintenance of Structures" in the Schedule
Operations and Maintenance Expenses)
6) Row Status (usually the letter A - not used much)
7) Row Change Year (when this has a year other than 1994, then I have to
redo the rows for every row after the change if the change was an addition of
a new row...and it gets messy--this is one of the other reasons I want to
learn Access better, to deal with the fact that the Row Literal Table changes
every 2 or 3 or 4 years due to changes in the industry (example--added lines
in the Transmission Op and Maintence table because third parties are now
allowed to sell power across a utility's lines).

Another table used in every query is Respondent ID, which puts the utility's
name in place of the Respondent ID number from the other tables.

I'll keep reading your posts and also some library books on Access I just
took out. So for example when I tried a :Left join, e.g. American Electric
Power only reported 30 out of the 104 items on the Plant in Service table,
but instead of getting the desired 104 with blanks for non-used rows, my
query multiplied 30 x 104 for that utility, and repeated values in the data
columns on the right....

Marshall Barton said:

No! A union query will not do it. And Excel should not be
needed for any of what you have described so far.

It seems that there is at least one more table (Companies?)
involved. Before you say "of course", please post a more
specific list of the tables, their relevant fields and which
fields are used as primary and foreign keys to link the
tables. Without that info we will have to go back and forth
several times while I guess what you have and you expose one
more tidbit of the information needed to solve your problem.

It looks like the left joins I've tried are all doing the same
thing--causing the "missing" rows from the key Row Literal Table to appear
just once in the query results. I need them to appear for each of the 200
utility companies.

I am thinking I'll need to paste Row Literal Table into Excel, cookie-cutter
copies of it for each utility ID Number, and paste a big (40,000 line long)
version of it into Access to then run a query....I wish there were an easier
way to do this....what would a Union query do?

:
Peter wrote:
I'm using MS Access 2002. I want to design a query that forces a certain key
table's number of rows to be repeated in each query result, even when some
elements in other tables do not report data for certain items on the key
table.

Background: I'm updating a big benchmarking database that I previously
slogged through Excel to clean up. This time I want to design better queries
in Access before exporting the data to Excel for statistical analysis. The
data involves 2,500 data items per electric utillity per year, with 200+
utility companies on the list, going back every year to 1994. If you
multiply those out you end up with 7.5 million data points...even though I
only end up using 5% of the data, but it will still be several hundred
thousand data items even before I start the number crunching in Excel.

The key table is called Row Literal Table. It defines what the row numbers
in the other tables mean.

Example: The Plant in Service table if fully loaded, per all the items for
it in the Row Literal Table, has 104 lines of data, but one utility may have
no nuclear power plants, so it will have blanks where the nuclear plant data
rows are, and then it will show (say) only 80 lines out of the full 104
possible, in its Plant in Service query results, while another utility that
has no transmission lines will have a different subset out of the 104 lines
of data on the same query result.

When I export my data into Excel I want to line up all the Row Literals in
the same rows in Excel, across for each year from 1994 to the present, but,
the rows don't line up, because of the blank items in some utilities make
their lists shorter.

So I want to do a query so that if a utility does not report something, its
query result should have a line with the row literal (e.g. Nuclear plant in
service) and then blanks on that row, rather than missing that row. This
way, in Excel, I can do statisics much more easily. There's too much data
and too muchb variability to use vlookups or nested IF statements in Excel to
resolve this.

I think all you need to do is use an outer join from the
literals table to the plants table. Maybe a little like:

SELECT [Row Literal Table].*, Service.*
FROM [Row Literal Table] LEFT JOIN Service
ON [Row Literal Table].pkfield = Service.fkfield

Click to expand...

Click to expand...

Peter · Apr 12, 2010

Here's what I tried:

SELECT f1_row_lit_tbl.sched_table_name, f1_plant_in_srvce.respondent_id,
f1_row_lit_tbl.row_number, f1_row_lit_tbl.row_literal,
f1_plant_in_srvce.yr_end_bal
FROM f1_row_lit_tbl LEFT JOIN f1_plant_in_srvce ON f1_row_lit_tbl.row_number
= f1_plant_in_srvce.row_number
WHERE (((f1_row_lit_tbl.sched_table_name)="f1_plant_in_srvce"));

When I pasted yours in after making the table names exact (as shown below) I
get the message:

Invalid SQL Statement: Expect 'Delete' 'Insert' 'Procedure' 'Select' or
'Update'

Then when I put Select in front, it said "Syntax Error in join operation"

SELECT: FROM [f1_row_lit_tbl]
LEFT JOIN (f1_plant_in_srvce]
INNER JOIN Respondents
ON [f1_plant_in_srvce].Respondent_Id =
Respondents.[Respondent_ID])
ON [f1_row_lit_tbl].serviceID = [f1_plant_in_srvce].serviceID

Then I added a bracket to the first soft parenthesis as shown below, and it
said Join Expression not supported.

SELECT: FROM [f1_row_lit_tbl]
LEFT JOIN ([f1_plant_in_srvce]
INNER JOIN Respondents
ON [f1_plant_in_srvce].Respondent_Id =
Respondents.[Respondent_ID])
ON [f1_row_lit_tbl].serviceID = [f1_plant_in_srvce].serviceID

Thanks again for your help. I'll be unreachable until after 5pm EST today.

Marshall Barton said:
I think I should look at A Copy/Paste of your query's SQL
view too.

AFAICT, your FROM clause would look like:

FROM [Row Literal Table]
LEFT JOIN ([Plant in Service]
INNER JOIN Respondents
ON [Plant in Service].Respondent_Id =
Respondents.[Respondent ID])
ON [Row Literal Table].serviceID = [Plant in
Service].serviceID
--
Marsh
MVP [MS Access]

Glad to provide more info. Row Literal Table is published every year as part
of the database. This year it has 2,760 rows, from more than 100 tables, but
I typically use only 20 or 30 of the tables. The columns in Row Literal
Table are:

1) Schedule_Table_Name
2) Respondent_Id (these are the numbers 1 to 400+ identifying each utility
that reports to the Federal Energy Regulatory Commission (FERC)
3) Row number (these really are the rows on each page of the "paper" version
that each utility reports--each page on the paper version is a different
Schedule
4) Row Sequence (usually this is the same number on each line, as the row
number)
5) Row Literal (the words on each line of the paper version--example would
be "Distribution System Additions for the Year" in the schedule Plant in
Service, or "Transmission - Maintenance of Structures" in the Schedule
Operations and Maintenance Expenses)
6) Row Status (usually the letter A - not used much)
7) Row Change Year (when this has a year other than 1994, then I have to
redo the rows for every row after the change if the change was an addition of
a new row...and it gets messy--this is one of the other reasons I want to
learn Access better, to deal with the fact that the Row Literal Table changes
every 2 or 3 or 4 years due to changes in the industry (example--added lines
in the Transmission Op and Maintence table because third parties are now
allowed to sell power across a utility's lines).

Another table used in every query is Respondent ID, which puts the utility's
name in place of the Respondent ID number from the other tables.

I'll keep reading your posts and also some library books on Access I just
took out. So for example when I tried a :Left join, e.g. American Electric
Power only reported 30 out of the 104 items on the Plant in Service table,
but instead of getting the desired 104 with blanks for non-used rows, my
query multiplied 30 x 104 for that utility, and repeated values in the data
columns on the right....

Marshall Barton said:

No! A union query will not do it. And Excel should not be
needed for any of what you have described so far.

It seems that there is at least one more table (Companies?)
involved. Before you say "of course", please post a more
specific list of the tables, their relevant fields and which
fields are used as primary and foreign keys to link the
tables. Without that info we will have to go back and forth
several times while I guess what you have and you expose one
more tidbit of the information needed to solve your problem.

Peter wrote:
It looks like the left joins I've tried are all doing the same
thing--causing the "missing" rows from the key Row Literal Table to appear
just once in the query results. I need them to appear for each of the 200
utility companies.

I am thinking I'll need to paste Row Literal Table into Excel, cookie-cutter
copies of it for each utility ID Number, and paste a big (40,000 line long)
version of it into Access to then run a query....I wish there were an easier
way to do this....what would a Union query do?

:
Peter wrote:
I'm using MS Access 2002. I want to design a query that forces a certain key
table's number of rows to be repeated in each query result, even when some
elements in other tables do not report data for certain items on the key
table.

Background: I'm updating a big benchmarking database that I previously
slogged through Excel to clean up. This time I want to design better queries
in Access before exporting the data to Excel for statistical analysis. The
data involves 2,500 data items per electric utillity per year, with 200+
utility companies on the list, going back every year to 1994. If you
multiply those out you end up with 7.5 million data points...even though I
only end up using 5% of the data, but it will still be several hundred
thousand data items even before I start the number crunching in Excel.

The key table is called Row Literal Table. It defines what the row numbers
in the other tables mean.

Example: The Plant in Service table if fully loaded, per all the items for
it in the Row Literal Table, has 104 lines of data, but one utility may have
no nuclear power plants, so it will have blanks where the nuclear plant data
rows are, and then it will show (say) only 80 lines out of the full 104
possible, in its Plant in Service query results, while another utility that
has no transmission lines will have a different subset out of the 104 lines
of data on the same query result.

When I export my data into Excel I want to line up all the Row Literals in
the same rows in Excel, across for each year from 1994 to the present, but,
the rows don't line up, because of the blank items in some utilities make
their lists shorter.

So I want to do a query so that if a utility does not report something, its
query result should have a line with the row literal (e.g. Nuclear plant in
service) and then blanks on that row, rather than missing that row. This
way, in Excel, I can do statisics much more easily. There's too much data
and too muchb variability to use vlookups or nested IF statements in Excel to
resolve this.

I think all you need to do is use an outer join from the
literals table to the plants table. Maybe a little like:

SELECT [Row Literal Table].*, Service.*
FROM [Row Literal Table] LEFT JOIN Service
ON [Row Literal Table].pkfield = Service.fkfield

Click to expand...

Click to expand...

.

Marshall Barton · Apr 12, 2010

You need your entire SELECT clause.

These long table names make things nearly unreadable to me
so I'm going to alias them and then try changing the FROM
clause:

SELECT L.sched_table_name,
S.respondent_id,
L.row_number,
L.row_literal,
S.yr_end_bal,
R.[intility name field]
FROM f1_row_lit_tbl As L
LEFT JOIN (f1_plant_in_srvce As S
LEFT JOIN Respondents As R
ON S.Respondent_Id = R.Respondent_ID)
ON L.serviceID =S.serviceID
--
Marsh
MVP [MS Access]

Marshall Barton said:
Here's what I tried:

SELECT f1_row_lit_tbl.sched_table_name, f1_plant_in_srvce.respondent_id,
f1_row_lit_tbl.row_number, f1_row_lit_tbl.row_literal,
f1_plant_in_srvce.yr_end_bal
FROM f1_row_lit_tbl LEFT JOIN f1_plant_in_srvce ON f1_row_lit_tbl.row_number
= f1_plant_in_srvce.row_number
WHERE (((f1_row_lit_tbl.sched_table_name)="f1_plant_in_srvce"));

When I pasted yours in after making the table names exact (as shown below) I
get the message:

Invalid SQL Statement: Expect 'Delete' 'Insert' 'Procedure' 'Select' or
'Update'

Then when I put Select in front, it said "Syntax Error in join operation"

SELECT: FROM [f1_row_lit_tbl]
LEFT JOIN (f1_plant_in_srvce]
INNER JOIN Respondents
ON [f1_plant_in_srvce].Respondent_Id =
Respondents.[Respondent_ID])
ON [f1_row_lit_tbl].serviceID = [f1_plant_in_srvce].serviceID

Then I added a bracket to the first soft parenthesis as shown below, and it
said Join Expression not supported.

SELECT: FROM [f1_row_lit_tbl]
LEFT JOIN ([f1_plant_in_srvce]
INNER JOIN Respondents
ON [f1_plant_in_srvce].Respondent_Id =
Respondents.[Respondent_ID])
ON [f1_row_lit_tbl].serviceID = [f1_plant_in_srvce].serviceID

Thanks again for your help. I'll be unreachable until after 5pm EST today.

Marshall Barton said:

I think I should look at A Copy/Paste of your query's SQL
view too.

AFAICT, your FROM clause would look like:

FROM [Row Literal Table]
LEFT JOIN ([Plant in Service]
INNER JOIN Respondents
ON [Plant in Service].Respondent_Id =
Respondents.[Respondent ID])
ON [Row Literal Table].serviceID = [Plant in
Service].serviceID

Glad to provide more info. Row Literal Table is published every year as part
of the database. This year it has 2,760 rows, from more than 100 tables, but
I typically use only 20 or 30 of the tables. The columns in Row Literal
Table are:

1) Schedule_Table_Name
2) Respondent_Id (these are the numbers 1 to 400+ identifying each utility
that reports to the Federal Energy Regulatory Commission (FERC)
3) Row number (these really are the rows on each page of the "paper" version
that each utility reports--each page on the paper version is a different
Schedule
4) Row Sequence (usually this is the same number on each line, as the row
number)
5) Row Literal (the words on each line of the paper version--example would
be "Distribution System Additions for the Year" in the schedule Plant in
Service, or "Transmission - Maintenance of Structures" in the Schedule
Operations and Maintenance Expenses)
6) Row Status (usually the letter A - not used much)
7) Row Change Year (when this has a year other than 1994, then I have to
redo the rows for every row after the change if the change was an addition of
a new row...and it gets messy--this is one of the other reasons I want to
learn Access better, to deal with the fact that the Row Literal Table changes
every 2 or 3 or 4 years due to changes in the industry (example--added lines
in the Transmission Op and Maintence table because third parties are now
allowed to sell power across a utility's lines).

Another table used in every query is Respondent ID, which puts the utility's
name in place of the Respondent ID number from the other tables.

I'll keep reading your posts and also some library books on Access I just
took out. So for example when I tried a :Left join, e.g. American Electric
Power only reported 30 out of the 104 items on the Plant in Service table,
but instead of getting the desired 104 with blanks for non-used rows, my
query multiplied 30 x 104 for that utility, and repeated values in the data
columns on the right....

:
No! A union query will not do it. And Excel should not be
needed for any of what you have described so far.

It seems that there is at least one more table (Companies?)
involved. Before you say "of course", please post a more
specific list of the tables, their relevant fields and which
fields are used as primary and foreign keys to link the
tables. Without that info we will have to go back and forth
several times while I guess what you have and you expose one
more tidbit of the information needed to solve your problem.

Peter wrote:
It looks like the left joins I've tried are all doing the same
thing--causing the "missing" rows from the key Row Literal Table to appear
just once in the query results. I need them to appear for each of the 200
utility companies.

I am thinking I'll need to paste Row Literal Table into Excel, cookie-cutter
copies of it for each utility ID Number, and paste a big (40,000 line long)
version of it into Access to then run a query....I wish there were an easier
way to do this....what would a Union query do?

:
Peter wrote:
I'm using MS Access 2002. I want to design a query that forces a certain key
table's number of rows to be repeated in each query result, even when some
elements in other tables do not report data for certain items on the key
table.

Background: I'm updating a big benchmarking database that I previously
slogged through Excel to clean up. This time I want to design better queries
in Access before exporting the data to Excel for statistical analysis. The
data involves 2,500 data items per electric utillity per year, with 200+
utility companies on the list, going back every year to 1994. If you
multiply those out you end up with 7.5 million data points...even though I
only end up using 5% of the data, but it will still be several hundred
thousand data items even before I start the number crunching in Excel.

The key table is called Row Literal Table. It defines what the row numbers
in the other tables mean.

Example: The Plant in Service table if fully loaded, per all the items for
it in the Row Literal Table, has 104 lines of data, but one utility may have
no nuclear power plants, so it will have blanks where the nuclear plant data
rows are, and then it will show (say) only 80 lines out of the full 104
possible, in its Plant in Service query results, while another utility that
has no transmission lines will have a different subset out of the 104 lines
of data on the same query result.

When I export my data into Excel I want to line up all the Row Literals in
the same rows in Excel, across for each year from 1994 to the present, but,
the rows don't line up, because of the blank items in some utilities make
their lists shorter.

So I want to do a query so that if a utility does not report something, its
query result should have a line with the row literal (e.g. Nuclear plant in
service) and then blanks on that row, rather than missing that row. This
way, in Excel, I can do statisics much more easily. There's too much data
and too muchb variability to use vlookups or nested IF statements in Excel to
resolve this.

I think all you need to do is use an outer join from the
literals table to the plants table. Maybe a little like:

SELECT [Row Literal Table].*, Service.*
FROM [Row Literal Table] LEFT JOIN Service
ON [Row Literal Table].pkfield = Service.fkfield

Click to expand...

.

Click to expand...

Peter · Apr 13, 2010

Thanks so much for all your help Marsh! I'm translating back all the table
names into a new query per your format--this should do the trick!

Appreciated all your efforts--thanks again.

Regards,
Peter

Marshall Barton said:
You need your entire SELECT clause.

These long table names make things nearly unreadable to me
so I'm going to alias them and then try changing the FROM
clause:

SELECT L.sched_table_name,
S.respondent_id,
L.row_number,
L.row_literal,
S.yr_end_bal,
R.[intility name field]
FROM f1_row_lit_tbl As L
LEFT JOIN (f1_plant_in_srvce As S
LEFT JOIN Respondents As R
ON S.Respondent_Id = R.Respondent_ID)
ON L.serviceID =S.serviceID
--
Marsh
MVP [MS Access]

Here's what I tried:

SELECT f1_row_lit_tbl.sched_table_name, f1_plant_in_srvce.respondent_id,
f1_row_lit_tbl.row_number, f1_row_lit_tbl.row_literal,
f1_plant_in_srvce.yr_end_bal
FROM f1_row_lit_tbl LEFT JOIN f1_plant_in_srvce ON f1_row_lit_tbl.row_number
= f1_plant_in_srvce.row_number
WHERE (((f1_row_lit_tbl.sched_table_name)="f1_plant_in_srvce"));

When I pasted yours in after making the table names exact (as shown below) I
get the message:

Invalid SQL Statement: Expect 'Delete' 'Insert' 'Procedure' 'Select' or
'Update'

Then when I put Select in front, it said "Syntax Error in join operation"

SELECT: FROM [f1_row_lit_tbl]
LEFT JOIN (f1_plant_in_srvce]
INNER JOIN Respondents
ON [f1_plant_in_srvce].Respondent_Id =
Respondents.[Respondent_ID])
ON [f1_row_lit_tbl].serviceID = [f1_plant_in_srvce].serviceID

Then I added a bracket to the first soft parenthesis as shown below, and it
said Join Expression not supported.

SELECT: FROM [f1_row_lit_tbl]
LEFT JOIN ([f1_plant_in_srvce]
INNER JOIN Respondents
ON [f1_plant_in_srvce].Respondent_Id =
Respondents.[Respondent_ID])
ON [f1_row_lit_tbl].serviceID = [f1_plant_in_srvce].serviceID

Thanks again for your help. I'll be unreachable until after 5pm EST today.

Marshall Barton said:

I think I should look at A Copy/Paste of your query's SQL
view too.

AFAICT, your FROM clause would look like:

FROM [Row Literal Table]
LEFT JOIN ([Plant in Service]
INNER JOIN Respondents
ON [Plant in Service].Respondent_Id =
Respondents.[Respondent ID])
ON [Row Literal Table].serviceID = [Plant in
Service].serviceID

Peter wrote:
Glad to provide more info. Row Literal Table is published every year as part
of the database. This year it has 2,760 rows, from more than 100 tables, but
I typically use only 20 or 30 of the tables. The columns in Row Literal
Table are:

1) Schedule_Table_Name
2) Respondent_Id (these are the numbers 1 to 400+ identifying each utility
that reports to the Federal Energy Regulatory Commission (FERC)
3) Row number (these really are the rows on each page of the "paper" version
that each utility reports--each page on the paper version is a different
Schedule
4) Row Sequence (usually this is the same number on each line, as the row
number)
5) Row Literal (the words on each line of the paper version--example would
be "Distribution System Additions for the Year" in the schedule Plant in
Service, or "Transmission - Maintenance of Structures" in the Schedule
Operations and Maintenance Expenses)
6) Row Status (usually the letter A - not used much)
7) Row Change Year (when this has a year other than 1994, then I have to
redo the rows for every row after the change if the change was an addition of
a new row...and it gets messy--this is one of the other reasons I want to
learn Access better, to deal with the fact that the Row Literal Table changes
every 2 or 3 or 4 years due to changes in the industry (example--added lines
in the Transmission Op and Maintence table because third parties are now
allowed to sell power across a utility's lines).

Another table used in every query is Respondent ID, which puts the utility's
name in place of the Respondent ID number from the other tables.

I'll keep reading your posts and also some library books on Access I just
took out. So for example when I tried a :Left join, e.g. American Electric
Power only reported 30 out of the 104 items on the Plant in Service table,
but instead of getting the desired 104 with blanks for non-used rows, my
query multiplied 30 x 104 for that utility, and repeated values in the data
columns on the right....

:
No! A union query will not do it. And Excel should not be
needed for any of what you have described so far.

It seems that there is at least one more table (Companies?)
involved. Before you say "of course", please post a more
specific list of the tables, their relevant fields and which
fields are used as primary and foreign keys to link the
tables. Without that info we will have to go back and forth
several times while I guess what you have and you expose one
more tidbit of the information needed to solve your problem.

Peter wrote:
It looks like the left joins I've tried are all doing the same
thing--causing the "missing" rows from the key Row Literal Table to appear
just once in the query results. I need them to appear for each of the 200
utility companies.

I am thinking I'll need to paste Row Literal Table into Excel, cookie-cutter
copies of it for each utility ID Number, and paste a big (40,000 line long)
version of it into Access to then run a query....I wish there were an easier
way to do this....what would a Union query do?

:
Peter wrote:
I'm using MS Access 2002. I want to design a query that forces a certain key
table's number of rows to be repeated in each query result, even when some
elements in other tables do not report data for certain items on the key
table.

Background: I'm updating a big benchmarking database that I previously
slogged through Excel to clean up. This time I want to design better queries
in Access before exporting the data to Excel for statistical analysis. The
data involves 2,500 data items per electric utillity per year, with 200+
utility companies on the list, going back every year to 1994. If you
multiply those out you end up with 7.5 million data points...even though I
only end up using 5% of the data, but it will still be several hundred
thousand data items even before I start the number crunching in Excel.

The key table is called Row Literal Table. It defines what the row numbers
in the other tables mean.

Example: The Plant in Service table if fully loaded, per all the items for
it in the Row Literal Table, has 104 lines of data, but one utility may have
no nuclear power plants, so it will have blanks where the nuclear plant data
rows are, and then it will show (say) only 80 lines out of the full 104
possible, in its Plant in Service query results, while another utility that
has no transmission lines will have a different subset out of the 104 lines
of data on the same query result.

When I export my data into Excel I want to line up all the Row Literals in
the same rows in Excel, across for each year from 1994 to the present, but,
the rows don't line up, because of the blank items in some utilities make
their lists shorter.

So I want to do a query so that if a utility does not report something, its
query result should have a line with the row literal (e.g. Nuclear plant in
service) and then blanks on that row, rather than missing that row. This
way, in Excel, I can do statisics much more easily. There's too much data
and too muchb variability to use vlookups or nested IF statements in Excel to
resolve this.

I think all you need to do is use an outer join from the
literals table to the plants table. Maybe a little like:

SELECT [Row Literal Table].*, Service.*
FROM [Row Literal Table] LEFT JOIN Service
ON [Row Literal Table].pkfield = Service.fkfield
.

Click to expand...

Click to expand...

.