Oversize DB crashed

P

Petr Danes

I've messed up a DB rather royally, I suspect. It's a long-term conversion
project, where I'm importing a bunch of pretty garbaged manual entries and
trying to integrate them into a functional database. There are literally
hundreds of queries that fiddle with the data as it's read in, some that
clear or drop tables to prepare for the next attempt and VBA code that
executes the queries. They are named and numbered in a systematic fashion,
and I wrote a routine that executes queries by groups.

As anyone who has done this sort of project knows, this repeated reading in,
clearing, reading in again, clearing again, over and over, causes a serious
amount of DB bloat. I normally close with a C&R every so often, to reduce
the size back to normal, but yesterday, during a full day's worth of
thrashing this thing, I neglected to do so and around 11pm, it ran over max
to almost 2.1 GB and locked up. Naturally, it's also been several days since
I made a backup, which I normally do pretty religiously.

The tables and queries I managed to drag and drop to a new, empty DB, but
most of the forms and almost all of the code is inaccessible. The VBA editor
won't even open and all attempts to do a C&R cause an error, where Access
advises me that it has detected corruption, that I should make a backup copy
and do a compact and repair.

I've tried all the usual stuff:

Compact and repair manually - error message.
Set the 'Compact on close' option - error message instead of options dialog.
Decompile - error message.
Import from another DB - error message.
Drag and drop into another DB - error message.

Does anyone know any tricks for getting stuff out of a mess like this? I've
been looking at it with a hex editor, there seems to be a lot of ballast
that I could cut away. Maybe if I got it back under 2GB it would recover
enough to let me extract at least something. I've managed to hack Word
documents where the VBA code was password protected with a hex editor. The
document complains and is not fully functional until I take further steps,
but it is functional enough to let me extract what I need.

Can the system tables be edited /erased / overwritten from a good DB? Can I
simply chop off the tail end, where all the temporary junk is? Would there
be checksum problems? Would Windows or Access complain, but generate a new
checksum anyway? Anybody have any experience with this?

The 2GB limit is usually not much of a problem for me, but it's annoying
that there is no warning when it goes over. Word processors as far back as
thirty years ago would warn you if your document had reached the size
limit - it seems pretty Mickey Mouse that an otherwise sophisticated product
like Access does not, but just lets you crash.

Petr
 
G

Gigamite

Petr said:
As anyone who has done this sort of project knows, this repeated reading
in, clearing, reading in again, clearing again, over and over, causes a
serious amount of DB bloat.

It's a good idea to plan for this by creating a temp database with temp
tables and link to those temp tables from the "real" database. Replace
the temp database with a new copy of the temp database file; fill it up
with data; process the data. Rinse and repeat.
Does anyone know any tricks for getting stuff out of a mess like this?
I've been looking at it with a hex editor, there seems to be a lot of
ballast that I could cut away.

Don't cut away "ballast" and still expect to open the database with Jet
or ACE. You'll most likely get "Unrecognized database format" when you
try. It's not a bag of jelly beans where if the licorice ones were
removed, one would still have a bagful of jelly beans to enjoy. A
database file is much more complex than that.
Maybe if I got it back under 2GB it would
recover enough to let me extract at least something.

A database file doesn't recover when one cuts out pieces. A database
file is very, very dependent upon file structure. The file size needs
to be exactly the right size and each piece must be exactly the right
size, too. Many of these pieces point to addresses where the rest of
the necessary pieces are located within the file. One would have to
find the address pointers and edit all of those for each chunk cut out.
This is a lengthy exercise in futility if done manually.
I've managed to
hack Word documents where the VBA code was password protected with a hex
editor. The document complains and is not fully functional until I take
further steps, but it is functional enough to let me extract what I need.

A Word document is much simpler than a database file. The database
won't open after you hack it the same way you would a Word file.
Can the system tables be edited /erased / overwritten from a good DB?

Perhaps. How do you expect this to save your VBA and forms in the bad file?
Can I simply chop off the tail end, where all the temporary junk is?

No. There's important stuff stored there, too, like security and LVAL
pages.
Would there be checksum problems?

I don't believe Jet and ACE use checksums. They use relative pointers,
so if a chunk is removed in the file, at least some of the pointers
don't point to their intended data anymore because the data is now
stored at "earlier" addresses than they once were.
Would Windows or Access complain, but
generate a new checksum anyway? Anybody have any experience with this?

Windows wouldn't complain unless you removed the EOF marker from the
file or altered file fragment lengths and their addresses while hacking.
Jet or ACE would complain because they'd be missing things necessary
to open the database file or pointers would point to the wrong
addresses, tripping up Jet or ACE when they try to read and organize data.
The 2GB limit is usually not much of a problem for me, but it's annoying
that there is no warning when it goes over.

It only goes over the 2 GB limit if the file is corrupt in a certain way
and then more data is added to the file. Otherwise, Jet tells you it
can't function when one tries to exceed the limit. How Jet tells you it
can't function can be very confusing. For example, a "subscript out of
range" or a "no current record" message when opening a wizard or
inserting records. Makes you go "Huh?"
 
K

Klatuu

Sorry Petr, but you are pretty much hosed. If you can't open the VBA editor,
you will likely not get your forms back with the code; however, you may be
able to get the form without the code if you can open the form in design
view. Try opening a form in design view and setting the Has Module property
to No and saving it. You might be able to import the forms without the code.

What I would suggest going forward is that you split your database. The
Front End will not bloat like the data does. It will a bit when you are
modifying objects, but not like data will when you are doing heavy
manipulation.

Also, if you are storing any graphic objects or documents in your database,
a better way is to store those objects in a specific folder and use
hyperlinks to reference the objects. Pictures and documents take up a lot of
spaces.

Once you have split the database and you have the problem again, all you
loose is your data.

Now, be more religious about doing backups.
 
G

Gigamite

Petr said:
Word processors as far back
as thirty years ago would warn you if your document had reached the size
limit

Oh, yeah. Those were the days. Staring at the blinking amber cursor on
a 12-inch, pixelated screen for 10 minutes while the program loaded from
a cassette tape (and repeating this visual excitement for 3 to 4 minutes
every time we saved the file we were working on).

COMMAND>print homework.txt lpt1
Can't find file homework.txt.
Can't find file lpt1.

COMMAND>print /usr/smithj/homework.txt lpt1
Can't find file lpt1.
homework.txt is currently being printed.
^C (after waiting 10 minutes at the printer, cycling the power, waiting
another 5 minutes and nothing coming out)

COMMAND>print /usr/smithj/homework.txt lpt2
Can't find file lpt2.
homework.txt is currently being printed.
^C (after waiting 5 minutes at the printer, cycling the power, waiting
another 5 minutes and nothing coming out)

COMMAND>print /usr/smithj/homework.txt /dev/prt/lpt1
Print spool /dev/prt/lpt1 full. Abort, retry, ignore?
COMMAND>i
Ready.

COMMAND>r (after waiting 5 minutes)
Unknown command r.

COMMAND>print /usr/smithj/homework.txt /dev/prt/lpt1
Print spool /dev/prt/lpt1 full. Abort, retry, ignore?

COMMAND>r
Print spool /dev/prt/lpt1 full. Abort, retry, ignore?

COMMAND>a
Unknown error /dev/prt/lpt1. Abort, retry, ignore?

COMMAND>a
Memory full. Abort, retry, ignore?

COMMAND>^C
Ready.

COMMAND>print /usr/smithj/homework.txt /dev/prt/lpt1
Unknown device /dev/prt/lpt1.


Oh, yeah. Those were the days 30 years ago when word processors warned
you. We were much better off. We could read half a novel while waiting
for the printer to not print a 4 KB text file.
 
J

John Spencer MVP

AND if you are doing a lot of manipulation before finally storing the data,
you might consider using a temporary database to do all the manipulation and
then storing the final result in your backend database (the tables of data).

See Tony Toews' website
http://www.granite.ab.ca/access/temptables.htm
for an example of how to create a temporary table in a temporary database

John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
 
G

Gigamite

Petr said:
The tables and queries I managed to drag and drop to a new, empty DB,
but most of the forms and almost all of the code is inaccessible. The
VBA editor won't even open and all attempts to do a C&R cause an error,
where Access advises me that it has detected corruption, that I should
make a backup copy and do a compact and repair.

I've tried all the usual stuff:

Compact and repair manually - error message.
Set the 'Compact on close' option - error message instead of options
dialog.
Decompile - error message.
Import from another DB - error message.
Drag and drop into another DB - error message.

Does anyone know any tricks for getting stuff out of a mess like this?

You could also try SaveAsText and LoadFromText from a good database on
each form and module in the bad database. See
http://groups.google.com/group/microsoft.public.access.modulesdaovba/msg/197aaee520d29af3?hl=en
for example syntax.
 
P

Petr Danes

Hi Klatuu,

thanks for the try. No, I can't even open the forms in design mode, the same
error message pops up when I try. This file is probably junk, unless I can
figure out enough of the internal structure to pull some stuff out with a
hex editor. I also tried picking at it with Access 2007, no luck. It acted a
little differently than 2003, but had no better results in actually
retrieving anything.

Splitting the DB is a good idea and I probably should have done it, but
most of my projects are one-machine, one-user (at a time, anyway) databases,
which also tend to get toted around on flash drives, so I prefer the
portability of keeping everything in one file. That approach does have
obvious drawbacks, but it's simple. Also, the development phases usually
bounce around between several machines, and I constantly make new copies and
rename them to include the current date, so I'm constantly leaving behind
multiple archival copies on various machines as an integral part of the
development cycle, in addition to making deliberate backups. This one just
happened to slip through the cracks for a few days, so of course, it's the
one that bombed.

Petr
 
D

Dirk Goldgar

Petr Danes said:
I've messed up a DB rather royally, I suspect. It's a long-term conversion
project, where I'm importing a bunch of pretty garbaged manual entries and
trying to integrate them into a functional database. There are literally
hundreds of queries that fiddle with the data as it's read in, some that
clear or drop tables to prepare for the next attempt and VBA code that
executes the queries. They are named and numbered in a systematic fashion,
and I wrote a routine that executes queries by groups.

As anyone who has done this sort of project knows, this repeated reading
in, clearing, reading in again, clearing again, over and over, causes a
serious amount of DB bloat. I normally close with a C&R every so often, to
reduce the size back to normal, but yesterday, during a full day's worth
of thrashing this thing, I neglected to do so and around 11pm, it ran over
max to almost 2.1 GB and locked up. Naturally, it's also been several days
since I made a backup, which I normally do pretty religiously.

The tables and queries I managed to drag and drop to a new, empty DB, but
most of the forms and almost all of the code is inaccessible. The VBA
editor won't even open and all attempts to do a C&R cause an error, where
Access advises me that it has detected corruption, that I should make a
backup copy and do a compact and repair.

I've tried all the usual stuff:

Compact and repair manually - error message.
Set the 'Compact on close' option - error message instead of options
dialog.
Decompile - error message.
Import from another DB - error message.
Drag and drop into another DB - error message.

Does anyone know any tricks for getting stuff out of a mess like this?
I've been looking at it with a hex editor, there seems to be a lot of
ballast that I could cut away. Maybe if I got it back under 2GB it would
recover enough to let me extract at least something. I've managed to hack
Word documents where the VBA code was password protected with a hex
editor. The document complains and is not fully functional until I take
further steps, but it is functional enough to let me extract what I need.

Can the system tables be edited /erased / overwritten from a good DB? Can
I simply chop off the tail end, where all the temporary junk is? Would
there be checksum problems? Would Windows or Access complain, but generate
a new checksum anyway? Anybody have any experience with this?

The 2GB limit is usually not much of a problem for me, but it's annoying
that there is no warning when it goes over. Word processors as far back as
thirty years ago would warn you if your document had reached the size
limit - it seems pretty Mickey Mouse that an otherwise sophisticated
product like Access does not, but just lets you crash.


If it's important enough to you, there are a couple of reputable
database-recovery firms that can probably recover your database (at least
partially) for a fee.

http://www.pksolutions.com/

http://www.everythingaccess.com/
 
P

Petr Danes

Thank you, Dirk, but the data is not important at all. That is imported from
around a dozen other free-standing databases, into this one, that is
eventually supposed to become the central network DB for the entire
department. But the individual DBs have been all operating in their own
little worlds for several years, and their contents are (from a DB
engineer's POV) a disaster. Mismatches, duplicates, missing data, many
different formats - the usual rubbish from manual entry into amateur-built
apps. They are, however, intact.

What I've lost, through my own carelessness, is several days worth of work
building conversion routines and queries to import this chaos into something
structured. That I will likely have to do again, but the source data remains
untouched throughout all this; in fact, some of it is still in use and I
regularly get fresh copies from the users. They have their normal work to
do, and get information from me on problems in their datasets as I build
conversion routines and discover problems, so they are doing some manual
repair of the data as well as me building routines to do machine conversions
where possible. When I finally get all the conversion routines tuned and the
users have repaired things that are impossible for me, they will stop
working in their copies, I will get one final set of copies and the
conversion will run one last time 'for real', at which point my DB will
become the actual data store.

It's not a complete tragedy, most of the work is in figuring out and
designing the manipulation code. Once I know what needs to be done, it's
just typing and mousing to do it. I remember it pretty well, so it's mostly
just a PIA to re-enter and test the code, and quite often, the second time
around I do a better job anyway.

Petr
 
D

Dirk Goldgar

Petr Danes said:
Thank you, Dirk, but the data is not important at all. [...]
What I've lost, through my own carelessness, is several days worth of work
building conversion routines and queries to import this chaos into
something structured.

That may be recoverable by these services; I don't know, but you could ask
them. Generally, PKSolutions' policy is that if they can't recover your
database you don't owe them anything. It was just a thought.

Good luck!
 
F

Fred

Since your tables are imported and expendable, maybe you could look at some
big ones big ones in your hex editor, then look for the same thing in a copy
of your DB with your hex editor and try deleting those.
 
L

Larry Linson

Splitting the database is not a "good idea"; it is something that would
have, almost certainly, _prevented your losing several days of work_ (and
that is a "great idea"). Whatever you use to justify unsplit databases once
you have them set up for the users, it just does not apply to the kind of
work you are doing to convert / construct this one...

I learned, a lot of years ago, before Access was even a gleam in its
inventors thoughts, by similarly bitter experience, the value of _frequent_
and _complete_ backups in heavy development and processing situations as you
describe. When I am in "heavy development mode", as you seem to have been,
I "indulge my paranoia" by taking backups as frequently as I need to prevent
having to re-do whatever amount of work I have determined as my limit (and,
to be very frank, I have a very low tolerance level for unncecessarily
repeating work).

You have my sympathy and empathy for the situation in which you find
yourself, and my best wishes that it'll be a learning experience that will
save you from facing a similar situation again.

Larry Linson
Microsoft Office Access MVP
 
P

Petr Danes

Well, I got it fixed and luckily, it wasn't even all that much trouble.

As I wrote in my original post, most of the work in this conversion project
is done by queries, not code, and the database, even over its size limit,
operated well enough to allow drag and drop extractions of queries and
tables into a new database. There are several hundred queries, so it was a
nuisance to pick each one individually and pull it across, but lots better
than having to re-write a bunch of them. What it refused to touch was
anything containing code, so modules and forms were stuck, but since I'm
still mostly tuning the import action queries, the modules and forms had
changed very little since the last good backup. Large sigh of relief.

It seemed odd to me, though, that it had grown so quickly past the 2GB
limit - it's less than 100MB even after importing all data, so I did some
poking and experimenting, and discovered what had happened. It's kind of
sneaky, so I hope posting this may save some else a bit of grief.

Some of my queries are Select Into queries, which create temporary tables
and others are SQL Drop Table commands, which remove them again. Like all
the others, they are named and numbered in a systematic way, which allows my
ExecQueries subroutine to execute them as part of blocks of queries that
have specific sets of tasks, to be done in a specific order.

As I experiment and modify the queries to accommodate the data, I
occasionally open a query that accesses a table which isn't there at that
moment, and that's where Access snuck up and gave me a wedgie. When a query
includes a join and the table doesn't exist, Access (sometimes) quietly
modifies the query to remove the join clause, giving me a Cartesian product
next time the query executes. These tables have tens of thousands of records
and some of the queries join up to five tables, with results which are, in
retrospect, painfully obvious.

What brought it to my attention was my first patching together of the
repaired database. As I was putting bits and pieces from backups together
and compiling and testing the code to see where I still needed to fix
something to get back to pre-crash status, it suddenly locked up again, with
the same error as before. A look into the folder showed a size over 2GB
again! And of course, still no backup - I hadn't gotten anywhere close to
finishing the repair, so there wasn't (I thought) much of anything to back
up yet, and I had only tried running it a few times, nowhere near enough to
bloat it over the size limit. So, drag and drop all the queries and tables
to a new database again (live and don't learn, that's me).

This time I pulled across everything I remembered, did a C&R and finally
made a copy, then started trying to execute the import routines again. I
manually stepped the code and at one point it froze. Suspicious now, I
looked into the folder with Windows Explorer and saw the .mdb file growing
by tens of megabytes per second, and almost 100% processor utilization.
Quick stab at Ctrl/Alt/Del and I managed to blow away the MSACCESS process
at around 1.5 GB, before it had grown to unusable size again. Opened it and
looked at the query that was executing when it took off and discovered that
all the joins were gone, resulting in a runaway temporary workspace
allocation.

Soooooo, I'm going to create the temp tables manually, change my queries
from Select Into to Insert Into and from Drop Table to Delete *, so that
they stay in place. That should keep Access from mangling the queries, and I
already put in the following code that tells me the current database size at
the end of every conversion run, to remind me when a C&R might be a good
idea.

Set fs = CreateObject("Scripting.FileSystemObject") ' Primary control
object: file system.
Debug.Print fs.getfile(CurrentDb.Name).Size

And of course, I'm being more scrupulous about making backups.

Got a question, though, for the people that advised me to split the DB, if
any of you are still awake after all this. Since it was the allocation of
the temporary workspace that caused the runaway file size, wouldn't that
have blown up my working DB even if it was split? The tables were fine, it
was the workspace that spun out of control and as far as I know, that's
always in the current DB, not the BE of a split DB.

Petr
 
J

John Spencer MVP

Yes, the split database backend would be subject to the same problem. On the
other hand, you wouldn't trash your front end.

As I said earlier, you might be better off using a temporary database to do
all the processing and just save the final result into your active backend.

It is a bit of trouble to get everything set up, but I use the technique with
some fairly complex reports that require a lot of manipulation of the data
before I can use it in the reports. I had tried using stacked queries for the
reports, but they kept timing out or giving me too complex errors or failing
in other ways.

John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
 
P

Petr Danes

Yes, the split database backend would be subject to
the same problem. On the other hand, you wouldn't
trash your front end.

Thank you, John, but I don't think I made my question clear. As I understand
it, when a DB is split, the FE is the running copy and the BE is simply a
data store. Therefore, the temporary workspaces are created in the FE, are
they not? And so my runaway Cartesian queries would have blown up the FE,
where temporary activities take place. The actual tables never got to any
unreasonable size, the DB blew up because it was unable to accommodate the
(temporary) astronomical Cartesian product that I unwittingly requested when
my JOIN clauses disappeared.

If that is not the case, do you know where does the temporary activity take
place and under what conditions? What if I'm joining a local table to a
linked table? In a pass-through query, I would expect the activity to be
done in the backend, but if I'm only using linked tables to unload some of
the data to a remote .mdb file, that file is not actually running and I
would expect all the action to be done locally. Was using pass-through
queries an inherent part of the 'split your DB' advice? If so, sorry, I
didn't pick up on that.

As I said earlier, you might be better off using a
temporary database to do all the processing and
just save the final result into your active
backend.

True, that's a good way of doing it. Actually, the DB I'm working with now
is just such a temporary one. I will have no need for all the manipulation
queries once the data is finally straightened out, and I have another DB
further along the way that will actually be the end application. This one is
the between job, and the last step is to export the good data to the final
DB app, which has only the functions needed by the users once the conversion
is all finished and the app is put into service.

I had tried using stacked queries for the reports,
but they kept timing out or giving me too complex
errors or failing in other ways.

That's a good trick for reports, I'll keep it in mind. None of my apps have
yet had such complex reporting requirements, but it's probably only a matter
of time.

Petr
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top