Word Ghosting

C

CGjess

Version: 2008 Operating System: Mac OS X 10.5 (Leopard) Processor: Intel Though Word is not a program I have a lot of experience with, I have created a customizable flyer for a client that contains images, some body copy and a table. I have set document protection so that only a portion of the document can be edited by the user. The file has been opened on a PC and saved down to Word 97-2004. The file size is under 1 MB.

When the user modifies the document (they may customize a headline, contact information and/or add logo) and then saves the changes, the document size jumps up to 6 MB and a logo has not even been added at this point. I was told that this is called "Word Ghosting" and that I need to avoid it...obviously, if I new what it was I would've avoided it. Can anybody clue me in? Help is very much appreciated.
 
C

CyberTaz

I've never heard the term "Word Ghosting" so I can't say what it means or
whether it applies. What I can tell you is that the file bloat is directly
attributable to resaving the file from one format ot another & back again.

The former binary file format (.doc) & the current OXML format (.docx) are
not different in name only. The structures are totally distinct from one
another. Word 2007/2008 provide a number of recognizably new features which
are supported only by the OXML format as well as a number of other features
which -- on the surface -- appear to be the same as in 2004 & prior but are
handled differently "under the hood." When a file is saved back to the old
format it has to generate content compatible with that format for use in the
older versions as well as retain the element data for reopening the file in
the newer versions... Essentially it's being forced to create a 'file within
a file' & that's what causes the escalation in size.

For example, if I create a new document in Word 2008 with no content other
than a properly inserted 1.5MB JPEG image the document's file size is 1.6MB
on disk. If I then take the same file & Save As in .doc the resulting file
is 4.7MB on disk.

I'm not sure why you're saving in the 97-2004 format, but that's most likely
what's causing the lion's share of the bloat.

HTH |:>)
Bob Jones
[MVP] Office:Mac
 
J

John McGhie

I seem to remember I did hear the phenomenon called "ghosting" at one stage,
but the terms we're more familiar with are "file bloat" or "stranded RTF".

The actual cause is quite complex, and can be affected very much by how you
handle the document.

Generally there are two things happening: images are being duplicated, and
the internal structure is being converted to RTF.

When you insert an image into a Word document, Word stores the entire
picture file. It then makes a lower-resolution copy in its native format
for display.

If you then move that document cross-platform, Word will convert the image
into a different format, and store another display version. By this time,
the picture store in the document will be twice the size, even if nothing
else happens.

When you convert a file from .docx to .doc, you are converting a compressed
XML format to an uncompressed binary format. The .docx format is plain text
internally, and compresses by 70% on save. The binary is similar to RTF:
not only is it a rather verbose format, but it doesn't compress very well.
Even if a document contains nothing but text, you should expect the old .doc
format to be four times the size of a .docx containing the same text.

The only practical way to avoid this is "Don't work cross-platform; or if
you do, don't convert to a different format."

In a small well-trained workgroup that is all on the same network (no
laptops...) you can use external pictures. You can avoid embedding the
picture file in the document, instead adding only a link to the document and
allowing the original picture file to remain outside the document. This
works well and creates extremely small documents: but it blows up as soon as
someone emails the document somewhere.

Sorry: no good answers: tell the IT department to buy some more disk space.

Hope this helps

Version: 2008 Operating System: Mac OS X 10.5 (Leopard) Processor: Intel
Though Word is not a program I have a lot of experience with, I have created a
customizable flyer for a client that contains images, some body copy and a
table. I have set document protection so that only a portion of the document
can be edited by the user. The file has been opened on a PC and saved down to
Word 97-2004. The file size is under 1 MB.

When the user modifies the document (they may customize a headline, contact
information and/or add logo) and then saves the changes, the document size
jumps up to 6 MB and a logo has not even been added at this point. I was told
that this is called "Word Ghosting" and that I need to avoid it...obviously,
if I new what it was I would've avoided it. Can anybody clue me in? Help is
very much appreciated.

--

The email below is my business email -- Please do not email me about forum
matters unless I ask you to; or unless you intend to pay!

John McGhie, Microsoft MVP (Word, Mac Word), Consultant Technical Writer,
McGhie Information Engineering Pty Ltd
Sydney, Australia. | Ph: +61 (0)4 1209 1410 | mailto:[email protected]
 
C

CGjess

Yeah, I didn't think there was going to be some magic "fix-all" button to toggle off...but I was hoping! I'm gonna have to embed the pics, but I'll rebuild on the PC and in the client's software version to avoid the cross-platform/format issues. Hopefully it will keep the file size down enough to make everybody happy. Thanks for the response!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top