Word 2003 vs Word 2007 and XML

M

Mariann

I was wondering if anyone was familiar with the way Word 2003 worked as far
as direct formatting goes vs. 2007 and the XML background.

It is my understanding that if you used direct formatting in Word 2003
instead of using styles, it created problems with corruption because Word
2003 would put a a code before and after each and every character selected.
For example, if someone had a document they received in Courier New that was
50 pages long and they selected the entire document and changed the font to
Times New Roman (TNR), every character throughout the document would have a
TNR code before and after.

Does anyone know if this is still a problem now that Word uses XML?

Thanks so much!

Mariann
 
J

Jay Freedman

Well, I don't think your description of what happens in Word 2003 is
anywhere near right, least of all because Word doesn't use codes embedded in
the text stream -- that's WordPerfect's architecture. There might be some
extra stuff in the binary file header created by direct formatting, but
nothing like one code per character.

In any case, Word 2007 certainly doesn't work that way. As an experiment, I
took a one-sentence document, applied TNR directly to it, and saved. After
changing the file extension to .zip and extracting document.xml from the
container, here's the result:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
- <w:document
xmlns:ve="http://schemas.openxmlformats.org/markup-compatibility/2006"
xmlns:eek:="urn:schemas-microsoft-com:eek:ffice:eek:ffice"
xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"
xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"
xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing"
xmlns:w10="urn:schemas-microsoft-com:eek:ffice:word"
xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml">
- <w:body>
- <w:p w:rsidR="00C46D2C" w:rsidRPr="000160B7" w:rsidRDefault="00C46D2C">
- <w:pPr>
- <w:rPr>
<w:rFonts w:ascii="Times New Roman" w:hAnsi="Times New Roman" w:cs="Times
New Roman" />
</w:rPr>
</w:pPr>
- <w:r w:rsidRPr="000160B7">
- <w:rPr>
<w:rFonts w:ascii="Times New Roman" w:hAnsi="Times New Roman" w:cs="Times
New Roman" />
</w:rPr>
<w:t>On the Insert tab, the galleries include items that are designed to
coordinate with the overall look of your document.</w:t>
</w:r>
</w:p>
- <w:sectPr w:rsidR="00C46D2C" w:rsidRPr="000160B7" w:rsidSect="007936D3">
<w:pgSz w:w="12240" w:h="15840" />
<w:pgMar w:top="1440" w:right="1440" w:bottom="1440" w:left="1440"
w:header="720" w:footer="720" w:gutter="0" />
<w:cols w:space="720" />
<w:docGrid w:linePitch="360" />
</w:sectPr>
</w:body>
</w:document>

As you can see, the text (in the <w:t> element) doesn't contain anything
other than the original sentence.

--
Regards,
Jay Freedman
Microsoft Word MVP
Email cannot be acknowledged; please post all follow-ups to the newsgroup so
all may benefit.
 
M

Mariann

Thanks so much, Jay -

I've read all kinds of white papers in the past and have been trained that
2003 works this way - for years! Scary that it never did.

Thanks so much for clarifying and for your hard work verifying. (How do you
extract the xml from the zip file?)

Thanks again -
Mariann
 
J

Jay Freedman

(How do you extract the xml from the zip file?)

After you change the extension, the file is just an ordinary zip file.
If you have one of the more recent versions of Windows, you can just
double-click it to view the contents; otherwise, use any unzipping
program to open it. Inside you'll find several folders, one of which
is named "word". Open that folder, and one of the files in it is named
document.xml. (The others hold information about fonts, styles,
margins, and so forth.) Use the usual unzip (or extract or whatever
your program calls it) to make an uncompressed copy of that file.

By default, the program that opens .xml files is Internet Explorer,
although you can use any text editor such as Notepad.
 
Y

Yves Dhondt

Note that if the document is not created using Word, the file might have a
totally different name and location within the zip file. In such a case, you
should open the zip file and check the file [Content_Types].xml which is
always available in the root of the zip. In it, look for

ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml"

The accompanying PartName will tell you the exact location of your Word
document within the zip file.

Yves
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top