Minimilistic XML file for Word

  • Thread starter John... Visio MVP
  • Start date
J

John... Visio MVP

Does anyone know what the least amount of XML information needed to make a
Word document?

Rather than create a Word document using the Word object model, I would like
to create a text file that contains XML tags that Word can interpret as a
Word document and fill in the missing information.

John... Visio MVP
 
T

Tony Jollans

You have to make a package - a single text file doesn't cut it. The minimum
is pretty basic - off the top of my head you need a [Content_Types].xml file
and a _rels/.rels file, along with the document itself, which needs little
more than <document><body><p><t>(text)... or something like that.

I can dig out some more detail if you want. What are you looking to put in
the document before Word sees it? And what are you expecting Word to add to
it?
 
J

John... Visio MVP

Thanks I'll give that a try.

I am looking to mine some information from Visio and create a Word document.
It will mainly be headers and paragraphs. Creating a Word document with a
reference to a Word template that would contain the appropriate styles would
be a help.

I did try the route of deleting tags to see what would still load, but it
did not like the tags I left.

John... Visio MVP

Tony Jollans said:
You have to make a package - a single text file doesn't cut it. The
minimum is pretty basic - off the top of my head you need a
[Content_Types].xml file and a _rels/.rels file, along with the document
itself, which needs little more than <document><body><p><t>(text)... or
something like that.

I can dig out some more detail if you want. What are you looking to put in
the document before Word sees it? And what are you expecting Word to add
to it?

--
Enjoy,
Tony

www.WordArticles.com

John... Visio MVP said:
Does anyone know what the least amount of XML information needed to make
a Word document?

Rather than create a Word document using the Word object model, I would
like to create a text file that contains XML tags that Word can interpret
as a Word document and fill in the missing information.

John... Visio MVP
 
P

Peter Jamieson

I don't know for sure, but
a. If you mean in .docx" format, then you have to follow what Tony
Jollans says as that's actually a couple of chunks of xml inside a zip
file - see e.g.

http://msdn.microsoft.com/en-us/library/bb266220.aspx#office2007wordfileformat_creatingthedocument

b. if you mean "using Word 2007 XML" then I believe you can put the
whole thing in a single .xml file - the following is about as small as I
could get to open, but I haven't tried it for real or tried stripping
out any more of the namespace definitions). What's more,
- I can't tell you whether or not this XML actually conforms to the
ISO OOXML standards
- .xml format files cannot contain everything that .docx files can
contain, or at least, Word does not always save everything needed to
"roundtrip" a file when you save using .xml format


<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?mso-application progid="Word.Document"?>
<pkg:package
xmlns:pkg="http://schemas.microsoft.com/office/2006/xmlPackage"><pkg:part pkg:name="/_rels/.rels"
pkg:contentType="application/vnd.openxmlformats-package.relationships+xml"
pkg:padding="512"><pkg:xmlData><Relationships
xmlns="http://schemas.openxmlformats.org/package/2006/relationships"><Relationship
Id="rId1"
Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument"
Target="word/document.xml"/></Relationships></pkg:xmlData></pkg:part><pkg:part
pkg:name="/word/document.xml"
pkg:contentType="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml"><pkg:xmlData><w:document
xmlns:ve="http://schemas.openxmlformats.org/markup-compatibility/2006"
xmlns:eek:="urn:schemas-microsoft-com:eek:ffice:eek:ffice"
xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"
xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"
xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing"
xmlns:w10="urn:schemas-microsoft-com:eek:ffice:word"
xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml"><w:body><w:p></w:p></w:body></w:document></pkg:xmlData></pkg:part></pkg:package>

c. A Word 2003 WordProcessingML format file can be smaller, but
doesn't conform to the same XML schemas
d. you could consider either an empty .txt file or a very small .rtf
file containing the following to be "Word files", but obviously neither
is an "XML format Word file"

{\rtf1\ansi{\par}}

Peter Jamieson

http://tips.pjmsn.me.uk
 
T

Tony Jollans

Interesting, Peter. I know that's the format the WordOpenXML property in the
OM gives you but I hadn't quite realised you could flatten the package like
that and put it in a file. It makes what John wants to do, a little easier.

You can remove some of the dross, sorry, unused namespaces, from that,
giving this (which, I'm sure, will look a mess when the newsreader has
finished with it):

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

<pkg:package
xmlns:pkg="http://schemas.microsoft.com/office/2006/xmlPackage">

<pkg:part pkg:name="/_rels/.rels"
pkg:contentType="application/vnd.openxmlformats-package.relationships+xml">
<pkg:xmlData>

<Relationships
xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
<Relationship Id="rId1"
Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument"
Target="document.xml"/>
</Relationships>

</pkg:xmlData>
</pkg:part>

<pkg:part pkg:name="/document.xml"
pkg:contentType="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml">
<pkg:xmlData>

<w:document
xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
<w:body>
<w:p>
<w:r>
<w:t>
Text generated in, perhaps, Visio. This can be really long
if you wish.
</w:t>
</w:r>
</w:p>
</w:body>
</w:document>

</pkg:xmlData>
</pkg:part>

</pkg:package>

It rather sounds as if John wants to add a little more than just text, and
will need some extra parts (and relationships) for headers, but I'm sure
this is do-able.

IMHO, Word is a little out of order validating, and taking dependent action
upon, the file extension - but it only does this for .docx and .docm, so, if
you save the above to a file and call it something .doc, Word will open it,
and then you can do whatever, and save it as a .docx. Actually, I might be
able to use this myself.
 
J

John... Visio MVP

Yes, I was referring to the Word xml with the xml extension (your option b).

The Word 2003 format is less verbose, I am trying to see if that will give
me a better result.

John... Visio MVP
 
J

John... Visio MVP

I was able to successfully strip it down to

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?mso-application progid="Word.Document"?>
<w:wordDocument xmlns:aml="http://schemas.microsoft.com/aml/2001/core"
xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882"
xmlns:ve="http://schemas.openxmlformats.org/markup-compatibility/2006"
xmlns:eek:="urn:schemas-microsoft-com:eek:ffice:eek:ffice"
xmlns:w10="urn:schemas-microsoft-com:eek:ffice:word"
xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml"
xmlns:wsp="http://schemas.microsoft.com/office/word/2003/wordml/sp2"
xmlns:sl="http://schemas.microsoft.com/schemaLibrary/2003/core"><w:body><w:p><w:r><w:t>The
first line of text</w:t></w:r></w:p><w:p></w:p><w:p><w:r><w:t>The second
line of text</w:t></w:r></w:p></w:body></w:wordDocument>

and it still loads into Word 2007,

So thanks Tony and Peter.

Now to start populating the Word document with something more relevant. ;-)

John... Visio MVP
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top