.docx files

D

Daiya Mitchell

Sorry! I have a different version, I'm afraid. Mine was never Vista, I
don't think.

You have to delete the stuff in caps in my address to email me directly.

Daiya
 
J

jsafdie

John, thanks for this -- I didn't think I was losing my mind, but
after everything suggested didn't work, I was beginning to wonder.

Care to look at one more (the only other one, thankfully, that was
submitted to me as a .docx)? I can't do anything with it either, and
it would be good to know that it was corrupted as well.

Thanks for your time, and sorry to have bothered everyone with this,
which turned out not to be a technical problem at all!!

Joe
 
J

John McGhie

Hi Joe:

OK, as I told you directly, I dug around in your document with SimpleText.

It appears that it was produced using the Open Document Format converter
from Source Forge.

Which means that while the document is in XML (it might even be "valid"
XML), Microsoft Word does not have the style sheets to read it with, so to
Word it is a corrupt document.

Joe's documents are NOT "Word documents", which is why the Converter can't
read them. Neither can Word 2007 :)

More information:

Let's do "XML 101" because I will bet this is not the last we see of this
problem in here :) Better yet, let's start with "Computing 101", because
that's where this problem has its roots.

Microsoft is a software "company". The word "company" implies it has some
shareholders, who have invested their pension funds in the corporation and
are hoping that it will make a PROFIT :)

Microsoft has the best word processor on the planet, and it makes a profit
if it can "sell" it. Several other software companies would like to weaken
Microsoft, so they have less competition for things such as databases and
file servers. One way to do that is to wreck Microsoft's cash-cow, which is
Office. These companies have banded together to try to produce an
"alternative" to Microsoft Office, which they are basically willing to give
away for free. This doesn't hurt them, because they do not sell a competing
product anyway, but it does hurt Microsoft, which makes half its money from
Office :)

The competitors don't want to get sued, so their version of Office produces
"Open Document Format". Very similar to Microsoft's "Open XML format, but
sufficiently different that they won't get sued.

Microsoft sees this coming, and produces Office without the ability to read
or write Open Doc format. This is a war, and in war, it's generally
considered good practice to wait for the enemy to shoot you, rather than
shooting yourself. So neither Word nor the Mac Converter can read Open Doc.

The open source community has a project under way (with Microsoft
assistance) to produce a translator that can translate a document between
Microsoft Open XML and Open Doc. And that's what I think was used to
produce the documents that Joe can't open.

Now let's do XML 101...

Extensible Markup Language is one of a group of languages that are all based
on Standardised Generalised Markup Language. SGML was invented in the 60's
by IBM. HTML is one application of it, and that's what we use to produce
websites.

HTML has a long list of "tags" that have defined names and defined meanings.
An HTML file can have a style sheet, but it's not so important, because all
the tags have fixed names and the browser can simply assume the names all
mean what they are supposed to mean and display them accordingly.

The problem with that is that if you wish to put something on your web page
that is not in the standard Document Type Definition for HTML, you are outta
luck. The browser will not be able to understand whatever code you use, so
will either ignore it or crash or both.

XML is the answer: You can extend XML to describe anything you like,
provided you include a style sheet to tell the recipient what the names you
have used are, and what they mean.

A .docx file is a zipped container. Inside is a little website, with files
and folders. The standard specifies a structure for these. You do not have
to have all of the items, but if you have any, they must have the correct
structure. You can have extras, but you have to tell the recipient what
they are, and what's in them.

You can use any stylesheet you like, and have as many different tags as you
like, but you must say what each one is and what it means. The stylesheet
must be inside the .docx, or it must be at a particular URL and the computer
must be on the Internet to get it.

Without the stylesheet, the document is totally unintelligible to the
recipient. Complete Swahili -- it has no idea which of the characters are
tags, let alone what the tags mean :)

Of course, it is up to the maker of the XML file to ensure his recipients
can not only get the stylesheet, but that their computers can process the
commands within it. For example: There's no point in sending commands for
right-to-left text to Mac Word, because it can't display that.

I believe Joe's first problem is that the stylesheet that applies to those
documents is in an untrusted location, so Microsoft software is going to
refuse to download it.

Without the stylesheet, neither Word nor the Converter can even FIND the
"content" of the document, let alone read it.

Both Microsoft and the Open Source community are working on that problem.

The second problem is that once you can understand the language inside the
document, it does not necessarily follow that you can carry out the commands
it contains. Microsoft and the Open Source community are also working on
that problem.

I expect Microsoft Word will be able to handle ODF documents in the near
future, if the user installs a translator to convert from one to the other.
And when that happens, Microsoft's Macintosh Business Unit will have to
determine whether it's profitable for it to spend time and money on bringing
that ability to the Mac. If they decide to do that, they must then decide
whether there is enough demand for the ODF converter for them to enable the
Converter to handle ODF in earlier versions of Microsoft Office.

Microsoft's shareholders might want to be in that discussion, because if
they decide to do that, it's a bit like handing the keys to the Microsoft
cash register to the Open Source Community.

And WE might want to have some input to that decision too. Why would
Microsoft keep making Word if its competitors were giving away an
equivalent? Makes the real thing a bit difficult to "sell". You recall
what happened to Internet Explorer for Mac?

Oops... That's "Politics 101", isn't it :)

Cheers


John, thanks for this -- I didn't think I was losing my mind, but
after everything suggested didn't work, I was beginning to wonder.

Care to look at one more (the only other one, thankfully, that was
submitted to me as a .docx)? I can't do anything with it either, and
it would be good to know that it was corrupted as well.

Thanks for your time, and sorry to have bothered everyone with this,
which turned out not to be a technical problem at all!!

Joe

--
Don't wait for your answer, click here: http://www.word.mvps.org/

Please reply in the group. Please do NOT email me unless I ask you to.

John McGhie, Consultant Technical Writer
McGhie Information Engineering Pty Ltd
http://jgmcghie.fastmail.com.au/
Sydney, Australia. S33°53'34.20 E151°14'54.50
+61 4 1209 1410, mailto:[email protected]
 
P

Phillip Jones

John said:
Hi Joe:

OK, as I told you directly, I dug around in your document with SimpleText.

It appears that it was produced using the Open Document Format converter
from Source Forge.

Which means that while the document is in XML (it might even be "valid"
XML), Microsoft Word does not have the style sheets to read it with, so to
Word it is a corrupt document.

Joe's documents are NOT "Word documents", which is why the Converter can't
read them. Neither can Word 2007 :)

More information:

Let's do "XML 101" because I will bet this is not the last we see of this
problem in here :) Better yet, let's start with "Computing 101", because
that's where this problem has its roots.

Microsoft is a software "company". The word "company" implies it has some
shareholders, who have invested their pension funds in the corporation and
are hoping that it will make a PROFIT :)

Microsoft has the best word processor on the planet, and it makes a profit
if it can "sell" it. Several other software companies would like to weaken
Microsoft, so they have less competition for things such as databases and
file servers. One way to do that is to wreck Microsoft's cash-cow, which is
Office. These companies have banded together to try to produce an
"alternative" to Microsoft Office, which they are basically willing to give
away for free. This doesn't hurt them, because they do not sell a competing
product anyway, but it does hurt Microsoft, which makes half its money from
Office :)

The competitors don't want to get sued, so their version of Office produces
"Open Document Format". Very similar to Microsoft's "Open XML format, but
sufficiently different that they won't get sued.

Microsoft sees this coming, and produces Office without the ability to read
or write Open Doc format. This is a war, and in war, it's generally
considered good practice to wait for the enemy to shoot you, rather than
shooting yourself. So neither Word nor the Mac Converter can read Open Doc.

The open source community has a project under way (with Microsoft
assistance) to produce a translator that can translate a document between
Microsoft Open XML and Open Doc. And that's what I think was used to
produce the documents that Joe can't open.

Now let's do XML 101...

Extensible Markup Language is one of a group of languages that are all based
on Standardised Generalised Markup Language. SGML was invented in the 60's
by IBM. HTML is one application of it, and that's what we use to produce
websites.

HTML has a long list of "tags" that have defined names and defined meanings.
An HTML file can have a style sheet, but it's not so important, because all
the tags have fixed names and the browser can simply assume the names all
mean what they are supposed to mean and display them accordingly.

The problem with that is that if you wish to put something on your web page
that is not in the standard Document Type Definition for HTML, you are outta
luck. The browser will not be able to understand whatever code you use, so
will either ignore it or crash or both.

XML is the answer: You can extend XML to describe anything you like,
provided you include a style sheet to tell the recipient what the names you
have used are, and what they mean.

A .docx file is a zipped container. Inside is a little website, with files
and folders. The standard specifies a structure for these. You do not have
to have all of the items, but if you have any, they must have the correct
structure. You can have extras, but you have to tell the recipient what
they are, and what's in them.

You can use any stylesheet you like, and have as many different tags as you
like, but you must say what each one is and what it means. The stylesheet
must be inside the .docx, or it must be at a particular URL and the computer
must be on the Internet to get it.

Without the stylesheet, the document is totally unintelligible to the
recipient. Complete Swahili -- it has no idea which of the characters are
tags, let alone what the tags mean :)

Of course, it is up to the maker of the XML file to ensure his recipients
can not only get the stylesheet, but that their computers can process the
commands within it. For example: There's no point in sending commands for
right-to-left text to Mac Word, because it can't display that.

I believe Joe's first problem is that the stylesheet that applies to those
documents is in an untrusted location, so Microsoft software is going to
refuse to download it.

Without the stylesheet, neither Word nor the Converter can even FIND the
"content" of the document, let alone read it.

Both Microsoft and the Open Source community are working on that problem.

The second problem is that once you can understand the language inside the
document, it does not necessarily follow that you can carry out the commands
it contains. Microsoft and the Open Source community are also working on
that problem.

I expect Microsoft Word will be able to handle ODF documents in the near
future, if the user installs a translator to convert from one to the other.
And when that happens, Microsoft's Macintosh Business Unit will have to
determine whether it's profitable for it to spend time and money on bringing
that ability to the Mac. If they decide to do that, they must then decide
whether there is enough demand for the ODF converter for them to enable the
Converter to handle ODF in earlier versions of Microsoft Office.

Microsoft's shareholders might want to be in that discussion, because if
they decide to do that, it's a bit like handing the keys to the Microsoft
cash register to the Open Source Community.

And WE might want to have some input to that decision too. Why would
Microsoft keep making Word if its competitors were giving away an
equivalent? Makes the real thing a bit difficult to "sell". You recall
what happened to Internet Explorer for Mac?

Oops... That's "Politics 101", isn't it :)


No That's strictly Bill Gates response to a hissy Fit he had, when
Apple decided to include Safari in OSX. Safari which is just a patched
up version of an abandoned Web Browser from The UNIX/Linux Community.
(Because it been patched some many times they just abandoned it).

I've had safari on my Mac's since OSX.2.3 and I've opened it 5 times in
all that time.

--
------------------------------------------------------------------------
Phillip M. Jones, CET |LIFE MEMBER: VPEA ETA-I, NESDA, ISCET, Sterling
616 Liberty Street |Who's Who. PHONE:276-632-5045, FAX:276-632-0868
Martinsville Va 24112 |[email protected], ICQ11269732, AIM pjonescet
------------------------------------------------------------------------

If it's "fixed", don't "break it"!

mailto:p[email protected]

<http://www.kimbanet.com/~pjones/default.htm>
<http://www.kimbanet.com/~pjones/90th_Birthday/index.htm>
<http://www.kimbanet.com/~pjones/Fulcher/default.html>
<http://www.kimbanet.com/~pjones/Harris/default.htm>
<http://www.kimbanet.com/~pjones/Jones/default.htm>

<http://www.vpea.org>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top