Word>SaveAs>HTML not the reverse of MSXML2.DOMDocument40document>Load

R

redryderridesagain

After trying the function MSXML2.DOMDocument40 document>Load on a Word
document which was saved as XML and winding up with a hideously
complex file which would load but was somehow malformed, I tried to
export the Word document to HTML and found that although more readable
the intermediate document had the following features which prevented
it from being read in as XML;

- single quoted attribute value sb double
- attribute values not quoted
- node values should not be quoted
- & in node values
- unmatched tags i.e. <!-- and <br

I gather xml not = html but is there a simple way to export as simple
xml or should I transform these features out of the intermediate
file.

Thanks
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top