Replacing text in Word 2003 (and XML?)

T

Tim Mavers

I am trying to find the best solution to programatically replacing text in a
word document (using word 2003)? Right now we allow users to enter
specific 'content tags' directly in Word by entering things like @TAGNAME@.

Later in the process we (programatically) run through the document via Word
(using COM) which is very slow and error prone. We search for all these
content tags and replace them with the real values. Some of these tags are
more complex than these (and support heirarchies) so that is why we came up
with using content tags.

I know Word has much better XML support these days, but I have not really
used it. I am trying to figure out a way that we can save these word doc as
XML and then parse through it and transform it using XSLT? Right now our
process is very slow as we have to instantiate Word, use the old Word COM
object model (which is not very friendly with .NET).

I tried a simple export of the word doc in XML and it contained a lot of
extra XML that I wasn't expecting. More importantly, depending on how
things were entered, sometimes our custom tags (i.e. @TAGNAME@) was split up
over several XML nodes. In other words (I am paraphrasing), in XML,
sometimes it woudl look like this:

<a:blah>@</a:blah><a:blah2>TAGNAME</a:blah2><a:blah>@</a:blah>

rather than having @TAGNAME@ be contiuous:

<a:blah>@TAGNAME@</a:blah>

I couldn't figure out where the a:blah stuff came from. Any ideas?

Thanks!
 
C

Cindy M -WordMVP-

Hi Tim,

would you mind very much opening such a document again and copying/pasting the
"blah stuff". Since that's not a WordML tag, it's a bit difficult to offer any
kind of opinion :)
I am trying to find the best solution to programatically replacing text in a
word document (using word 2003)? Right now we allow users to enter
specific 'content tags' directly in Word by entering things like @TAGNAME@.

Later in the process we (programatically) run through the document via Word
(using COM) which is very slow and error prone. We search for all these
content tags and replace them with the real values. Some of these tags are
more complex than these (and support heirarchies) so that is why we came up
with using content tags.

I know Word has much better XML support these days, but I have not really
used it. I am trying to figure out a way that we can save these word doc as
XML and then parse through it and transform it using XSLT? Right now our
process is very slow as we have to instantiate Word, use the old Word COM
object model (which is not very friendly with .NET).

I tried a simple export of the word doc in XML and it contained a lot of
extra XML that I wasn't expecting. More importantly, depending on how
things were entered, sometimes our custom tags (i.e. @TAGNAME@) was split up
over several XML nodes. In other words (I am paraphrasing), in XML,
sometimes it woudl look like this:

<a:blah>@</a:blah><a:blah2>TAGNAME</a:blah2><a:blah>@</a:blah>

rather than having @TAGNAME@ be contiuous:

<a:blah>@TAGNAME@</a:blah>

I couldn't figure out where the a:blah stuff came from. Any ideas?

Cindy Meister
INTER-Solutions, Switzerland
http://homepage.swissonline.ch/cindymeister (last update Jun 8 2004)
http://www.word.mvps.org

This reply is posted in the Newsgroup; please post any follow question or reply
in the newsgroup and not by e-mail :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top