Performance Coding Tips for Word 2003 automation?

J

John Harris:Maine

We have an application that is written in VB 6.0 and automates Word using the
Word Object model (COM based). We currently upgraded to Word 2003 (still vb
6.0). We are experiencing slower performance times with our application and
I was wondering if any information exists about the following:
1. Is there any known performance issues when upgrading an automation
application from Word 97 to Word 2003? Is Word 2003 just slower?
2. What types of things can someone do to write better performing automation
code using the Word object model in Word 2003?
3. We've got a lot of automation code controlling word. Some items we have
made into a VBA macro so as to benefit from "in-process" speed, but much of
our code is preferred to be outside of Word and calling in using the COM
objects and methods. So I am looking for suggestions other than stating the
fact that VBA macros will run faster.
4. We've got some heavy "looping" logic that is calling into Word object
model. Any tips for using Word automation within "loop" processing? (see
next questions regarding whats going on in our loops).
5. Any truth to the fact that using fully referenced object method calls are
slower than using a temporary variable to the same object method? Ex: If I
have WordObject.SubObject1.SubObject2.SubObject3.method() and that the
SubObject3 is utilized mulitple times within the logic, is it better to
create a temporary object of type SubObject3 and utilize that instead? In
other words is there any processing overhead in word resolving the hierarchy
each time? We've run some tests but the timings vary so much, that we've
been unable to pin this down one way or the other.
6. We recently switched a lot of code from utilizing the "activeDocument"
object to utilizing the documents collection object and indexing it by the
name of the desired document. Should this be any slower. Should we avoid
doing this within loops?
7. One of our heavy processing loops is in processing bookmarks. We loop
through the bookmarks doing string compares on the "type" of bookmark. Could
we have our own subcollections of bookmarks to keep the "types" of bookmarks
we save in separate collections? This would facilitate faster searching for
the bookmarks of a certain type.
8. I was once told by Microsoft on a support call that both office 10 and 11
perform faster with COM based calls that are late bound versus early-bound.
They stated while this seems contrary to what developers generally hear with
regard to COM applications the reason for this regarding Word is that its
"late-bound" interfaces actually existed before the COM model reached
maturity and when they formalized it for COM in later versions what has
happened is that there are more "layers" over the Late-bound interface to
provide the early-bound interfaces. Any truth to this? It seems like I
continuously find information to the contrary when searching the web.
9. Regardless of the above statement which is the fastest way to utilize
Word? Late-bound or early bound. Late bound has the added benefit of being
more robust in an environment where new versions of Office are rolled out
under the control of another department and we use it for that benefit.
10. Please point me to resources or documentation with these types of tips.
 
J

Jonathan West

John Harris:Maine said:
We have an application that is written in VB 6.0 and automates Word using
the
Word Object model (COM based). We currently upgraded to Word 2003 (still
vb
6.0). We are experiencing slower performance times with our application
and
I was wondering if any information exists about the following:
1. Is there any known performance issues when upgrading an automation
application from Word 97 to Word 2003? Is Word 2003 just slower?

Yes, but this can be limited to some extent by loading up on memory. Maye
sure you have at least 256Mb and preferably 512 Mb installed.

2. What types of things can someone do to write better performing
automation
code using the Word object model in Word 2003?

Call the word obhject model as little as possible.
3. We've got a lot of automation code controlling word. Some items we
have
made into a VBA macro so as to benefit from "in-process" speed, but much
of
our code is preferred to be outside of Word and calling in using the COM
objects and methods. So I am looking for suggestions other than stating
the
fact that VBA macros will run faster.

1. if you repeatedly read items from the object model, read it once and
cache the read value into a local variable

2. If you are inserting a string into a document, build the entire string
first and then insert it. Don't make multiple insertions of small strings.

4. We've got some heavy "looping" logic that is calling into Word object
model. Any tips for using Word automation within "loop" processing? (see
next questions regarding whats going on in our loops).
5. Any truth to the fact that using fully referenced object method calls
are
slower than using a temporary variable to the same object method? Ex: If
I
have WordObject.SubObject1.SubObject2.SubObject3.method() and that the
SubObject3 is utilized mulitple times within the logic, is it better to
create a temporary object of type SubObject3 and utilize that instead? In
other words is there any processing overhead in word resolving the
hierarchy
each time? We've run some tests but the timings vary so much, that we've
been unable to pin this down one way or the other.

Use the With keyword or use local object variables. Generally performance
improves every time you can get rid of a dot from your code. (However,
remember that something like ActiveDocument.Hyperlinks(n) is actually
shorthand for ActiveDocument.Hyperlinks.Item(n). Count the dots as if you
are using the Item keyword.)
6. We recently switched a lot of code from utilizing the "activeDocument"
object to utilizing the documents collection object and indexing it by the
name of the desired document. Should this be any slower. Should we avoid
doing this within loops?

This will be slower. insert, set an object variable of type Document, and
reference that to the document you are changing.
7. One of our heavy processing loops is in processing bookmarks. We loop
through the bookmarks doing string compares on the "type" of bookmark.
Could
we have our own subcollections of bookmarks to keep the "types" of
bookmarks
we save in separate collections? This would facilitate faster searching
for
the bookmarks of a certain type.

if you are not modifying the text or location of the bookmarks, then read
all the bookmarks into a local array or collection, and then process that.
8. I was once told by Microsoft on a support call that both office 10 and
11
perform faster with COM based calls that are late bound versus
early-bound.
They stated while this seems contrary to what developers generally hear
with
regard to COM applications the reason for this regarding Word is that its
"late-bound" interfaces actually existed before the COM model reached
maturity and when they formalized it for COM in later versions what has
happened is that there are more "layers" over the Late-bound interface to
provide the early-bound interfaces. Any truth to this? It seems like I
continuously find information to the contrary when searching the web.

I've never heard this. All the information I have seen is to the efect that
early-bound code is preferable from a performace point of view, and that the
primary benefit of late bound code is compatibility with multiple versions
of Word.

neverhteless, I suppose that it is possible that some late bound code could
be quicker. Whether *your* code is quicker when late bound can only be
discovered by benchmarking. You have to test it. In most cases, there are a
number of different ways of doing the same thing. Build some test programs
and time the alternatives.
9. Regardless of the above statement which is the fastest way to utilize
Word? Late-bound or early bound. Late bound has the added benefit of
being
more robust in an environment where new versions of Office are rolled out
under the control of another department and we use it for that benefit.

I think that if that benefit is important enoyugh for you, you should
continue with late-bound. he difference in performance between ealy and late
bound code is probably much less that the performance difference you can
make by caching improtabnt information into local arrays or collections
10. Please point me to resources or documentation with these types of
tips.

This article is excellent at providing ideas for optimization possibilities

VBA Code Optimisation
By Ken Getz
http://www.microsoft.com/officedev/articles/movs101.htm

 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top