MS Word objects list

P

Piotr Nadolny

Word allows us to access to it's objects like tables
(ole_object.application.activedocument.tables), paragraphs, sentences and
other. But is there an object that keeps all the Word objects ((or)
references to them) as they appear in the document.
For example: Having the collection of tables
(ole_object.application.activedocument.tables) I don't know where exactly
that tables are in the document. If the Range or Selection object selects
the whole table, how to get to know that it is the table? What is more
interesting: How to get to know (when the cursor is in the beginning of the
document) what kind of object will be the next: table, sentence or sth else?
How can I read my document from top to toe?

Thanks in advance...
 
W

Word Heretic

G'day "Piotr Nadolny" <[email protected]>,

The fundamental unit to step through a document is either paras or
collections.

How can you POSSIBLY know what is next when the user can stick
ANYTHING ANYWHERE? I'm afraid I really don't understand your perceived
objective.

Steve Hudson - Word Heretic
Want a hyperlinked index? S/W R&D? See WordHeretic.com

steve from wordheretic.com (Email replies require payment)


Piotr Nadolny reckoned:
 
P

Piotr Nadolny

Ok, maybe in other words...
Suppose that my document begins with a table. How can I get to know that it
begins with a table from the script (using ole_object.application.range...
or ole_object.application.selection... or sth else)? Is there an array or a
list of Word objects in document? How can I read a Word document from top to
toe?
 
P

Pete Bennett

The Word object hierarchy doesn't work in the way that you imagine it does.

All the Word objects are pointers to structures within the document that can
interleave each other more or less at will. Thus, there's no sequential
structure of objects. Tables can contain paragraphs, ranges of text can
contain comments and so on.

If you really want to draw up a map of what's in the document, you're better
off going through each collection you're interested in (say, tables,
comments, footnotes, whatever) and logging the start and end positions of
each of their associated ranges. You can then use that list to construct
your sequence of objects.

Whatever's left in the gaps is of course text...
 
H

Howard Kaikow

You would have to sequentially navigate through the document, testing along
the way to see what is the current object (and whether that object contains
other objects).
 
J

Jay Freedman

To expand a bit on this...

It would be slow, inefficient, frustrating, and ultimately unproductive to
"read a Word document from top to toe". Instead, determine what kind of
manipulation you need to perform, and then work with the appropriate
objects. To do that, you don't usually need to know where in the document
those objects are, or in what sequence they appear on the screen, or even
how many there are.

Make maximum use of the For Each <object> In <collection> loop construct and
the .Find method (which also does replacement) of a Range object. The Range
object's .Information function can tell you things such as whether the range
is in a table.

Learn about StoryRanges -- a document consists of up to 11 "stories" (more
in Word 2003) such as main text, headers, footers, footnotes, endnotes,...

For the few occasions when you really need to know the absolute index of an
object in the document, see
http://word.mvps.org/FAQs/MacrosVBA/GetIndexNoOfPara.htm.
 
P

Piotr Nadolny

My RTF files won't be complicated. I'd like to generate a very simple XML
document that will represent my RTF document. I want to load XML and
generate RTF file, modify it and then save it as XML. I'll try to log start
and end positions of object's ranges. Thanks, I'll see what can be done...
 
J

Jay Freedman

Hi Piotr,

If you had said at the beginning that you're translating between RTF
and XML, I would have suggested right away: don't use Word at all!
Write a console application in whatever VB- or C-related language
you're comfortable with, and do the conversion directly.

The reason is that both RTF and XML are reasonably linear,
stream-oriented formats, while Word's native format is massively
nonlinear. I haven't looked at this seriously, but for a simple file
structure you might need only to translate the tags.
 
W

Word Heretic

G'day "Piotr Nadolny" <[email protected]>,

Don't forget that these 'logged' points are going to change
dynamically as you autoedit the document.

However, there are several other alternatives here.

Load the XML as a text file directly and find and replace / examine
nodes to yoru hearts content.

Load the XML as XML and embed an XSLT if one is missing to produce the
transformation required, then specify that transform when reloading
the XML into your alternate data source.

Now, if you are set on the RTF intermediate, you can easily store a
logfile of events to use to guide the subsequent word processing.


Steve Hudson - Word Heretic
Want a hyperlinked index? S/W R&D? See WordHeretic.com

steve from wordheretic.com (Email replies require payment)


Piotr Nadolny reckoned:
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top