How to correct use of Pages object (Microsoft.Interop.Word)

O

Oleg

I need to find sizes and positions of rectangles that bound every text
fragment (word) and shape (picture) in MS Word Document (page formatting is
set).

I'm trying to do it by manipulating Rectangles property of Pages
(_Document.ActiveWindow.ActivePane.Pages) object :

object missing = Type.Missing;
object fileName = @"c:\1.rtf";
object dontConfirmConversions = false;
object openReadWrite = true;
object nonVisible = false;

Word._Application wapp = new Microsoft.Office.Interop.Word.Application();
Word._Document doc = wapp.Documents.Open( ref fileName, ref
dontConfirmConversions,
ref openReadWrite, ref missing, ref missing, ref missing, ref missing, ref
missing,
ref missing, ref missing, ref missing, ref nonVisible, ref missing, ref
missing,
ref missing, ref missing );

Word.Pages pages = doc.ActiveWindow.ActivePane.Pages;

using ( StreamWriter sw = new StreamWriter( "TestFile.txt" ) ) {
foreach ( Word.Page p in pages ) {
Word.Rectangles rects = p.Rectangles;
foreach ( Word.Rectangle r in rects ) {
sw.WriteLine( " Width:" + r.Width + " Height: " + r.Height );
}
}
}

The output of such fragment of code must be file TestFile.txt, which
contains lines with bounding rectangles sizes.

But I have only one record in result. ( for example, Width:312 Height: 728).

This question is for real professional, which I'm not.
Please, help me.
 
C

Cindy M.

Hi =?Utf-8?B?T2xlZw==?=,
I need to find sizes and positions of rectangles that bound every text
fragment (word) and shape (picture) in MS Word Document (page formatting is
set).

I'm trying to do it by manipulating Rectangles property of Pages
(_Document.ActiveWindow.ActivePane.Pages) object :

object missing = Type.Missing;
object fileName = @"c:\1.rtf";
object dontConfirmConversions = false;
object openReadWrite = true;
object nonVisible = false;

Word._Application wapp = new Microsoft.Office.Interop.Word.Application();
Word._Document doc = wapp.Documents.Open( ref fileName, ref
dontConfirmConversions,
ref openReadWrite, ref missing, ref missing, ref missing, ref missing, ref
missing,
ref missing, ref missing, ref missing, ref nonVisible, ref missing, ref
missing,
ref missing, ref missing );

Word.Pages pages = doc.ActiveWindow.ActivePane.Pages;

using ( StreamWriter sw = new StreamWriter( "TestFile.txt" ) ) {
foreach ( Word.Page p in pages ) {
Word.Rectangles rects = p.Rectangles;
foreach ( Word.Rectangle r in rects ) {
sw.WriteLine( " Width:" + r.Width + " Height: " + r.Height );
}
}
}

The output of such fragment of code must be file TestFile.txt, which
contains lines with bounding rectangles sizes.

But I have only one record in result. ( for example, Width:312 Height: 728).
When I test your code in a Windows Form application it works fine. I got (for
example) for lines defining four different text regions:
Width:432 Height: 648
Width:119 Height: 38
Width:72 Height: 20
Width:612 Height: 792

Are you sure the document you're testing on contains more than one rectangle?

Cindy Meister
INTER-Solutions, Switzerland
http://homepage.swissonline.ch/cindymeister (last update Jun 17 2005)
http://www.word.mvps.org

This reply is posted in the Newsgroup; please post any follow question or
reply in the newsgroup and not by e-mail :)
 
O

Oleg

Thanks for your answer, Sidny!

Excuse me, if I must be mistaken, but I thought that portion of text is a
word.
Looks like it isn't so.
Please, tell me how can I get bounding rectangle (Rectangle object) for
every word in the text (if it is possible at all).

Regards,
 
O

Oleg

Thank you for your answer, Sindy!

Excuse me, Sindy, I must be mistaken, but I thought, that portion of text is
a word.
Looks like it isn't so...
Please, tell me how can I get bounding rectangle for every word in the text
(if it is possible, of course).

Sincerely,
 
C

Cindy M.

Hi =?Utf-8?B?T2xlZw==?=,
Excuse me, Sindy, I must be mistaken, but I thought, that portion of text is
a word.
Looks like it isn't so...
No... I think this works basically the way a scanner does, to calculate where a
"block" is positioned. As long as everything has the same basic positioning
(left alignment) and is text, it's one "Rectangle".
Please, tell me how can I get bounding rectangle for every word in the text
(if it is possible, of course).
That's not really possible, and least not directly. You can sort of get this
using the relative horizontal and vertical parametes of the INFORMATION
property. This is applicable to the Range and Selection objects. So, basically,
you could do something along these lines
Dim wd as Word.Range
For each wd in ActiveDocument.Words
left = wd.Information(wdHorizontalPositionRelativeToPage)
top = wd.Information(wdVerticalPositionRelativeToPage)
Dim rng as Word.Range
Set rng = wd.Duplicate
rng.Collapse wdCollapseEnd
right = wd.Information(wdHorizontalPositionRelativeToPage)
height = rng.ParagraphFormat.LineSpacing
Next

but these measurements may not be very accurate.

Cindy Meister
INTER-Solutions, Switzerland
http://homepage.swissonline.ch/cindymeister (last update Jun 17 2005)
http://www.word.mvps.org

This reply is posted in the Newsgroup; please post any follow question or reply
in the newsgroup and not by e-mail :)
 
O

Oleg

Thank you very much Sindy.

Your advice might help me (as I see it), but I have some problems again.
For example, when code given below is being executed, exception is thrown.

foreach (Word.Range wd in doc.Words) {
int left = (int)wd.get_Information(
Microsoft.Office.Interop.Word.WdInformation.wdHorizontalPositionRelativeToPage);
//...
}

May be not for every Range object is possible to get relative position.
I'm sorry, if I bother you.
 
C

Cindy M.

Hi =?Utf-8?B?T2xlZw==?=,
Your advice might help me (as I see it), but I have some problems again.
For example, when code given below is being executed, exception is thrown.

foreach (Word.Range wd in doc.Words) {
int left = (int)wd.get_Information(
Microsoft.Office.Interop.Word.WdInformation.wdHorizontalPositionRelativeToPage);
//...
}
Check in the Help for the wdInformation parameters you're using. These certainly
do not return an INT value. More likely a single (float).

Also, but such things in try-catch blocks so that you get meaningful information
in the error messages. Without this information, it's very difficult to impossible
to troubleshoot. I just happen to know that these certainly don't return integer
values...

Cindy Meister
INTER-Solutions, Switzerland
http://homepage.swissonline.ch/cindymeister (last update Jun 17 2005)
http://www.word.mvps.org

This reply is posted in the Newsgroup; please post any follow question or reply in
the newsgroup and not by e-mail :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top