Find page number for each word in document

G

Gabe Knuth

Hi All,

I have a Word document for which I'm trying to create an index. I've
just started this code, but the idea is to parse through each word and
dump out the word and the page number to a CSV file that I can
manipulate later on (or dump it to a dictionary object or array or
something so I can manipulated it in code). The end result being the
100 least used words in the document should make a halfway decent
index.

My problem is that I can't figure out how to get the the page number
to show up correctly using the following code:

Function CreateIndex()

Dim fso, NewsFile

Set fso = CreateObject("Scripting.FileSystemObject")
Set File = fso.CreateTextFile("c:\book\sampleindex.csv", True)
Set objDictionary = CreateObject("Scripting.Dictionary")

Set colWords = Application.ActiveDocument.Words

For Each strWord In colWords
strWord = LCase(strWord)
strLetter = Left(strWord, 1)
If Asc(strLetter) < 97 Or Asc(strLetter) > 122 Then
Else
ActiveWindow.ScrollIntoView Selection.Range, True
intPage = Selection.Range.Information(wdActiveEndPageNumber)
MsgBox (strWord & "," & intPage)
File.WriteLine strWord & "," & intPage
End If
Next

File.Close

End Function

I'm pretty new to VBA for Word, but I'm not terrible at VBS, so I can
sometimes keep up with you über scripters out there.

The result of this script is a CSV file that has each word in the
document, and the number "1", because wdActiveEndPageNumber always
returns a "1". I tried to add the "ActiveWindow.ScrollIntoView" line
to scroll to the page, but I have a feeling I'm missing something for
that to work.

So, any help you guys can offer would be much appreciated.

Thanks!
Gabe
 
J

Jay Freedman

Hi Gabe,

The reason you're seeing page 1 for every entry is that the Selection object
represents the location of the cursor (technically the "insertion point") or
whatever text is selected, and your code doesn't do anything that would move
that point away from the start of the document. And given that fact,
ScrollIntoView accomplishes nothing at all, it just keeps going to the top
of the document.

The real problem, though, is that you shouldn't be using the Selection at
all. The loop variable strWord is actually a Range object (look in the VBA
help for the Words collection, which will tell you that each item in the
collection is a Range), and every Range object has the same
Information(wdActiveEndPageNumber) method that the Selection.Range has --
but it refers to the end of the Range object instead of the end of the
Selection. Since strWord is the thing that's "moving" through the document,
that's what you need to refer to.

So get rid of the ScrollIntoView line, and change the next line to

intPage = strWord.Information(wdActiveEndPageNumber)

--
Regards,
Jay Freedman
Microsoft Word MVP
Email cannot be acknowledged; please post all follow-ups to the newsgroup so
all may benefit.
 
G

Gabe Knuth

Hi Gabe,

The reason you're seeing page 1 for every entry is that the Selection object
represents the location of the cursor (technically the "insertion point") or
whatever text is selected, and your code doesn't do anything that would move
that point away from the start of the document. And given that fact,
ScrollIntoView accomplishes nothing at all, it just keeps going to the top
of the document.

The real problem, though, is that you shouldn't be using the Selection at
all. The loop variable strWord is actually a Range object (look in the VBA
help for the Words collection, which will tell you that each item in the
collection is a Range), and every Range object has the same
Information(wdActiveEndPageNumber) method that the Selection.Range has --
but it refers to the end of the Range object instead of the end of the
Selection. Since strWord is the thing that's "moving" through the document,
that's what you need to refer to.

So get rid of the ScrollIntoView line, and change the next line to

intPage = strWord.Information(wdActiveEndPageNumber)

--
Regards,
Jay Freedman
Microsoft Word MVP FAQ:http://word.mvps.org
Email cannot be acknowledged; please post all follow-ups to the newsgroup so
all may benefit.

Thanks, Jay!

I tried that, and I'm given an error that says Object Required on the
new line. Do I need to set strWord as something before this will
work? The actual error I'm getting is "Runtime Error '424': Object
Required"

Thanks!
 
J

Jean-Guy Marcil

Thanks, Jay!

I tried that, and I'm given an error that says Object Required on the
new line. Do I need to set strWord as something before this will
work? The actual error I'm getting is "Runtime Error '424': Object
Required"

This is becasue you are using "sloppy" code... ;-)
You should use Option Explicit at the top of all your modules. This way,
ambiguous and undeclared variables/objects would be detected.

You have:

For Each strWord In colWords
strWord = LCase(strWord)

and Jay suggested:
intPage = strWord.Information(wdActiveEndPageNumber)

But, by the time the compiler gets to Jay's suggestd line, the following has
taken place:

For Each strWord In colWords
is declaring a range object called strWord (colWords is a collection of
ranges, not strings).
Then,
strWord = LCase(strWord)
creates a string called strWord from the Range called strWord!!!

So, when you get to
intPage = strWord.Information(wdActiveEndPageNumber)
the compiler is trying to run a property on a string, not a range object as
it should.

Also, you are using an object called File, which is a declared object in the
Scripting libarary.

Here is a modified version of your code (untested):

Dim fso As Scripting.FileSystemObject
Dim fsoFile As Scripting.TextStream
Dim objDictionary As Scripting.Dictionary
Dim colWords As Word.Words
Dim rgeWord As Word.Range
Dim strWord As String
Dim strLetter As String
Dim lngPage As Long

Set fso = CreateObject("Scripting.FileSystemObject")
Set fsoFile = fso.CreateTextFile("c:\book\sampleindex.csv", True)
Set objDictionary = CreateObject("Scripting.Dictionary")

Set colWords = Application.ActiveDocument.Words

For Each rgeWord In colWords
strWord = LCase(rgeWord.Text)
strLetter = Left(strWord, 1)
If Asc(strLetter) < 97 Or Asc(strLetter) > 122 Then
Else
lngPage = rgeWord.Information(wdActiveEndPageNumber)
MsgBox (strWord & "," & lngPage)
fsoFile.WriteLine strWord & "," & lngPage
End If
Next

fsoFile.Close
 
G

Gabe Knuth

This is becasue you are using "sloppy" code... ;-)
You should use Option Explicit at the top of all your modules. This way,
ambiguous and undeclared variables/objects would be detected.

You have:

For Each strWord In colWords
strWord = LCase(strWord)

and Jay suggested:
intPage = strWord.Information(wdActiveEndPageNumber)

But, by the time the compiler gets to Jay's suggestd line, the following has
taken place:

For Each strWord In colWords
is declaring a range object called strWord (colWords is a collection of
ranges, not strings).
Then,
strWord = LCase(strWord)
creates a string called strWord from the Range called strWord!!!

So, when you get to
intPage = strWord.Information(wdActiveEndPageNumber)
the compiler is trying to run a property on a string, not a range object as
it should.

Also, you are using an object called File, which is a declared object in the
Scripting libarary.

Here is a modified version of your code (untested):

Dim fso As Scripting.FileSystemObject
Dim fsoFile As Scripting.TextStream
Dim objDictionary As Scripting.Dictionary
Dim colWords As Word.Words
Dim rgeWord As Word.Range
Dim strWord As String
Dim strLetter As String
Dim lngPage As Long

Set fso = CreateObject("Scripting.FileSystemObject")
Set fsoFile = fso.CreateTextFile("c:\book\sampleindex.csv", True)
Set objDictionary = CreateObject("Scripting.Dictionary")

Set colWords = Application.ActiveDocument.Words

For Each rgeWord In colWords
strWord = LCase(rgeWord.Text)
strLetter = Left(strWord, 1)
If Asc(strLetter) < 97 Or Asc(strLetter) > 122 Then
Else
lngPage = rgeWord.Information(wdActiveEndPageNumber)
MsgBox (strWord & "," & lngPage)
fsoFile.WriteLine strWord & "," & lngPage
End If
Next

fsoFile.Close

Ahh! Now I get it. I'm not much in the way of envisioning things,
but once I see them I can deconstruct what you all mean. This works
like a champ! Thanks!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top