Very Slow character counting in Word 2003

E

Edgar E. Cayce

G'day Edgar E. Cayce <[email protected]>,

Post the code Edgar, post the code :)

OK Steve, here is the code for my counting - note that if I comment
out the .font tests it runs very fast, but slows back down to a crawl
if I do those tests.

"CountSet" is a structure used for my counts.

Sub CountChars(ByVal CurrentRange As Range, ByRef Counter As CountSet)
Dim BoldState As Boolean
Dim UnderlineState As Boolean
Dim ItalicState As Boolean
Dim LastFontName As String
Dim LastFontSize As Long
Dim LastFontColor As Long
Dim CharCounter As Long
Dim DocEnd As Long

BoldState = False
UnderlineState = False
ItalicState = False
LastFontName = CurrentRange.Characters(1).Font.Name
LastFontSize = CurrentRange.Characters(1).Font.Size
LastFontColor = CurrentRange.Characters(1).Font.Color
CharCounter = 0
With CurrentRange
DocEnd = .End - 1
.Collapse wdCollapseStart
Do While .Start < DocEnd
.MoveEnd
CharCounter = CharCounter + 1
If CharCounter Mod 1000 = 0 Then
Debug.Print CharCounter ' so I can see how fast it is
DoEvents
End If

If .Text = " " Then
Counter.Spaces = Counter.Spaces + 1
ElseIf .Text = vbTab Then
Counter.Tabs = Counter.Tabs + 1
ElseIf .Text = vbCr Then
Counter.Returns = Counter.Returns + 1
Else ' else it is a char, the only one where we care about
'bold, font, etc.
Counter.Chars = Counter.Chars + 1
If .Case = wdUpperCase Then
Counter.CapitalChars = Counter.CapitalChars + 1
End If
' note for these, we check transition as well as
'presence.
If .Font.Bold Then
Counter.BoldChars = Counter.BoldChars + 1
If BoldState = False Then
Counter.BoldTransitions = _
Counter.BoldTransitions + 1
End If
BoldState = True
Else
If BoldState = True Then
Counter.BoldTransitions = _
Counter.BoldTransitions + 1
End If
BoldState = False
End If

If .Font.Underline Then
Counter.UnderlineChars = _
Counter.UnderlineChars + 1
If UnderlineState = False Then
Counter.UnderlineTransitions = _
Counter.UnderlineTransitions + 1
End If
UnderlineState = True
Else
If UnderlineState = True Then
Counter.UnderlineTransitions = _
Counter.UnderlineTransitions + 1
End If
UnderlineState = False
End If

If .Font.Italic Then
Counter.ItalicChars = Counter.ItalicChars + 1
If ItalicState = False Then
Counter.ItalicTransitions = _
Counter.ItalicTransitions + 1
End If
ItalicState = True
Else
If ItalicState = True Then
Counter.ItalicTransitions = _
Counter.ItalicTransitions + 1
End If
ItalicState = False
End If

If .Font.Name <> LastFontName Then
Counter.FontTransitions = _
Counter.FontTransitions + 1
LastFontName = .Font.Name
End If
If .Font.Size <> LastFontSize Then
Counter.FontTransitions = _
Counter.FontTransitions + 1
LastFontSize = .Font.Size
End If
If .Font.Color <> LastFontColor Then
Counter.FontTransitions = _
Counter.FontTransitions + 1
LastFontColor = .Font.Color
End If
End If
.Collapse wdCollapseEnd
Loop
End With
End Sub
 
A

Andrew Cushen

Edgar-

<< ...they all seem to have a count of 1 more than is
visible - even empty ones such as footnotes (when I have
none) have a
count of one char.
I believe what is causing that one character count is the
paragraph marker that is present in all paragraphs
(paragraphs as Word sees it, not necessarily what YOU
think are paragraphs!)

HTH,

-Andrew
=================================================
-----Original Message-----
Argh, am having problems with my news server - when I post it tells me
it failed, and then it posts twice.

Anyway, please excuse my double posts - Free Agent seems to have some
problems.

I tried my counter using WordHeretic's Method #1, and it works very
quickly - until I enable the code that checks italics, bold, fonts,
etc. Then it slows back down to as fast as my original code, or the
Method #2 code that deletes chars from the beginning of the range.

Note that to get Method #1 to work, I needed to change "DocEnd = .End"
to "DocEnd= .End - 1", or my loop just went forever. This may have to
do with the fact that I am counting each StoryRange in the doc
seperately, and they all seem to have a count of 1 more than is
visible - even empty ones such as footnotes (when I have none) have a
count of one char.

My Bold/Italic/etc. counting code is pretty basic, it's just

if .Text.Bold then
[count it...]
elseif .Text.Italic

Etc.

Any better way to do this?

Ed


Well, this is very weird.

I have not figured out what is making it so slow, but I have shed some
light on why my counts are coming up wrong. It appears that the
.Delete command, when deleting the last char of a word, deletes the
space after it as well. I figured this out by stepping through my
loop with .Visible set for the Word application. Only my double
spaces between sentences were being counted as spaces, all other
spaces were getting deleted along with the last letter of words.

And then, since 2 chars were being deleted for each word, my count of
chars is off so the paragraph marker at the end of the text gets
counted again and again...

Any idea how to keep it from deleting the spaces when deleting the
last char of a word?

Ed

.
 
W

Word Heretic

G'day Edgar E. Cayce <[email protected]>,

The macro is quite long so I've commented it inline below: The
suggested changes will bring about some speed increases for you, but
lets face it mate, you ARE doing a lot of tests :)

Otherwise, you are doing quite well!

Steve Hudson - Word Heretic
Want a hyperlinked index? S/W R&D? See WordHeretic.com

steve from wordheretic.com (Email replies require payment)


Edgar E. Cayce reckoned:
OK Steve, here is the code for my counting - note that if I comment
out the .font tests it runs very fast, but slows back down to a crawl
if I do those tests.

"CountSet" is a structure used for my counts.

Sub CountChars(ByVal CurrentRange As Range, ByRef Counter As CountSet)
Dim BoldState As Boolean
Dim UnderlineState As Boolean
Dim ItalicState As Boolean
Dim LastFontName As String
Dim LastFontSize As Long
Dim LastFontColor As Long
Dim CharCounter As Long
Dim DocEnd As Long

BoldState = False
UnderlineState = False
ItalicState = False
LastFontName = CurrentRange.Characters(1).Font.Name
LastFontSize = CurrentRange.Characters(1).Font.Size
LastFontColor = CurrentRange.Characters(1).Font.Color
CharCounter = 0
With CurrentRange
DocEnd = .End - 1
.Collapse wdCollapseStart
Do While .Start < DocEnd
.MoveEnd
CharCounter = CharCounter + 1
If CharCounter Mod 1000 = 0 Then
Debug.Print CharCounter ' so I can see how fast it is
DoEvents
End If

See how many times you are CALCULATING .text below?
If .Text = " " Then
Counter.Spaces = Counter.Spaces + 1
ElseIf .Text = vbTab Then
Counter.Tabs = Counter.Tabs + 1
ElseIf .Text = vbCr Then
Counter.Returns = Counter.Returns + 1

So, lets not! Also, let's use a select case statment instead of
multiple ElseIf's

Dim Text as String

....

Text=.Text
Select Case Text
Case " "
Counter.Spaces = Counter.Spaces + 1
Case vbtab
Counter.Tabs = Counter.Tabs + 1
Case vbCr
Counter.Returns = Counter.Returns +1
.....

Use this principle over and over below.



Else ' else it is a char, the only one where we care about
'bold, font, etc.
Counter.Chars = Counter.Chars + 1
If .Case = wdUpperCase Then
Counter.CapitalChars = Counter.CapitalChars + 1
End If
' note for these, we check transition as well as
'presence.

From here, we are using .Font a few times. Let's just calculate it
ONCE

With .Font
If .Bold Then
Counter.BoldChars = Counter.BoldChars + 1


Use the same principle here as two lines above
If BoldState = False Then

becomes

If Not BoldState then...
Counter.BoldTransitions = _
Counter.BoldTransitions + 1
End If
BoldState = True
Else
If BoldState = True Then

If BoldState then
Counter.BoldTransitions = _
Counter.BoldTransitions + 1
End If
BoldState = False
End If
If .Underline Then
Counter.UnderlineChars = _
Counter.UnderlineChars + 1
If UnderlineState = False Then

If Not...
Counter.UnderlineTransitions = _
Counter.UnderlineTransitions + 1
End If
UnderlineState = True
Else
If UnderlineState = True Then

If .. then
Counter.UnderlineTransitions = _
Counter.UnderlineTransitions + 1
End If
UnderlineState = False
End If
If .Italic Then
repeat changes from above for If's
Counter.ItalicChars = Counter.ItalicChars + 1
If ItalicState = False Then
Counter.ItalicTransitions = _
Counter.ItalicTransitions + 1
End If
ItalicState = True
Else
If ItalicState = True Then
Counter.ItalicTransitions = _
Counter.ItalicTransitions + 1
End If
ItalicState = False
End If

If .Name said:
Counter.FontTransitions = _
Counter.FontTransitions + 1
LastFontName = .Font.Name
End If
 
W

Word Heretic

G'day Edgar E. Cayce <[email protected]>,

Your surmise is correct.

Steve Hudson - Word Heretic
Want a hyperlinked index? S/W R&D? See WordHeretic.com

steve from wordheretic.com (Email replies require payment)


Edgar E. Cayce reckoned:
Steve,

So I can set a String variable to the text and still test for bold,
italic, fonts, etc? I thought that those are only properties of
Ranges, not Strings.

Is it possible to copy the data referred to by a range to a local
variable and keep those properties?

Ed

G'day Edgar E. Cayce <[email protected]>,

It will be as slow as your old method - but it will stay at that speed
all the way through - the other way gets slower AND slower :)

The main problem is you are constantly calculating .Text

Just do it once

Dim Text as String

...

Text=.Text

and use Text in all your If bits

Steve Hudson - Word Heretic
Want a hyperlinked index? S/W R&D? See WordHeretic.com

steve from wordheretic.com (Email replies require payment)


Edgar E. Cayce reckoned:
Argh, am having problems with my news server - when I post it tells me
it failed, and then it posts twice.

Anyway, please excuse my double posts - Free Agent seems to have some
problems.

I tried my counter using WordHeretic's Method #1, and it works very
quickly - until I enable the code that checks italics, bold, fonts,
etc. Then it slows back down to as fast as my original code, or the
Method #2 code that deletes chars from the beginning of the range.

Note that to get Method #1 to work, I needed to change "DocEnd = .End"
to "DocEnd= .End - 1", or my loop just went forever. This may have to
do with the fact that I am counting each StoryRange in the doc
seperately, and they all seem to have a count of 1 more than is
visible - even empty ones such as footnotes (when I have none) have a
count of one char.

My Bold/Italic/etc. counting code is pretty basic, it's just

if .Text.Bold then
[count it...]
elseif .Text.Italic

Etc.

Any better way to do this?

Ed


On Mon, 24 May 2004 12:37:54 -0700, Edgar E. Cayce

Well, this is very weird.

I have not figured out what is making it so slow, but I have shed some
light on why my counts are coming up wrong. It appears that the
.Delete command, when deleting the last char of a word, deletes the
space after it as well. I figured this out by stepping through my
loop with .Visible set for the Word application. Only my double
spaces between sentences were being counted as spaces, all other
spaces were getting deleted along with the last letter of words.

And then, since 2 chars were being deleted for each word, my count of
chars is off so the paragraph marker at the end of the text gets
counted again and again...

Any idea how to keep it from deleting the spaces when deleting the
last char of a word?

Ed


On Sun, 23 May 2004 13:28:27 +1000, Word Heretic

G'day Edgar E. Cayce <[email protected]>,

Hokay.

The best way to move along with ranges is NOT to use an iterated
collection. For example, do not use .Paragraphs(99). Every time you
access a member of the para, words, char etc collections, it is
DYNAMICALLY CALCULATED on the spot. This is important for BOTH of the
below solutions.

Find and Replace kicks the pooper out of this if you are able to use
such methodology. In a similar way, you collapse the find's range to
the end, then refind on whats left in the document. Always use the
smallest scope possible for your ranges.


Solution 1

This is the general purpose method that moves a range along

'Declare

Dim Scanner as Range
Dim DocEnd as Long


'Init

Set Scanner=ActiveDocument.Content
Scanner.Collapse wdCollapseStart

'Dont calc this every damn time, do it once

DocEnd=ActiveDocument.Content.End


'Sensible With structures enormously speed code

With Scanner


'Main loop

'We test for the end

While .Start < DocEnd

.MoveEnd


'Do your tests with Scanner

.Collapse wdCollapseEnd

wend
end with


Set Scanner=Nothing


Solution 2

I used a variation on this for my report on typefaces used in a doc.
It deletes as it goes, so iterating the start of the collection is
always very quick. It is usually only suitable for char by char
analyses.

You just always

With .Characters(1)

'blah blah

.Delete

end with

until the character count is 1 in that StoryRange. Corrupt documents
can play havoc here but that's not something I am willing to go into
at length here :) Hopefully it won't be neccesary in a handful of
years.



Steve Hudson - Word Heretic
Want a hyperlinked index? S/W R&D? See WordHeretic.com

steve from wordheretic.com (Email replies require payment)


Edgar E. Cayce reckoned:

Helmut, Klaus, DA,

Thank you so much for your help.

I am wondering - I need to count font transitions, whenever the face,
color or size of the font changes. Since I don;t know what these
fonts are in advance, can you think of a way to use Helmut's Find
method to count these transitions?

Ed


On Fri, 21 May 2004 02:56:05 -0700, "Helmut Weber"

Hi Edgar,
in addition to Klaus' advice, you may use
search and replace to get the number of
transitions, too:
Don't forget to reset search options before.
' Dim sTime As Long
' sTime = CLng(Timer)
Dim lItalTrns As Long ' transitions
Dim oRng As Range
Set oRng = ActiveDocument.Range
With oRng.Find
.Text = ""
.Font.Italic = True
While .Execute
lItalTrns = lItalTrns + 1
Wend
End With
' MsgBox CLng(Timer) - sTime
MsgBox lItalTrns * 2 ' !
Whereas an excerpt from your code on counting
transitions needs approximately 1 second for
1 page of my test-doc, the above macro needs
1 second for 100 pages. You might have to add
some lines of code for docs that start or end with
italic characters, depending on accuracy.
Greetings from Bavaria, Germany
Helmut Weber, MVP
"red.sys" & chr(64) & "t-online.de"
Word 2002, Windows 2000
 
E

Edgar E. Cayce

I managed to come up with a pretty quick character counter that does
everything I want - if anyone is interested in the code, email me.

Thanks for everyone's help!

Ed
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top