Speeding up slow code

C

Carlos Nunes-Ueno

I work in an accounting firm and for audits we sometimes get general ledger
documents in Excel that we then process to see if there are interesting
patterns, etc.

The problem is that most of the time, these are basically puked into Excel
in print formats by whatever accounting application the client is using.
This means that instead of each transaction having it's own row with all of
the information, some of the information might be in a simulated "header" or
"footer" and there might be random columns inserted willy-nilly to replicate
how it looks on paper.

To address this, I usually hack out a little custom function that takes the
transactions from the raw data and puts them in neat rows in another
worksheet. My issue is that these tend to be fairly slow and resource
intensive. When they run, Excel immediately jumps to 100% CPU usage and
processing around 3500 rows takes five minutes or sometimes more. I'm much
more comfortable with Access VBA than with Excel and therefore I get the
feeling that there is probably a lot of optimization that could done to the
code. I'd appreciate any ideas that could speed up this code.

Anyway, without further ado, here's an example:

Public Sub CleanUp(intMaxRow As Integer, AcctNumCol As Integer, AcctNameCol
As Integer, ParamArray Cols() As Variant)
'Walk down the RawData sheet and look for one of two patterns,
"Account*" or "??##"

'If "Account*" is found, then copy the Account Name and Number to string
variables
'these will be used later for copying into the CleanData sheet.

'If "??##" this row is a transaction, so start building a discontinuous
range using the
'columns specified in the Cols paramarray.

Const StartRow As Integer = 2

Dim wksRawData As Worksheet
Dim wksCleanData As Worksheet
Dim rngRawData As Range
Dim rngTemp As Range

Dim intRawRow As Integer
Dim intCleanRow As Integer
Dim strAcctNum As String
Dim strAcctName As String
Dim intColCnt As Integer

Application.ScreenUpdating = False

intCleanRow = StartRow

Set wksRawData = Application.Worksheets("RawData")
Set wksCleanData = Application.Worksheets("CleanData")

'Walk through the RawData sheet
For intRawRow = 1 To intMaxRow
If wksRawData.Cells(intRawRow, 1) Like "Account*" Then
strAcctNum = Trim(wksRawData.Cells(intRawRow, AcctNumCol))
strAcctName = Trim(wksRawData.Cells(intRawRow, AcctNameCol))
ElseIf wksRawData.Cells(intRawRow, 1) Like "??##" Then

'This row is a transaction, write in the account number and name
wksCleanData.Cells(intCleanRow, 1) = strAcctNum
wksCleanData.Cells(intCleanRow, 2) = strAcctName

'Initialize the beginning of the info range
Set rngRawData = wksRawData.Cells(intRawRow, Cols(0))

'Build the range with the columns we need
For intColCnt = 1 To UBound(Cols)
Set rngTemp = wksRawData.Cells(intRawRow, Cols(intColCnt))
Set rngRawData = Application.Union(rngRawData, rngTemp)
Next intColCnt

Set rngTemp = wksCleanData.Cells(intCleanRow, 3)

rngRawData.Copy rngTemp

intCleanRow = intCleanRow + 1
End If

Next intRawRow

'Release all objects
Set rngTemp = Nothing
Set rngRawData = Nothing
Set wksCleanData = Nothing
Set wksRawData = Nothing

Application.ScreenUpdating = True

End Sub
 
J

Jim Cone

Try the code below.
I have added the rngRow variable (it eliminates some dots).
You should change the row variable declarations from Integer to Long.
I have no way of testing it so please provide some feedback.
Also, some helpful posting tips here...
http://www.cpearson.com/excel/newposte.htm

Public Sub CleanUp(intMaxRow As Integer, AcctNumCol As Integer, _
AcctNameCol As Integer, ParamArray Cols() As Variant)
Const StartRow As Integer = 2
Dim wksRawData As Worksheet
Dim wksCleanData As Worksheet
Dim rngRawData As Range
Dim rngTemp As Range
Dim rngRow As Range

Dim intRawRow As Integer
Dim intCleanRow As Integer
Dim strAcctNum As String
Dim strAcctName As String
Dim intColCnt As Integer

Application.ScreenUpdating = False
intCleanRow = StartRow

Set wksRawData = Application.Worksheets("RawData")
Set wksCleanData = Application.Worksheets("CleanData")

'Walk through the RawData sheet
For intRawRow = 1 To intMaxRow
Set rngRow = wksRawData.Rows(intRawRow).Cells
If rngRow(1) Like "Account*" Then
strAcctNum = Trim(rngRow(AcctNumCol))
strAcctName = Trim(rngRow(AcctNameCol))
ElseIf rngRow(1)(1) Like "??##" Then

'This row is a transaction, write in the account number and name
wksCleanData.Cells(intCleanRow, 1) = strAcctNum
wksCleanData.Cells(intCleanRow, 2) = strAcctName

'Initialize the beginning of the info range
Set rngRawData = rngRow(Cols(0))
'Build the range with the columns we need
For intColCnt = 1 To UBound(Cols)
Set rngTemp = rngRow(Cols(intColCnt))
Set rngRawData = Application.Union(rngRawData, rngTemp)
Next intColCnt

Set rngTemp = wksCleanData.Cells(intCleanRow, 3)
rngRawData.Copy rngTemp
intCleanRow = intCleanRow + 1
End If
Next intRawRow

'Release all objects
Set rngRow = Nothing
Set rngTemp = Nothing
Set rngRawData = Nothing
Set wksCleanData = Nothing
Set wksRawData = Nothing
Application.ScreenUpdating = True
End Sub
--
Jim Cone
San Francisco, USA
http://www.realezsites.com/bus/primitivesoftware
(Excel Add-ins / Excel Programming)





"Carlos Nunes-Ueno"
wrote in message
I work in an accounting firm and for audits we sometimes get general ledger
documents in Excel that we then process to see if there are interesting
patterns, etc.
The problem is that most of the time, these are basically puked into Excel
in print formats by whatever accounting application the client is using.
This means that instead of each transaction having it's own row with all of
the information, some of the information might be in a simulated "header" or
"footer" and there might be random columns inserted willy-nilly to replicate
how it looks on paper.
To address this, I usually hack out a little custom function that takes the
transactions from the raw data and puts them in neat rows in another
worksheet. My issue is that these tend to be fairly slow and resource
intensive. When they run, Excel immediately jumps to 100% CPU usage and
processing around 3500 rows takes five minutes or sometimes more. I'm much
more comfortable with Access VBA than with Excel and therefore I get the
feeling that there is probably a lot of optimization that could done to the
code. I'd appreciate any ideas that could speed up this code.
Anyway, without further ado, here's an example:

Public Sub CleanUp(intMaxRow As Integer, AcctNumCol As Integer, AcctNameCol
As Integer, ParamArray Cols() As Variant)
'Walk down the RawData sheet and look for one of two patterns,
"Account*" or "??##"

'If "Account*" is found, then copy the Account Name and Number to string
variables
'these will be used later for copying into the CleanData sheet.

'If "??##" this row is a transaction, so start building a discontinuous
range using the
'columns specified in the Cols paramarray.

Const StartRow As Integer = 2

Dim wksRawData As Worksheet
Dim wksCleanData As Worksheet
Dim rngRawData As Range
Dim rngTemp As Range

Dim intRawRow As Integer
Dim intCleanRow As Integer
Dim strAcctNum As String
Dim strAcctName As String
Dim intColCnt As Integer

Application.ScreenUpdating = False

intCleanRow = StartRow

Set wksRawData = Application.Worksheets("RawData")
Set wksCleanData = Application.Worksheets("CleanData")

'Walk through the RawData sheet
For intRawRow = 1 To intMaxRow
If wksRawData.Cells(intRawRow, 1) Like "Account*" Then
strAcctNum = Trim(wksRawData.Cells(intRawRow, AcctNumCol))
strAcctName = Trim(wksRawData.Cells(intRawRow, AcctNameCol))
ElseIf wksRawData.Cells(intRawRow, 1) Like "??##" Then

'This row is a transaction, write in the account number and name
wksCleanData.Cells(intCleanRow, 1) = strAcctNum
wksCleanData.Cells(intCleanRow, 2) = strAcctName

'Initialize the beginning of the info range
Set rngRawData = wksRawData.Cells(intRawRow, Cols(0))

'Build the range with the columns we need
For intColCnt = 1 To UBound(Cols)
Set rngTemp = wksRawData.Cells(intRawRow, Cols(intColCnt))
Set rngRawData = Application.Union(rngRawData, rngTemp)
Next intColCnt

Set rngTemp = wksCleanData.Cells(intCleanRow, 3)

rngRawData.Copy rngTemp

intCleanRow = intCleanRow + 1
End If
Next intRawRow

'Release all objects
Set rngTemp = Nothing
Set rngRawData = Nothing
Set wksCleanData = Nothing
Set wksRawData = Nothing
Application.ScreenUpdating = True
End Sub
 
J

Jim Cone

Correction...

ElseIf rngRow(1)(1) Like "??##" Then

Should read...

ElseIf rngRow(1) Like "??##" Then
'--
Jim Cone
 
C

Carlos Nunes-Ueno

Hi Jim,

Thanks for the reply and the link to the etiquette guide. I've been crazy
busy so I just finally got to check the group again. That and the news
server I had been using refused to show the post for days.

I tried your modified code (including changing the integers to longs) and it
definitely was faster. Running about 3300 rows took about two minutes this
time, and last time it would have been around five or so. I still wouldn't
cherish running 50,000 rows or anything like that but this is definitely an
improvement.

Just one more general question, is it usually faster to use a range object
than the .cells property of the worksheet for grabbing rows like this?

Thanks,

Carlos
 
J

Jim Cone

Carlos,
Eliminating dots speeds up code. That is especially helpful when
using loops as the effect is multiplied by the number of loops used.
If you can get a dot out of a loop do it.
Otherwise (not in a loop), I generally use a Range object or a With statement
if three or more dots can be replaced.

And from a post by Tushar Mehta, MS MVP, about 4 years ago...
'------------------
10,000 loops consisting of 4 statements, each setting the same variable
to a different range:
Time Rank
Set r = Range("$A$1", "$A$1") 0.090113 1
Set r = Range("A1", "A1") 0.1058 2

Set r = Range("$A$1") 0.177712 3
Set r = Range("A1") 0.180887 4
Set r = Cells(1, 1) 0.19815 5

Set r = Cells(1, "A") 0.308837 6
Set r = [A1] 0.621438 7

[The times are the average over 10 cycles and represent the time for
40,000 Set operations and one procedure call, since each test was in a
separate procedure.]
'------------------
Jim Cone
San Francisco, USA
http://www.realezsites.com/bus/primitivesoftware
(Excel Add-ins / Excel Programming)



"Carlos Nunes-Ueno"
wrote in message
Hi Jim,
Thanks for the reply and the link to the etiquette guide. I've been crazy
busy so I just finally got to check the group again. That and the news
server I had been using refused to show the post for days.

I tried your modified code (including changing the integers to longs) and it
definitely was faster. Running about 3300 rows took about two minutes this
time, and last time it would have been around five or so. I still wouldn't
cherish running 50,000 rows or anything like that but this is definitely an
improvement.
Just one more general question, is it usually faster to use a range object
than the .cells property of the worksheet for grabbing rows like this?

Thanks,
Carlos



"Jim Cone"
wrote in message
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top