XL2003: VBE Won't let me break into running code

C

CTB

Hello all,

I'm trying to manipulate huge text files (1 - 4 GB each) with the
TextStream object. Each file contains 6 months of data...I'm trying
to create 6 smaller files with one month of data each. The first step
in the process is reading through the whole file one line at a time so
I can get a total number of lines (any advice to do this faster/
alternate way would be greatly appreciated...I looked, didn't see a
"line count" property of the TextStream or File object so I could
avoid reading whole file). This part of the first file (smallest
file) took 5 - 10 mins to run. I use the total line count for
progress reporting in Excel's progress bar for the next
part...building the 6 monthly files from the main file.

At some point when the code is running (I don't know what point it
is...might change each time code is run), the VBE will no longer let
me break into running code ([Ctrl] + [Pause/Break]) for debuging...it
just stops running code...it no longer offers me the option to stop or
debug.

The time before last, I had a breakpoint set just after getting total
line count and a "Break when value is true" watch set just a few lines
before finishing the first monthly file (roughly 1/6th of the way
through the 2nd full read through of the main file) so I could debug
the code closing the first monthly file, creating the 2nd monthly
file, then continuing to write data. Breakpoint broke into code
without problems (VBE not yet tripped up), but "Break..." watch was
ignored.

This last time running code, it made it through the Total Line Count
run through, but threw an error (Data Type mismatch...if I remember
correctly). Instead of giving me the option to stop or debug...it
just stopped the code...I really need to be able to debug this.

XL/VBE goes into "Not Responding" mode (but code is still running)
usually when Windows Focus changes to another program. Maybe that is
when the VBE trips up and won't let a break-in.

Anyone else have issues with this? Is there anything that can be done
to fix this (other than staring at XL the whole time code is running
hoping a reminder won't pop up to take focus away from XL...hoping XL
won't go into "Not Responding" mode)? Is XL 2007's VBE more stable
that it will be able to handle this better (I do have XL2007 installed
on this machine)?

Thanks for any help anyone can provide,

CTB
 
G

GS

I've never worked with text files as large as GBs but I know that MB
files process fairly quickly using arrays and normal VB I/O (as opposed
to FileSystemObject).

Essentially, I read the entire file into a Variant variable in one shot
and then use the Split() function to dump it into another Variant
variable, resulting in a dynamic array. If the file contains data
records AND the first line contains the data fieldnames, the UBound of
the array is the record count. If the first line doesn't contain
fieldnames then the line count is the UBound+1.

To parse the data into separate months would be no problem by looping
the array checking which month each element belongs to and dumping that
into a String variable. If you set up a String variable for each month
then you should only have to loop the array once, and use a Select Case
construct to build the separate month strings.

Once you've parsed the file into appropriate month strings you can
write them back to individual files in one shot (each).

Keep in mind that handling files of this size will burden resources
some, but if your machine is fit for such tasks it should be no problem
to do.

You can run a progress bar in the array loop, OR report the progress in
the StatusBar for each line as it gets processed.

Summary:
1. Read source file into array in one shot.
2. Parse data into separate month strings using one loop.
3. Write month strings in one shot to separate files.
 
J

Jim Cone

It sounds as if you are running out of memory.
Do not use "ReadAll". It will do that.

Use: "Application.EnableCancelKey = Interrupt" and use DoEvents inside of any loops.

Your posted code did not show up in my newsreader.
The following function will take < 10 minutes to return the number of lines in a text file...
"---
Function NumberOfLines(sFilePath As String) As Long
Dim fs As Object
Dim f As Object
Set fs = CreateObject("Scripting.FileSystemObject")
Set f = fs.OpenTextFile(sFilePath, 8) ' 8 -> Appending
NumberOfLines = f.Line - 1
f.Close
Set f = Nothing
Set fs = Nothing
End Function
'---
Jim Cone
Portland, Oregon USA .
http://www.mediafire.com/PrimitiveSoftware .
(free and commercial excel programs)






"CTB" <[email protected]>
wrote in message
Hello all,

I'm trying to manipulate huge text files (1 - 4 GB each) with the
TextStream object. Each file contains 6 months of data...I'm trying
to create 6 smaller files with one month of data each. The first step
in the process is reading through the whole file one line at a time so
I can get a total number of lines (any advice to do this faster/
alternate way would be greatly appreciated...I looked, didn't see a
"line count" property of the TextStream or File object so I could
avoid reading whole file). This part of the first file (smallest
file) took 5 - 10 mins to run. I use the total line count for
progress reporting in Excel's progress bar for the next
part...building the 6 monthly files from the main file.

At some point when the code is running (I don't know what point it
is...might change each time code is run), the VBE will no longer let
me break into running code ([Ctrl] + [Pause/Break]) for debuging...it
just stops running code...it no longer offers me the option to stop or
debug.

The time before last, I had a breakpoint set just after getting total
line count and a "Break when value is true" watch set just a few lines
before finishing the first monthly file (roughly 1/6th of the way
through the 2nd full read through of the main file) so I could debug
the code closing the first monthly file, creating the 2nd monthly
file, then continuing to write data. Breakpoint broke into code
without problems (VBE not yet tripped up), but "Break..." watch was
ignored.

This last time running code, it made it through the Total Line Count
run through, but threw an error (Data Type mismatch...if I remember
correctly). Instead of giving me the option to stop or debug...it
just stopped the code...I really need to be able to debug this.

XL/VBE goes into "Not Responding" mode (but code is still running)
usually when Windows Focus changes to another program. Maybe that is
when the VBE trips up and won't let a break-in.

Anyone else have issues with this? Is there anything that can be done
to fix this (other than staring at XL the whole time code is running
hoping a reminder won't pop up to take focus away from XL...hoping XL
won't go into "Not Responding" mode)? Is XL 2007's VBE more stable
that it will be able to handle this better (I do have XL2007 installed
on this machine)?

Thanks for any help anyone can provide,

CTB
 
S

spilly39

I'd like to see the code you are using to read the lines
- shouldn't be more than a few lines of CODE (as opposed to DATA that is!)

I've handled large log files, tho' not in the GB range

It's vital you do not read the whole file, but just a block at a time and
within that only one logical line at a time. My issue was that some data
files were exUnix (Lf) and other data ex Windoze (CrLf line terminated) and
VB Line Input statement needs CRLF data.

My solution was to write a class (UnixDosReader) with properties of
filename, size etc, and methods of fOpen, GetNextLine and fClose

I'm looking at my code now (not having used it since I retired)
I see I handled the file unblocking in VBA myself if it was a Unix file,
reading one block of data (of whatever size I chose) at a time.

What I did was to scan the 1st lump of the data at file open time to
determine if it was LF or CRLF data
If CRLF in block 1, then my GetNextLine method reads the file one line at at
time using std VB IO code
Line Input #pFileNo, dataLine

If it's Unix it gets a bit more complicated, with a Private Sub inside the
class:
ReadaBlock(s as String, Size as Integer) 'size is size of block buffer
which internally did
Get #pFile,,s '(read into s String argument)
You have to gwt it right at the end of the file, which doesn't fill your
standard blocksize.

Altogether there's 273 lines of code (of which the 1st 94 are solid
comments)

I could send the code - it's a bit long to post here

BUT I have one little nagging worry...
FileSize is typed as Long, which on my 32 bit machine only gets to 2GB. I
assume you're on a 64 bit box? Just checking..
You'll have to look carefully at any such Typing matters if you want to use
my code

spilly



CTB said:
Hello all,

I'm trying to manipulate huge text files (1 - 4 GB each) with the
TextStream object. Each file contains 6 months of data...I'm trying
to create 6 smaller files with one month of data each. The first step
in the process is reading through the whole file one line at a time so
I can get a total number of lines (any advice to do this faster/
alternate way would be greatly appreciated...I looked, didn't see a
"line count" property of the TextStream or File object so I could
avoid reading whole file). This part of the first file (smallest
file) took 5 - 10 mins to run. I use the total line count for
progress reporting in Excel's progress bar for the next
part...building the 6 monthly files from the main file.

At some point when the code is running (I don't know what point it
is...might change each time code is run), the VBE will no longer let
me break into running code ([Ctrl] + [Pause/Break]) for debuging...it
just stops running code...it no longer offers me the option to stop or
debug.

The time before last, I had a breakpoint set just after getting total
line count and a "Break when value is true" watch set just a few lines
before finishing the first monthly file (roughly 1/6th of the way
through the 2nd full read through of the main file) so I could debug
the code closing the first monthly file, creating the 2nd monthly
file, then continuing to write data. Breakpoint broke into code
without problems (VBE not yet tripped up), but "Break..." watch was
ignored.

This last time running code, it made it through the Total Line Count
run through, but threw an error (Data Type mismatch...if I remember
correctly). Instead of giving me the option to stop or debug...it
just stopped the code...I really need to be able to debug this.

XL/VBE goes into "Not Responding" mode (but code is still running)
usually when Windows Focus changes to another program. Maybe that is
when the VBE trips up and won't let a break-in.

Anyone else have issues with this? Is there anything that can be done
to fix this (other than staring at XL the whole time code is running
hoping a reminder won't pop up to take focus away from XL...hoping XL
won't go into "Not Responding" mode)? Is XL 2007's VBE more stable
that it will be able to handle this better (I do have XL2007 installed
on this machine)?

Thanks for any help anyone can provide,

CTB
 
G

GS

spilly39 expressed precisely :
I'd like to see the code you are using to read the lines
- shouldn't be more than a few lines of CODE (as opposed to DATA that is!)

I've handled large log files, tho' not in the GB range

It's vital you do not read the whole file, but just a block at a time and
within that only one logical line at a time. My issue was that some data
files were exUnix (Lf) and other data ex Windoze (CrLf line terminated) and
VB Line Input statement needs CRLF data.

My solution was to write a class (UnixDosReader) with properties of filename,
size etc, and methods of fOpen, GetNextLine and fClose

I'm looking at my code now (not having used it since I retired)
I see I handled the file unblocking in VBA myself if it was a Unix file,
reading one block of data (of whatever size I chose) at a time.

What I did was to scan the 1st lump of the data at file open time to
determine if it was LF or CRLF data
If CRLF in block 1, then my GetNextLine method reads the file one line at at
time using std VB IO code
Line Input #pFileNo, dataLine

If it's Unix it gets a bit more complicated, with a Private Sub inside the
class:
ReadaBlock(s as String, Size as Integer) 'size is size of block buffer
which internally did
Get #pFile,,s '(read into s String argument)
You have to gwt it right at the end of the file, which doesn't fill your
standard blocksize.

Altogether there's 273 lines of code (of which the 1st 94 are solid comments)

I could send the code - it's a bit long to post here

BUT I have one little nagging worry...
FileSize is typed as Long, which on my 32 bit machine only gets to 2GB. I
assume you're on a 64 bit box? Just checking..
You'll have to look carefully at any such Typing matters if you want to use
my code

spilly


Thanks for stepping in!
I ran into the same issues with reading larger files in all at once.
Like you I've never done GB-size files but have had to read files as
large as 10MB in blocks as you describe. For me, however, this was a
onetime workaround and so I never bothered to refine it beyond that.

Your solution sounds exactly what this OP needs. I'd also be very
interested if you're willing to share.
 
S

spilly39

GS said:
spilly39 expressed precisely :


Thanks for stepping in!
I ran into the same issues with reading larger files in all at once. Like
you I've never done GB-size files but have had to read files as large as
10MB in blocks as you describe. For me, however, this was a onetime
workaround and so I never bothered to refine it beyond that.

Your solution sounds exactly what this OP needs. I'd also be very
interested if you're willing to share.

--
Garry

Free usenet access at http://www.eternal-september.org
ClassicVB Users Regroup! comp.lang.basic.visual.misc

Hi Garry

That was quick!

I see no reason not to share.
In fact I'd be willing to post an attachment here if that's allowed, but I
doubt it is.

I'm not a big NG man, so I'm unsure of the Netiquette;
I'm really only subscribed (today!) becuase of another prob I've got
but when I saw this thread, it was a dead cert for my UnixDosReader class...
Oh sod it! I'll try an attachment anyway - it's got the usual chances:
Success or Failure!
Health warning:
I'm a totally self taught VBA afficionado.
I'm always impressed with my code when I write it
but never impressed when I read it again two years later

emails from me to GS (those are MY initials!!) at somehwere dot net ain't
likely to reach you.
However if you email me on spilly39 buttons (my other prob is to do with
ActiveX buttons), that's a valid but disposable email address, & I should
get your mail.

Listening out
spilly
 
C

Clif McIrvin

spilly39 said:
[ ]

Hi Garry

That was quick!

I see no reason not to share.
In fact I'd be willing to post an attachment here if that's allowed,
but I
doubt it is.

I'm not a big NG man, so I'm unsure of the Netiquette;
I'm really only subscribed (today!) becuase of another prob I've got
but when I saw this thread, it was a dead cert for my UnixDosReader
class...
Oh sod it! I'll try an attachment anyway - it's got the usual chances:
Success or Failure!
Health warning:
I'm a totally self taught VBA afficionado.
I'm always impressed with my code when I write it
but never impressed when I read it again two years later

Boy, that song sure sounds familiar! <grin>

300 lines of code? Just copy / paste into your reply and we'll all get
it just fine!

(as you may know by now, the attachment didn't make it through the
eternal september servers)

Welcome to the NGs! [Good tutors in this classroom!]
 
S

spilly39

Oh! I thought it had. My reader shows the attachment fine

OK, Here's the code anyway (and I've snipped the other thread text)
Enjoy!

===========
'NB THIS IS A .CLS, not a .BAS

'How to use
'Dim UnixFile As UnixDosReader
'Dim Buff as String 'record area
'
' 'Initialisation: Create new instance of Class
' set UnixFile = New UnixDosReader
'
'Open file
'
' UnixFile.fOpen CStr(FilesToOpen(Input_File))
'
'
'
'Process to end of file
'
' Do Until EOF(UnixFile.FileNo)
' UnixFile.GetNextLine Buff
' If Err.Number = 0 Then
'
' end if
' Loop
'Close
' UnixFile.fClose
'
'
Option Explicit
'**************************************************************
' This class reads sequential files written to
' EITHER Unix OR DOS line terminator standards
' It is NOT suitable for files without a CR, LF or CRLF row terminator
' (e.g. where records have a record length rather than a terminator)
'
' It behaves the same as the standard VB 'Line Input' statement
' In particular, the EOF flag is set when the last data line
' is returned.
'
' TO USE:
' =======
' Instantiate UnixDosReader in the normal way (Set obj as New UnixDosReader)
'
' Methods:
' --------
' expression.fOpen(FullFileName) 'Required, String. Holds the fully
qualified filename
' Check the Err object
after each fOpen method
' (see ERROR HANDLING
below)
' expression.GetNextLine(dataLine) 'Required, String. On return holds the
next line of data
' expression.fClose 'No arguments. Closes the file and
frees the unit number
'
' Properties:
' -----------
' FileNo ReadOnly Long. Returns the file
number see FreeFile() function
' To test for EOF:
' If EOF(obj.FileNo) = True then...
' FileSize ReadOnly Long. Returns the filesize
in Bytes
' FullFileName ReadOnly String. Returns the fully
qualified Name (Drive, Path & FileName)
' FileName ReadOnly String. Returns the FileName
and extension
'
' NB: ERROR HANDLING
' An error may occur in the fOpen method, if the file cannot be opened.
' The user must check the Err.Number property after each call to fOpen
'
' obj.fOpen myPath & "\" & myFileName
' if Err.Number > 0 then
' ... your error handling code
' e.g.
' Err.Raise Err.Number
' Err.Clear
' End If
'
' Note that use of the 'On Error' statement is not explicitly required.
' Your error handling code may be as simple as the example shown
' Err.Raise Err.Number
' Err.Clear
' This will cause the standard VB Run-Time Error
' NB You should call Err.Clear at the end of your error handler
'
'
' Written by Graham Spillman, DEJCP, July 2002
' 01732 475304
'***************************************************************

'History
'28/08/02 Fix to set EOF on reading a file of zero bytes length
'22/03/04 Fix to cope with block null padding to fixed size
' The last row returned from the file (of all nulls) can be detected
by
' if len(Record_Buffer) <> 0 then
' If Asc(Left(Record_Buffer, 1)) = 0 Then (probably) Exit Do
' end if
'26/03/04 Moved the code to size the Buffer from fOpen into ReadaBlock
' There is a tiny risk that reading a Unix file of a certain size
' might exit prematurely from GetNextLine2

'exposed properties
Private pFileNo As Long
Private pFileSize As Long
Private pFullFileName As String
Private pFileName As String

'Private variables
'File Control area
Private BytesRead As Long
Private BuffPos As Integer
Private LineDelim As String
Private Buffer As String

Private OneByte As String

Private Const MaxBlock As Integer = 1024 * 4 'defines the block size
for reading

Property Get FileNo() As Long
FileNo = pFileNo
End Property
Property Get FileSize() As Long
FileSize = pFileSize
End Property
Property Get FullFileName() As String
FullFileName = pFullFileName
End Property
Property Get Filename() As String
Filename = pFileName
End Property
Sub fOpen(FullFileName As String)
'On return user must test Err.Number for standard VB file open errors
Dim i As Integer
i = FreeFile
On Error Resume Next
Open FullFileName For Input As #i
If Err.Number <> 0 Then Exit Sub 'I failed to open the file

Close #i
pFileNo = i
Open FullFileName For Binary As #pFileNo
're-initialise the file control area
BytesRead = 0
BuffPos = 0
'26/03/04 2 lines code moved to ReadaBlock. grjs
'Buffer = String(MaxBlock, " ")
'OneByte = String(1, " ")
LineDelim = ""

'setup FullFileName, FileName & FileSize properties
pFullFileName = FullFileName
pFileName = fGetShortName(FullFileName)
pFileSize = LOF(pFileNo)

End Sub
Sub fClose()
Close pFileNo
pFileNo = 0
BytesRead = 0
BuffPos = 0
Buffer = ""
OneByte = ""
LineDelim = ""
End Sub
Sub GetNextLine(dataLine As String)
'GetNextLine(varname)
'Written to handle Unix text files as quickly & conveniently as PC format
files
'May be used as a direct replacement for Line Input #pFileNo, varname e.g.
' obj.GetNextLine(varname) is the exact equivalent of
' Line Input #pFileNo, varname
'In particular, EOF is set correctly, so that the following is valid
'
' Do until obj.EOF(pFileNo)
' obj.GetNextLine(varname)
' ...
' Loop
'GRJS Jul 2002

Select Case pFileNo
Case 0, Is > 511
Err.Raise vbObjectError + 100, "UnixDosRead FATAL Error", _
"GetNextLine called without using fOpen
method to open the file"
End
End Select

'this block examines the file one byte at a time, until the 1st
terminator cahracter is seen
'it is only executed for the first record;
'thereafter BytesRead is > 0, and performance improves
If BytesRead = 0 Then
'first I-O for this File - locate EOL delimiter
dataLine = ""
' Stop
Do Until Right(dataLine, 1) = vbCr _
Or Right(dataLine, 1) = vbLf _
Or EOF(pFileNo) _
Or pFileSize = 0 '28/08/02 allow for zero length
files; added final Or term
ReadaBlock OneByte, 1
dataLine = dataLine & OneByte
Loop

LineDelim = Right(dataLine, 1)
'dataLine = Left(dataLine, Len(dataLine) - 1)
'28/08/02 allow for zero length files
dataLine = Left(dataLine, Len(dataLine) - Len(LineDelim))
'28/08/02 allow for zero length files
Select Case LineDelim
Case vbLf
Case vbCr
ReadaBlock Buffer, MaxBlock
If Left(Buffer, 1) = vbLf Then
LineDelim = "L" 'suitable for
native Line Input
Close pFileNo 'so close the
file & re-open it
Open pFullFileName For Input As #pFileNo 'and handle
entire file in GetNextLine2
Buffer = ""
Call GetNextLine2(dataLine)
Exit Sub
End If
End Select
If Loc(pFileNo) = pFileSize Then
OneByte = Input(1, #pFileNo)
Exit Sub
End If
ReadaBlock Buffer, MaxBlock
Else
'this is the normal code to read all except the first record
Call GetNextLine2(dataLine)
End If

End Sub
Private Sub GetNextLine2(dataLine As String)

Dim eol As Integer 'EndOfLine pointer

'Normal condition executes this block only
If LineDelim = "L" Then 'the file's delimiters are suitable for Line
Input
Line Input #pFileNo, dataLine
Else 'the file's delimiter is vbLf (Unix)
or vbCr (Mac)
'unblock a record from the current block
dataLine = ""
eol = InStr(BuffPos + 1, Buffer, LineDelim)
Do Until eol <> 0 'eol=0 when a record crosses block
boundaries
If BytesRead + Len(Buffer) = pFileSize Then Exit Do
'must get next block for continuation
dataLine = dataLine & Right(Buffer, Len(Buffer) - BuffPos)
ReadaBlock Buffer, MaxBlock
BuffPos = 0
'find next delimiter
eol = InStr(BuffPos + 1, Buffer, LineDelim)
Loop
If eol <> 0 Then
dataLine = dataLine & Mid(Buffer, BuffPos + 1, eol - BuffPos -
1)
Else
dataLine = dataLine & Buffer
eol = Len(Buffer)
End If
BuffPos = eol
If BuffPos = Len(Buffer) And Loc(pFileNo) = pFileSize Then
'set EOF condition
OneByte = Input(1, #pFileNo)
End If
End If
End Sub
Private Sub ReadaBlock(s As String, Size As Integer)

BytesRead = BytesRead + Size
If BytesRead > pFileSize Then
BytesRead = BytesRead - Size 'restore orig value of
BytesRead
s = String(pFileSize - BytesRead, " ") 'adjust size of final block
'26/03/04 code to size Buffer moved here grjs
Else
If Len(s) <> Size Then s = String(Size, " ") '1st time sizes
Buffer
End If

Get #pFileNo, , s 'read the data

End Sub
Private Function fGetShortName(FullFileName As String) As String
Dim i As Integer
i = Len(FullFileName)
Do Until Mid(FullFileName, i - 1, 1) = "\"
i = i - 1
Loop
fGetShortName = Right(FullFileName, Len(FullFileName) - i + 1)
End Function

===========
 
G

GS

Clif McIrvin formulated on Wednesday :
spilly39 said:
[ ]

Hi Garry

That was quick!

I see no reason not to share.
In fact I'd be willing to post an attachment here if that's allowed, but I
doubt it is.

I'm not a big NG man, so I'm unsure of the Netiquette;
I'm really only subscribed (today!) becuase of another prob I've got
but when I saw this thread, it was a dead cert for my UnixDosReader
class...
Oh sod it! I'll try an attachment anyway - it's got the usual chances:
Success or Failure!
Health warning:
I'm a totally self taught VBA afficionado.
I'm always impressed with my code when I write it
but never impressed when I read it again two years later

Boy, that song sure sounds familiar! <grin>

300 lines of code? Just copy / paste into your reply and we'll all get it
just fine!

(as you may know by now, the attachment didn't make it through the eternal
september servers)

Welcome to the NGs! [Good tutors in this classroom!]

Actually, I did get the attachment. As you can see, I'm using MesNews
as my reader and so attachments are supported. It shows up as a link
("File: unixdosreader.cls") in my message window.
 
G

GS

Hi Graham,
Thanks so much! I got your attachment as an attachment in the message
window.

This looks well thought out. The detailed comments are a blessing,
though this will take some time to digest. Can't wait to do a test
drive...<bg>
 
C

Clif McIrvin

spilly39 said:
Oh! I thought it had. My reader shows the attachment fine

OK, Here's the code anyway (and I've snipped the other thread text)
Enjoy!

[ ]

My bad. I'm using OE; and I completely missed the "This message has an
attachment" flag.

When I went back and checked, sure enough ... I can get the original
attachment also.

Thanks much for sharing your work!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top