VBSript to read multiple CSV files & write out to 1 CSV file

L

LenJr

Not sure if this is the right forum to post this question but here goes.
I have many CSV files that I need to read and pull some data from them and
write out one CSV file. I will need to search through folders set up like
this:
C:\Test_ResourcePlanning\2008\08\06\Acquisition_Project_Team
C:\Test_ResourcePlanning\2008\08\06\Commercial_Team
So basically the CSV files would be in the last "Team" folder listed. So
only CSV files would under the 2008 and I have one for 2007 and will have one
for 2009. In the example above the 08 is the month and the 06 is the first
Monday on the month. The CSV file is comma delimited with headings. I would
like to only write out certain records and fields. For example:
EmpID, Team, PMID, Date, Type
I would like to write out the EmpId, Team, Date, Type fields and only the
records with Type = 0. Once all files are read and written to the 1 CSV
file I would like to zip(WinZip) it. The final CVS file will have about
250,000 records on it and will eventually be imported into an Access
database. In which on open I will run a script to import the zipped csv file
and unzip and load into an Access table. The reason for this is the CSV file
will be on a shared server and I have people using VPN and the response time
using an access database on the server is VERY slow. The application runs
much better with the data on the local drive. With the data in CSV format
and beeing zipped it is only about 600k and can be imported via VPN in a
reasonable time.

Thanks,
Len
 
L

LenJr

Thanks for the input Gordon,
Here is an example of a record:
"EmpID","Team","PMID",Date,Type
Text fields have double "s and numeric fields do not.

My issue is the size of the zip file that will be imported into the
application so I really would like(if possible) to edit the CSV files before
I zip them to the 1 final cvs file. So that is why I would like to exclude
records with Type not = 0. This script will be running after hours on the
server using a scheduler so I am not too worried about time. I think I am Ok
once I get the zipped file. I have code to unzip and code to import the CSV
file to a table in the Access DBs.

Len
 
L

LenJr

Thanks again Gordon,
Having a little trouble though...
getting an error on this line:
Open mySearchPath & "maker.bat" For Output As #fh
"Expected end of statement" 800A0401

Any idea?
 
G

gllincoln

Hi Len,

Umm... a failure to communicate here. Are you running this as an ASP or
Wscript execution? I wrote it as an Access function - VBA, not VBScript.

VBScript does not support the older DOS file open syntax, you have to use
the filescriptingobject. Also the various Dim statements may fail because
VBScript only supports variant type variables.

If you create a code module in Access - paste the function into the module,
use a macro to run it, it will work. If I have the time after I finish what
I am working on, i will try to convert the function to *.vbs code.

Cordially,
Gordon
 
G

gllincoln

HI Len,

Here is an update. I was a little tired when I posted that last version. I left the 'as string in the const declarations. That probably won't work.

Hope this helps,
Gordon


==================================================

Const mySearchPath = "C:\Test_ResourcePlanning\"
Const myTargetFile = "C:\Test_ResourcePlanning\target.txt"
Const ForReading = 1, ForWriting = 2, ForAppending = 8

Dim myHdr 'stores the header
Dim myRow 'holds a row of data
Dim s 'the current file name
Dim myArray(5) 'stores the 5 cols
Dim myLen 'stores the char count of row
Dim ptr 'line marker
Dim x 'loop counter
Dim y 'array index var
Dim myExportLine 'the output row to target.txt
Dim WShell 'Shell object
Dim zf 'count the files processed
Dim zr 'count the records found

Dim fso 'filesystem object
Dim myIn 'import file object
Dim myOut 'export file object
Dim myWork 'work file object

Dim flag 'output this row?

Set fso = CreateObject("Scripting.FileSystemObject")
Set myWork = fso.CreateTextFile(mySearchPath & "maker.bat", True)
myWork.WriteLine("Dir " & mySearchPath & "*.csv /s /b > " & mySearchPath & "filelist.txt")
myWork.Close

Set WShell = WScript.CreateObject("WScript.Shell")
WShell.Run mySearchPath & "maker.bat", 3, True
Set WShell = Nothing

Set myWork = fso_OpenTextFile(mySearchPath & "filelist.txt", ForReading)
Set myOut = fso.CreateTextFile(mySearchPath & "target.txt", True)

myHdr = Chr(34) & "EmpID" & Chr(34) & Chr(44) & Chr(34) & "Team" & Chr(34) & Chr(44) & Chr(34)
myHdr = myHdr & "PMID" & Chr(34) & Chr(44) & "Date" & Chr(44) & "Type"

myOut.WriteLine(myHdr)

'Read our file list
Do While Not myWork.AtEndOfStream
s = myWork.ReadLine
zf = zf + 1
'open a source file
Set myIn = fso_OpenTextFile(s, ForReading)
Do While Not myIn.AtEndOfStream
y = 5
'read a row
myRow = myIn.ReadLine
myLen = Len(myRow)
'skip a header
If Right(Trim(myRow), 4) <> "Type" Then
flag = True
For x = myLen To 1 Step -1
'start of col marker
If Mid(myRow, x, 1) = Chr(44) Then
'the type col
If y = 5 Then
myArray(y) = Trim(Mid(myRow, x + 1))
'store boundary marker -1
ptr = x - 1
'if not type = 0 skip row
If myArray(y) <> 0 Then
flag = False
Exit For
Else
'move index for next col
y = y - 1
End If
'are we there yet daddy?
ElseIf y > 1 Then
myArray(y) = Mid(myRow, x + 1, ptr - x)
'new boundary marker - 1
ptr = x - 1
'next array index
y = y - 1
End If
End If
Next
Else
'hdr row - skip it
flag = False
End If
If flag = True Then
'the rest of the data is the first col
myArray(y) = Left(myRow, ptr)
myExportLine = ""
'build & write the row to target
zr = zr + 1
For x = 1 To 4
myExportLine=myExportLine & myArray(x) & Chr(44)
Next
myExportLine=myExportLine & myArray(5)
myOut.WriteLine(myExportLine)
End If
'get the next row
Loop
myIn.Close
'get the next source file
Loop
myWork.Close
myIn.Close
myOut.Close
MsgBox "We processed " & zf & " files and exported " & zr & " records.", vbInformation, "CSV_Concat Results"
==============================================================
 
L

LenJr

Hi Gordon, Sorry I have not responded...had issues getting the reply to
work...well anyway thanks for the update....I did notice that....no biggy.
Also, I changed things around a bit...I used your code to produce the file
list but then I used the following to process the data. There are many ways
to skin a cat...what do you think?

Const adOpenStatic = 3
Const adLockOptimistic = 3
Const adCmdText = &H0001

Set objConnection = CreateObject("ADODB.Connection")
Set objRecordSet = CreateObject("ADODB.Recordset")
Set objRecordSet1 = CreateObject("ADODB.Recordset")


Do While Not myWork.AtEndOfStream
s = myWork.ReadLine
' MsgBox s
Dim strPathName
Dim strFileName
Dim intFileNameStart
Dim intPathLen
intPathLen = Len(s)
intFileNameStart = InStrRev(s,"\",-1)
strFileName = Right(s,intPathLen - intFileNameStart)
strPathName = Left(s,intFileNameStart - 1)


objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & strPathName & ";" & _
"Extended Properties=""text;HDR=YES;FMT=Delimited'"""


objRecordset.Open "SELECT * FROM [" & strFileName & "] WHERE SubNbr =
0", _
objConnection, adOpenStatic, adLockOptimistic, adCmdText

Do Until objRecordset.EOF
myExportLine = Chr(34) & objRecordset.Fields.Item("Team")
myExportLine = myExportLine & Chr(34) & Chr(44) &
objRecordset.Fields.Item("DtStart")
myExportLine = myExportLine & Chr(44) & Chr(34) &
objRecordset.Fields.Item("Desc")
myExportLine = myExportLine & Chr(34) & Chr(44) &
objRecordset.Fields.Item("Week1")
myExportLine = myExportLine & Chr(44) & Chr(34) &
objRecordset.Fields.Item("EmpName")
myExportLine = myExportLine & Chr(34) & Chr(44) & Chr(34) &
objRecordset.Fields.Item("RollUpTeam")
myExportLine = myExportLine & Chr(34) & Chr(44) & Chr(34) &
objRecordset.Fields.Item("Type") & Chr(34)
myOut.WriteLine(myExportLine)
objRecordset.MoveNext
Loop
objRecordset.Close
objConnection.Close
Loop


myWork.Close
myIn.Close
myOut.Close
 
G

gllincoln

Hi Len,

Interesting alternative - I'm curious about the relative speed of the
scripting object vs. the Access ADODB use of the CSV data. Any chance you
could do a run time comparison?

I would think that the file scripting object would be faster for this kind
of task but I could be wrong.

Cordially,
Gordon
 
L

LenJr

Hey Gordon, I would be happy to run a time comparison, but the problem I am
having with your code is I think the sample data I gave you was too simple of
an example....I am having problems working my actual data. So I think I need
more help. Here is the actual data...the first record is the header row and
the second is the corresponding data:

"EmpID","Team","PM1ID","AcctMgrID","DtStart","ItemNbr","SubNbr","Desc","Week1","Week2","Week3","Week4","Comment","EmpName","PM1Name","AcctMgrName","RollUpTeam","Type","Role"

"E999999","Architecture Team","E111111","E222222",2/4/2008
0:00:00,1,0,"Total Planned Hours",30.00,36.00,20.00,10.00,,"David
Smith","Robert Smith","Tom Smith ","Delivery Applications","T","Staff"

The columns that I need in the output are:
Team, DtStart, Desc, Week1, EmpName, RollUpTeam, Type
And this output is based on the SubNbr = 0.

Thanks again and I understand if this is taking up too much of your time....
Len

gllincoln said:
Hi Len,

Interesting alternative - I'm curious about the relative speed of the
scripting object vs. the Access ADODB use of the CSV data. Any chance you
could do a run time comparison?

I would think that the file scripting object would be faster for this kind
of task but I could be wrong.

Cordially,
Gordon


LenJr said:
Hi Gordon, Sorry I have not responded...had issues getting the reply to
work...well anyway thanks for the update....I did notice that....no biggy.
Also, I changed things around a bit...I used your code to produce the file
list but then I used the following to process the data. There are many
ways
to skin a cat...what do you think?

Const adOpenStatic = 3
Const adLockOptimistic = 3
Const adCmdText = &H0001

Set objConnection = CreateObject("ADODB.Connection")
Set objRecordSet = CreateObject("ADODB.Recordset")
Set objRecordSet1 = CreateObject("ADODB.Recordset")


Do While Not myWork.AtEndOfStream
s = myWork.ReadLine
' MsgBox s
Dim strPathName
Dim strFileName
Dim intFileNameStart
Dim intPathLen
intPathLen = Len(s)
intFileNameStart = InStrRev(s,"\",-1)
strFileName = Right(s,intPathLen - intFileNameStart)
strPathName = Left(s,intFileNameStart - 1)


objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & strPathName & ";" & _
"Extended Properties=""text;HDR=YES;FMT=Delimited'"""


objRecordset.Open "SELECT * FROM [" & strFileName & "] WHERE SubNbr =
0", _
objConnection, adOpenStatic, adLockOptimistic, adCmdText

Do Until objRecordset.EOF
myExportLine = Chr(34) & objRecordset.Fields.Item("Team")
myExportLine = myExportLine & Chr(34) & Chr(44) &
objRecordset.Fields.Item("DtStart")
myExportLine = myExportLine & Chr(44) & Chr(34) &
objRecordset.Fields.Item("Desc")
myExportLine = myExportLine & Chr(34) & Chr(44) &
objRecordset.Fields.Item("Week1")
myExportLine = myExportLine & Chr(44) & Chr(34) &
objRecordset.Fields.Item("EmpName")
myExportLine = myExportLine & Chr(34) & Chr(44) & Chr(34) &
objRecordset.Fields.Item("RollUpTeam")
myExportLine = myExportLine & Chr(34) & Chr(44) & Chr(34) &
objRecordset.Fields.Item("Type") & Chr(34)
myOut.WriteLine(myExportLine)
objRecordset.MoveNext
Loop
objRecordset.Close
objConnection.Close
Loop


myWork.Close
myIn.Close
myOut.Close
 
G

gllincoln

Hi Len,

I understand why migrating the code was difficult - this new sample data, only exporting a few of the columns, calls for a different approach.

Try this version.

Cordially,
Gordon


Const mySearchPath = "C:\Test_ResourcePlanning\"
Const myTargetFile = "C:\Test_ResourcePlanning\target.txt"

Const ForReading = 1, ForWriting = 2, ForAppending = 8

Dim myHdr 'stores the header
Dim myRow 'holds a row of data
Dim s 'the current file name
Dim myArray(19) 'stores the 19 cols
Dim myOutput(7,1)'stores label, index of output cols
Dim myLen 'stores the char count of row
Dim ptr 'line marker
Dim x 'loop counter
Dim y 'array index var
Dim myExportLine 'the output row to target.txt
Dim WShell 'Shell object
Dim zf 'count the files processed
Dim zr 'count the records found

Dim fso 'filesystem object
Dim myIn 'import file object
Dim myOut 'export file object
Dim myWork 'work file object

Dim NewRow 'boolean marker for new row start
Dim flag 'output this row?

' build our output manager
' 0=header label, 1=index
myOutput(1,0) = "Team"
myOutput(1,1) = 2
myOutput(2,0) = "DStart"
myOutput(2,1) = 5
myOutput(3,0) = "Desc"
myOutput(3,1) = 8
myOutput(4,0) = "Week1"
myOutput(4,1) = 9
myOutput(5,0) = "EmpName"
myOutput(5,1) = 14
myOutput(6,0) = "RollupTeam"
myOutput(6,1) = 17
myOutput(7,0) = "Type"
myOutput(7,1) = 18


Set fso = CreateObject("Scripting.FileSystemObject")
Set myWork = fso.CreateTextFile(mySearchPath & "maker.bat", True)
myWork.WriteLine("Dir " & mySearchPath & "*.csv /s /b > " & mySearchPath & "filelist.txt")
myWork.Close

Set WShell = WScript.CreateObject("WScript.Shell")
WShell.Run mySearchPath & "maker.bat", 3, True
Set WShell = Nothing

Set myWork = fso_OpenTextFile(mySearchPath & "filelist.txt", ForReading)
Set myOut = fso.CreateTextFile(mySearchPath & "target.txt", True)

myHdr = ""

'build our export header
for x = 1 to 7
myHdr = myHdr & chr(34) & myOutput(x,0) & chr(34) & chr(44)
next
myHdr = left(myHdr,len(myHdr)-1)

myOut.WriteLine(myHdr)

'Read our file list
Do While Not myWork.AtEndOfStream
s = myWork.ReadLine
zf = zf + 1
'open a source file
Set myIn = fso_OpenTextFile(s, ForReading)
Do While Not myIn.AtEndOfStream
y = 19
NewRow = True
'read a row
myRow = myIn.ReadLine
myLen = Len(myRow)
'skip a header
If left(Right(Trim(myRow), 5),4) <> "Role" Then
flag = True
For x = myLen To 1 Step -1
'start of col marker
If Mid(myRow, x, 1) = Chr(44) Then

If NewRow = True Then
ptr = x
myArray(y) = Mid(myRow,ptr+1, myLen - ptr)
y = y - 1
NewRow = False
ptr = ptr - 1
Else
myArray(y) = Mid(myRow, x + 1, ptr-x)
ptr = x - 1
y = y - 1

End If
End If
Next
myArray(y) = left(myRow,ptr)
Else
'hdr row - skip it
flag = False
End If

If myArray(7) <> "0" Then flag = False
If flag = True Then
myExportLine = ""
'build & write the row to target
zr = zr + 1
For x = 1 To 7
myExportLine=myExportLine & myArray(myOutput(x,1)) & Chr(44)
Next
myExportLine=left(myExportLine, len(myExportLine)-1)
myOut.WriteLine(myExportLine)
End If
'get the next row
Loop
myIn.Close
'get the next source file
Loop
myWork.Close
myIn.Close
myOut.Close
MsgBox "We processed " & zf & " files and exported " & zr & " records.", vbInformation, "CSV_Concat Results"
 
L

LenJr

Hi Gordon, Thanks again. You were correct...your version runs much quicker.
It ran thru about 56,000 records in about 5 minutes and my version took about
14 minutes. I had to make one change to your version just as an FYI....the
text fields that I am working with have chr(34) qualifiers...so I added the
some code in case a text field had a chr(44) that was not a delimiter.

Thanks for all your help!
Len

'The textField switch is used to identify the start and end of a text field
in case the field
'has a chr(44) in it that is not a delimiter. When set to 1 it identifies
the start of the text field 'then when set to 2 it identifies
the end of the text field and is set back to 0 so it knows that the
'next chr(44) is a delimiter.
textField = 0
For x = myLen To 1 Step -1
'start of col marker
If (Mid(myRow, x, 1) = Chr(44)) and (textField = 0) Then
If NewRow = True Then
ptr = x
myArray(y) = Mid(myRow,ptr+1, myLen - ptr)

y = y - 1
NewRow = False
ptr = ptr - 1
Else
myArray(y) = Mid(myRow, x + 1, ptr-x)
'msgBox myArray(y) & " = " & y
ptr = x - 1
y = y - 1

End If
End If
If (Mid(myRow, x, 1) = Chr(34)) Then
textField = textField + 1
if textField = 2 Then
textField = 0
End If
End If
Next
myArray(y) = left(myRow,ptr)
Else
'hdr row - skip it
flag = False
End If
 
G

gllincoln

Hi Len,

Thank you for the time comparison - it's good to have real world working
examples that validate (or not) one's assumptions.
Cordially,
Gordon
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top