Extract data from csv file

L

L.Mathe

I was looking through this discussion group, found something close to what I
need, but not being a programmer, I haven't been able to modify it to what I
am attempting to do and I hope someone can help.

The .csv files are split into groups by month (ie: "c:\Jan\file name.csv)".
I need to search within the group of csv files and extract data into an Excel
file. What I would like to do is if Cell A1 in my active wb matches the data
to the right of the 76 comma in the csv file, extract the 'text' value (must
be specified as text as this data is a 19 digit number and can't have it
tuncated), in cell A2. Then in cell B2, extract the data that is to the
right of the 109 comma. Continue searching the current file and loop through
all remaining files, extract subsequent data into the next line below.

Hopefully this is possible and someone can help!

Thanks
 
K

KC

Interesting exercise.
I am guessing that you have one workbook with one worksheet where
A1, B1 only are filled. Nothing further;
Only search in each csv file for 76th and 109th comma,
In what way is the matching done please? as the following 19 positions are
DIGITS only.
 
J

JLatham

Or it could be that the .csv file is turning out to be close to a fixed field
length file and he means that there's a comma at the 76th and 109th character
position in a record? Definitely needs clarification.
 
L

L.Mathe

My apologies for delay in replying, I had the flu and couldn't think
straight.

I looked more carefully at the type of files I need to seach for a
particular string, and found they are 'Excel Comma Separated Values'. The
files to be searched average 35,000 lines, and have, 1 believe, 120 columns
of data.

What I am attempting to do is search the 77th column for matching data, and
if there is a match, extract the data in the 47th column (19 digit number, so
need to extract as text), and also the data in the 110th column. When
opening the file using Note Pad, all the data is enclosed in " " and
separated by commas.

The workbook I want to extract the data to will always be basically blank.
I am hoping to have a user put a 'value' in Cell A1 then use a click button
to run the macro. It really doesn't matter what columns data goes to as long
as the data extracted is from the same line from the text file. IE Results
in WB:
Column A Column B
6888551119921316789 01/31/2010 15:10
6888551118195432688 02/13/2010 12:45

The code I found was as follows:
1-Their question: To extract data (the first three letters after the 2nd
comma, and the first 35 characters after the 7th comma) from a csv file (over
100,000 rows),
only after the 8th column matches a values in column A of my spreadsheet. The
two extracted data elements need to be stored in my worksheet in columns B
and C.

2- Reply: Sub Gettext()

Const ForReading = 1, ForWriting = 2, ForAppending = 3
Const TristateUseDefault = -2, TristateTrue = -1, TristateFalse = 0

Dim Data(8)

'default folder
Folder = "C:\temp"
ChDir (Folder)

Set fsread = CreateObject("Scripting.FileSystemObject")
FName = Application.GetOpenFilename("CSV (*.csv),*.csv")

Set fread = fsread.GetFile(FName)
Set tsread = fread.OpenAsTextStream(ForReading, TristateUseDefault)

RowCount = 1
Do While tsread.atendofstream = False

InputLine = tsread.ReadLine

For i = 0 To 7
If InStr(InputLine, ",") > 0 Then
Data(i) = Left(InputLine, InStr(InputLine, ",") - 1)
InputLine = Mid(InputLine, InStr(InputLine, ",") + 1)
Else
If Len(InputLine) > 0 Then
Data(i) = InputLine
InputLine = ""
Else
Exit For
End If
End If
Next i
'check if 8th item is in column A
Set c = Columns("A:A").Find(what:=Data(7), LookIn:=xlValues, _
lookat:=xlWhole)
If Not c Is Nothing Then
c.Offset(0, 1) = Left(Data(2), 3)
c.Offset(0, 2) = Left(Data(7), 35)
End If
Loop
tsread.Close
End Sub

Unfortunatley I have not been able to modify this (I can hardly read it)!

Thanks
 
J

joel

this code will open a folderPicker to get the correct folder and the
search every CSV file in the folder using column 77 and getting data i
column 110.


Sub GetData()
DestSht = "sheet1"
With ThisWorkbook.Sheets(DestSht)
SearchData = .Range("A1").Text
.Columns("A:B").NumberFormat = "@"
End With

'Declare a variable as a FileDialog object.
Dim fd As FileDialog

'Create a FileDialog object as a Folder Picker dialog box.
Set fd = Application.FileDialog(msoFileDialogFolderPicker)

'Declare a variable to contain the path
'of each selected item. Even though the path is a String,
'the variable must be a Variant because For Each...Next
'routines only work with Variants and Objects.
Dim vrtSelectedItem As Variant

'Use a With...End With block to reference the FileDialog object.
With fd

'Use the Show method to display the File Picker dialog box an
return the user's action.
'The user pressed the action button.
If .Show = -1 Then

'Step through each string in the FileDialogSelectedItem
collection.
For Each Folder In .SelectedItems
Call ReadCSV(Folder, SearchData, DestSht)

Next Folder
End If
End With

'Set the object variable to Nothing.
Set fd = Nothing
End Sub

Sub ReadCSV(ByVal Folder As Variant, _
ByVal SearchData As String, _
ByVal DestSht)

Dim Data As String

LastRow = ThisWorkbook.Sheets(DestSht) _
.Range("A" & Rows.Count).End(xlUp).Row
NewRow = LastRow + 1
RowCount = NewRow
FName = Dir(Folder & "\*.csv")
Do While FName <> ""

Workbooks.OpenText Filename:=Folder & "\" & FName, _
DataType:=xlDelimited, Comma:=True
Set CSVFile = ActiveWorkbook
Set CSVSht = CSVFile.Sheets(1)
'check if data exists in column 77
Set c = CSVSht.Columns(77).Find(what:=SearchData, _
LookIn:=xlValues, lookat:=xlWhole)
If Not c Is Nothing Then
FirstAddr = c.Address
Do
Data = CSVSht.Cells(c.Row, 110)
With ThisWorkbook.Sheets(DestSht)
.Range("A" & RowCount) = FName
.Range("B" & RowCount) = RowCount
.Range("C" & RowCount) = Data
RowCount = RowCount + 1
End With
Set c = CSVSht.Columns(77).FindNext(after:=c)
Loop While Not c Is Nothing And c.Address <> FirstAddr
End If
CSVFile.Close savechanges:=False

FName = Dir()
Loop
End Sub
 
L

L.Mathe

Hi Joel,

This code is BRILLIANT! I just need one other piece to it, I need also to
extract the data in column 47 (note, the data is a 19 digit number so it must
be defined as 'Text'. I tried to modify what you sent me as follows, but it
didn't work:

Do
Data1 = CSVSht.Cells(c.Row, 110)
Data2 = CSVSht.Cells(c.Row, 47)

With ThisWorkbook.Sheets(DestSht)
..Range("A" & RowCount) = FName
..Range("B" & RowCount) = RowCount
..Range("C" & RowCount) = Data1
..Range("D" & RowCount) = Data2

RowCount = RowCount + 1
End With

I hope you can provide a little further assistance with this. The amount of
manual work this piece of code will save is invaluable.

Thank you
--
Linda


joel said:
this code will open a folderPicker to get the correct folder and then
search every CSV file in the folder using column 77 and getting data in
column 110.


Sub GetData()
DestSht = "sheet1"
With ThisWorkbook.Sheets(DestSht)
SearchData = .Range("A1").Text
.Columns("A:B").NumberFormat = "@"
End With

'Declare a variable as a FileDialog object.
Dim fd As FileDialog

'Create a FileDialog object as a Folder Picker dialog box.
Set fd = Application.FileDialog(msoFileDialogFolderPicker)

'Declare a variable to contain the path
'of each selected item. Even though the path is a String,
'the variable must be a Variant because For Each...Next
'routines only work with Variants and Objects.
Dim vrtSelectedItem As Variant

'Use a With...End With block to reference the FileDialog object.
With fd

'Use the Show method to display the File Picker dialog box and
return the user's action.
'The user pressed the action button.
If .Show = -1 Then

'Step through each string in the FileDialogSelectedItems
collection.
For Each Folder In .SelectedItems
Call ReadCSV(Folder, SearchData, DestSht)

Next Folder
End If
End With

'Set the object variable to Nothing.
Set fd = Nothing
End Sub

Sub ReadCSV(ByVal Folder As Variant, _
ByVal SearchData As String, _
ByVal DestSht)

Dim Data As String

LastRow = ThisWorkbook.Sheets(DestSht) _
.Range("A" & Rows.Count).End(xlUp).Row
NewRow = LastRow + 1
RowCount = NewRow
FName = Dir(Folder & "\*.csv")
Do While FName <> ""

Workbooks.OpenText Filename:=Folder & "\" & FName, _
DataType:=xlDelimited, Comma:=True
Set CSVFile = ActiveWorkbook
Set CSVSht = CSVFile.Sheets(1)
'check if data exists in column 77
Set c = CSVSht.Columns(77).Find(what:=SearchData, _
LookIn:=xlValues, lookat:=xlWhole)
If Not c Is Nothing Then
FirstAddr = c.Address
Do
Data = CSVSht.Cells(c.Row, 110)
With ThisWorkbook.Sheets(DestSht)
.Range("A" & RowCount) = FName
.Range("B" & RowCount) = RowCount
.Range("C" & RowCount) = Data
RowCount = RowCount + 1
End With
Set c = CSVSht.Columns(77).FindNext(after:=c)
Loop While Not c Is Nothing And c.Address <> FirstAddr
End If
CSVFile.Close savechanges:=False

FName = Dir()
Loop
End Sub


--
joel
------------------------------------------------------------------------
joel's Profile: 229
View this thread: http://www.thecodecage.com/forumz/showthread.php?t=180054

Microsoft Office Help

.
 
J

joel

I forgot that I had 1 minor error in the oringal posting that I didn'
fix so it gave you the row number in the CSV file instead of the ro
number in the workbook.

from
Range("B" & RowCount) = RowCount
to
Range("B" & RowCount) = C.Row




There werre two thing I did that weren't obvious to keep the data a
text


1) Format columns A & b as Text

from:
.Columns("A:B").NumberFormat = "@"
to:
.Columns("A:D").NumberFormat = "@"

2) Use a variable that was declared as a string

From:
Dim Data As String


To:
Dim Data As String
Dim Data1 As String
Dim Data2 As String
 
A

anthony russano

I need to accomplish a very similar task...

And i want to work with this code, but what file extension should i save it as..? what should i need to compile and run it?

I'm a noob, but i need to read a csv file that contains information about different concerts and events... here is a snippet of the csv:

"EventID","Event","Venue","City","State","DateTime","TicketsYN","Address","ZipCode","URLLink","ImageURL"
1305111, Bhangra Fusion Eleven,The Fillmore - Detroit,Detroit,Michigan,3/20/2010 18:00,Y,2115 Woodward,48201,http://orders.ticketsintown.com/Res...p://www.indux.com/map/fillmore_detroit_tn.gif

So, I want to search the csv based on "Event" and have it return every event that matches along with certain data related to the event that can be found in the same row. The two items i need are "City" and "URLLink".

I suppose I need the resulting data in a new csv file that only containts a sub-set of the columns from the original one and also only contains data related to the event name I searched for:

"Event","City","URLLink"
Bhangra Fusion Eleven,Detroit,http://orders.ticketsintown.com/ResultsTicket.aspx?evtid=1305111


So I should have basically a table of "events" "city" and "URLLink" then I will export it into another script that will take that data along with input from user-- asking for the domain name he reserved for this artist--and then create website on an apache webserver for the domain name entered and in addition create a sub-domain for each instance of the event based on the "city" name obtained from the csv file and the domain name input by the user.

The subdomains will then all be redirected to the "URLLink" of the corresponding "city"

is that clear enough.? any ideas or suggestions would be appreciated. i think i have a better understanding myself after writing that all out. but still sounds like alot of work and i dont want to go down a rabbit hole



joel wrote:

I forgot that I had 1 minor error in the oringal posting that I did notfix so
23-Feb-10

I forgot that I had 1 minor error in the oringal posting that I did no
fix so it gave you the row number in the CSV file instead of the ro
number in the workbook

fro
Range("B" & RowCount) = RowCoun
t
Range("B" & RowCount) = C.Ro


There werre two thing I did that were not obvious to keep the data a
tex

1) Format columns A & b as Tex

from
Columns("A:B").NumberFormat = "@
to
Columns("A:D").NumberFormat = "@

2) Use a variable that was declared as a strin

From
Dim Data As Strin

To
Dim Data As Strin
Dim Data1 As Strin
Dim Data2 As Strin

-
joe
-----------------------------------------------------------------------
joel's Profile: 22
View this thread: http://www.thecodecage.com/forumz/showthread.php?t=18005

Microsoft Office Help

Previous Posts In This Thread:

Extract data from csv file
I was looking through this discussion group, found something close to what
need, but not being a programmer, I have not been able to modify it to what
am attempting to do and I hope someone can help

The .csv files are split into groups by month (ie: "c:\Jan\file name.csv)"
I need to search within the group of csv files and extract data into an Exce
file. What I would like to do is if Cell A1 in my active wb matches the dat
to the right of the 76 comma in the csv file, extract the 'text' value (mus
be specified as text as this data is a 19 digit number and cannot have i
tuncated), in cell A2. Then in cell B2, extract the data that is to th
right of the 109 comma. Continue searching the current file and loop throug
all remaining files, extract subsequent data into the next line below

Hopefully this is possible and someone can help

Thank
-
Linda

Interesting exercise.
Interesting exercise
I am guessing that you have one workbook with one worksheet wher
A1, B1 only are filled. Nothing further
Only search in each csv file for 76th and 109th comma,
In what way is the matching done please? as the following 19 positions are
DIGITS only.

Or it could be that the .
Or it could be that the .csv file is turning out to be close to a fixed field
length file and he means that there is a comma at the 76th and 109th character
position in a record? Definitely needs clarification.

:

My apologies for delay in replying, I had the flu and could not thinkstraight.
My apologies for delay in replying, I had the flu and could not think
straight.

I looked more carefully at the type of files I need to seach for a
particular string, and found they are 'Excel Comma Separated Values'. The
files to be searched average 35,000 lines, and have, 1 believe, 120 columns
of data.

What I am attempting to do is search the 77th column for matching data, and
if there is a match, extract the data in the 47th column (19 digit number, so
need to extract as text), and also the data in the 110th column. When
opening the file using Note Pad, all the data is enclosed in " " and
separated by commas.

The workbook I want to extract the data to will always be basically blank.
I am hoping to have a user put a 'value' in Cell A1 then use a click button
to run the macro. It really does not matter what columns data goes to as long
as the data extracted is from the same line from the text file. IE Results
in WB:
Column A Column B
6888551119921316789 01/31/2010 15:10
6888551118195432688 02/13/2010 12:45

The code I found was as follows:
1-Their question: To extract data (the first three letters after the 2nd
comma, and the first 35 characters after the 7th comma) from a csv file (over
100,000 rows),
only after the 8th column matches a values in column A of my spreadsheet. The
two extracted data elements need to be stored in my worksheet in columns B
and C.

2- Reply: Sub Gettext()

Const ForReading = 1, ForWriting = 2, ForAppending = 3
Const TristateUseDefault = -2, TristateTrue = -1, TristateFalse = 0

Dim Data(8)

'default folder
Folder = "C:\temp"
ChDir (Folder)

Set fsread = CreateObject("Scripting.FileSystemObject")
FName = Application.GetOpenFilename("CSV (*.csv),*.csv")

Set fread = fsread.GetFile(FName)
Set tsread = fread.OpenAsTextStream(ForReading, TristateUseDefault)

RowCount = 1
Do While tsread.atendofstream = False

InputLine = tsread.ReadLine

For i = 0 To 7
If InStr(InputLine, ",") > 0 Then
Data(i) = Left(InputLine, InStr(InputLine, ",") - 1)
InputLine = Mid(InputLine, InStr(InputLine, ",") + 1)
Else
If Len(InputLine) > 0 Then
Data(i) = InputLine
InputLine = ""
Else
Exit For
End If
End If
Next i
'check if 8th item is in column A
Set c = Columns("A:A").Find(what:=Data(7), LookIn:=xlValues, _
lookat:=xlWhole)
If Not c Is Nothing Then
c.Offset(0, 1) = Left(Data(2), 3)
c.Offset(0, 2) = Left(Data(7), 35)
End If
Loop
tsread.Close
End Sub

Unfortunatley I have not been able to modify this (I can hardly read it)!

Thanks


--
Linda


:

this code will open a folderPicker to get the correct folder and thensearch
this code will open a folderPicker to get the correct folder and then
search every CSV file in the folder using column 77 and getting data in
column 110.


Sub GetData()
DestSht = "sheet1"
With ThisWorkbook.Sheets(DestSht)
SearchData = .Range("A1").Text
Columns("A:B").NumberFormat = "@"
End With

'Declare a variable as a FileDialog object.
Dim fd As FileDialog

'Create a FileDialog object as a Folder Picker dialog box.
Set fd = Application.FileDialog(msoFileDialogFolderPicker)

'Declare a variable to contain the path
'of each selected item. Even though the path is a String,
'the variable must be a Variant because For Each...Next
'routines only work with Variants and Objects.
Dim vrtSelectedItem As Variant

'Use a With...End With block to reference the FileDialog object.
With fd

'Use the Show method to display the File Picker dialog box and
return the user's action.
'The user pressed the action button.
If .Show = -1 Then

'Step through each string in the FileDialogSelectedItems
collection.
For Each Folder In .SelectedItems
Call ReadCSV(Folder, SearchData, DestSht)

Next Folder
End If
End With

'Set the object variable to Nothing.
Set fd = Nothing
End Sub

Sub ReadCSV(ByVal Folder As Variant, _
ByVal SearchData As String, _
ByVal DestSht)

Dim Data As String

LastRow = ThisWorkbook.Sheets(DestSht) _
Range("A" & Rows.Count).End(xlUp).Row
NewRow = LastRow + 1
RowCount = NewRow
FName = Dir(Folder & "\*.csv")
Do While FName <> ""

Workbooks.OpenText Filename:=Folder & "\" & FName, _
DataType:=xlDelimited, Comma:=True
Set CSVFile = ActiveWorkbook
Set CSVSht = CSVFile.Sheets(1)
'check if data exists in column 77
Set c = CSVSht.Columns(77).Find(what:=SearchData, _
LookIn:=xlValues, lookat:=xlWhole)
If Not c Is Nothing Then
FirstAddr = c.Address
Do
Data = CSVSht.Cells(c.Row, 110)
With ThisWorkbook.Sheets(DestSht)
Range("A" & RowCount) = FName
Range("B" & RowCount) = RowCount
Range("C" & RowCount) = Data
RowCount = RowCount + 1
End With
Set c = CSVSht.Columns(77).FindNext(after:=c)
Loop While Not c Is Nothing And c.Address <> FirstAddr
End If
CSVFile.Close savechanges:=False

FName = Dir()
Loop
End Sub


--
joel
------------------------------------------------------------------------
joel's Profile: 229
View this thread: http://www.thecodecage.com/forumz/showthread.php?t=180054

Microsoft Office Help

Hi Joel,This code is BRILLIANT!
Hi Joel,

This code is BRILLIANT! I just need one other piece to it, I need also to
extract the data in column 47 (note, the data is a 19 digit number so it must
be defined as 'Text'. I tried to modify what you sent me as follows, but it
did not work:

Do
Data1 = CSVSht.Cells(c.Row, 110)
Data2 = CSVSht.Cells(c.Row, 47)

With ThisWorkbook.Sheets(DestSht)
..Range("A" & RowCount) = FName
..Range("B" & RowCount) = RowCount
..Range("C" & RowCount) = Data1
..Range("D" & RowCount) = Data2

RowCount = RowCount + 1
End With

I hope you can provide a little further assistance with this. The amount of
manual work this piece of code will save is invaluable.

Thank you
--
Linda


:

I forgot that I had 1 minor error in the oringal posting that I did notfix so
I forgot that I had 1 minor error in the oringal posting that I did not
fix so it gave you the row number in the CSV file instead of the row
number in the workbook.

from
Range("B" & RowCount) = RowCount
to
Range("B" & RowCount) = C.Row




There werre two thing I did that were not obvious to keep the data as
text


1) Format columns A & b as Text

from:
Columns("A:B").NumberFormat = "@"
to:
Columns("A:D").NumberFormat = "@"

2) Use a variable that was declared as a string

From:
Dim Data As String


To:
Dim Data As String
Dim Data1 As String
Dim Data2 As String


--
joel
------------------------------------------------------------------------
joel's Profile: 229
View this thread: http://www.thecodecage.com/forumz/showthread.php?t=180054

Microsoft Office Help


Submitted via EggHeadCafe - Software Developer Portal of Choice
WPF TreeView Control With Filtering of Nodes Based on Their Levels
http://www.eggheadcafe.com/tutorial...a-bf26b9783a9b/wpf-treeview-control-with.aspx
 
J

joel

I will look at thsi tonight. To run the code you need to use VBA
(Visual Basic) programming language that is part of Excel. If you open
Excel and right click the tab (normally says sheet1) at the bottom of
the worksheet and select "View Code". the from the VBA menu add a
module using the menu Insert - Mode. then you cna paste the code from
the posting into the module and run the code. the code won't run
properly because the columns in your CSV file is different from the code
you posted. The changes are minor.

The Code opens CSV files and puts them into an Excel XLS workbook. So
yo uwould save the results asa a workbook even thought your files were
originally CSV files. I could save tthe results either as a CSV file or
leavve them in a XLS workbook.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top