Can't Get data without including Header using ADO

D

Dave B

Excel 2000

Hello,

I can't pull all the data I need because the ADO method looks for a header
and if one isn't available, it pulls a blank.
I know what rows I need so I make those the "SourceRange."

I need to get data from an extremely large Excel file (>40,000 rows and thru
HH in columns) without opening it - I'm trying to speed up the process.
Someone suggested I use ADO and now I'm slowly learning it. I am very good
at spreadsheets and comfortable with slightly advanced vb.

Here's the code:

Private Function ReadDataFromWorkbook(SourceFile As String, SourceRange As
String) As Variant

Dim dbConnection As ADODB.Connection, rs As ADODB.Recordset
Dim dbConnectionString As String

dbConnectionString = "DRIVER={Microsoft Excel Driver
(*.xls)};ReadOnly=1;DBQ=" & SourceFile & ";HDR=false"
Set dbConnection = New ADODB.Connection

dbConnection.Open dbConnectionString ' open the database connection

Set rs = dbConnection.Execute("[" & SourceRange & "]") 'The source
range is say A22000:IV25000

ReadDataFromWorkbook = rs.GetRows ' returns a two dim array with all
records in rs
rs.Close
dbConnection.Close ' close the database connection
Set rs = Nothing
Set dbConnection = Nothing
On Error GoTo 0
Exit Function
InvalidInput:
MsgBox "The source file or source range is invalid!", vbExclamation,
"Get data from closed workbook"
Set rs = Nothing
Set dbConnection = Nothing
End Function
 
T

TroyW

Dave,

ADO assumes the first row of the range is the field names. Try the following
command right after the "Set rs = dbConnection.Execute" line of code.

MsgBox rs.Fields(0).Name

Is this some of the missing data?

I assume you also noticed that the "GetRows" method transposes the data. The
columns become rows and rows become columns. You can use the UBound and
LBound functions to get the dimensions of the array. Depending upon what
version of Excel you have you can use the Application.Transpose function to
switch the data around if you need to.

Hope that helps...

Troy
 
D

Dave B

Troy,

Thanks for the tip. What I've found is that ADO expects the first row to be
header info which I can handle but when it encounters blanks or what it
considers non-headers, it doesn't retreive the data in that column. Is
there a method to get around this? Can I specify the Header row and then
the range of rows in my SourceRange? I couldn't get that to work - having
more than one range specified for the data.

MsgBox rs.Fields(0).Name gave me the data from the intersection of the first
column and the first row.

....I did notice that the data was transposed and took care of that by
placing it in another array.

dave


TroyW said:
Dave,

ADO assumes the first row of the range is the field names. Try the following
command right after the "Set rs = dbConnection.Execute" line of code.

MsgBox rs.Fields(0).Name

Is this some of the missing data?

I assume you also noticed that the "GetRows" method transposes the data. The
columns become rows and rows become columns. You can use the UBound and
LBound functions to get the dimensions of the array. Depending upon what
version of Excel you have you can use the Application.Transpose function to
switch the data around if you need to.

Hope that helps...

Troy


Dave B said:
Excel 2000

Hello,

I can't pull all the data I need because the ADO method looks for a header
and if one isn't available, it pulls a blank.
I know what rows I need so I make those the "SourceRange."

I need to get data from an extremely large Excel file (>40,000 rows and thru
HH in columns) without opening it - I'm trying to speed up the process.
Someone suggested I use ADO and now I'm slowly learning it. I am very good
at spreadsheets and comfortable with slightly advanced vb.

Here's the code:

Private Function ReadDataFromWorkbook(SourceFile As String, SourceRange As
String) As Variant

Dim dbConnection As ADODB.Connection, rs As ADODB.Recordset
Dim dbConnectionString As String

dbConnectionString = "DRIVER={Microsoft Excel Driver
(*.xls)};ReadOnly=1;DBQ=" & SourceFile & ";HDR=false"
Set dbConnection = New ADODB.Connection

dbConnection.Open dbConnectionString ' open the database connection

Set rs = dbConnection.Execute("[" & SourceRange & "]") 'The source
range is say A22000:IV25000

ReadDataFromWorkbook = rs.GetRows ' returns a two dim array with all
records in rs
rs.Close
dbConnection.Close ' close the database connection
Set rs = Nothing
Set dbConnection = Nothing
On Error GoTo 0
Exit Function
InvalidInput:
MsgBox "The source file or source range is invalid!", vbExclamation,
"Get data from closed workbook"
Set rs = Nothing
Set dbConnection = Nothing
End Function
 
T

TroyW

Dave,

Sorry, I was a little slow picking up on the root cause. I was focused on
the column headers and I now realize your difficulties are in the body of
the data.

The problem is being caused by having mixed data types in the columns. Each
column of data (in the source range) must have a SINGLE data type. It can
be all numbers or all text. Having both types mixed together in a column
will cause the results you've observed (blank cells). You can have ColumnA
be numbers, ColumnB text, ColumnC numbers, etc. You just can't have numbers
and text in the same column.

Basically, the driver looks at the first few rows of each column to
determine the data type for the ENTIRE column. Majority sets the data type
for the column. The minority types will then be ignored during the retrieval
and will show up as blank cells.

AFAIK, MS Query, ADO, and DAO all have this limitation. They all use similar
toolsets to query the datafile.

Can you segment your data to avoid mixing data types?

Troy


Dave B said:
Troy,

Thanks for the tip. What I've found is that ADO expects the first row to be
header info which I can handle but when it encounters blanks or what it
considers non-headers, it doesn't retreive the data in that column. Is
there a method to get around this? Can I specify the Header row and then
the range of rows in my SourceRange? I couldn't get that to work - having
more than one range specified for the data.

MsgBox rs.Fields(0).Name gave me the data from the intersection of the first
column and the first row.

...I did notice that the data was transposed and took care of that by
placing it in another array.

dave


TroyW said:
Dave,

ADO assumes the first row of the range is the field names. Try the following
command right after the "Set rs = dbConnection.Execute" line of code.

MsgBox rs.Fields(0).Name

Is this some of the missing data?

I assume you also noticed that the "GetRows" method transposes the data. The
columns become rows and rows become columns. You can use the UBound and
LBound functions to get the dimensions of the array. Depending upon what
version of Excel you have you can use the Application.Transpose function to
switch the data around if you need to.

Hope that helps...

Troy


and
thru
SourceRange
As
String) As Variant

Dim dbConnection As ADODB.Connection, rs As ADODB.Recordset
Dim dbConnectionString As String

dbConnectionString = "DRIVER={Microsoft Excel Driver
(*.xls)};ReadOnly=1;DBQ=" & SourceFile & ";HDR=false"
Set dbConnection = New ADODB.Connection

dbConnection.Open dbConnectionString ' open the database connection

Set rs = dbConnection.Execute("[" & SourceRange & "]") 'The source
range is say A22000:IV25000

ReadDataFromWorkbook = rs.GetRows ' returns a two dim array with all
records in rs
rs.Close
dbConnection.Close ' close the database connection
Set rs = Nothing
Set dbConnection = Nothing
On Error GoTo 0
Exit Function
InvalidInput:
MsgBox "The source file or source range is invalid!", vbExclamation,
"Get data from closed workbook"
Set rs = Nothing
Set dbConnection = Nothing
End Function
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top