Handling very large files (too many rows)

K

Kingston999

I'm trying to analyse a set of data. However, the data is a CSV fil
with almost 1.5 million rows, which somewhat excedes the 65 thousand o
so that excel can handle. However, excel is what I'm used to. If ther
any convenient way to import so much data? Alternately, can anyon
recomend a different means to observe and process that amount of data
 
V

Vaughan

I really think that is way too much for Excel, even if you break it into bits.

What you should use depends on how you want to analyse the data. Access can handle that quantity of data. Maybe, if you are going to filter the data down to a manageable subset or summary you could import it to Access and then query it using MS Query in Excel if using Excel is important to you.
 
D

Dave Peterson

Do you really need all 1.5 million rows?

If not, maybe you could reduce the rows to import before you try to import it.

(your favorite text editor or even a little program that keeps just the records
you want.)
 
K

Kingston999

In reply to the clarifications requested above:

1. It looks like I do need at least a substantial proportion of th
rows. However, being able to import every tenth value might work, i
someone could suggest a way of doing that.

2. The reason I'm trying to use Excel is that I know how to use it.
don't know how to use Access, for example, and I am not sure what els
I could use. If anyone could suggest a different program suitable fo
analysising large amounts of data, I'd be very grateful
 
D

Dave Peterson

Depending what you mean by analysis, you may be able to use Data|Pivottable and
point at the .CSV file.

Or put the data in Access and use excel to extract it.

(I've read posts that this is possible--I don't use Access.)

If you don't get enough help here, you may want to go to one of the Access
newsgroups and tell them your problem. Maybe it'll turn into a simple (haha)
solution.
 
H

hgrove

Vaughanwrote...
As far as I know, there isn't any spreadsheet that will cope with
1.5m rows.
...

I can't either, though Quattro Pro could handle 1.0 million rows
 
H

hgrove

Kingston999 wrote...
...
1. It looks like I do need at least a substantial proportion of the
rows. However, being able to import every tenth value might
work, if someone could suggest a way of doing that.

Every 10th row would still mean approx. 150,000 rows, and that stil
can't fit into a single worksheet, so you'd still be faced with th
same sort of problems (analyzing data spanning multiple worksheets) a
you would if you imported all 1.5 million rows.

What *specifically* are you trying to do?
2. The reason I'm trying to use Excel is that I know how to use
it. I don't know how to use Access, for example, and I am not
sure what else I could use. If anyone could suggest a different
program suitable for analysising large amounts of data, I'd be
very grateful.

If all you know how to use is a hammer, everything looks like nails.

Spreadsheets are only good at analyzing small to moderate amounts o
data. 1.5 million records is way to much for any spreadsheet to handl
at all well.

What you should use depends critically on what you're trying to do. S
what are you trying to do
 
A

AlfD

Hi!

Slightly OT.

The thought occurs that there might be occasions when "paralle
processing" could be used to extend the usable rows of Excel.

What I mean is: if the content of the rows runs to, say, 32 column
(A-AF) then A-IV would accommodate 8 such parallel sets of columns
each set self-contained. That could, effectively, give access to 8
(2^16-1) working rows: say 0.5 million.

I've never tried such a device, but it has to work (in theory), doesn'
it? The practice?
Al
 
D

Dave Peterson

I'm sure you could do something with the data--but you'd lose a lot of the stuff
that you can do in excel easily--just inserting/deleting rows would be a pain.

And I know that when I approach 40k-50k rows (and 50 columns (say)) in a single
worksheet (with lots of vlookup()'s and other formulas), excel can slow down to
almost a halt.
 
K

Kingston999

Specifically, I am looking at recorded movement of a part in a machin
over time. I am trying to find the peak values.

However, I am tending to agree with you that Excel isn't the best too
for the job, which is why I am also interested in what you'd sugges
instead
 
D

Dave Peterson

I don't have a suggestion.

The last time I had to work with that many rows, I brought it to our mainframe
and trimmed it down so that it would fit into excel.
 
B

Bill Martin

Specifically, I am looking at recorded movement of a part in a machine
over time. I am trying to find the peak values.

Do you have no experience with some programming language? Even BASIC would
serve and be easy to learn.

Bill -- (Remove KILLSPAM from my address to use it)
 
Top