Using a Cube to store Histogram information?

R

R Avery

I am running regressions off of SQL server using VBA for calculation.
In addition to this, I would like to be able to create histograms with
that bucket all the variables. It should be able to create
cross-distributions for 1, 2, or 3 variables, and do so with a simple
..AddRow() method or something like that.

Since there are lots of variables in the regression, I do not want to
create a huge array with a dimension for each variable. I expect most
cross-buckets (sub-cubes) to be empty, so I think that a SparseCube
object could do the trick, if such a thing were to exist.

Also, I am calculating directly off a SQL statement that returns
millions of rows, so whatever solution I take must be able to handle
that and not slow down the regression calculation too much.

If there is an entirely different way to bucket and store the data such
that I can easily aggregate over variables to get a 1, 2, or 3
dimensional grid of data that an Excel histogram can easily display,
however, I would be open to it. Of the class of things that work,
whatever is fastest to obtain or build...

Any help would be appreciated.
 
R

R Avery

Well, I ended up implementing a SparseCube class which represents a
n-dimensional cube, and after I made this, projecting the cube on to
one or two of its dimensions I made a SparseCubeProjection class, which
is able to serialize itself to Excel.

It is able to bucket the datarows and create projections of itself
quickly, so it seems to be good enough for my purposes.

Thanks.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top