VBA vs C#: filling a spreadsheet with random numbers - C# 4x slower...

H

Hal Styli

Hello,

I have a VBA macro which fills a sheet with 65536 random integers, it
takes just over a minute to run on my laptop, a T5470 1.60 Dual Core
with 3.5GB Ram, running Vista, Excel 2007 and Visual Studio 2005. I
figured it might run faster in C#, in fact it runs 4 times slower and
then when you open the Spreadsheet, it complains about being in a
'different format than that specified in the file extension.... Verify
that the file is not corrupted..' - taking a long time to load,
perhaps another minute or two. Even given that the VBA is run from
within Excel, the C# version does seem extremely slow.

Is there anything fundamentally wrong with the C#?
Any tips on speeding it up?
Any way of making the format it is written to be 'straight xls',
rather than 'a different format.
A complete listing of the VBA sn the C# is given below.

Thanks for all constructive help given.
Hal
'------------ this VBA takes 1m 10s -------------------------
Option Explicit

Sub main()
Dim t1 As Date, t2 As Date
t1 = Time

Application.ScreenUpdating = False
Debug.Print "Start time: "; t1

Const mr = 2 ^ 16, mc = 2 ^ 8, L = 10 ^ 8, U = 9 * 10 ^ 8
Dim r As Long, c As Long
ReDim a(1 To mr, 1 To mc) As Long

For r = 1 To mr
For c = 1 To mc
a(r, c) = Int(U * Rnd + L)
Next c
Next r
Range(Cells(1, 1), Cells(mr, mc)) = a

Application.ScreenUpdating = True
t2 = Time
Debug.Print " End time: "; t2
Debug.Print "Time taken: "; Format(t2 - t1, "HH:MM:SS")
MsgBox "Time taken: " & Format(t2 - t1, "HH:MM:SS")
End Sub
'----------------------- end of VBA --------------------------
//------------- this C# takes over 4x as long ----------------
// Stage 1 took: 00:00:16.0970140
// Stage 2 took: 00:01:45.4888199
// Stage 3 took: 00:03:00.2385236
// Stage 4 took: 00:00:00.8334127
//
using System;
using System.Collections.Generic;
using System.Text;
using System.Diagnostics; // timing
using System.Reflection;
using System.Runtime.InteropServices;
using Excel = Microsoft.Office.Interop.Excel;

namespace test4
{
class Program
{
static void Main(string[] args)
{
Stopwatch st = new Stopwatch();
// ----------------- Stage 1 ----------------------
st.Start();

// Excel object references.
Excel.Application m_objExcel = null;
Excel.Workbooks m_objBooks = null;
Excel._Workbook m_objBook = null;
Excel.Sheets m_objSheets = null;
Excel._Worksheet m_objSheet = null;
Excel.Range m_objRange = null;

// Frequenty-used variable for optional arguments.
object m = System.Reflection.Missing.Value;

// Paths used by the sample code for accessing and storing
data.
object m_strSampleFolder = "C:\\ExcelData\\";

// Start a new workbook in Excel.
m_objExcel = new Excel.Application();
m_objBooks = (Excel.Workbooks)m_objExcel.Workbooks;
m_objBook = (Excel._Workbook)(m_objBooks.Add(m));
m_objSheets = (Excel.Sheets)m_objBook.Worksheets;
m_objSheet = (Excel._Worksheet)(m_objSheets.get_Item(1));

const int maxrows = 65536; //65536=2^16=1<<16
const int maxcols = 256; //256=2^8=1<<8

object[,] objData = new Object[maxrows, maxcols];
Random rdm = new Random((int)DateTime.Now.Ticks);
//double nOrderAmt;//, nTax;

for (int r = 0; r < maxrows; r++)
{
for (int c = 0; c < maxcols; c++)
{
objData[r,c] = rdm.Next(100000000, 999999999);
}
}
st.Stop();
Console.WriteLine("Stage 1 took: {0}",
st.Elapsed.ToString());
// ----------------- Stage 2 ----------------------
st.Reset();
st.Start();

m_objRange = m_objSheet.get_Range("A1", m);
m_objRange = m_objRange.get_Resize(maxrows, maxcols);
m_objRange.set_Value(m, objData);

st.Stop();
Console.WriteLine("Stage 2 took: {0}",
st.Elapsed.ToString());
// ----------------- Stage 3 ----------------------
st.Reset();
st.Start();

// Save the workbook and quit Excel.
DateTime date = DateTime.Now;
m_objBook.SaveAs(m_strSampleFolder + "BigSheet-" +
date.ToString("yyyyMMdd_HHmmss.fff") + ".xls",

m,m,m,m,m,Excel.XlSaveAsAccessMode.xlNoChange,m,m,m,m,m);
m_objBook.Close(false,m,m);
m_objExcel.Quit();

st.Stop();
Console.WriteLine("Stage 3 took: {0}",
st.Elapsed.ToString());
// ----------------- Stage 4 ----------------------
st.Reset();
st.Start();

//Clean-up
m_objRange = null;
m_objSheet = null;
m_objSheets = null;
m_objBooks = null;
m_objBook = null;
m_objExcel = null;
GC.Collect();

st.Stop();
Console.WriteLine("Stage 4 took: {0}",
st.Elapsed.ToString());
}
}
}
 
H

Hal Styli

I have a VBA macro which fills a sheet with 65536 random integers

In fact, it fills the sheet with 65536*256 = 16,777,216 random
integers.

Hal.
 
M

Mike S

Hello,

I have a VBA macro which fills a sheet with 65536 random integers, it
takes just over a minute to run on my laptop, a T5470 1.60 Dual Core
with 3.5GB Ram, running Vista, Excel 2007 and Visual Studio 2005. I
figured it might run faster in C#, in fact it runs 4 times slower and
then when you open the Spreadsheet, it complains about being in a
'different format than that specified in the file extension.... Verify
that the file is not corrupted..' - taking a long time to load,
perhaps another minute or two. Even given that the VBA is run from
within Excel, the C# version does seem extremely slow.

Is there anything fundamentally wrong with the C#?
Any tips on speeding it up?
Any way of making the format it is written to be 'straight xls',
rather than 'a different format.
A complete listing of the VBA sn the C# is given below.

Thanks for all constructive help given.
Hal

Have you seen this discussion?

http://www.eggheadcafe.com/software/aspnet/35492274/excel-performance-through-com.aspx
-----------

Also would these lines help (did I miss them somewhere)?

m_objRange.Application.ScreenUpdating = false;
m_objRange.Application.EnableEvents = false;

http://stackoverflow.com/questions/...between-live-unsaved-excel-data-and-c-objects
 
S

sunny

Hello,

I have a VBA macro which fills a sheet with 65536 random integers, it
takes just over a minute to run on my laptop, a T5470 1.60 Dual Core
with 3.5GB Ram, running Vista, Excel 2007 and Visual Studio 2005. I
figured it might run faster in C#, in fact it runs 4 times slower and
then when you open the Spreadsheet, it complains about being in a
'different format than that specified in the file extension.... Verify
that the file is not corrupted..' - taking a long time to load,
perhaps another minute or two. Even given that the VBA is run from
within Excel, the C# version does seem extremely slow.

Is there anything fundamentally wrong with the C#?
Any tips on speeding it up?
Any way of making the format it is written to be 'straight xls',
rather than 'a different format.
A complete listing of the VBA sn the C# is given below.

Thanks for all constructive help given.
Hal
'------------ this VBA takes 1m 10s -------------------------
Option Explicit

Sub main()
    Dim t1 As Date, t2 As Date
    t1 = Time

    Application.ScreenUpdating = False
    Debug.Print "Start time: "; t1

    Const mr = 2 ^ 16, mc = 2 ^ 8, L = 10 ^ 8, U = 9 * 10 ^ 8
    Dim r As Long, c As Long
    ReDim a(1 To mr, 1 To mc) As Long

    For r = 1 To mr
       For c = 1 To mc
          a(r, c) = Int(U * Rnd + L)
       Next c
    Next r
    Range(Cells(1, 1), Cells(mr, mc)) = a

    Application.ScreenUpdating = True
    t2 = Time
    Debug.Print "  End time: "; t2
    Debug.Print "Time taken: "; Format(t2 - t1, "HH:MM:SS")
    MsgBox "Time taken: " & Format(t2 - t1, "HH:MM:SS")
End Sub
'----------------------- end of VBA --------------------------
//------------- this C# takes over 4x as long ----------------
//         Stage 1 took: 00:00:16.0970140
//         Stage 2 took: 00:01:45.4888199
//         Stage 3 took: 00:03:00.2385236
//         Stage 4 took: 00:00:00.8334127
//
using System;
using System.Collections.Generic;
using System.Text;
using System.Diagnostics; // timing
using System.Reflection;
using System.Runtime.InteropServices;
using Excel = Microsoft.Office.Interop.Excel;

namespace test4
{
    class Program
    {
        static void Main(string[] args)
        {
            Stopwatch st = new Stopwatch();
            // ----------------- Stage 1 ----------------------
            st.Start();

            // Excel object references.
            Excel.Application m_objExcel = null;
            Excel.Workbooks m_objBooks = null;
            Excel._Workbook m_objBook = null;
            Excel.Sheets m_objSheets = null;
            Excel._Worksheet m_objSheet = null;
            Excel.Range m_objRange = null;

            // Frequenty-used variable for optional arguments..
            object m = System.Reflection.Missing.Value;

            // Paths used by the sample code for accessing and storing
data.
            object m_strSampleFolder = "C:\\ExcelData\\";

            // Start a new workbook in Excel.
            m_objExcel  = new Excel.Application();
            m_objBooks = (Excel.Workbooks)m_objExcel.Workbooks;
            m_objBook = (Excel._Workbook)(m_objBooks.Add(m));
            m_objSheets = (Excel.Sheets)m_objBook.Worksheets;
            m_objSheet = (Excel._Worksheet)(m_objSheets.get_Item(1));

            const int maxrows = 65536; //65536=2^16=1<<16
            const int maxcols = 256;   //256=2^8=1<<8

            object[,] objData = new Object[maxrows, maxcols];
            Random rdm = new Random((int)DateTime.Now.Ticks);
            //double nOrderAmt;//, nTax;

            for (int r = 0; r < maxrows; r++)
            {
                for (int c = 0; c < maxcols; c++)
                {
                    objData[r,c] = rdm.Next(100000000, 999999999);
                }
            }
            st.Stop();
            Console.WriteLine("Stage 1 took: {0}",
st.Elapsed.ToString());
            // ----------------- Stage 2 ----------------------
            st.Reset();
            st.Start();

            m_objRange = m_objSheet.get_Range("A1", m);
            m_objRange = m_objRange.get_Resize(maxrows, maxcols);
            m_objRange.set_Value(m, objData);

            st.Stop();
            Console.WriteLine("Stage 2 took: {0}",
st.Elapsed.ToString());
            // ----------------- Stage 3 ----------------------
            st.Reset();
            st.Start();

            // Save the workbook and quit Excel.
            DateTime date = DateTime.Now;
            m_objBook.SaveAs(m_strSampleFolder + "BigSheet-" +
date.ToString("yyyyMMdd_HHmmss.fff") + ".xls",

m,m,m,m,m,Excel.XlSaveAsAccessMode.xlNoChange,m,m,m,m,m);
            m_objBook.Close(false,m,m);
            m_objExcel.Quit();

            st.Stop();
            Console.WriteLine("Stage 3 took: {0}",
st.Elapsed.ToString());
            // ----------------- Stage 4 ----------------------
            st.Reset();
            st.Start();

            //Clean-up
            m_objRange = null;
            m_objSheet = null;
            m_objSheets = null;
            m_objBooks = null;
            m_objBook = null;
            m_objExcel = null;
            GC.Collect();

            st.Stop();
            Console.WriteLine("Stage 4 took: {0}",
st.Elapsed.ToString());
        }
    }

}
I think,

Application.ScreenUpdating = False

is crucial.

And, in VBA, there is no allocation of new variable like "object[,]
objData = new Object[maxrows, maxcols];" in C# code.
In Stage 3, saving the file is also time-consuming process.

Actually, it is a very unfair race.

Finally, I'm not sure about the corruption complaint but in my
experiece, if you use Excel 2003 or save as xls instead of xlsx, it
could happen.
 
M

Martin Brown

Hello,

I have a VBA macro which fills a sheet with 65536 random integers, it
takes just over a minute to run on my laptop, a T5470 1.60 Dual Core
with 3.5GB Ram, running Vista, Excel 2007 and Visual Studio 2005. I
figured it might run faster in C#, in fact it runs 4 times slower and
then when you open the Spreadsheet, it complains about being in a
'different format than that specified in the file extension.... Verify
that the file is not corrupted..' - taking a long time to load,
perhaps another minute or two. Even given that the VBA is run from
within Excel, the C# version does seem extremely slow.

Is there anything fundamentally wrong with the C#?
Any tips on speeding it up?
Any way of making the format it is written to be 'straight xls',
rather than 'a different format.
A complete listing of the VBA sn the C# is given below.

Thanks for all constructive help given.
Hal
//------------- this C# takes over 4x as long ----------------
// Stage 1 took: 00:00:16.0970140
// Stage 2 took: 00:01:45.4888199
// Stage 3 took: 00:03:00.2385236
// Stage 4 took: 00:00:00.8334127
// ----------------- Stage 2 ----------------------
st.Reset();
st.Start();

m_objRange = m_objSheet.get_Range("A1", m);
m_objRange = m_objRange.get_Resize(maxrows, maxcols);
m_objRange.set_Value(m, objData);

st.Stop();
Console.WriteLine("Stage 2 took: {0}",

Not knowing C# but guessing at the behaviour leading to slowness it
might be worth hardwiring the dimensions of the Range and avoiding the
get_Resize. I suspect Excel is tying itself in knots with this request.

ie use .get_Range("A1:IV65536")

If this is indeed faster then construct the string for the range request
and avoid the Resize.

st.Elapsed.ToString());
// ----------------- Stage 3 ----------------------
st.Reset();
st.Start();

// Save the workbook and quit Excel.
DateTime date = DateTime.Now;
m_objBook.SaveAs(m_strSampleFolder + "BigSheet-" +
date.ToString("yyyyMMdd_HHmmss.fff") + ".xls",

m,m,m,m,m,Excel.XlSaveAsAccessMode.xlNoChange,m,m,m,m,m);
m_objBook.Close(false,m,m);
m_objExcel.Quit();

st.Stop();
Console.WriteLine("Stage 3 took: {0}",
st.Elapsed.ToString());
// ----------------- Stage 4 ----------------------
st.Reset();
st.Start();

It isn't a great advert for C#, but it could well be that the big Range
resize is creating a great deal of thrashing somewhere. The time taken
to initialise the array itself looked fairly respectable.

Regards,
Martin Brown
 
N

norie

Isn't it the opening of the workbook after it's been populated where
the problem is?

What happens when you open it manually from within Excel?

One thing, as well as turning off screen updating and events might be
too set calculation to manual for the Excel application object.

I know the code doesn't involve formulas but it wouldn't do any harm
to try that.
 
Top