-
Feb 24th, 2015, 08:30 AM
#1
Thread Starter
New Member
[RESOLVED] Speed Up Reading in Arrays from Excel
Hello ,
I am designing a program which requires me to read multiple columns (usually 300-1000 values in length) from multiple excel documents into arrays. This process can take a little bit of time and I was looking to speed it up. I am a relatively new vb programmer so I was wondering what the best option would be between backgroundworkers vs. threads vs. tasks.
Happy coding,
- Steve
Last edited by steve08; Feb 24th, 2015 at 08:54 AM.
-
Feb 24th, 2015, 09:36 AM
#2
Re: Speed Up Reading in Arrays from Excel
Why does it need to be sped up? [Excel] spreadsheets are desktop data manipulators, and not designed as data stores. What, exactly, are you doing with the data?
It'll take as long as it takes, while you may make a small impact with threading and background workers, the impact will not be hugely significant.
"Ok, my response to that is pending a Google search" - Bucky Katt.
"There are two types of people in the world: Those who can extrapolate from incomplete data sets." - Unk.
"Before you can 'think outside the box' you need to understand where the box is."
-
Feb 24th, 2015, 09:49 AM
#3
Thread Starter
New Member
Re: Speed Up Reading in Arrays from Excel
I wanted it to speed up because sometimes the GUI freezes for an uncomfortable period of time. I understand that excel spreadsheets are not deigned for data storage. Honestly, I wish the data was stored into a database it would be much more logical, but that is something that is outside my control. Once the data is scanned in I have to sort some, search some for key points, compare points to ideal, and use the data to optimize the DUT.
If you feel that reading in the data on multiple threads will not have a significant impact on performance then I guess I will leave it as is.
Last edited by steve08; Feb 24th, 2015 at 09:54 AM.
-
Feb 24th, 2015, 10:29 AM
#4
Re: Speed Up Reading in Arrays from Excel
I wanted it to speed up because sometimes the GUI freezes for an uncomfortable period of time.
If you feel that reading in the data on multiple threads will not have a significant impact on performance then I guess I will leave it as is.
It sounds to me as though you are reaching the limits of Excel as a data store, or at least a quick data store!!
If you have no control as to this, then you should instead set expectations that it will be slow and only get slower as more data is added.
Please Mark your Thread "Resolved", if the query is solved & Rate those who have helped you
-
Feb 24th, 2015, 11:17 AM
#5
Re: Speed Up Reading in Arrays from Excel
How are you getting your data from Excel in the first place, there are numerous methods some faster than others. I have used ExcelInterop (slowest) and moved onto 2 different engines (Syncfusion and SpreadSheetGear) which are immensely faster.
-
Feb 24th, 2015, 11:20 AM
#6
Thread Starter
New Member
Re: Speed Up Reading in Arrays from Excel
I am currently using ExcelInterop. I have never heard of Syncfusion or SpreadSheetGear, which of those would you recommend?
-
Feb 24th, 2015, 11:34 AM
#7
Re: Speed Up Reading in Arrays from Excel
Depends what you want to do with Excel. If you want to just get data, then you don't need a 3rd party, you can use odbc/oledb. A quick search will find some examples.
-
Feb 24th, 2015, 01:34 PM
#8
Hyperactive Member
Re: Speed Up Reading in Arrays from Excel
If you are just grabbing the data out of Excel and don't care about the formatting of the spreadsheet, then ADO.Net is much faster than using Interop. I'd suggest that you use that to get the data out of the Excel files. You may also consider threading or using a BGW (which is just a class to easily use threading) in order to allow your UI not to freeze up, but if you need that data loaded before the GUI can be used anyway, then it may not really be worth it.
-
Feb 24th, 2015, 03:19 PM
#9
Re: Speed Up Reading in Arrays from Excel
Originally Posted by steve08
I am currently using ExcelInterop. I have never heard of Syncfusion or SpreadSheetGear, which of those would you recommend?
You are probably iterating the Excel range cell by cell and pulling each value. This will be hideously slow. It is quick to pull your entire range into an array.
Assume the "ws" is a reference to an Excel.WorkSheet object.
Code:
Dim data As Object(,) = DirectCast(ws.Range("a1:a1000").Value, Object(,))
When you retrieve the data from Excel in this manner, the result is a two dimensional array of objects. Note that the lower bound of the array will be one (1) and not zero(0) as in standard .Net arrays. Empty cells will be represented as Nothing in the array.
-
Feb 25th, 2015, 08:46 AM
#10
Re: Speed Up Reading in Arrays from Excel
Stop working with the arrays... or at least stop reading from Excel using arrays... instead open it with ADO.NET and suck the data into a DataTable... then close the connection and work with the data in the dt. No interops, no PIATAIAIAIA, or what ever... just an QOLEDB connection, and command and the resulting datatable.
https://support.microsoft.com/kb/316...?wa=wsignin1.0
http://vb.net-informations.com/excel...xcel_oledb.htm
https://www.google.com/webhp?q=selec...do.net&es_th=1
-tg
-
Feb 25th, 2015, 11:05 AM
#11
Thread Starter
New Member
Re: Speed Up Reading in Arrays from Excel
Originally Posted by TnTinMN
You are probably iterating the Excel range cell by cell and pulling each value. This will be hideously slow. It is quick to pull your entire range into an array.
This is exactly what I was doing. Using your code it was extremely fast to pull the entire worksheet into an array.
Originally Posted by TnTinMN
When you retrieve the data from Excel in this manner, the result is a two dimensional array of objects. Note that the lower bound of the array will be one (1) and not zero(0) as in standard .Net arrays. Empty cells will be represented as Nothing in the array.
When I inspected the array my lower bound was 0. [0,0] was the first element of the array. I think I am going to try this vs. ADO.net to see which is faster.
-
Feb 25th, 2015, 11:27 AM
#12
Re: Speed Up Reading in Arrays from Excel
Here is a simple example which places data into a Object array which is then placed into a DataTable. Since you are only interested in the array disregard the DataTable, change the return type from DataTable to Object(,)
Code:
Option Strict On
Imports Excel = Microsoft.Office.Interop.Excel
Imports Microsoft.Office
Imports System.Runtime.InteropServices
Module ExcelDemoIteratingData_2
Public Sub DemoGettingDates()
Dim dt As DataTable = OpenExcelAndIterate(
IO.Path.Combine(
AppDomain.CurrentDomain.BaseDirectory,
"MyExcelFile.xlsx"),
"Sheet1",
"B1",
"B10")
End Sub
Public Function OpenExcelAndIterate(
ByVal FileName As String,
ByVal SheetName As String,
ByVal StartCell As String,
ByVal EndCell As String) As DataTable
Dim dt As New DataTable
If IO.File.Exists(FileName) Then
Dim Proceed As Boolean = False
Dim xlApp As Excel.Application = Nothing
Dim xlWorkBooks As Excel.Workbooks = Nothing
Dim xlWorkBook As Excel.Workbook = Nothing
Dim xlWorkSheet As Excel.Worksheet = Nothing
Dim xlWorkSheets As Excel.Sheets = Nothing
Dim xlCells As Excel.Range = Nothing
xlApp = New Excel.Application
xlApp.DisplayAlerts = False
xlWorkBooks = xlApp.Workbooks
xlWorkBook = xlWorkBooks.Open(FileName)
xlApp.Visible = False
xlWorkSheets = xlWorkBook.Sheets
'
' For/Next finds our sheet
'
For x As Integer = 1 To xlWorkSheets.Count
xlWorkSheet = CType(xlWorkSheets(x), Excel.Worksheet)
If xlWorkSheet.Name = SheetName Then
Proceed = True
Exit For
End If
Runtime.InteropServices.Marshal.FinalReleaseComObject(xlWorkSheet)
xlWorkSheet = Nothing
Next
If Proceed Then
dt.Columns.AddRange(
New DataColumn() _
{
New DataColumn With {.ColumnName = "Identifier", .DataType = GetType(Int32), .AutoIncrement = True, .AutoIncrementSeed = 1},
New DataColumn With {.ColumnName = "SomeDate", .DataType = GetType(Date)}
}
)
Dim xlUsedRange = xlWorkSheet.Range(StartCell, EndCell)
Try
Dim ExcelArray(,) As Object = CType(xlUsedRange.Value(Excel.XlRangeValueDataType.xlRangeValueDefault), Object(,))
If ExcelArray IsNot Nothing Then
' Get bounds of the array.
Dim bound0 As Integer = ExcelArray.GetUpperBound(0)
Dim bound1 As Integer = ExcelArray.GetUpperBound(1)
For j As Integer = 1 To bound0
If (ExcelArray(j, 1) IsNot Nothing) Then
dt.Rows.Add(New Object() {Nothing, ExcelArray(j, 1)})
Else
dt.Rows.Add(New Object() {Nothing, Nothing})
End If
Next
End If
Finally
ReleaseComObject(xlUsedRange)
End Try
Else
MessageBox.Show(SheetName & " not found.")
End If
xlWorkBook.Close()
xlApp.UserControl = True
xlApp.Quit()
ReleaseComObject(xlCells)
ReleaseComObject(xlWorkSheets)
ReleaseComObject(xlWorkSheet)
ReleaseComObject(xlWorkBook)
ReleaseComObject(xlWorkBooks)
ReleaseComObject(xlApp)
Else
MessageBox.Show("'" & FileName & "' not located. Try one of the write examples first.")
End If
Return dt
End Function
Private Sub ReleaseComObject(ByVal sender As Object)
Try
If sender IsNot Nothing Then
System.Runtime.InteropServices.Marshal.ReleaseComObject(sender)
sender = Nothing
End If
Catch ex As Exception
sender = Nothing
End Try
End Sub
End Module
-
Feb 25th, 2015, 11:44 AM
#13
Re: Speed Up Reading in Arrays from Excel
Originally Posted by steve08
When I inspected the array my lower bound was 0. [0,0] was the first element of the array.
Thanks for that info, they must of made a change to that in the newer Excel versions. I'm still using 2007.
-
Feb 25th, 2015, 12:24 PM
#14
Thread Starter
New Member
Re: [RESOLVED] Speed Up Reading in Arrays from Excel
Thanks to everyone for the assistance. This was my first post in these forums and y'all made it a very positive experience.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|