Opening XLS file through OleDb. Header issues.
Hello, there is a problem I'm afraid there is no solution for, but I wish to ask anyway just in case someone knows.
My application downloads a third-party XLS file from the web. I cannot change its format so I have to teach my application to open it as it is.
I'm using OleDb for opening, here's my code:
vb Code:
'
If XLSFile = "" Then Exit Sub
Dim sExcelConn As String = _
String.Format( _
"Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};Extended Properties=""Excel 8.0;HDR=Yes;IMEX=1"";", XLSFile)
Dim conn As New OleDbConnection(sExcelConn)
Dim ds As New DataSet()
conn.Open()
Dim dtschema As DataTable = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, Nothing)
Dim tablename As String = ""
If dtschema.Rows.Count > 0 Then
tablename = dtschema.Rows(0).Item("TABLE_NAME").ToString()
Else
Exit Sub
End If
Dim cmd As New OleDbCommand(String.Format("SELECT * FROM [{0}];", tablename), conn)
Dim dtData As New DataTable
Dim da As New OleDbDataAdapter(cmd)
da.Fill(dtData)
Everything works fine except one thing.
The headers of the table I'm trying to get are on the second row, not on the first one. It looks like this:
Code:
| A | B | C | D | ...
------|---------+---------+---------+---------+
1|This is a main table title which spans the whole table
------|---------+---------+---------+---------+
2| Header1 | Header2 | Header3 | Header4 | ...
------|---------+---------+---------+---------+
3| Data | Data | Data | Data |
------|---------+---------+---------+---------+
...
Now, when I get that table the headers are treated as data while the table title from the first row is considered a header. Is there a workaround or I just have to live with it?
Re: Opening XLS file through OleDb. Header issues.
You can select a range as shown below. Let's say your data begins on row A and spans to D, you have some idea how many hows there are, say between 50 and 80 but want to make sure you get them all so ask for rows up to 100. If there are say 30 rows the row count will be 30, not the amount you asked for.
So in my example below I only have A3:D6 but asked for up to D100, my row count is 4.
Code:
Private ConnectionNoHeader As String = "provider=Microsoft.Jet.OLEDB.4.0; data source='{0}';Extended Properties=""Excel 8.0;IMEX=1; HDR=No;"""
Code:
Private Sub Button2_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button2.Click
Dim FileName As String = "Test.xls"
Using MyConnection As New System.Data.OleDb.OleDbConnection(String.Format(ConnectionNoHeader, FileName))
MyConnection.Open()
Dim cmd As OleDbCommand = New OleDbCommand("select * from [Sheet1$A3:D100]", MyConnection)
Dim dr As System.Data.IDataReader = cmd.ExecuteReader
Dim dt As New DataTable
dt.Load(dr)
Me.Text = dt.Rows.Count.ToString
End Using
End Sub
An alternate is to read as you are now and remove the first few data rows from the returning DataTable so all that remains is data.
Re: Opening XLS file through OleDb. Header issues.
Well, this could help, I'll try it, but still, I need to know the number of rows to query AND I need those headers.
Right now, I used this code which solves that particular problem:
vb Code:
'
If XLSFile = "" Then Exit Sub
Dim sExcelConn As String = String.Format("Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};Extended Properties=""Excel 8.0;HDR=NO;IMEX=1"";", XLSFile)
Dim conn As New OleDbConnection(sExcelConn)
LicData = New DataTable
Try
conn.Open()
Dim dtschema As DataTable = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, Nothing)
Dim tablename As String = ""
If dtschema.Rows.Count > 0 Then
tablename = dtschema.Rows(0).Item("TABLE_NAME").ToString()
End If
Dim cmd As New OleDbCommand(String.Format("SELECT * FROM [{0}];", tablename), conn)
Dim dr As OleDbDataReader = cmd.ExecuteReader
Dim rownumber As Integer = 0
Do While dr.Read()
Select Case rownumber
Case Is > 1
Dim objs(dr.FieldCount - 1) As Object
dr.GetValues(objs)
LicData.Rows.Add(objs)
Case 0 ' Skip
Case 1 ' Headers
For cln As Integer = 0 To dr.FieldCount - 1
LicData.Columns.Add(New DataColumn(dr.Item(cln).ToString()))
Next
End Select
rownumber += 1
Loop
Message(String.Format("Считано {0} записей.", records), False, True)
Message(Environment.NewLine & "Данные успешно считаны.")
Catch ex As Exception
Message("Ошибка: " & Environment.NewLine & ex.Message)
Finally
conn.Close()
End Try
1 Attachment(s)
Re: Opening XLS file through OleDb. Header issues.
If the format is always the same i.e. data begins on a specific row then this (same as before) gives you that plus tells you how many rows will be returned.
Code:
Public Class Form1
Private ConnectionNoHeader As String = "provider=Microsoft.Jet.OLEDB.4.0; data source='{0}';" & _
"Extended Properties=""Excel 8.0;IMEX=1; HDR=No;"""
Private FileName As String = "Test.xls"
Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
Using MyConnection As New System.Data.OleDb.OleDbConnection(String.Format(ConnectionNoHeader, FileName))
MyConnection.Open()
Dim cmd As OleDbCommand = New OleDbCommand("select COUNT(*) from [Sheet3$A3:D100]", MyConnection)
Dim RowCount = CLng(cmd.ExecuteScalar)
lblRowCount.Text = String.Format("Row Count: {0}", RowCount)
cmd = New OleDbCommand("select F1 As Column1, F2 As Whatever, F3 As LastCol from [Sheet3$A3:D100]", MyConnection)
Dim dr As System.Data.IDataReader = cmd.ExecuteReader
Dim dt As New DataTable
dt.Load(dr)
DataGridView1.DataSource = dt
End Using
End Sub
End Class
Re: Opening XLS file through OleDb. Header issues.
Quote:
Originally Posted by
kevininstructor
If the format is always the same i.e. data begins on a specific row then this (same as before) gives you that plus tells you how many rows will be returned.
Code:
Dim cmd As OleDbCommand = New OleDbCommand("select COUNT(*) from [Sheet3$A3:D100]",
Well, as I understand, I should query for A3:H65535, what I mean - there are quite a number of rows (20-30K at least) and it varies. The data always begins at row #3 though.
P.S. Would give you rep, but it appears that I need to distribute it some more. Thanks anyway.
Re: Opening XLS file through OleDb. Header issues.
To make the rows to read dynamic
Code:
Private ConnectionNoHeader As String = "provider=Microsoft.Jet.OLEDB.4.0; data source='{0}';" & _
"Extended Properties=""Excel 8.0;IMEX=1; HDR=No;"""
Private FileName As String = "Test.xls"
Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
Using MyConnection As New System.Data.OleDb.OleDbConnection(String.Format(ConnectionNoHeader, FileName))
MyConnection.Open()
Dim cmd As OleDbCommand = New OleDbCommand("select COUNT(*) from [Sheet3$]", MyConnection)
Dim RowCount = CLng(cmd.ExecuteScalar)
Dim RowsToSkip = 2 ' make adjustment for rows we are skipping i.e. header rows
lblRowCount.Text = String.Format("Row Count: {0}", RowCount)
Dim StartRow As String = 1.ExcelColumnName
Dim StartCol As Integer = 3
Dim EndRow As String = 3.ExcelColumnName
Dim EndCol As Integer = CInt(RowCount) + RowsToSkip
Dim StatementToUse = _
<SQL>
SELECT
F1 As Column1,
F2 As Whatever,
F3 As LastCol
FROM [Sheet3$<%= StartRow %><%= StartCol %>:<%= EndRow %><%= EndCol %>]
</SQL>.Value
cmd = New OleDbCommand(StatementToUse, MyConnection)
Dim dr As System.Data.IDataReader = cmd.ExecuteReader
Dim dt As New DataTable
dt.Load(dr)
DataGridView1.DataSource = dt
End Using
End Sub
Code module
Code:
Module Module1
<System.Runtime.CompilerServices.Extension()> _
Public Function ExcelColumnName(ByVal Index As Integer) As String
Dim chars = New Char() _
{ _
"A"c, "B"c, "C"c, "D"c, "E"c, "F"c, "G"c, "H"c, "I"c, _
"J"c, "K"c, "L"c, "M"c, "N"c, "O"c, "P"c, "Q"c, "R"c, _
"S"c, "T"c, "U"c, "V"c, "W"c, "X"c, "Y"c, "Z"c _
}
Index -= 1
Dim columnName As String
Dim quotient = Index \ 26
If quotient > 0 Then
columnName = ExcelColumnName(quotient) + chars(Index Mod 26)
Else
columnName = chars(Index Mod 26).ToString()
End If
Return columnName
End Function
End Module