Results 1 to 11 of 11

Thread: [RESOLVED] Trouble using 2 different Excel files (.xlsx) to create a single data table

  1. #1

    Thread Starter
    Addicted Member
    Join Date
    Nov 2022
    Location
    Manitoba, Canada
    Posts
    174

    Resolved [RESOLVED] Trouble using 2 different Excel files (.xlsx) to create a single data table

    Using VB.Net and Visual Studio 2022

    I am attempting to create a data table which is derived from a pair of excel (.xlsx) files. Currently I am manually downloading these files by using my web browser and downloading them from a website at which time I then save them into my app.startup directory. What I hope to accomplish with these two separate .xlsx files is to amalgamate them into one single data table (I think).

    What I want to achieve is this: Each of the .xlsx files I have downloaded contain different statistics regarding NHL hockey teams. The first Excel file for example is a boiled down summary of all the 32 NHL hockey teams' individual general stats. Data like total wins, total losses and total season points. The second Excel file contains data for each of the 32 teams' individual statistics regarding their Penalties. Total penalty minutes, types of penalties, ect... I hope to join or somehow combine these two data tables into one so as to display them in my datagridview in such a way that the data from each individual .xlsx file corresponds by Team Name across the grid. I think its important to note that each of these .xlsx files share a common column called “Team”. This column represents the team’s name. I'm a novice and quite new to all of this however, I think what I need to do here is create a relationship between these tables, using the "Team" column as a Primary Key/Foreign Key? Am I on the right path? I've only been trying to do this for 3 days now...

    I can’t seem to come up with an approach to achieve this and that's where I think I need the help...Unless I'm way out in left field. Then in that case, somebody please point me in the right direction. I'll share what I have achieved so far and how I have achieved it. As far as I understand it, I have successfully retrieved that .xlsx files, loaded their data into two different data tables and have defined a data set (I am using the correct terminology? I'm trying).

    I use this code to fill two separate data tables, each with the data from one of the .xlsx files. I actually call this on a Form Load event since the files that will be utilized here will always have the same name and location.
    Code:
        Public Sub GetExcelFiles()
    
            Dim openFileDialog1 As New OpenFileDialog()
            openFileDialog1.FileName = "Summary.xlsx"
    
            Dim TheFileName1 As String = openFileDialog1.FileName
            Dim ConnectionString1 As String = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & Application.StartupPath() & TheFileName1 & ";Extended Properties='Excel 12.0 Xml;HDR=YES;'"
    
            Using conn As New OleDbConnection(ConnectionString1)
    
                Dim cmd As New OleDbCommand("SELECT * FROM [Summary$]", conn)
                Dim adapter As New OleDbDataAdapter(cmd)
    
                adapter.Fill(dtTeamStats)
    
            End Using
    
    
            Dim openFileDialog2 As New OpenFileDialog()
            openFileDialog2.FileName = "Penalties.xlsx"
    
            Dim TheFileName2 As String = openFileDialog2.FileName
            Dim ConnectionString2 As String = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & Application.StartupPath() & TheFileName2 & ";Extended Properties='Excel 12.0 Xml;HDR=YES;'"
    
    
            Using conn As New OleDbConnection(ConnectionString2)
    
                Dim cmd As New OleDbCommand("SELECT * FROM [Penalties$]", conn)
                Dim adapter As New OleDbDataAdapter(cmd)
    
                adapter.Fill(dtPenalties)
    
            End Using
    
        End Sub
    Then I go ahead and setup the Data Set with the code below (again, called from the Form Load event). I suspect it is within this area that I must define primary and foreign keys and form some sort of relationship between the two data tables?
    Code:
        Public Sub SetUpDataSet()
    
            dtTeamStats.TableName = "TeamStats"
            dtPenalties.TableName = "Penalties"
            ds.Tables.Add(dtTeamStats)
            ds.Tables.Add(dtPenalties)
    
        End Sub
    This may be irrelevant to my question but I'll show my work anyway. Once the data set is structured, I then go on to define the Binding Source and bind it to my DataGridView like this:
    Code:
        Public Sub SetUpBindings()
    
            Me.bs.DataSource = ds
            Me.bs.DataMember = "TeamStats"
    
            Me.DataGridView1.DataSource = bs
    
        End Sub
    I guess my first question is: I am on the right path? My second question would be, how do I create a relationship between these two data tables using code AND are the TEAM columns the correct columns to use to make that relationship?

    For example, I have written this into the code but I can't quite get it sorted out.... It gets the error "System.ArgumentException: 'DataMember property 'CombinedData' cannot be found on the DataSource.'" at this line: Me.bs.DataMember = "CombinedData"
    Code:
        Public Sub SetUpDataSet()
    
            dtTeamStats.TableName = "TeamStats"
            dtPenalties.TableName = "Penalties"
    
            ds.Tables.Add(dtTeamStats)
            ds.Tables.Add(dtPenalties)
    
            ds.Relations.Add("CombinedData", ds.Tables("TeamStats").Columns("Team"), ds.Tables("Penalties").Columns("Team"), False)
    
        End Sub
    
    
    
        Public Sub SetUpBindings()
    
            Me.bs.DataSource = ds
            Me.bs.DataMember = "CombinedData"
    
            Me.DataGridView1.DataSource = bs
    
        End Sub
    Thank you in advance for any guidance. I'm struggling...

    For those who are interested (if it helps), the .xlsx files in which I am referring to can be exported from the NHL website at the links below:
    Teams Summary
    Penalty Minutes

  2. #2
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    4,826

    Re: Trouble using 2 different Excel files (.xlsx) to create a single data table

    Why not use SQL directly?
    For the life of me i can never remember the specific syntax to address an Excel-File including path directly in the FROM-Clause, but i know it exists
    Something along the Lines of
    SELECT * FROM [;Data Source=SomePath;SomeOptions]!SomeSheetWithinTheFile

    If that fails, i know two approaches out of my left sleeve:
    1) Use an InMemory-SQLite-Database, import each Excel-File there and throw a simple Select at it
    Code:
    SELECT * FROM Summary AS S INNER JOIN Penalties AS P ON S.Team=P.Team ORDER BY S.Team
    2) Nested loop.
    Outer loop runs through Summary, inner Loop runs through Penalties, compare Team-Name, if found pull your data, exit inner loop
    Can be optimized to one loop, if it is sure, that Team-Names never change between files
    Then you need to load each excel-file sorted by team-name.
    Then you need a Loop with 32 iterations
    Last edited by Zvoni; Tomorrow at 31:69 PM.
    ----------------------------------------------------------------------------------------

    One System to rule them all, One Code to find them,
    One IDE to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    Code is like a joke: If you have to explain it, it's bad

  3. #3

    Thread Starter
    Addicted Member
    Join Date
    Nov 2022
    Location
    Manitoba, Canada
    Posts
    174

    Re: Trouble using 2 different Excel files (.xlsx) to create a single data table

    Quote Originally Posted by Zvoni View Post
    Why not use SQL directly?
    Something along the Lines of
    SELECT * FROM [;Data Source=SomePath;SomeOptions]!SomeSheetWithinTheFile
    Thank you for the input, Zvoni. So what you're saying is that I'm going about it all wrong? Instead of creating a relationship between the two data tables that I am currently building using the data from the Excel files, I should instead create an SQL query (SELECT)? I've used SQL with .db files but it never occurred to me that I could use it with .xlsx files as well. This option interests me the most since I do have some experience with VB.Net and SQL database work...Its limited but I do have some. Previously, I would use SQLite to build my database and then use SQL to query them and what I think you're saying is that I can use SQL to query my Excel files the same way. I think I understand...

    Quote Originally Posted by Zvoni View Post
    If that fails, i know two approaches out of my left sleeve:
    1) Use an InMemory-SQLite-Database, import each Excel-File there and throw a simple Select at it
    Code:
    SELECT * FROM Summary AS S INNER JOIN Penalties AS P ON S.Team=P.Team ORDER BY S.Team
    An InMemory-SQLite-Database isn't something I have any experience with. I will have to do some reading up on and experimenting with this before I am able to get a better understanding of what this is... This looks much like your first suggestion, yet different...

    Quote Originally Posted by Zvoni View Post
    2) Nested loop.
    Outer loop runs through Summary, inner Loop runs through Penalties, compare Team-Name, if found pull your data, exit inner loop
    Can be optimized to one loop, if it is sure, that Team-Names never change between files
    Then you need to load each excel-file sorted by team-name.
    Then you need a Loop with 32 iterations
    I feel like this option is way over my head and would be troublesome for me. Rather, I would be troublesome to the forum by asking a billion questions... It's no secret that my understanding of VB.Net as a language is not very strong and I struggle, even with the basic stuff. This isn't from a lack of effort, I assure you, as I have been sitting at this laptop for an average of about 4-6 hours per day trying to comprehend this stuff. I just don't learn well by reading, never have... I'm a hands on learner. I'll consider this option as a last resort just because your description scares a novice like me...lol

    In any event, thank you Zvoni for your guidance. I'll see what I can come up with tomorrow. It's currently after 2AM here and I have an early morning ahead of me.

    Cheers!

  4. #4
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    4,826

    Re: Trouble using 2 different Excel files (.xlsx) to create a single data table

    Quote Originally Posted by The_Hobbyist View Post
    I feel like this option is way over my head and would be troublesome for me. Rather, I would be troublesome to the forum by asking a billion questions... It's no secret that my understanding of VB.Net as a language is not very strong and I struggle, even with the basic stuff. This isn't from a lack of effort, I assure you, as I have been sitting at this laptop for an average of about 4-6 hours per day trying to comprehend this stuff. I just don't learn well by reading, never have... I'm a hands on learner. I'll consider this option as a last resort just because your description scares a novice like me...lol

    In any event, thank you Zvoni for your guidance. I'll see what I can come up with tomorrow. It's currently after 2AM here and I have an early morning ahead of me.

    Cheers!
    In your first code-block change/add (Note the ORDER BY-Clause)
    Code:
    SELECT * FROM [Summary$] ORDER BY Team
    
    SELECT * FROM [Penalties$] ORDER BY Team
    and then singlestep through your recordsets/datasets
    Last edited by Zvoni; Tomorrow at 31:69 PM.
    ----------------------------------------------------------------------------------------

    One System to rule them all, One Code to find them,
    One IDE to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    Code is like a joke: If you have to explain it, it's bad

  5. #5

    Thread Starter
    Addicted Member
    Join Date
    Nov 2022
    Location
    Manitoba, Canada
    Posts
    174

    Re: Trouble using 2 different Excel files (.xlsx) to create a single data table

    Quote Originally Posted by Zvoni View Post
    In your first code-block change/add (Note the ORDER BY-Clause)
    Code:
    SELECT * FROM [Summary$] ORDER BY Team
    
    SELECT * FROM [Penalties$] ORDER BY Team
    and then singlestep through your recordsets/datasets
    Alright, I understand the changes that you suggest I make regarding the sorting of the tables alphabetically. It works as intended now as far as sorting goes when I populate the DataGridView. As for the Nested Loop, I'm going to look through some code examples and see if I can't make heads or tails of it. Thanks.
    Last edited by The_Hobbyist; Nov 28th, 2024 at 11:58 AM.

  6. #6
    PowerPoster
    Join Date
    Sep 2005
    Location
    Modesto, Ca.
    Posts
    5,288

    Re: Trouble using 2 different Excel files (.xlsx) to create a single data table

    Can be optimized to one loop, if it is sure, that Team-Names never change between files
    This is a key point. Will the team names ALWAYS be the same, spelled the same?

    Will there be just one record per team or multiple?

    It sounds like you want to take two datatables and combine them to create one table with one record for each team. Is that correct?

    Do you plan on saving this data? If so, how? Excel, Database...

  7. #7

    Thread Starter
    Addicted Member
    Join Date
    Nov 2022
    Location
    Manitoba, Canada
    Posts
    174

    Re: Trouble using 2 different Excel files (.xlsx) to create a single data table

    Quote Originally Posted by wes4dbt View Post
    This is a key point. Will the team names ALWAYS be the same, spelled the same?

    Will there be just one record per team or multiple?

    It sounds like you want to take two datatables and combine them to create one table with one record for each team. Is that correct?

    Do you plan on saving this data? If so, how? Excel, Database...
    The team names will ALWAYS be spelled the same, yes. Yes, there will only be a single record per team. I have no intentions (at this time) of modifying the data taken from these .xlsx files. They are strictly to be used for reading and collecting hockey team stats. Those stats will then be taken and used in some math equations. I suppose if I were to opt to save files down the road, for whatever reason, I'd likely choose to stay with the Excel platform? I don't have a good reason for this but in my mind it makes sense to keep .xlsx data coming in as .xlsx data going out? You asked, so I felt the need to answer...but I don't suspect there will be any need to manipulate the data.

    Yes, I believe what I want to do is take two datatables (each one derived from one .xlsx file) and then combine them to create one data table with one record for each of the teams. My problem at the moment is that the stats for each team are divided across two serperate .xlsx files. With that said, those .xlsx files each have the same number of records (a single record for each of the 32 NHL teams) and they each have a column named "Team" which holds the team name for that record.

    The datatables in question can be viewed from the links in my original post. There is an "Export" file button on the page which will allow you to save the file from the current page.
    Last edited by The_Hobbyist; Nov 28th, 2024 at 06:40 PM. Reason: Grammar & clarification

  8. #8
    PowerPoster
    Join Date
    Sep 2005
    Location
    Modesto, Ca.
    Posts
    5,288

    Re: Trouble using 2 different Excel files (.xlsx) to create a single data table

    It's Thanksgiving and I'm short on time so I'll just freehand an example,

    Code:
    'Add columns to Datatable1 for the data from Datatable2
    
    'Then,
    
    For Each row1 as DataRow in Datatable1
        Dim row2 as Datarow()
        row2 = Datatable2.Select("Team ='" & row.Item("Team").ToString & "'")
        If row2.Count>0 Then
           'add data to row1
          row1.Item("somefield")= row2(0).Item("somefield")
        End If
    End for
    Last edited by wes4dbt; Nov 28th, 2024 at 08:48 PM.

  9. #9

    Thread Starter
    Addicted Member
    Join Date
    Nov 2022
    Location
    Manitoba, Canada
    Posts
    174

    Resolved Re: Trouble using 2 different Excel files (.xlsx) to create a single data table

    Quote Originally Posted by wes4dbt View Post
    It's Thanksgiving and I'm short on time so I'll just freehand an example,

    Code:
    'Add columns to Datatable1 for the data from Datatable2
    
    'Then,
    
    For Each row1 as DataRow in Datatable1
        Dim row2 as Datarow()
        row2 = Datatable2.Select("Team ='" & row.Item("Team").ToString & "'")
        If row2.Count>0 Then
           'add data to row1
          row1.Item("somefield")= row2(0).Item("somefield")
        End If
    End for
    First off, Happy Thanksgiving!

    Secondly, thank you for this. I was able to look at your freehand example and adapt it and implement it into my project. It works like a champ and I am very grateful! I will share my working code and mark this thread resolved. Many thanks to everyone who offered guidance.

    I use this code to open 2 separate .xlsx files and then fill two different datatables (dtTeamStats and dtPenalties). Each of the datatables contains the data from 1 of the .xlsx files. I have chosen to call this on the form load event so the data is loaded into my DataGridView at startup rather than having the user use a file dialog box and navigate to the files in question.
    Code:
        Public Sub GetExcelFiles()
    
            Dim openFileDialog1 As New OpenFileDialog()
            openFileDialog1.FileName = "Summary.xlsx"
    
            Dim TheFileName1 As String = openFileDialog1.FileName
            Dim ConnectionString1 As String = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & Application.StartupPath() & TheFileName1 & ";Extended Properties='Excel 12.0 Xml;HDR=YES;'"
    
            Using conn As New OleDbConnection(ConnectionString1)
    
                Dim cmd As New OleDbCommand("SELECT * FROM [Summary$] ORDER BY Team", conn)
                Dim adapter As New OleDbDataAdapter(cmd)
    
                adapter.Fill(dtTeamStats)
    
            End Using
    
    
            Dim openFileDialog2 As New OpenFileDialog()
            openFileDialog2.FileName = "Penalties.xlsx"
    
            Dim TheFileName2 As String = openFileDialog2.FileName
            Dim ConnectionString2 As String = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & Application.StartupPath() & TheFileName2 & ";Extended Properties='Excel 12.0 Xml;HDR=YES;'"
    
    
            Using conn As New OleDbConnection(ConnectionString2)
    
                Dim cmd As New OleDbCommand("SELECT * FROM [Penalties$] ORDER BY Team", conn)
                Dim adapter As New OleDbDataAdapter(cmd)
    
                adapter.Fill(dtPenalties)
    
            End Using
    
        End Sub
    As I understand it, the below code does the following:
    1) It starts off by giving the first datatable a name ("TeamStats"). This will be the datatable used to create the dataset.
    2) It then goes on to add columns to the datatable (dtTeamStats) which correspond to the columns in the second datatable (dtPenalties). It should be noted that these columns represent the data that I want to add from the 2nd datatable (dtPenalties) to the first datatable (dtTeamStats).
    3) Then the code runs a loop, searching through each record or row, looking for a match on the "Team" columns. Essentially it is searching the records across the two datatables and trying to match the rows that share a team's name. When it finds a match, it pulls the data from the record in the 2nd datatable (dtPenalties) and adds it into the record in the first table (dtTeamStats). The data is placed into the columns that were created in step 2 above.

    Code:
        Public Sub SetUpDataSet()
    
            dtTeamStats.TableName = "TeamStats"
    
            dtTeamStats.Columns.Add("PIM", GetType(Double))
            dtTeamStats.Columns.Add("PIM/GP", GetType(String))
            dtTeamStats.Columns.Add("Pen Drawn", GetType(Double))
            dtTeamStats.Columns.Add("Pen Taken", GetType(Double))
            dtTeamStats.Columns.Add("Net Pen", GetType(Double))
            dtTeamStats.Columns.Add("Pen Drawn/60", GetType(String))
            dtTeamStats.Columns.Add("Pen Taken/60", GetType(String))
            dtTeamStats.Columns.Add("Net Pen/60", GetType(Double))
            dtTeamStats.Columns.Add("Bench", GetType(Double))
            dtTeamStats.Columns.Add("Minor", GetType(Double))
            dtTeamStats.Columns.Add("Major", GetType(Double))
            dtTeamStats.Columns.Add("Match", GetType(Double))
            dtTeamStats.Columns.Add("Msct", GetType(Double))
            dtTeamStats.Columns.Add("G Msct", GetType(Double))
    
            For Each row1 As DataRow In dtTeamStats.Rows
    
                Dim row2 As DataRow()
                row2 = dtPenalties.Select("Team ='" & row1.Item("Team").ToString & "'")
    
                If row2.Count > 0 Then
                    row1.Item("PIM") = row2(0).Item("PIM")
                    row1.Item("PIM/GP") = row2(0).Item("PIM/GP")
                    row1.Item("Pen Drawn") = row2(0).Item("Pen Drawn")
                    row1.Item("Pen Taken") = row2(0).Item("Pen Taken")
                    row1.Item("Net Pen") = row2(0).Item("Net Pen")
                    row1.Item("Pen Drawn/60") = row2(0).Item("Pen Drawn/60")
                    row1.Item("Pen Taken/60") = row2(0).Item("Pen Taken/60")
                    row1.Item("Net Pen/60") = row2(0).Item("Net Pen/60")
                    row1.Item("Bench") = row2(0).Item("Bench")
                    row1.Item("Minor") = row2(0).Item("Minor")
                    row1.Item("Major") = row2(0).Item("Major")
                    row1.Item("Match") = row2(0).Item("Match")
                    row1.Item("Msct") = row2(0).Item("Msct")
                    row1.Item("G Msct") = row2(0).Item("G Msct")
                End If
    
            Next
    
            ds.Tables.Add(dtTeamStats)
    
        End Sub

  10. #10
    PowerPoster
    Join Date
    Sep 2005
    Location
    Modesto, Ca.
    Posts
    5,288

    Re: [RESOLVED] Trouble using 2 different Excel files (.xlsx) to create a single data

    Code:
    ds.Tables.Add(dtTeamStats)
    I don't know2 what's going on in the rest of your app but if your only using one Datatable then there is no reason to be using a DataSet.

  11. #11

    Thread Starter
    Addicted Member
    Join Date
    Nov 2022
    Location
    Manitoba, Canada
    Posts
    174

    Re: [RESOLVED] Trouble using 2 different Excel files (.xlsx) to create a single data

    Quote Originally Posted by wes4dbt View Post
    Code:
    ds.Tables.Add(dtTeamStats)
    I don't know2 what's going on in the rest of your app but if your only using one Datatable then there is no reason to be using a DataSet.
    Ah! Good call. Thank you again!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width