Results 1 to 28 of 28

Thread: Suggestions for speeding up my program? (RESOLVED)

  1. #1

    Thread Starter
    Registered User
    Join Date
    Jul 2000
    Posts
    52

    Suggestions for speeding up my program? (RESOLVED)

    I have a program that is running extremely slow. I need some suggestions on how to speed it up. While it will probably run somewhat slow because of the number of records, I fear that I may not be using the fastest possible logic/code. The program basically consists of 2 tables in an Access to Informix database link and two pipe delimited text files. The text files have roughly 23000 lines in them a piece and the tables have roughly 30000 records in them a piece. Here is what the program does:

    1. Reads the first record of table 1 in.
    2. Tries to find a matching record via primary key in table 2.
    3. Tries to locate the records primary key in file 1.
    4. If not found in file 1, tries to find it in file 2.
    5. If it isn't in either file, deletes the record from the database.
    6. Moves onto the next record in table 1.

    I use a few for/next loops and I try to break out of any loop whenever I can. It takes roughly 3 seconds per record, and I currently only look in file 1! At 30000 records, that's 25 hours to complete the process. Doubling that by adding in the search for file 2. That's 50 hours. That's a long time!

    I think that the slow part is finding the records primary key in the files. I am reading it in line by line. I am hoping there is some quicker way to search for a string in a text file and just return whether or not it is there.

    Thanks for any help. If requested, I will post the code if it helps. Thanks again!

    shootsnlad
    Last edited by shootsnlad; Dec 10th, 2002 at 11:12 AM.

  2. #2
    Your Ad Here! Edneeis's Avatar
    Join Date
    Feb 2000
    Location
    Moreno Valley, CA (SoCal)
    Posts
    7,339
    What does your code look like right now?

    Are you reading the files in once and then searching through them? What are you reading them into? What is the relation between the keys?

    You could read the data into a dataset and then use Select methods to filter out certain matching rows.

  3. #3

    Thread Starter
    Registered User
    Join Date
    Jul 2000
    Posts
    52

    the code

    Here is the code:

    Code:
            Dim foundcity, foundcounty As Boolean
            Dim textline,newparcel As String
            Dim rowcount, rowcount2, i, z As Integer
            Dim dsparcel As New DataSet()
            Dim dsparcelowner As New DataSet()
            DaParcel.Fill(dsparcel)
            DaParcelOwner.Fill(dsparcelowner)
            Dim dtparcel As DataTable = dsparcel.Tables(0)
            Dim dtparcelowner As DataTable = dsparcelowner.Tables(0)
            rowcount = dtparcelowner.Rows.Count
            rowcount2 = dtparcel.Rows.Count
            For i = 0 To rowcount - 1
                For z = 0 To rowcount2 - 1
                    If dtparcelowner.Rows(i).Item("parcel_no") = dtparcel.Rows(z).Item("parcel-number") Then
                        FileOpen(1, "cn-ci.txt", OpenMode.Binary, OpenAccess.Read, OpenShare.Shared)
                        FileOpen(2, "cn-co.txt", OpenMode.Binary, OpenAccess.Read, OpenShare.Shared)
                        While Not EOF(1)
                            textline = LineInput(1)
                            Dim fieldarray() As String = textline.Split("|")
                            newparcel = Mid(fieldarray(9), 1, 2) + "." + Mid(fieldarray(9), 3, 2) + "." + Mid(fieldarray(9), 5, 3) + "." + Mid(fieldarray(9), 8, 3)
                            If dtparcelowner.Rows(i).Item("parcel_no") = newparcel Then
                                foundcity = True
                                Exit While
                            End If
                        End While
                        While Not EOF(2)
                            textline = LineInput(2)
                            Dim fieldarray() As String = textline.Split("|")
                            newparcel = Mid(fieldarray(9), 1, 2) + "." + Mid(fieldarray(9), 3, 2) + "." + Mid(fieldarray(9), 5, 3) + "." + Mid(fieldarray(9), 8, 3)
                            If dtparcelowner.Rows(i).Item("parcel_no") = fieldarray(9) Then
                               foundcounty = True
                            End If
                        End While
                        If foundcity = True Or foundcounty = True Then
                        Else
                            dtparcelowner.Rows(i).Delete()
                            dtparcelowner.Rows(z).Delete()
                            foundcity = False
                            foundcounty = False
                        End If
                        FileClose(1)
                        FileClose(2)
                        z = rowcount2 - 1
                    End If
                Next z
            Next i

  4. #4
    Your Ad Here! Edneeis's Avatar
    Join Date
    Feb 2000
    Location
    Moreno Valley, CA (SoCal)
    Posts
    7,339
    Opening and reading the text files is probably what is so slow. Try reading these:

    http://www.c-sharpcorner.com/databas...ctODBCText.asp

    http://msdn.microsoft.com/library/de...DataSource.asp

    http://www.aspalliance.com/ericm/articles/textdb.asp

    http://msdn.microsoft.com/library/de...a_ini_file.asp

    By converting the text to a datasource you can run normal queries on it to find a match and that should speed things up dramatically.

  5. #5
    Your Ad Here! Edneeis's Avatar
    Join Date
    Feb 2000
    Location
    Moreno Valley, CA (SoCal)
    Posts
    7,339
    Well I had read but not tested those links so I decided to test them and I couldn't get it to work. But I figured it out, after I made the DSN I downloaded the .NET ODBC provider and then everything worked peachy. MS has the ODBC provider here. After that the ODBC objects are the same as the SQL and OLEDB counterparts and you can fill a new table in your dataset from the textfiles.

  6. #6
    Member
    Join Date
    Sep 2002
    Location
    California
    Posts
    52
    The reason it's so slow is definitely because of the way you're doing file handling. You need to use Steams.

    Here's a tutorial that should help clear things up;
    http://www.elitevb.com/content/01,0072,01/01.aspx

  7. #7

    Thread Starter
    Registered User
    Join Date
    Jul 2000
    Posts
    52

    follow-up

    It looks like I have two options, connecting through ADO.NET to the text file or streams. I have a couple of follow-up questions/problems.

    1. If I want to use ADO.NET to connect to a textfile, I would have to create a schema.ini file. That would be a lot of work. This file has roughly 100 columns in it that vary in size from line to line, and you need to tell ADO.NET exactly what the file looks like. Probably not possible.

    2. If I add a DSN to my machine, I would have to add it to any other machine that this may run on. (Not a big deal, just wanted to verify).

    3. The other option, streams, is something I don't think I still fully understand. I've read through the tutorial and I'm still confused. I believe I can setup the streamreader for what I need, but how do I use it. Do I use it to continously read in the whole file into a variable? How does it work?

    Thanks for the help! I'm getting closer.

    shootsnlad

  8. #8
    Your Ad Here! Edneeis's Avatar
    Join Date
    Feb 2000
    Location
    Moreno Valley, CA (SoCal)
    Posts
    7,339
    Regarding question 1, when you create the DSN it creates the schema.ini for you. Also there is a 'Guess' button when making the DSN that fills in the column info for you. If there is a decent structure to the file it should work fine after you tell it that you are using the custom delimiter | .

    Question 2, I believe you can make the DSN a file DSN and then just include it with your app (and the schema.ini file which goes in the same folder as the text file(s)).

    Question 3 if you play around a bit with streams you should catch on quickly they are fairly use to pick up. Although if you choose the stream method then you'll still be reading through the file on every line of the first table, whereas if you use it as a datasource you'd read it once and then just filter the data to match.

  9. #9

    Thread Starter
    Registered User
    Join Date
    Jul 2000
    Posts
    52
    Well I definately believe I am making progress and I understand what I need to do. The "Guess" button doesn't work for me, for some reason. I put in that it is custom delimited with a | for a delimeter. I check the "Column name header" box and I hit the "Guess" button and get three columns. The first column is the first row of my file (with all the column names in it, separated by a | symbil) and the second and third column is F1 and F2. Without the guess working, I'm in trouble. Any ideas? I've actually tried numerous things with the file, as well as making it just 2 rows (the column names and 1 row of data).

    Thanks again.

    shootsnlad

  10. #10
    Your Ad Here! Edneeis's Avatar
    Join Date
    Feb 2000
    Location
    Moreno Valley, CA (SoCal)
    Posts
    7,339
    Is the first row of the file the field names? Or do you have an example data text file you can post that I can take a crack at?

    It worked for me but I didn't have fieldnames as the first row. Here is what my data looked like:

    Ed|Marquez|15420 Bello|1|54
    Cookie|Marquez|15420 Bello|2|31
    John|Doe|4100 Latham|3|27
    Jane|Doe|4100 Latham|4|84

    And it filled in the right data types on the right columns and named them F1,F2,F3... which I renamed. I think the data would need to be:
    FirstName|LastName|Address|ID|No
    Ed|Marquez|15420 Bello|1|54
    Cookie|Marquez|15420 Bello|2|31
    John|Doe|4100 Latham|3|27
    Jane|Doe|4100 Latham|4|84

    For the Column name header option to work.

  11. #11

    Thread Starter
    Registered User
    Join Date
    Jul 2000
    Posts
    52
    I did a copy and paste of your test data (the one without the column names) into a text file. I then tried to create it using that test file. The same thing is happening that happened with my file. I have attached a JPG of what my screen looks like after I hit the "Guess" button. Any ideas?

    Thanks again for the prompt replies. I appreciate it.

  12. #12
    Your Ad Here! Edneeis's Avatar
    Join Date
    Feb 2000
    Location
    Moreno Valley, CA (SoCal)
    Posts
    7,339
    I dunno here is a screen of what I get:


    What version of windows are you using? Can you send either the text file or a sample of it so I can try it at my end?

  13. #13

    Thread Starter
    Registered User
    Join Date
    Jul 2000
    Posts
    52

    solved!

    I figured it out. The reason why it was doing that was because the file was located on a network drive rather than my local drive. It works on my local drive, but messes up on the network. I will continue on. Hopefully, by following the links you posted I should be able to wrap it up. I'll be back if I have some more trouble. Thanks again!

    shootsnlad

  14. #14

    Thread Starter
    Registered User
    Join Date
    Jul 2000
    Posts
    52

    no luck

    Well, while the idea of ODBCing to a file seemed like a good idea, I have had no luck with it. I created a file DSN to my pipe delimited file. I was able to get the schema.ini created fine. When I am in VB.NET and I go to make a new connection to it, I chose the Microsoft OLE DB Provider for ODBC Drivers, and I then try to build a string. I then select my DSN file that I created I get a message saying (Unable to connect to database. It is only possible to connect to SQL Server Desktop Engine databases and Microsoft Access databases with this version of Visual Studio)

    Any ideas?


    Thanks...

    shootsnlad

  15. #15
    Your Ad Here! Edneeis's Avatar
    Join Date
    Feb 2000
    Location
    Moreno Valley, CA (SoCal)
    Posts
    7,339
    You can't use the OLEDB data provider you have to download the .NEt ODBC Data Provider (which works the same way as OLEDB). I had the same problem at first, I guess in the Beta you could use the OLEDB drivers to connect to a DSN but now you need the ODBC ones.

    http://msdn.microsoft.com/downloads/...mpositedoc.xml

  16. #16

    Thread Starter
    Registered User
    Join Date
    Jul 2000
    Posts
    52
    I have the ODBC .NET data provider installed. I also added the ODBCConnection and such icons onto my toolbar. Now, what do I need to do? Usually I can generate a connection string to be able to connect to the file, but I can't.

    shootsnlad
    Last edited by shootsnlad; Dec 5th, 2002 at 03:38 PM.

  17. #17
    Your Ad Here! Edneeis's Avatar
    Join Date
    Feb 2000
    Location
    Moreno Valley, CA (SoCal)
    Posts
    7,339
    Its not under the system.data namespace its under the microsoft.data namespace, microsoft.data.odbc to be exact. That threw me off too.

  18. #18

    Thread Starter
    Registered User
    Join Date
    Jul 2000
    Posts
    52
    See above edited post.. (your too quick) :-)

  19. #19
    Your Ad Here! Edneeis's Avatar
    Join Date
    Feb 2000
    Location
    Moreno Valley, CA (SoCal)
    Posts
    7,339
    The connection string is just:
    "DSN=NameOfDSNHere"

    One other thing that is wierd instead of a table name you use the filename, including extension.

    "SELECT * FROM pipes.txt"

  20. #20
    Your Ad Here! Edneeis's Avatar
    Join Date
    Feb 2000
    Location
    Moreno Valley, CA (SoCal)
    Posts
    7,339
    This is the code from my test:
    VB Code:
    1. Dim cnn As New OdbcConnection("DSN=PipeTest")
    2.         Dim adp As New OdbcDataAdapter("SELECT * FROM pipes.txt", cnn)
    3.         Dim ds As New DataSet()
    4.         cnn.Open()
    5.  
    6.         adp.Fill(ds, "Test")
    7.         cnn.Close()
    8.  
    9.         TextBox1.Text = ds.Tables("Test").Rows(0).Item("FirstName")
    10.         Dim dr() As DataRow = ds.Tables("Test").Select(TextBox2.Text)
    11.         DataGrid1.DataSource = dr

    I used TextBox1 just to make sure that everything loaded fine.
    TextBox2 is a Filter string that I enter at runtime the results of which then are shown in the datagrid. An example would be: "FirstName Like e*"

  21. #21

    Thread Starter
    Registered User
    Join Date
    Jul 2000
    Posts
    52
    1 question. Where do you get "test" from? That is referencing a table name. But in a file, there really is no table name. Mine is erroring out on that statement, with a "system error". Thanks!

    shootsnlad

    PS: The adp.fill line is the one I'm referrring to.

  22. #22
    Your Ad Here! Edneeis's Avatar
    Join Date
    Feb 2000
    Location
    Moreno Valley, CA (SoCal)
    Posts
    7,339
    Actually it is more like a key for the table in the dataset's table collection. It doesn't have to be an actual table name.

    What is the error you are getting? Or what is more of the StackTrace?

  23. #23

    Thread Starter
    Registered User
    Join Date
    Jul 2000
    Posts
    52
    Here is the error on I am getting on that line.

    An unhandled exception of type 'Microsoft.Data.Odbc.OdbcException' occurred in microsoft.data.odbc.dll

    Additional information: System error.

  24. #24
    Your Ad Here! Edneeis's Avatar
    Join Date
    Feb 2000
    Location
    Moreno Valley, CA (SoCal)
    Posts
    7,339
    I'm not sure can you post the code? Worked fine for me, I can post the sample project if you'd like, just let me know.

  25. #25

    Thread Starter
    Registered User
    Join Date
    Jul 2000
    Posts
    52
    Here is the code:

    Code:
            Dim citycon As New OdbcConnection("DSN=City")
            Dim cityadp As New OdbcDataAdapter("SELECT * FROM CN-CI.txt", citycon)
            Dim cityds As New DataSet()
            citycon.Open()
            cityadp.Fill(cityds)
            citycon.Close()
    
            Dim foundcity, foundcounty As Boolean
            Dim rowcount, rowcount2, i, z As Integer
            Dim dsparcel As New DataSet()
            Dim dsparcelowner As New DataSet()
            DaParcel.Fill(dsparcel)
            DaParcelOwner.Fill(dsparcelowner)
            Dim dtparcel As DataTable = dsparcel.Tables(0)
            Dim dtparcelowner As DataTable = dsparcelowner.Tables(0)
            rowcount = dtparcelowner.Rows.Count
            rowcount2 = dtparcel.Rows.Count
            For i = 0 To rowcount - 1
                For z = 0 To rowcount2 - 1
                    If dtparcelowner.Rows(i).Item("parcel_no") = dtparcel.Rows(z).Item("parcel-number") Then
                        Dim tablecmd As String
                        Dim cmdresults As Integer
                        Dim cmd As New Microsoft.Data.Odbc.OdbcCommand()
                        tablecmd = "SELECT * FROM cn-ci.txt WHERE PARCEL = " + dtparcelowner.Rows(i).Item("parcel_no")
                        cmd.CommandText = tablecmd
                        citycon.Open()
                        cmdresults = cmd.ExecuteNonQuery()
                        citycon.Close()
                        MessageBox.Show("After creating the table, results = " & cmdresults.ToString)
                        If foundcity = True Or foundcounty = True Then
                        Else
                            dtparcelowner.Rows(i).Delete()
                            dtparcelowner.Rows(z).Delete()
                            foundcity = False
                            foundcounty = False
                        End If
                        z = rowcount2 - 1
                    End If
                Next z
            Next i
    There is some logic I still need to put in there, but the code is there. It errors out on the cityadp.fill line (#5). I have attached a bit of my file to see if you can get it working. Thanks again!

  26. #26
    Your Ad Here! Edneeis's Avatar
    Join Date
    Feb 2000
    Location
    Moreno Valley, CA (SoCal)
    Posts
    7,339
    Ok I found the trouble, its the - in the filename which becomes the table name. For some reason it doesn't like it, but if you take it out then all is well. So just change the name of the file to: CNCI.txt and make that change in either the DSN or the schema file (which is what changing it in the DSN will do) and then everything works fine. Of course I don't have the parcel data so I couldn't test your code but I got the same error until I made those changes and after that I could query the text file.

  27. #27
    Your Ad Here! Edneeis's Avatar
    Join Date
    Feb 2000
    Location
    Moreno Valley, CA (SoCal)
    Posts
    7,339
    Also on the other use of it I think this may be closer to what you are trying to do:

    VB Code:
    1. Dim cmd As New OdbcCommand("SELECT Count(*) FROM cnci.txt WHERE PARCEL = " & TextBox2.Text, cnn)
    2.         Dim result As Integer = cmd.ExecuteScalar
    3.         MsgBox("Record(s) found " & result)
    4.         cnn.Close()

    Although since you already have all the data in a dataset its about 4 times faster to filter the dataset instead of running a new query.

    VB Code:
    1. Dim result() as Datarow=cityds.Tables(0).Select("PARCEL = " & TextBox2.Text, cnn)
    2.         MsgBox("Record(s) found " & result.Length)

    Running a new query took .4 of a second but running a filter on the dataset took .14 of a second. Which I also left the connection open from filling the dataset and normally you'd have to reopen it which will take longer, but that is not needed with the filter.

  28. #28

    Thread Starter
    Registered User
    Join Date
    Jul 2000
    Posts
    52
    Well, everything appears to be running as it should (and A LOT faster I might add). Now it's actually runable. I think that should take care of everything. I'll let you know if I have anymore problems, but everything looks good so far. Thanks again for all your help! I appreciate it.

    shootsnlad

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width