Results 1 to 3 of 3

Thread: Quickly loading data into a program

  1. #1

    Thread Starter
    PowerPoster Jenner's Avatar
    Join Date
    Jan 2008
    Location
    Mentor, OH
    Posts
    3,712

    Quickly loading data into a program

    Here's my situation. I have about 12.3 Mb of data - at least this is how large the .xml file is when I use Dataset.WriteXml() from another program.
    It takes about 1 seconds to write that 12.3 Mb.
    I can transfer the file across the network in the blink of an eye.

    I want to read it into another program.
    The problem is, it takes about 1 min, 10 sec using Dataset.ReadXml which in my opinion is nuts.
    It takes 58 sec to read it into string via StreamReader.ReadToEnd()

    There's gotta be a better, faster way to do this. Am I missing something? I'm reading off a network location, but I've also tried File.Copy() to the local Temp directory and reading from there with little success.

    Anyone have any suggestions?
    Thanks!
    My CodeBank Submissions: TETRIS using VB.NET2010 and XNA4.0, Strong Encryption Class, Hardware ID Information Class, Generic .NET Data Provider Class, Lambda Function Example, Lat/Long to UTM Conversion Class, Audio Class using BASS.DLL

    Remember to RATE the people who helped you and mark your forum RESOLVED when you're done!

    "Two things are infinite: the universe and human stupidity; and I'm not sure about the universe. "
    - Albert Einstein

  2. #2
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    110,299

    Re: Quickly loading data into a program

    How many records are we talking about? Most likely you should be using a database. An XML file is all well and good for storing small amounts of data or moving large amounts of data between systems but it was never intended for storing large amounts of data. Remember that reading structured data like that involves lots of validation and that's something that slows it down a lot.

  3. #3

    Thread Starter
    PowerPoster Jenner's Avatar
    Join Date
    Jan 2008
    Location
    Mentor, OH
    Posts
    3,712

    Re: Quickly loading data into a program

    About 150,000 records best I can tell from the datafile. I know, I should be using a database, and the data is nicely stored in a database but there are other circumstances involved (as there usually is in the world of programming)

    I have about 30 clients running this program. Some are on opposite ends of a site-to-site VPN from the database (which is for our ERP system, so it's under constant use). When all 30 are banging away on the database every 5-10 minutes looking for exact-same data updates, the performance of the whole ERP starts suffering a little bit. Not enough for me to do what I'm trying to do, but enough to be noticeable. The real reason is the VPN. On the database side, pulling those 150,000 records from the database takes like 3 seconds and it's done. On the far end of the VPN, it takes 5 friggin minutes.

    So... I decided to write a Windows Service that periodically pulled the data from the database, stored it on fileshares on both-ends of the VPN tunnel, and all the clients can pull the data from the fileshares. I only have one client hitting my database up for data now so my database traffic is reduced. All the clients try for the datafiles... and if the files aren't present or are more than an hour old (my service is knocked out) only then try to go directly to the database. Plus, the idea being the file being local in a location on each end of the VPN tunnel, it should be a fast retrieval from anywhere on the network. All I need to do is hit up AD Sites and Services to find what end of the network the client is on for which fileshare to look for the data on.

    Thus my current problem. I went from the A and B sides of the VPN taking 3 seconds and 5 minutes, to 1 min 20sec and 1 min 20sec. Hooray for our California office, but now the guys at HQ are complaining program startup is taking too long.


    Ok, update:
    Using DataTable.BeginLoadData / .EndLoadData got me from 1 min 20 sec down to about 45 sec.
    I'm messing around with a MemoryMappedFileStream now to do partial data loads. I'm messing with a simple "best guess" function that looks at the now alphabetized data - I split the data file into 5 overlapping regions: first-third, second-third, last-third, and two overlapping data chunks in the middle and based on what I'm looking for, set my MemoryMappedViewStream to it for a potential faster lookup. I've gotten it down to 9 seconds now. Meanwhile, on an off-thread, I have the whole dataset taking it's 45 sec load-in and once it finished, fills a dataset and disables the MemoryMappedFile guesswork routine. Likewise, if the guess-routine can't find matching data (it guessed wrong) it falls back into waiting for the whole chunk of data to load in before checking it. It "should" always find a record. It's scanned in from a system-generated barcode after all.
    My CodeBank Submissions: TETRIS using VB.NET2010 and XNA4.0, Strong Encryption Class, Hardware ID Information Class, Generic .NET Data Provider Class, Lambda Function Example, Lat/Long to UTM Conversion Class, Audio Class using BASS.DLL

    Remember to RATE the people who helped you and mark your forum RESOLVED when you're done!

    "Two things are infinite: the universe and human stupidity; and I'm not sure about the universe. "
    - Albert Einstein

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width