Results 1 to 22 of 22

Thread: A faster way to insert a large number of records on a remote SQL Server

  1. #1

    Thread Starter
    Frenzied Member cory_jackson's Avatar
    Join Date
    Dec 2011
    Location
    Fallbrook, California
    Posts
    1,104

    A faster way to insert a large number of records on a remote SQL Server

    I am inserting on the order of 100k rows to a remote SQL server. Each row is a couple dozen columns but I am only sending an small string account number from a text file to one column using SQLCommand only. IE No ORM or DataTables. I created a function to do this locally with ExecuteNonQuery and it was acceptable. But in testing on the remote server it managed only 6.7 rows per second which is unacceptably slow. The user could literally be there all day in some cases. Can you suggest a better method or alternative ideas?

    I was considering BulkInsert as it seems the smartest but at this point I don't know if I will be able to access the remote file system. If I do it will have to be via FTP and even if the current server allows FTP access I'm not sure the next one will. But it's possible I don't fully understand how this works.

    Is there any way to send all of the data to the remote system and parse it there?

    I was looking at ExecuteAsync. I've never done anything like this before so I don't know if it's viable and I will have to invest significant time educating myself before I can make that decision. But if someone can tell me this will solve my problem I'll gladly bone up on it.

    A final note: My SQL command contains and "IF EXISTS" to prevent inserting a duplicate account number. Maybe this is a bad idea? It seemed like a good idea to let the SQL Server make the decision. I could instead get a list of account numbers first and make the decision on my end before sending the command. Under normal circumstances there should not be any rows in the table that match a criteria which defines this set.

  2. #2
    MS SQL Powerposter szlamany's Avatar
    Join Date
    Mar 2004
    Location
    Connecticut
    Posts
    18,263

    Re: A faster way to insert a large number of records on a remote SQL Server

    You should load the simplest table possible.

    Either a temp table or a staging table.

    If you want to go to an extreme that table can have just two columns - an IDENTITY value to keep rows in sequence and a large single field for the text. Or break down this single text field into a series of simple varchar(100) type fields.

    Point is to get the database engine to load the rows with as little work as possible.

    Once you have it all on the server you can work that temp table or staging table into your real tables with just a SINGLE INSERT type of statement.

    *** Read the sticky in the DB forum about how to get your question answered quickly!! ***

    Please remember to rate posts! Rate any post you find helpful - even in old threads! Use the link to the left - "Rate this Post".

    Some Informative Links:
    [ SQL Rules to Live By ] [ Reserved SQL keywords ] [ When to use INDEX HINTS! ] [ Passing Multi-item Parameters to STORED PROCEDURES ]
    [ Solution to non-domain Windows Authentication ] [ Crazy things we do to shrink log files ] [ SQL 2005 Features ] [ Loading Pictures from DB ]

    MS MVP 2006, 2007, 2008

  3. #3

    Thread Starter
    Frenzied Member cory_jackson's Avatar
    Join Date
    Dec 2011
    Location
    Fallbrook, California
    Posts
    1,104

    Re: A faster way to insert a large number of records on a remote SQL Server

    Update: I realized it was simple to test the conditional SQLCommand. I found it was only 5% faster without the condition.

  4. #4
    MS SQL Powerposter szlamany's Avatar
    Join Date
    Mar 2004
    Location
    Connecticut
    Posts
    18,263

    Re: A faster way to insert a large number of records on a remote SQL Server

    Quote Originally Posted by cory_jackson View Post
    Update: I realized it was simple to test the conditional SQLCommand. I found it was only 5% faster without the condition.
    Does this mean you have dealt with your issue? We seem to have cross-posted here...

    *** Read the sticky in the DB forum about how to get your question answered quickly!! ***

    Please remember to rate posts! Rate any post you find helpful - even in old threads! Use the link to the left - "Rate this Post".

    Some Informative Links:
    [ SQL Rules to Live By ] [ Reserved SQL keywords ] [ When to use INDEX HINTS! ] [ Passing Multi-item Parameters to STORED PROCEDURES ]
    [ Solution to non-domain Windows Authentication ] [ Crazy things we do to shrink log files ] [ SQL 2005 Features ] [ Loading Pictures from DB ]

    MS MVP 2006, 2007, 2008

  5. #5

    Thread Starter
    Frenzied Member cory_jackson's Avatar
    Join Date
    Dec 2011
    Location
    Fallbrook, California
    Posts
    1,104

    Re: A faster way to insert a large number of records on a remote SQL Server

    No. Forget I mentioned the conditional SQL command. It makes no difference and I'm still faced with and incredibly long process.

    I don't understand your suggestion szlamany. I'm not loading any tables. I have a text file with something like 100k rows each containing a single account number. I am simply looping thought a List of String and using a parameterized SQLCommand. No ADO tables or anything like that.

    INSERT INTO Accounts (AccNum, CountyID)
    VALUES (@AccNum, @CountyID)

    Each time in the loop I set the value of the AccNum parameter and ExecuteNonQuery.

  6. #6
    MS SQL Powerposter szlamany's Avatar
    Join Date
    Mar 2004
    Location
    Connecticut
    Posts
    18,263

    Re: A faster way to insert a large number of records on a remote SQL Server

    I'm suggesting that if you are loading this into a table with an index or two, for example - that this would be a slower process when compared to loading it into a temp table.

    And then that the speed of running the following Insert should be quite fast.

    Insert into RealTable Select * From #TempTable

    *** Read the sticky in the DB forum about how to get your question answered quickly!! ***

    Please remember to rate posts! Rate any post you find helpful - even in old threads! Use the link to the left - "Rate this Post".

    Some Informative Links:
    [ SQL Rules to Live By ] [ Reserved SQL keywords ] [ When to use INDEX HINTS! ] [ Passing Multi-item Parameters to STORED PROCEDURES ]
    [ Solution to non-domain Windows Authentication ] [ Crazy things we do to shrink log files ] [ SQL 2005 Features ] [ Loading Pictures from DB ]

    MS MVP 2006, 2007, 2008

  7. #7
    Smooth Moperator techgnome's Avatar
    Join Date
    May 2002
    Posts
    34,532

    Re: A faster way to insert a large number of records on a remote SQL Server

    This is how I deal with these situations (oddly, I'm elbow deep in developing just such a mass importer for our system. Kismet)...
    The short of it is SQL Bulk Copy is the best way to go... there's two ways to do it:
    1) Straight up SQL... it's fast, 100% of all the processing can be done on the server, and it goes all in one big sluuuuurp. Down side, the file being imported MUST be on the server, in a location where the server can get to it.
    2) ADO.NET contains a SQLBulkCopy class in the SQLClient namespace. This is what I've been using. I open the file using a reader (the actual type varies depending on the type of file) and then I read the data into a DataTable. Then I open a connection to the server, create an instance of the SBC, give it the name of the destination table, create mappings if needed, and then feed it the DataTable... some of the processing is on the client, mostly the reading of the file and getting it into a datatable, but then once it is passed off, it goes really quick.

    Now, this is where the advise of the others come in... when using SQLBulkCopy, you should stuff it into a staging table. From there you can scrub the data and validate it... once it's cleaned THEN you move it to the production tables. At that point you can do what ever you want with the staged data. We hold ours for historical auditing. The user does have the option to clear it out from time to time if they wish.

    So to recap, the steps are typically:
    1) use BCP (Bulk Copy Process) to suck the data out of the file and into a staging table of some kind. This can be done via pure SQL, or through ADO.NET
    2) validate the data in the staging table
    3) move the valid data from the staging table into it's final location in production tables

    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

  8. #8
    MS SQL Powerposter szlamany's Avatar
    Join Date
    Mar 2004
    Location
    Connecticut
    Posts
    18,263

    Re: A faster way to insert a large number of records on a remote SQL Server

    I've always used Bulk Insert myself when presented with a large file - OP said that might not be possible.

    I've never used SQLBulkCopy - how does it compare to a Bulk Insert?

    The one time I can recall doing "row-by-row" inserts from a text file were back in VB6 days and that was slow - just lived with it since it was a batch schedule type task.

    Quote Originally Posted by cory_jackson View Post
    A final note: My SQL command contains and "IF EXISTS" to prevent inserting a duplicate account number. Maybe this is a bad idea? It seemed like a good idea to let the SQL Server make the decision. I could instead get a list of account numbers first and make the decision on my end before sending the command. Under normal circumstances there should not be any rows in the table that match a criteria which defines this set.
    If you use a STAGING table then after you load that table you would first DELETE any rows with EXISTING account #'s.

    That would make sure the data set is clean.

    Then a quick one line SQL to insert this into production.
    Last edited by szlamany; Feb 24th, 2016 at 04:04 PM.

    *** Read the sticky in the DB forum about how to get your question answered quickly!! ***

    Please remember to rate posts! Rate any post you find helpful - even in old threads! Use the link to the left - "Rate this Post".

    Some Informative Links:
    [ SQL Rules to Live By ] [ Reserved SQL keywords ] [ When to use INDEX HINTS! ] [ Passing Multi-item Parameters to STORED PROCEDURES ]
    [ Solution to non-domain Windows Authentication ] [ Crazy things we do to shrink log files ] [ SQL 2005 Features ] [ Loading Pictures from DB ]

    MS MVP 2006, 2007, 2008

  9. #9
    Smooth Moperator techgnome's Avatar
    Join Date
    May 2002
    Posts
    34,532

    Re: A faster way to insert a large number of records on a remote SQL Server

    Quote Originally Posted by szlamany View Post
    I've always used Bulk Insert myself when presented with a large file - OP said that might not be possible.

    I've never used SQLBulkCopy - how does it compare to a Bulk Insert?
    Obviously since there is some processing on the client, and the data has to be sent to the server... it's not as quick as doing a pure BCP ... But it is definitely far superior to doing row by row inserts. Any time I do file imports now, I always reach for the SQLBulkCopy ... I don't even consider anything else. The more I work with it, the more about its capabilies I learn - for instance the mapping... until I learned about that a couple months ago, I thought the client datatable had to mirror EXACTLY what the destination table looked like... turns out that's not the case and you can setup mapping so that you can skip cols if you need to, or as in the case specifically I had to setup, the source table had two cols with the same header name, but their stating table had different names... so with the mapping, it was possible to map "Col1" to "Column1" and the other "Col1" to "Column2" ... it also gives flexibility in the naming of the fields to comply with out platform SDK rules, while still retaining the flexibility of hte format for the files (which often come to our clients from other vendors who charge every time you request a change to their output.

    I've been processing some sample files with approx 1000 records in them, 12 fields... it imports in about 3 seconds, that includes opening the file, building the temp datatable, sucking the data in, minor processing and then sending it off to the SQLBulkCopy. We have other bottlenecks in the system unrelated to the BCP so I don't usually pay attention to it much... I'm going to be running some tests later tonight, I can give some numbers on the data... I don't have a ready made BCP I can compare it to, so I'm not sure it'll be a whole lot useful, but now I'm curious about the performance of it, so I'll crank out some ridiculously large files for it to process and see what happens.

    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

  10. #10
    MS SQL Powerposter szlamany's Avatar
    Join Date
    Mar 2004
    Location
    Connecticut
    Posts
    18,263

    Re: A faster way to insert a large number of records on a remote SQL Server

    Thanks for the info and whatever you might find out - much appreciated.

    *** Read the sticky in the DB forum about how to get your question answered quickly!! ***

    Please remember to rate posts! Rate any post you find helpful - even in old threads! Use the link to the left - "Rate this Post".

    Some Informative Links:
    [ SQL Rules to Live By ] [ Reserved SQL keywords ] [ When to use INDEX HINTS! ] [ Passing Multi-item Parameters to STORED PROCEDURES ]
    [ Solution to non-domain Windows Authentication ] [ Crazy things we do to shrink log files ] [ SQL 2005 Features ] [ Loading Pictures from DB ]

    MS MVP 2006, 2007, 2008

  11. #11

    Thread Starter
    Frenzied Member cory_jackson's Avatar
    Join Date
    Dec 2011
    Location
    Fallbrook, California
    Posts
    1,104

    Re: A faster way to insert a large number of records on a remote SQL Server

    Very interesting I'm going to get right on this. I'm stoked because I really didn't want to get into Async. I think I've seen the SQLBulkCopy but thought it was the same as SQLBulkInsert. Also interesting about using a temp table. All good advice. I'm on it. But I might have some questions.

  12. #12
    MS SQL Powerposter szlamany's Avatar
    Join Date
    Mar 2004
    Location
    Connecticut
    Posts
    18,263

    Re: A faster way to insert a large number of records on a remote SQL Server

    btw - if you were thinking of using ASYNC to send lots of inserts at the same time you would have discovered that was not going to change anything.

    The bottleneck of your method is the server being able to accept single insert's - and many, many of them.

    *** Read the sticky in the DB forum about how to get your question answered quickly!! ***

    Please remember to rate posts! Rate any post you find helpful - even in old threads! Use the link to the left - "Rate this Post".

    Some Informative Links:
    [ SQL Rules to Live By ] [ Reserved SQL keywords ] [ When to use INDEX HINTS! ] [ Passing Multi-item Parameters to STORED PROCEDURES ]
    [ Solution to non-domain Windows Authentication ] [ Crazy things we do to shrink log files ] [ SQL 2005 Features ] [ Loading Pictures from DB ]

    MS MVP 2006, 2007, 2008

  13. #13

    Thread Starter
    Frenzied Member cory_jackson's Avatar
    Join Date
    Dec 2011
    Location
    Fallbrook, California
    Posts
    1,104

    Re: A faster way to insert a large number of records on a remote SQL Server

    Do you have a SQL command handy that executes from my TempTable to the RealTable conditionally? IE when it doesn't already exist to prevent duplicates? I'm sure I can work it out on my own but I'm slow at SQL command and it might take me a while.

  14. #14

    Thread Starter
    Frenzied Member cory_jackson's Avatar
    Join Date
    Dec 2011
    Location
    Fallbrook, California
    Posts
    1,104

    Re: A faster way to insert a large number of records on a remote SQL Server

    Quote Originally Posted by szlamany View Post
    btw - if you were thinking of using ASYNC to send lots of inserts at the same time you would have discovered that was not going to change anything.

    The bottleneck of your method is the server being able to accept single insert's - and many, many of them.
    I was wondering about that. Thank you very much. Like I said in my post I wanted to know if it was worth learning.

  15. #15
    MS SQL Powerposter szlamany's Avatar
    Join Date
    Mar 2004
    Location
    Connecticut
    Posts
    18,263

    Re: A faster way to insert a large number of records on a remote SQL Server

    Quote Originally Posted by cory_jackson View Post
    Do you have a SQL command handy that executes from my TempTable to the RealTable conditionally? IE when it doesn't already exist to prevent duplicates? I'm sure I can work it out on my own but I'm slow at SQL command and it might take me a while.
    I'm not familiar with MySQL enough to write that query.

    Regardless I would prefer to FIRST delete the rows from the TEMP TABLE that DO EXIST in the production table (since that is the smallest population of rows reaching out to an index in a production table to see if they need to be deleted).

    Quote Originally Posted by cory_jackson View Post
    I was wondering about that. Thank you very much. Like I said in my post I wanted to know if it was worth learning.
    I actually had a situation where I was receiving ASYNC tasks to load things into MS SQL. I was seeing first hand the bottleneck - solution was to create a FIRST IN-FIRST OUT type of queue running ASYNC "single tasks" to process the SQL. Basically making it a single push of SQL from all those parent tasks.

    *** Read the sticky in the DB forum about how to get your question answered quickly!! ***

    Please remember to rate posts! Rate any post you find helpful - even in old threads! Use the link to the left - "Rate this Post".

    Some Informative Links:
    [ SQL Rules to Live By ] [ Reserved SQL keywords ] [ When to use INDEX HINTS! ] [ Passing Multi-item Parameters to STORED PROCEDURES ]
    [ Solution to non-domain Windows Authentication ] [ Crazy things we do to shrink log files ] [ SQL 2005 Features ] [ Loading Pictures from DB ]

    MS MVP 2006, 2007, 2008

  16. #16

    Thread Starter
    Frenzied Member cory_jackson's Avatar
    Join Date
    Dec 2011
    Location
    Fallbrook, California
    Posts
    1,104

    Re: A faster way to insert a large number of records on a remote SQL Server

    Thanks. FTR I'm using MS SQL, not MySQL. THough I'm guessing it would be similar.

    I'm also seriously considering not using a condition. I'm reviewing the requirements and finding that I didn't have a full understanding of how the data was used after the initial batch import. As I see it now it should be permissible to delete them all. In actuality these will be handled in sets defined by a foreign key ID. So instead of checking ot see if there are any that match the FK and the account number I can just delete everything matching the foreign key.

  17. #17
    Smooth Moperator techgnome's Avatar
    Join Date
    May 2002
    Posts
    34,532

    Re: A faster way to insert a large number of records on a remote SQL Server

    Quote Originally Posted by cory_jackson View Post
    Do you have a SQL command handy that executes from my TempTable to the RealTable conditionally? IE when it doesn't already exist to prevent duplicates? I'm sure I can work it out on my own but I'm slow at SQL command and it might take me a while.
    That's super easy... it's an update:
    Code:
    UPDATE P
    set Field1 = S.Field1, Field2 = S.Field2 ...
    from ProdTable P
    inner join StagingTable S on P.KeyField = S.KeyField
    where S.Status = 1 -- 1 being good to go and 0 being not (this is set by the validation routine that scrubbed the data)
    Inserting new records is a left join:
    Code:
    Insert into ProdTable (KeyField, Field1, Field2, ...)
    Select KeyField, Field1, Field2 ...
    from StagingTable S
    left join ProdTable P on S.KeyField = P.KeyField
    where S.Stauts = 1 and P.KeyField is null
    And if you want to get real fancy, you can build an upsert (using the MERGE keyword) that does it in one go... sadly I'm not equipped at the moment mentally to do the gymnastics on that command, it's not something I do often.

    I usually do these moves within a stored proc, and more often than not, they're much more complicated than that.. but you get the idea.

    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

  18. #18

    Thread Starter
    Frenzied Member cory_jackson's Avatar
    Join Date
    Dec 2011
    Location
    Fallbrook, California
    Posts
    1,104

    Re: A faster way to insert a large number of records on a remote SQL Server

    So the ADO BulkCopy will essentially copy the entire source table A in one transaction to a new table B at the destination? IE as a single object. I'm just trying to get a concept.

  19. #19

    Thread Starter
    Frenzied Member cory_jackson's Avatar
    Join Date
    Dec 2011
    Location
    Fallbrook, California
    Posts
    1,104

    Re: A faster way to insert a large number of records on a remote SQL Server

    Wow. I really need to learn more SQL.

  20. #20
    Smooth Moperator techgnome's Avatar
    Join Date
    May 2002
    Posts
    34,532

    Re: A faster way to insert a large number of records on a remote SQL Server

    welll... it's not a transaction perse... at least not a transaction in the traditional database sense of all or nothing... what it does is a highspeed insert of the data bypassing the log files... normally when you do an insert, or even an update, it doesn't actually go into the table... it gets swept through the log files first... or a record of that transaction gets put into the log files... it's what allows the database to help recover itself after a fault.

    BCP bypasses that... and inserts right into the table... I think it also will bypass triggers, indexes and other elements on the table - which is one reason to only BCP into a staging table - the staging table should be index-less, trigger-less and so on... I don't even put a PKey on my staging tables. Anyways, it's that bypassing that makes the BCP run so likety split.

    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

  21. #21

    Thread Starter
    Frenzied Member cory_jackson's Avatar
    Join Date
    Dec 2011
    Location
    Fallbrook, California
    Posts
    1,104

    Re: A faster way to insert a large number of records on a remote SQL Server

    That makes perfect sense. Thank you.

  22. #22
    MS SQL Powerposter szlamany's Avatar
    Join Date
    Mar 2004
    Location
    Connecticut
    Posts
    18,263

    Re: A faster way to insert a large number of records on a remote SQL Server

    Quote Originally Posted by cory_jackson View Post
    A final note: My SQL command contains and "IF EXISTS"
    I read "My SQL" to mean "MySQL"

    Oops.

    But TG gave it to you.

    Still I would delete the rows from the staging table first

    Code:
    Delete from Staging
        From Staging ST
        Left Join ProdTable PT on PT....
        Where PT.KeyField is not null
    This has to be in the same TRANSACTION that is about to do the INSERT into the ProdTable to truly be multiuser safe.

    *** Read the sticky in the DB forum about how to get your question answered quickly!! ***

    Please remember to rate posts! Rate any post you find helpful - even in old threads! Use the link to the left - "Rate this Post".

    Some Informative Links:
    [ SQL Rules to Live By ] [ Reserved SQL keywords ] [ When to use INDEX HINTS! ] [ Passing Multi-item Parameters to STORED PROCEDURES ]
    [ Solution to non-domain Windows Authentication ] [ Crazy things we do to shrink log files ] [ SQL 2005 Features ] [ Loading Pictures from DB ]

    MS MVP 2006, 2007, 2008

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width