Results 1 to 27 of 27

Thread: 600 Million records - Help please

  1. #1

    Thread Starter
    Hyperactive Member
    Join Date
    May 2013
    Posts
    295

    Question 600 Million records - Help please

    Hello All,

    I have a need to deploy a standalone program that will interrogate a multi-spanning Disk array and catalog all the files and attributes along with De-duplication abilities. I need to store 600 million records. SQL CE has a 4GB limit. I need something along the lines or 10GB+ and I'll span multiple databases or maybe just a single 100GB Database. Is this possible? The destination is not a SQL shop, they use Access of all things and that is just not going to work.

    Is there a way to distribute a MySQL Database just like a SQL CE Database? I believe MySQL has the storage capacity I need.

    Regards,

    -NJ
    Lo And Behold
    This is my first Microsoft App Submission
    Your opinion of it would be appreciated

  2. #2
    Smooth Moperator techgnome's Avatar
    Join Date
    May 2002
    Posts
    34,531

    Re: 600 Million records - Help please

    I would consider MySQL, SQL Server Express, or even Oracle Express (I believe there's a such thing).

    Access is definitely right out. But any of the above should be able to handle it. However, I'd strongly talk to them about their data and hte future and it may be worth investing NOW to get a decent server system with a proper SQL Server (or Oracle or MySQL Enterprise) ...


    something else to consider - archiving old data and spinning it off into its own database or storage.

    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

  3. #3
    Software Carpenter dee-u's Avatar
    Join Date
    Feb 2005
    Location
    Pinas
    Posts
    11,123

    Re: 600 Million records - Help please

    Quote Originally Posted by techgnome View Post
    I would consider MySQL, SQL Server Express, or even Oracle Express (I believe there's a such thing).

    Access is definitely right out. But any of the above should be able to handle it. However, I'd strongly talk to them about their data and hte future and it may be worth investing NOW to get a decent server system with a proper SQL Server (or Oracle or MySQL Enterprise) ...


    something else to consider - archiving old data and spinning it off into its own database or storage.

    -tg
    Doesn't SQL Server Express have a 10GB limit?
    Regards,


    As a gesture of gratitude please consider rating helpful posts. c",)

    Some stuffs: Mouse Hotkey | Compress file using SQL Server! | WPF - Rounded Combobox | WPF - Notify Icon and Balloon | NetVerser - a WPF chatting system

  4. #4
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    110,297

    Re: 600 Million records - Help please

    You're talking about the difference between a file-based database and a server-based database. SQL Server CE is file-based while MySQL is server-based, so you cannot simply deploy a MySQL data file and expect it to do anything useful. You need a MySQL server installed. Microsoft no longer develop SQL Server CE and recommend SQLite as a file-based database now but I don't think that anything file-based is going to be very good for data in that volume.

    If you must go to a server-based system then MySQL will give you the capacity you need at no additional cost, assuming that you can do the installation and configuration yourself. SQL Server Express would be easier but will limit you to 10 GB, as dee-u says, so you may have to span multiple database in that case. If you go to full SQL Server then you're up for a license fee.

  5. #5

    Thread Starter
    Hyperactive Member
    Join Date
    May 2013
    Posts
    295

    Re: 600 Million records - Help please

    Quote Originally Posted by jmcilhinney View Post
    You're talking about the difference between a file-based database and a server-based database. SQL Server CE is file-based while MySQL is server-based, so you cannot simply deploy a MySQL data file and expect it to do anything useful. You need a MySQL server installed. Microsoft no longer develop SQL Server CE and recommend SQLite as a file-based database now but I don't think that anything file-based is going to be very good for data in that volume.

    If you must go to a server-based system then MySQL will give you the capacity you need at no additional cost, assuming that you can do the installation and configuration yourself. SQL Server Express would be easier but will limit you to 10 GB, as dee-u says, so you may have to span multiple database in that case. If you go to full SQL Server then you're up for a license fee.
    Thank you JM!
    Lo And Behold
    This is my first Microsoft App Submission
    Your opinion of it would be appreciated

  6. #6

    Thread Starter
    Hyperactive Member
    Join Date
    May 2013
    Posts
    295

    Re: 600 Million records - Help please

    Quote Originally Posted by jmcilhinney View Post
    You're talking about the difference between a file-based database and a server-based database. SQL Server CE is file-based while MySQL is server-based, so you cannot simply deploy a MySQL data file and expect it to do anything useful. You need a MySQL server installed. Microsoft no longer develop SQL Server CE and recommend SQLite as a file-based database now but I don't think that anything file-based is going to be very good for data in that volume.

    If you must go to a server-based system then MySQL will give you the capacity you need at no additional cost, assuming that you can do the installation and configuration yourself. SQL Server Express would be easier but will limit you to 10 GB, as dee-u says, so you may have to span multiple database in that case. If you go to full SQL Server then you're up for a license fee.
    Question...

    I'm not a SQL pro by any means.
    If I have the site load a MySQL Server. Can I connected to the server and create a data base all from within my app? Or do I need them to setup the database and supply the connection string and table data?

    Thanks,

    -NJ
    Lo And Behold
    This is my first Microsoft App Submission
    Your opinion of it would be appreciated

  7. #7
    A SQL Server fool GaryMazzone's Avatar
    Join Date
    Aug 2005
    Location
    Dover,NH
    Posts
    7,495

    Re: 600 Million records - Help please

    The server needs to be installed on the network it will be used on. Please ensure that proper security procedures are followed for the user running the SQL Server services (it should be a domain user). Please only install the components that are required (in your case only the SQL Server engine and the SQL Server agent should be installed, probably the SSMS on the server also). Create a user for the application. Create a database for the application on the database server instance. Grant the application user the proper rights to the database. Now on first time use you can have the application create the required objects in the database. The connection string should be supplied as the server (or server\instance) and then the initial catalog as the database to use, the username for the application user and the password for that user.

    The accounts that run the SQL Server engine and the SQL Server Agent should be different domain accounts and neither account should be Admin on the server that the instance is running on.
    Sometimes the Programmer
    Sometimes the DBA

    Mazz1

  8. #8

    Thread Starter
    Hyperactive Member
    Join Date
    May 2013
    Posts
    295

    Re: 600 Million records - Help please

    Quote Originally Posted by GaryMazzone View Post
    The server needs to be installed on the network it will be used on. Please ensure that proper security procedures are followed for the user running the SQL Server services (it should be a domain user). Please only install the components that are required (in your case only the SQL Server engine and the SQL Server agent should be installed, probably the SSMS on the server also). Create a user for the application. Create a database for the application on the database server instance. Grant the application user the proper rights to the database. Now on first time use you can have the application create the required objects in the database. The connection string should be supplied as the server (or server\instance) and then the initial catalog as the database to use, the username for the application user and the password for that user.

    The accounts that run the SQL Server engine and the SQL Server Agent should be different domain accounts and neither account should be Admin on the server that the instance is running on.
    You are awesome!

    I just cut/pasted that and gave you the credit to the site.

    Thank you!

    -NJ
    Lo And Behold
    This is my first Microsoft App Submission
    Your opinion of it would be appreciated

  9. #9
    A SQL Server fool GaryMazzone's Avatar
    Join Date
    Aug 2005
    Location
    Dover,NH
    Posts
    7,495

    Re: 600 Million records - Help please

    There would be more things I could suggest on a best practices basis but I am not sure that is what you are looking for. I spend my time as a consultant fixing the issues that people who just click the next button on SQL Server deployments create. That includes over committing memory, default location of data files, extra privileges given to users, sa account being left open, assigning rights to the guest account, not having the proper setup of TempDB (not enough files and in the wrong place).
    Sometimes the Programmer
    Sometimes the DBA

    Mazz1

  10. #10
    PowerPoster JuggaloBrotha's Avatar
    Join Date
    Sep 2005
    Location
    Lansing, MI; USA
    Posts
    4,286

    Re: 600 Million records - Help please

    For what it's worth the company I work for does large data analytics and we regularly work with sets ranging from 300 million records to 1.5 billion records and we use Sql Server 2014, you'll want to make sure you have plenty of memory & processing power as well as a very large hard drive to allow Sql Server to work with data that large, our log files are commonly around 14gb each (data files come in at around 700gb each) and we shrink them as needed regularly when importing more data (usually in 20 to 30 million record increments). I highly recommend having your company just purchase a Sql Server license, I can't imagine working with 600 million records on the Express edition. Plus by using Sql Server we're easily able to us SSIS for importing/exporting as SS Management Studio has the menu integration for making packages from a wizard.
    Currently using VS 2015 Enterprise on Win10 Enterprise x64.

    CodeBank: All ThreadsColors ComboBoxFading & Gradient FormMoveItemListBox/MoveItemListViewMultilineListBoxMenuButtonToolStripCheckBoxStart with Windows

  11. #11

    Thread Starter
    Hyperactive Member
    Join Date
    May 2013
    Posts
    295

    Re: 600 Million records - Help please

    Quote Originally Posted by GaryMazzone View Post
    There would be more things I could suggest on a best practices basis but I am not sure that is what you are looking for. I spend my time as a consultant fixing the issues that people who just click the next button on SQL Server deployments create. That includes over committing memory, default location of data files, extra privileges given to users, sa account being left open, assigning rights to the guest account, not having the proper setup of TempDB (not enough files and in the wrong place).
    Click the next button! LOL
    Lo And Behold
    This is my first Microsoft App Submission
    Your opinion of it would be appreciated

  12. #12

    Thread Starter
    Hyperactive Member
    Join Date
    May 2013
    Posts
    295

    Re: 600 Million records - Help please

    Quote Originally Posted by JuggaloBrotha View Post
    For what it's worth the company I work for does large data analytics and we regularly work with sets ranging from 300 million records to 1.5 billion records and we use Sql Server 2014, you'll want to make sure you have plenty of memory & processing power as well as a very large hard drive to allow Sql Server to work with data that large, our log files are commonly around 14gb each (data files come in at around 700gb each) and we shrink them as needed regularly when importing more data (usually in 20 to 30 million record increments). I highly recommend having your company just purchase a Sql Server license, I can't imagine working with 600 million records on the Express edition. Plus by using Sql Server we're easily able to us SSIS for importing/exporting as SS Management Studio has the menu integration for making packages from a wizard.
    The 600 million record was a culmination of files on a spanned disk array. We are going to work with individual volumes doing all the heavy lifting piecemeal.
    In a nutshell, we're going to run the analytical logic on individual drives. File path, creation date, last write, last user stuff like that and then generate an MD5 hash. Should be about 10 to 15 million per drive. Each drive will get its own DB.

    The 600 million records come into play when we bring all the MD5 hash strings together along with a unique DB/File ID. In essence the largest record file may end up being the smallest in size due the footprint the actual data will take up.
    The ultimate goal is to de-dupe all the servers and maintain a pristine database of just MD5 hashes. Any new data will have to be hashed and bounced off the DB to be accepted into the data storage.
    Lo And Behold
    This is my first Microsoft App Submission
    Your opinion of it would be appreciated

  13. #13

    Thread Starter
    Hyperactive Member
    Join Date
    May 2013
    Posts
    295

    Re: 600 Million records - Help please

    Quote Originally Posted by GaryMazzone View Post
    There would be more things I could suggest on a best practices basis but I am not sure that is what you are looking for. I spend my time as a consultant fixing the issues that people who just click the next button on SQL Server deployments create. That includes over committing memory, default location of data files, extra privileges given to users, sa account being left open, assigning rights to the guest account, not having the proper setup of TempDB (not enough files and in the wrong place).
    Hi Gary,

    I have another question, if you would not mind.
    I have setup a MySQL Server on my development machine. Never having used MySQL, is there a way to Connect to the server without connecting to a pre-existing database and create a database on the fly within my App?
    I have googled but I'm not finding any examples of such. I could be googling wrong though.

    -NJ
    Lo And Behold
    This is my first Microsoft App Submission
    Your opinion of it would be appreciated

  14. #14
    A SQL Server fool GaryMazzone's Avatar
    Join Date
    Aug 2005
    Location
    Dover,NH
    Posts
    7,495

    Re: 600 Million records - Help please

    Yes you can call Create Database if needed, you connect to the master database. The user you use will need db_CreateDB rights (or sa). I would check the system meta data to see if the database exists then call create if needed like this:

    sql Code:
    1. IF NOT EXISTS (SELECT name FROM sys.sysdatabases WHERE name = 'SomeDBNameHere')
    2.     BEGIN
    3.         CREATE DATABASE SomeDBNameHere;    
    4.     END
    5. GO
    6. --Go to the database
    7. USE SomeDBNameHere;
    8. IF NOT EXISTS (SELECT name FROM sys.database_principals WHERE name = 'YourUserName')
    9. BEGIN
    10.     CREATE USER [YourUserName] FOR LOGIN [YourUserName] WITH DEFAULT_SCHEMA=[dbo];
    11.     ALTER ROLE [db_owner] ADD MEMBER [YourUserName];
    12. END
    13. GO

    There are other things you can add to the create database statement i.e. filenames and locations, initial size, max size, growth rate. If you don't supply these things it will use the default location of files that is set at the database instance level and the sizing is based on the model database
    Sometimes the Programmer
    Sometimes the DBA

    Mazz1

  15. #15
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    110,297

    Re: 600 Million records - Help please

    If MySQL is anything like SQL Server in this regard, you simply omit the Database attribute from the connection string.

  16. #16
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    110,297

    Re: 600 Million records - Help please

    Quote Originally Posted by GaryMazzone View Post
    Yes you can call Create Database if needed, you connect to the master database. The user you use will need db_CreateDB rights (or sa). I would check the system meta data to see if the database exists then call create if needed like this:

    sql Code:
    1. IF NOT EXISTS (SELECT name FROM sys.sysdatabases WHERE name = 'SomeDBNameHere')
    2.     BEGIN
    3.         CREATE DATABASE SomeDBNameHere;    
    4.     END
    5. GO
    6. --Go to the database
    7. USE SomeDBNameHere;
    8. IF NOT EXISTS (SELECT name FROM sys.database_principals WHERE name = 'YourUserName')
    9. BEGIN
    10.     CREATE USER [YourUserName] FOR LOGIN [YourUserName] WITH DEFAULT_SCHEMA=[dbo];
    11.     ALTER ROLE [db_owner] ADD MEMBER [YourUserName];
    12. END
    13. GO

    There are other things you can add to the create database statement i.e. filenames and locations, initial size, max size, growth rate. If you don't supply these things it will use the default location of files that is set at the database instance level and the sizing is based on the model database
    Gary, I think that you may have missed that this question is for MySQL and not SQL Server.

  17. #17

    Thread Starter
    Hyperactive Member
    Join Date
    May 2013
    Posts
    295

    Re: 600 Million records - Help please

    Quote Originally Posted by jmcilhinney View Post
    Gary, I think that you may have missed that this question is for MySQL and not SQL Server.
    Is the code he supplied viable? I would like to give this a shot when I get home tonight.

    -NJ
    Lo And Behold
    This is my first Microsoft App Submission
    Your opinion of it would be appreciated

  18. #18
    A SQL Server fool GaryMazzone's Avatar
    Join Date
    Aug 2005
    Location
    Dover,NH
    Posts
    7,495

    Re: 600 Million records - Help please

    For SQL Server yes it is. This is what I do if needed. I also have a section that checks to see if the user is added to the system logins before adding the database and assigning the user to the new database
    Sometimes the Programmer
    Sometimes the DBA

    Mazz1

  19. #19

    Thread Starter
    Hyperactive Member
    Join Date
    May 2013
    Posts
    295

    Re: 600 Million records - Help please

    Quote Originally Posted by GaryMazzone View Post
    For SQL Server yes it is. This is what I do if needed. I also have a section that checks to see if the user is added to the system logins before adding the database and assigning the user to the new database
    Thank you!

    I will give this a shot when I get home tonight. If my 5 year old lets me that is!

    -NJ
    Lo And Behold
    This is my first Microsoft App Submission
    Your opinion of it would be appreciated

  20. #20
    Smooth Moperator techgnome's Avatar
    Join Date
    May 2002
    Posts
    34,531

    Re: 600 Million records - Help please

    REMINDER - the code that Gary Posted is for SQL SERVER! it is NOT for MySQL ... I repeat NOT MySQL.

    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

  21. #21
    Junior Member
    Join Date
    Sep 2004
    Location
    Buffalo, NY
    Posts
    22

    Re: 600 Million records - Help please

    Might be worth your time reading up on PHPMyAdmin. I used it in school. Once you get it setup, it is nice.

    https://www.phpmyadmin.net/
    If somebody rubs you the wrong way, walk a mile in their shoes. Then you will be a mile away from them, and you will have their shoes...

  22. #22
    PowerPoster ChrisE's Avatar
    Join Date
    Jun 2017
    Location
    Frankfurt
    Posts
    3,040

    Re: 600 Million records - Help please

    Hi,

    any Company that has achieved 600Mio records and is still doing buisness should
    treat themselves to SQL_Server, especially if the want to add another 600Mio.

    regards
    Chris
    to hunt a species to extinction is not logical !
    since 2010 the number of Tigers are rising again in 2016 - 3900 were counted. with Baby Callas it's 3901, my wife and I had 2-3 months the privilege of raising a Baby Tiger.

  23. #23
    PowerPoster JuggaloBrotha's Avatar
    Join Date
    Sep 2005
    Location
    Lansing, MI; USA
    Posts
    4,286

    Re: 600 Million records - Help please

    Quote Originally Posted by ChrisE View Post
    Hi,

    any Company that has achieved 600Mio records and is still doing buisness should
    treat themselves to SQL_Server, especially if the want to add another 600Mio.

    regards
    Chris
    Or cloud processing like Amazon Redshift.
    Redshift is able to have us process things like 900 million records in 10 minutes instead of the 1.5 hours it takes Sql Server and Oracle to do the same.
    Currently using VS 2015 Enterprise on Win10 Enterprise x64.

    CodeBank: All ThreadsColors ComboBoxFading & Gradient FormMoveItemListBox/MoveItemListViewMultilineListBoxMenuButtonToolStripCheckBoxStart with Windows

  24. #24
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    110,297

    Re: 600 Million records - Help please

    Quote Originally Posted by JuggaloBrotha View Post
    Or cloud processing like Amazon Redshift.
    Redshift is able to have us process things like 900 million records in 10 minutes instead of the 1.5 hours it takes Sql Server and Oracle to do the same.
    I would imagine Microsoft offers something similar through SQL Server on Azure.

  25. #25
    Super Moderator jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    110,297

    Re: 600 Million records - Help please

    Quote Originally Posted by ChrisE View Post
    Hi,

    any Company that has achieved 600Mio records and is still doing buisness should
    treat themselves to SQL_Server, especially if the want to add another 600Mio.

    regards
    Chris
    Quite a lot of large systems run on MySQL.

  26. #26
    Fanatic Member kpmc's Avatar
    Join Date
    Sep 2017
    Posts
    1,012

    Re: 600 Million records - Help please

    Lets see... get MySQL for free or spend a jillion dollars on MSSQL....hmmm
    What kind of treat are you talking about? Reporting services?

  27. #27
    Smooth Moperator techgnome's Avatar
    Join Date
    May 2002
    Posts
    34,531

    Re: 600 Million records - Help please

    Quote Originally Posted by rekless View Post
    Might be worth your time reading up on PHPMyAdmin. I used it in school. Once you get it setup, it is nice.

    https://www.phpmyadmin.net/
    Only if you have a web server running PHP ... otherwise get Workbench... runs right on the server and there are Windows and Linux versions... no webserver needed. Much more flexible in my opinion.

    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width