Results 1 to 8 of 8

Thread: [RESOLVED] Multiple backing-source data

  1. #1

    Thread Starter
    PowerPoster
    Join Date
    Jul 2002
    Location
    Dublin, Ireland
    Posts
    2,148

    Resolved [RESOLVED] Multiple backing-source data

    So I'm thinking of having a mixture of database and flat file data in my latest application - any pointers to how this can be done while still retaining the unit of work happiness of something like Entity Framework?

  2. #2
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    9,017

    Re: Multiple backing-source data

    I usually do this when I have settings or other application related data to store. All the customer and product related stuff goes in the database and all the app related stuff can go into an XML file or My.Settings. Is this the type of thing you're talking about ?
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  3. #3

    Thread Starter
    PowerPoster
    Join Date
    Jul 2002
    Location
    Dublin, Ireland
    Posts
    2,148

    Re: Multiple backing-source data

    Yes - pretty much. One thing we have is massive numbers of stock prices...at the moment held in one truly huge table but the idea is that they instead get written to one blob per stock / date and an "end of day" price is the only thing written to the database. Then the process is that if you only want an end of day price I go to the database but if you want intra-day prices I find the right file and stream it. This is to massively increase the parallel-ism of the system.

  4. #4
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    9,017

    Re: Multiple backing-source data

    Hmmm...I donno. To me the whole thing can stay in the database. The data is all related and separating them defeats the purpose of having a database in the first place. Assuming you're using SQL Server, Access or Oracle, you have to realize that these are called relational databases. Its meant to store data that is related. The end of day prices and intra day prices are related to the stock. These pieces of data belong together. Why are you considering separating the data ?
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  5. #5
    Super Moderator FunkyDexter's Avatar
    Join Date
    Apr 2005
    Location
    An obscure body in the SK system. The inhabitants call it Earth
    Posts
    7,957

    Re: Multiple backing-source data

    These pieces of data belong together
    I agree! If you really want to separate the data you can do so by having the intra day and end of day prices in separate tables but I see no good argument for separating one off into files. You're certainly not going to find it quicker to retrieve a price from a file than a table, quite the opposite in fact. And the problems you're going to introduce around transaction management (I assume that's what you mean by "Unit of Work") are going to be horrendous.

    Is the goal to give dedicate hardware resource to an occasional expensive operation (reading intra day prices) so it doesn't impact on a frequent cheap operation (reading an end of day price)? If so you can always configure your database to put the intra day table in a file on one disk and the end of day table in a file on another disk. Or if you wanted to keep a unified view of all prices while still splitting the workload you could use horizontal partitioning.
    The best argument against democracy is a five minute conversation with the average voter - Winston Churchill

    Hadoop actually sounds more like the way they greet each other in Yorkshire - Inferrd

  6. #6

    Thread Starter
    PowerPoster
    Join Date
    Jul 2002
    Location
    Dublin, Ireland
    Posts
    2,148

    Re: Multiple backing-source data

    The problem with a "Prices" table is that in fact the things in it are largely uncorrelated - for example the price of wheat and the price of General Motors at 12:15 would be consecutive records but in fact they are in to way the same.

    Fortunately the thing about prices is that the records only go on the end of the table - once a price is recorded it cannot be changed, so any intra-day prices prior to today care read-only. So - once I have moved these into their own files then they can be distributed over the content-delivery-network so the file is available near the person who wants it. Each file name will be [stock id].[date].dat and because I know the price frequency I don't need to store the time in the file - it is effectively an array index.

  7. #7
    Super Moderator FunkyDexter's Avatar
    Join Date
    Apr 2005
    Location
    An obscure body in the SK system. The inhabitants call it Earth
    Posts
    7,957

    Re: Multiple backing-source data

    The problem with a "Prices" table is that in fact the things in it are largely uncorrelated
    An index on Product and Time would sort that out for you. That's an important thing, records in a database aren't "consecutive" because they're not ordered except within the context of an index. You can have as many indexes as you like so "consectutive" becomes a concept you can have complete control over.

    If you want to get data near to the person who wants it, use local databases replicated from a central source. In a situation like yours where the data is read only you'd use a simple publisher-subscriber model which is actually very easy to set up. And you could presumably set the refresh interval to a day at a time so the network traffic would be absolutely minimal. And your "Unit of Work" concern is neatly addressed in a replicated enviroment because the system will simply use a three phase commit where apropriate.

    About the only good argument I've ever come across for splitting data across files and a DB was where a company needed razor fast writes into the system becuase multiple clients would all flush their data in at around the same time of day. Writes were actually made into a text file (which is muich quicker than a DB insert), effectively allowing deferal of the DB insert to a quieter time. For finding and reading data, as you seem to be, then having that data in a DB rather than a file is going to be orders of magnitude quicker - particularly for the "finding" part.
    The best argument against democracy is a five minute conversation with the average voter - Winston Churchill

    Hadoop actually sounds more like the way they greet each other in Yorkshire - Inferrd

  8. #8

    Thread Starter
    PowerPoster
    Join Date
    Jul 2002
    Location
    Dublin, Ireland
    Posts
    2,148

    Re: Multiple backing-source data

    Aha - lightbulb moment.
    If I partition the table by stock type I effectively have one file per stock type anyway but can still use SQL and EF. Hurrah - much easier...thanks all.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width