-
heavy table
Hi all,
one of our tables is now contain 6 mil rows and growing in about 10k + rows each day, I know it's not consider huge table, but querying this table is still considerably slow compare to other tables. I know for sure that this table is pretty good indexed and i'm not sure what is my next step to improve query speed, I'm think of partition is this what i should go for ? is there any more steps i could take to improve this table performance ?
Thanks
-
Re: heavy table
We had a similar problem on a number of tables at my last job. The way we solved it was to create "Historical" tables into which we moved historical data on a nightly basis.
-tg
-
Re: heavy table
lol, that exactly what my boss suggested we'll do, but I'm not a fan of that idea, it feels like cheating, I know that sql database should be able to handle much larger tables, any one had any experience with partitions ? from what i read its looks like it might be just what i need, i just don't want (and dont have the spare time) to dive into learning the subject if it's not going to help us with this problem.
if i won't find any more elegant solutions to this I'll go with the split data into separate tables idea.
-
Re: heavy table
But it isn't cheating if the table is getting in the way of performance. The end result is that the data that's left in the active table is the current data. If there's archived data that's getting in the way, then it's a problem.
In short, you have to decide what's more important... the 75% of the data you're not using... or performance.
This is actually a common solution for this common problem.
-tg
-
Re: heavy table
Depends on overall data architecture and how you will use the data in the table... no point in partitioning on one column if there will be other queries based on other columns, e.g. partitioned on PK but other queries based on date or other criteria. Also no point in partitioning if aggregate queries aren't run on table. Depending on how query is written, indexes often slow down aggregate queries because database pages/blocks are revisited (per record) instead of being processed only once (one-pass, multi-records read). If you are partitioning solely for aggregate queries then might as well consider data warehouse/mart or as suggested a historical table.
The only drawback of historical table is that you can't rely on table PK to enforce uniqueness; key may already exist in historical table. Again, impact of that depends on your overall data architecture, particularly how you manage keys/control numbers. You can maintain record in both OLTP and data warehouse to segregate aggregate queries (moved to warehouse) but trade-off for performance is storage and manageability. Start-up is expensive (set-up of warehouse) but once it's up it leverages on the fact that storage is relatively cheap compared to never ending fire-fighting of OLTP performance problems; salary is more expensive in the long run compared to hardware.
Another interim (emphasis on not permanent) solution would be to create a daily duplicate of OLTP database for generating aggregate reports; database A captures the transactions, database A-copy is used to run the heavy reports. You can use existing mid-tier and front-end system, just have them connect to appropriate database. Trade-off is duplication of host and storage among others. But you can consider it as a daily exercise of your disaster recovery procedures and you can implement this in a few hours with minimal testing.