Results 1 to 11 of 11

Thread: [RESOLVED] Help with improving my SQL query (SQLite)

  1. #1

    Thread Starter
    PowerPoster
    Join Date
    Aug 2010
    Location
    Canada
    Posts
    2,412

    Resolved [RESOLVED] Help with improving my SQL query (SQLite)

    Hi,

    I have an SQL query that works now and is reasonably fast, but I have a feeling there must be a better way to write it and I am hoping for some feedback.

    First some background -

    I have 5 tables (T1 through T5) that represent 5 stages in the life of a piece of "work". T5 is the "oldest" stage, and T1 is the newest/final stage. The work have traveled through any of the 5 stages (skipping any of them too) from T5 to T1. I'm only interested in work that has reached stage T1.

    Each stage has a "GUID" field to identify uniquely and a "Reason" field that is a piece of text identifying it's reason for existence. Tables T1 through T4 also have a "ParentGUID" field that can link it to a record in an earlier table in the lifecycle, but this can be NULL if there is no back link. Any referenced record can also have a ParentGUID that links it to any earlier table, or be NULL to indicate the "ancestry" is terminated.

    Unless I'm mistaken, I've determined that there are 15 possible paths from T1 to previous tables:

    Code:
    T1
    T1 > T2
    T1 > T3
    T1 > T4
    T1 > T5
    T1 > T2 > T3
    T1 > T2 > T4
    T1 > T2 > T5
    T1 > T3 > T4
    T1 > T3 > T5
    T1 > T4 > T5
    T1 > T2 > T3 > T4
    T1 > T2 > T3 > T5
    T1 > T2 > T4 > T5
    T1 > T3 > T4 > T5
    T1 > T2 > T3 > T4 > T5
    What I need to do is get every record in T1 and check the value of the Reason field. If I find a non-NULL value then I stop and use the Reason value in my result set. If the value is NULL, then I check the ParentGUID field for a back-reference. If ithe ParentGUID is NULL, then I use NULL for the Reason in my result set. If I find a ParentGUID value I need to walk through T2, T3, T4, T5 until I find a matched record with either a Reason & ParentGUID combination that is NULL (in which case my result is NULL) or the first non-NULL Reason I find becomes my Result. This last bit means that even if a T1 record chain references T2 > T3 > T4 > T5, I stop at the first table in that chain that has a non-NULL Reason.

    Below is a link to an SQL Fiddle with some data & my working SQL:
    SQL Fiddle Here

    As you can see the SQL is fairly long, and if I ever need to add T6, T7, T8, etc... tables then it will get much bigger to accommodate all of the possible paths. Even though I can write a routine to generate the SQL, I'd rather use a better method if I can.

    Anyone have any pointers or ideas on how I might be able to improve my SQL statement? Thanks in advance for any help.

  2. #2
    Smooth Moperator techgnome's Avatar
    Join Date
    May 2002
    Posts
    34,532

    Re: Help with improving my SQL query (SQLite)

    Well.... if it had all been in one table, it would have been a snap to create a recursive CTE that walked the ancestry lines for you... but because they are in different tables... (but then I haven't looked at the fiddle yet either) ... my kiddo is getting ready for her graduation dance later this evening. ... perhaps after I get back from dinner, if I remember I'll take a look at it and see if I can come up with something. Just based on the description alone though, I suspect there's a number of left joins, case statements, or COALLECE and some other tom foolery is going to be needed.

    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

  3. #3

    Thread Starter
    PowerPoster
    Join Date
    Aug 2010
    Location
    Canada
    Posts
    2,412

    Re: Help with improving my SQL query (SQLite)

    Thanks for the response. Yeah, there are separate tables because each stage has some other fields that are only relevant to the individual stages. Maybe I should punt the relevant bits out to a temporary table and grab the results from there...I'll give that a try.

  4. #4
    Hyperactive Member
    Join Date
    Jul 2013
    Posts
    400

    Re: Help with improving my SQL query (SQLite)

    I think I agree with techgnome.
    What I see is identical tables with T1 not having a ParentGUID, and a typical case for a CTE.
    Carlos

  5. #5
    Super Moderator FunkyDexter's Avatar
    Join Date
    Apr 2005
    Location
    An obscure body in the SK system. The inhabitants call it Earth
    Posts
    7,900

    Re: Help with improving my SQL query (SQLite)

    The maximum number of iterations on this is determinate and for that reason I wouldn't recommend a recursive approach. Recursion will work but seems unnecessary in this case. This should just be a series of left joins with a coalesce in the select. You also need to consider that a record can "arrive" in T1 by skipping any or all of T2, T3, T4 and T5 so you need to join to all potential candidates. Something like:-
    Code:
    Select T1.GUID, Coalesce(T1.Reason, T2.Reason, T3.Reason, T4.Reason, T5.Reason)
    From T1
    Left Join T2
       on T2.Guid = T1.ParentGuid
    Left Join T3
       on T3.Guid in (T1.ParentGuid, T2.ParentGuid)
    Left Join T4
       on T4.Guid in (T1.ParentGuid, T2.ParentGuid, T3.ParentGuid)
    Left Join T5
       on T5.Guid in (T1.ParentGuid, T2.ParentGuid, T3.ParentGuid, T4.ParentGuid)
    Note, if it gets to the end of the chain without finding a reason it will report null.

    I think that should work but I haven't tried it in a query window so be leary of syntax errors.


    Regards
    Tom Fool




    Edit>Be slightly cautious. Techgnome is absolutely right to bring up the possibility of putting all this into one table and you should certainly consider it but it's not always the right choice. There are three basic ways of implementing inheritance in a database:-

    1. Separate tables. Basically what you've done. Each entity gets a completely separate table. This is approach tends to be best where the entities are conceptually different, even if they share similar fields. They're likely to have different relationships with other entities. I.E. Salesman and Customer are both "people" and will share a lot of similar information, like name etc. but the way you'll use them in your system is likely to be very different. You're unlikely ever to pay commission to a customer or send an invoice to a salesman. And you'll probably never going to want to see a combined view of salesmen and customers. There is little to be gained from putting entities like this into an inheritance structure.

    2. Single table with redundancy (i.e. nulls). Everything goes into a single table and an entity that doesn't need to use a particular field simply gets a null in it. This is a fairly simple design and is useful when the entities are very similar in their structure with only a few differences and when you're frequently going to want to use them in a similar manner. The advantage is that combines views are easy to pull up but the disadvantage is that you've got a lot of empty space in there (which is likely to be taking up disk, depending on how you've set your datatype). Strictly speaking it's a violation of third normal form.

    3. Single table with Children. All the "common" fields are in the parent table and you have child tables for each different entity. The primary key for the child tables match the primary key of the parent table and should also act as the foreign key (this ensures a 0 to 1 relationship). This is arguably the "correct" approach and is the only one that truly conforms to Third Normal Form. However, it's got lots of pitfalls. Although each parent entity can only have a single child of a given type, there's no easy way to ensure that a single parent entity doesn't have children of several different type. It can also become very unwieldy if the inheritance structure is complex (eg type A and B share some field that C does not). Finally, your queries with this approach can become pretty unwieldy when looking for combined views as you have to left join to every potential child entity. That said, it will give you the smallest space on disk and it's likely that your queries, while a bit ugly, will perform better than approach 2.
    Last edited by FunkyDexter; Jun 4th, 2019 at 02:51 AM.
    The best argument against democracy is a five minute conversation with the average voter - Winston Churchill

    Hadoop actually sounds more like the way they greet each other in Yorkshire - Inferrd

  6. #6
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    4,418

    Re: Help with improving my SQL query (SQLite)

    IMO there is a 4th way to FD's three approaches:
    A combination of 1 and 2 of FD's approaches.
    It's basically his 3rd approach but turned around:
    The parent-tables are the tables with the unique fields, the child-table is the table with the common fields.
    the child-table has its own PK, it has a combined FK of "Parent-Table" and "Parent-PK" in a 1:1-relationship to the Parent-Table
    To stay in FD's example with Salesman and customer:
    The Child-Table would have the Columns of Name, FamilyName, Address, ZIPCode, City, Country, Phone etc.
    The Parent-Table would have Columns like "Commission" (for Salesman) or "Balance" (For Customer)

    my 2 cents

    No idea about query-complexity though.
    I just added it for completeness' sake

    Opinions?
    Last edited by Zvoni; Tomorrow at 31:69 PM.
    ----------------------------------------------------------------------------------------

    One System to rule them all, One Code to find them,
    One IDE to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    Code is like a joke: If you have to explain it, it's bad

  7. #7
    Super Moderator FunkyDexter's Avatar
    Join Date
    Apr 2005
    Location
    An obscure body in the SK system. The inhabitants call it Earth
    Posts
    7,900

    Re: Help with improving my SQL query (SQLite)

    a combined FK of "Parent-Table" and "Parent-PK"
    The thing to remember here is that you can't set up a foreign key constraint that does that because "Parent Table" isn't actually a field in the parent table - it's a table name. So you can't enforce consistency in the database without setting up programmed constraints.

    I will say that the TableName/TableValue approach is usually considered bad practice. If it's used for normal values then it's known as the entity attribute value pair anti pattern. Used for relationships it means no data integrity constraints and particularly complex querying (trust me, I'm working with an insurance system where they've done that and it's a PITA to query).

    That said I've never seen it used to implement inheritance. I think it's worth consideration but it's also worth approaching with caution and considering the known issues with it. I suspect it'll have some pitfalls but all the approaches I described come with pitfalls too.
    The best argument against democracy is a five minute conversation with the average voter - Winston Churchill

    Hadoop actually sounds more like the way they greet each other in Yorkshire - Inferrd

  8. #8
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    4,418

    Re: Help with improving my SQL query (SQLite)

    It's the reason why i mentioned it.
    Better to have the pro's and con's out in the open instead of figuring out later it's actually the approach used and then trying to repair everything.

    Ideas are like ammunition: better having it and not needing it, than the other way 'round

    EDIT: Just a crazy thought i had regarding the "Parent-Table"-Constraint:
    Since it's SQLite: what about using the ROWID of the sqlite_master-Entry for the Parent-Table?
    But please remember: It's a crazy (!!) thought....
    Last edited by Zvoni; Tomorrow at 31:69 PM.
    ----------------------------------------------------------------------------------------

    One System to rule them all, One Code to find them,
    One IDE to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    Code is like a joke: If you have to explain it, it's bad

  9. #9
    Super Moderator FunkyDexter's Avatar
    Join Date
    Apr 2005
    Location
    An obscure body in the SK system. The inhabitants call it Earth
    Posts
    7,900

    Re: Help with improving my SQL query (SQLite)

    what about using the ROWID of the sqlite_master-Entry for the Parent-Table?
    I honestly don't know enough about sqllite to comment but it sounds analogous to the Information_Schema in SQL Server. The two questions would be:-
    1. Is the constraint logically expressable?
    2. Are you allowed to reference system entities in that way?

    If the answers Yes to both of those I imagine it would work.
    The best argument against democracy is a five minute conversation with the average voter - Winston Churchill

    Hadoop actually sounds more like the way they greet each other in Yorkshire - Inferrd

  10. #10

    Thread Starter
    PowerPoster
    Join Date
    Aug 2010
    Location
    Canada
    Posts
    2,412

    Re: Help with improving my SQL query (SQLite)

    Hi everyone, thanks a lot for all the feedback there's some great stuff here.

    I haven't used CTEs before, but I've been doing some studying up on them, so thanks for pointing them out techgnome and Carlos. I can see some uses for them and they'll be a good tool to add to my toolbox.

    Funky - that query is great and does exactly what I need in a way that's easy to understand at a glance, and easy to generate dynamically through code so I think I'm going to go ahead and use that approach - I don't have the option to change the database schema right now to use any of the other suggestions (thanks to both you and Zvoni for the additional commentary!), so your approach looks like my preferred choice right now.

    Again, thank you all! I'll mark this resolved.

  11. #11
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    4,418

    Re: [RESOLVED] Help with improving my SQL query (SQLite)

    A last note to the 4th approach i mentioned:
    As unlikely it might be, that you come into a situation that you have to use it
    (trying to avoid it is still the best option!), you have to know, that in that 4th approach you cannot implement a "hard"
    constraint for the DB-engine to enforce integrity, like deleting just a record from the parent-table ("Salesman") it will
    automatically delete the corresponding record from the child-table ("Common Fields").
    You have an entry from "Salesman" with ID 4 as FK in the child-table, but you also have an entry in "Customer" with ID 4 as FK in the child-table.
    What happens if you delete Salesman with ID4? Is it going to JUST delete the Salesman-childdata with FK=4, or is it going to delete the customer-child-data with FK=4 too?
    In case you ever have to use the 4th approach, you have to implement a "soft" constraint, meaning you have to do all operations from your frontend or you can use triggers (before DELETE, After UPDATE etc.),
    because with triggers you can then use the "table-name"-Entity in the SQL-Statement inside the trigger
    Last edited by Zvoni; Tomorrow at 31:69 PM.
    ----------------------------------------------------------------------------------------

    One System to rule them all, One Code to find them,
    One IDE to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    Code is like a joke: If you have to explain it, it's bad

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width