Results 1 to 13 of 13

Thread: [RESOLVED] Query of a Query

  1. #1

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    38,988

    Resolved [RESOLVED] Query of a Query

    I have encountered an odd situation and am asking for help writing a query.

    I have two tables. They have primary keys, but those are different between the two tables, so that field, unfortunately, has to be ignored for this question. The two tables should, in theory, be identical in every way EXCEPT the primary keys, which should be different for each record. The rest of the data consists of many fields, an unfortunate number of which are mind boggling mistakes, but that's a different matter. The key point is that table B is missing 1300 records that are found in the table A. Table A is correct.

    What I'm trying to do is look for some explanation for the missing records, so I'm trying to write a query that finds them. As far as I can tell, aside from the primary key field, there is no field, or combination of fields, that is unique, so I can't say "find everything in table A that is not found in table B".

    My first thought is that there is a key date field, which is just date with no time. For any given date, there can be hundreds, or even a few thousand, records, so they are far from unique, but a query that would help would be:

    Find the difference in the count of records with the same date between tables A and B. For example, if table A has 20 on 1/1/2022, and table B has 15 on 1/1/2022, then return 5 for 1/1/2022. This would result in a result set with the date in one column and the difference in another.

    I believe I could hack this together, though I have never written such a query. I just suspect that somebody on here can do it off the top of their head.
    My usual boring signature: Nothing

  2. #2
    Super Moderator si_the_geek's Avatar
    Join Date
    Jul 2002
    Location
    Bristol, UK
    Posts
    41,929

    Re: Query of a Query

    Here's a quick thought:
    Code:
    SELECT A.DateField, ACount - BCount
    FROM  (  SELECT DateField, Count(*) as ACount
             FROM TableA
             GROUP BY DateField
          ) as A
    LEFT JOIN
          (  SELECT DateField, Count(*) as BCount
             FROM TableB
             GROUP BY DateField
          ) as B
         ON A.DateField = B.DateField
    ...but note that you'll probably want to use a more complicated version of ACount - BCount to cope with instances of no matching records in TableB (so BCount is likely to be null)
    Last edited by si_the_geek; Aug 12th, 2022 at 03:57 PM. Reason: had missed out Group By

  3. #3

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    38,988

    Re: Query of a Query

    That looks like it will mostly work. I'll have to see about Nulls.
    My usual boring signature: Nothing

  4. #4

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    38,988

    Re: Query of a Query

    Darn. There's no pattern to the data. It's just goofiness all over the place. A few missing here, a few missing there. A few dozen on one day, a couple on another.
    My usual boring signature: Nothing

  5. #5
    Super Moderator si_the_geek's Avatar
    Join Date
    Jul 2002
    Location
    Bristol, UK
    Posts
    41,929

    Re: Query of a Query

    Cool, by the way I missed something earlier... you'll need to add GROUP BY DateField to each of the sub-queries.

  6. #6
    Super Moderator si_the_geek's Avatar
    Join Date
    Jul 2002
    Location
    Bristol, UK
    Posts
    41,929

    Re: Query of a Query

    Quote Originally Posted by Shaggy Hiker View Post
    Darn. There's no pattern to the data. It's just goofiness all over the place. A few missing here, a few missing there. A few dozen on one day, a couple on another.
    That's annoying

  7. #7

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    38,988

    Re: Query of a Query

    Quote Originally Posted by si_the_geek View Post
    Cool, by the way I missed something earlier... you'll need to add GROUP BY DateField to each of the sub-queries.
    Yeah, I found that, of course. A modest change to make.
    My usual boring signature: Nothing

  8. #8
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    4,418

    Re: [RESOLVED] Query of a Query

    Yeah. "Officially" resolved (just saw the post now)
    After reading your description of the problem, my first try would be:

    1) make copies of the two tables (since we don't want to destroy them)
    2) Declare all fields (except the primary key) of each table as unique combined (kind of when you declare a compound primary key)

    That way you can easily see if there is duplicate data/records in each table.
    If no, then it's a simple LEFT JOIN (to check for missing rows), or an UPSERT/INSERT IGNORE (depending on the DBMS) from Table A into Table B
    Last edited by Zvoni; Tomorrow at 31:69 PM.
    ----------------------------------------------------------------------------------------

    One System to rule them all, One Code to find them,
    One IDE to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    Code is like a joke: If you have to explain it, it's bad

  9. #9

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    38,988

    Re: [RESOLVED] Query of a Query

    Having looked more closely at a wider selection of data, UPSERT/INSERT IGNORE is not going to be happening.

    The data is coming from an online DB via an automatic download. That download includes no indication (currently) of when records were added or edited. When this was started, we assumed that nobody would add/edit records more than a month back. That assumption has proven to be...non functional. Therefore, the first thing I'm going to try to do is get the download to include insert date and edit date fields, if those are available.

    Basically, the process is working nicely. The data...is not so goodly.
    My usual boring signature: Nothing

  10. #10
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    4,418

    Re: [RESOLVED] Query of a Query

    There are no Transaction-logs on your db-server?
    Last edited by Zvoni; Tomorrow at 31:69 PM.
    ----------------------------------------------------------------------------------------

    One System to rule them all, One Code to find them,
    One IDE to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    Code is like a joke: If you have to explain it, it's bad

  11. #11

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    38,988

    Re: [RESOLVED] Query of a Query

    Not my server, actually. What is happening is that a third party is getting data from sundry folks, then we go to an API they provide to get data from them. We made an assumption, based on the nature of the data, that nobody would be adding/editing data after a month and a half, roughly. That assumption has proven to be wrong, but since the output includes no information as to when a record was added/edited, we're a bit hamstrung.
    My usual boring signature: Nothing

  12. #12
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    4,418

    Re: [RESOLVED] Query of a Query

    Quote Originally Posted by Shaggy Hiker View Post
    Not my server, actually. What is happening is that a third party is getting data from sundry folks, then we go to an API they provide to get data from them. We made an assumption, based on the nature of the data, that nobody would be adding/editing data after a month and a half, roughly. That assumption has proven to be wrong, but since the output includes no information as to when a record was added/edited, we're a bit hamstrung.
    Oh dear.
    Well, should be a lesson to those other parties......
    and maybe a redesign of the output they provide.....
    Last edited by Zvoni; Tomorrow at 31:69 PM.
    ----------------------------------------------------------------------------------------

    One System to rule them all, One Code to find them,
    One IDE to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    Code is like a joke: If you have to explain it, it's bad

  13. #13

    Thread Starter
    Super Moderator Shaggy Hiker's Avatar
    Join Date
    Aug 2002
    Location
    Idaho
    Posts
    38,988

    Re: [RESOLVED] Query of a Query

    Quote Originally Posted by Zvoni View Post
    Oh dear.
    Well, should be a lesson to those other parties......
    and maybe a redesign of the output they provide.....
    Yeah...no comment.
    My usual boring signature: Nothing

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width