-
Aug 12th, 2022, 12:28 PM
#1
[RESOLVED] Query of a Query
I have encountered an odd situation and am asking for help writing a query.
I have two tables. They have primary keys, but those are different between the two tables, so that field, unfortunately, has to be ignored for this question. The two tables should, in theory, be identical in every way EXCEPT the primary keys, which should be different for each record. The rest of the data consists of many fields, an unfortunate number of which are mind boggling mistakes, but that's a different matter. The key point is that table B is missing 1300 records that are found in the table A. Table A is correct.
What I'm trying to do is look for some explanation for the missing records, so I'm trying to write a query that finds them. As far as I can tell, aside from the primary key field, there is no field, or combination of fields, that is unique, so I can't say "find everything in table A that is not found in table B".
My first thought is that there is a key date field, which is just date with no time. For any given date, there can be hundreds, or even a few thousand, records, so they are far from unique, but a query that would help would be:
Find the difference in the count of records with the same date between tables A and B. For example, if table A has 20 on 1/1/2022, and table B has 15 on 1/1/2022, then return 5 for 1/1/2022. This would result in a result set with the date in one column and the difference in another.
I believe I could hack this together, though I have never written such a query. I just suspect that somebody on here can do it off the top of their head.
My usual boring signature: Nothing
-
Aug 12th, 2022, 12:39 PM
#2
Re: Query of a Query
Here's a quick thought:
Code:
SELECT A.DateField, ACount - BCount
FROM ( SELECT DateField, Count(*) as ACount
FROM TableA
GROUP BY DateField
) as A
LEFT JOIN
( SELECT DateField, Count(*) as BCount
FROM TableB
GROUP BY DateField
) as B
ON A.DateField = B.DateField
...but note that you'll probably want to use a more complicated version of ACount - BCount to cope with instances of no matching records in TableB (so BCount is likely to be null)
Last edited by si_the_geek; Aug 12th, 2022 at 03:57 PM.
Reason: had missed out Group By
-
Aug 12th, 2022, 03:13 PM
#3
Re: Query of a Query
That looks like it will mostly work. I'll have to see about Nulls.
My usual boring signature: Nothing
-
Aug 12th, 2022, 03:55 PM
#4
Re: Query of a Query
Darn. There's no pattern to the data. It's just goofiness all over the place. A few missing here, a few missing there. A few dozen on one day, a couple on another.
My usual boring signature: Nothing
-
Aug 12th, 2022, 03:57 PM
#5
Re: Query of a Query
Cool, by the way I missed something earlier... you'll need to add GROUP BY DateField to each of the sub-queries.
-
Aug 12th, 2022, 03:58 PM
#6
Re: Query of a Query
Originally Posted by Shaggy Hiker
Darn. There's no pattern to the data. It's just goofiness all over the place. A few missing here, a few missing there. A few dozen on one day, a couple on another.
That's annoying
-
Aug 12th, 2022, 05:33 PM
#7
Re: Query of a Query
Originally Posted by si_the_geek
Cool, by the way I missed something earlier... you'll need to add GROUP BY DateField to each of the sub-queries.
Yeah, I found that, of course. A modest change to make.
My usual boring signature: Nothing
-
Aug 16th, 2022, 01:22 AM
#8
Re: [RESOLVED] Query of a Query
Yeah. "Officially" resolved (just saw the post now)
After reading your description of the problem, my first try would be:
1) make copies of the two tables (since we don't want to destroy them)
2) Declare all fields (except the primary key) of each table as unique combined (kind of when you declare a compound primary key)
That way you can easily see if there is duplicate data/records in each table.
If no, then it's a simple LEFT JOIN (to check for missing rows), or an UPSERT/INSERT IGNORE (depending on the DBMS) from Table A into Table B
Last edited by Zvoni; Tomorrow at 31:69 PM.
----------------------------------------------------------------------------------------
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------------------
People call me crazy because i'm jumping out of perfectly fine airplanes.
---------------------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad
-
Aug 17th, 2022, 02:30 PM
#9
Re: [RESOLVED] Query of a Query
Having looked more closely at a wider selection of data, UPSERT/INSERT IGNORE is not going to be happening.
The data is coming from an online DB via an automatic download. That download includes no indication (currently) of when records were added or edited. When this was started, we assumed that nobody would add/edit records more than a month back. That assumption has proven to be...non functional. Therefore, the first thing I'm going to try to do is get the download to include insert date and edit date fields, if those are available.
Basically, the process is working nicely. The data...is not so goodly.
My usual boring signature: Nothing
-
Aug 17th, 2022, 02:53 PM
#10
Re: [RESOLVED] Query of a Query
There are no Transaction-logs on your db-server?
Last edited by Zvoni; Tomorrow at 31:69 PM.
----------------------------------------------------------------------------------------
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------------------
People call me crazy because i'm jumping out of perfectly fine airplanes.
---------------------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad
-
Aug 18th, 2022, 01:56 PM
#11
Re: [RESOLVED] Query of a Query
Not my server, actually. What is happening is that a third party is getting data from sundry folks, then we go to an API they provide to get data from them. We made an assumption, based on the nature of the data, that nobody would be adding/editing data after a month and a half, roughly. That assumption has proven to be wrong, but since the output includes no information as to when a record was added/edited, we're a bit hamstrung.
My usual boring signature: Nothing
-
Aug 19th, 2022, 01:54 AM
#12
Re: [RESOLVED] Query of a Query
Originally Posted by Shaggy Hiker
Not my server, actually. What is happening is that a third party is getting data from sundry folks, then we go to an API they provide to get data from them. We made an assumption, based on the nature of the data, that nobody would be adding/editing data after a month and a half, roughly. That assumption has proven to be wrong, but since the output includes no information as to when a record was added/edited, we're a bit hamstrung.
Oh dear.
Well, should be a lesson to those other parties......
and maybe a redesign of the output they provide.....
Last edited by Zvoni; Tomorrow at 31:69 PM.
----------------------------------------------------------------------------------------
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------------------
People call me crazy because i'm jumping out of perfectly fine airplanes.
---------------------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad
-
Aug 19th, 2022, 09:25 AM
#13
Re: [RESOLVED] Query of a Query
Originally Posted by Zvoni
Oh dear.
Well, should be a lesson to those other parties......
and maybe a redesign of the output they provide.....
Yeah...no comment.
My usual boring signature: Nothing
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|