[RESOLVED] Abnormal Unrolling
I can't believe I'm asking this, but here it is: We have a normalized database where somebody insists that it would be better if it were somewhat de-normalized. Everybody else thinks that's a bad idea, and it's not going to happen, but I thought there might be an intermediate solution to have it both ways, and that's what this question is about.
Suppose two tables:
Table 1:
PK
SomeStuff
SomeCount
Table 2:
PK
FK
OtherStuff
MoreOtherStuff.
If I had a query with a LEFT OUTER JOIN, it could look like this:
SELECT SomeStuff,SomeCount,OtherStuff,MoreOtherStuff
From Table1 LEFT OUTER JOIN Table 2 ON Table1.PK = Table2.FK
This will do almost exactly what the person who wants the tables denormalized wants, with one exception.
The SomeCount field in Table 1 could be some integer of 1 or greater. Quite often, if the count is > 1, then there will be no records in Table 2, but there MIGHT be. If I could say for certain that whenever SomeCount > 1, then there will be no record in table 2, then the LEFT OUTER JOIN query would return only one record for each record in Table 1, and a Sum of SomeCount will be correct.
I can't say that, though, so SomeCount could be 3, and the LEFT OUTER JOIN query might result in two or three rows, each with SomeCount = 3, and with whatever data from the two or three rows in Table 2. That would mean that a Sum of SomeCount would be different from just summing the SomeCount in Table 1. In other words:
SELECT SUM(SomeCount) FROM Table1
Would not return the same value as:
SELECT SUM(SomeCount)
From Table1 LEFT OUTER JOIN Table 2 ON Table1.PK = Table2.FK
This would bother me not at all, but it may bother the one guy who insists on denormalizing the database. So, what I'm asking is whether or not there is some reasonable approach that would essentially 'unroll' the case where the LEFT OUTER JOIN results in more than one record?
That's a more difficult problem than it might appear at first, so I don't think there is a REASONABLE way to do this (and it's too stupid a need to waste any time on, when it could be solved so much easier through code). The issue is that if SomeCount > 1, then there need not be SomeCount records in Table 2. In other words, if SomeCount were 5, there could be 3 records in Table 2, so the unrolled view would have to end up looking like this:
SomeStuff,1,OtherStuff,MoreOtherStuff
SomeStuff,1,OtherStuff,MoreOtherStuff
SomeStuff,1,OtherStuff,MoreOtherStuff
SomeStuff,2,Null,Null
Where that last record has a count of 2, as there are two records with no matching records in Table 2.
Is there a simple way to do this?
Re: [RESOLVED] Abnormal Unrolling
Re: [RESOLVED] Abnormal Unrolling
Here is my attempt
Code:
SELECT SUM(t2.SomeCount)
FROM (
SELECT t2.FK
, 1 AS SomeCount
, t2.OtherStuff
, t2.MoreStuff
FROM table2 t2
UNION ALL
SELECT t1.PK
, MAX(t1.SomeCount) - COUNT(t2.PK) AS SomeCount
, NULL AS OtherStuff
, NULL AS MoreStuff
FROM table1 t1
LEFT JOIN table2 t2
ON t2.FK = t1.PK
GROUP BY t1.PK
HAVING MAX(t1.SomeCount) - COUNT(t2.PK) <> 0
) t2
And here is my reasoning:
- I'm adding a new column SomeCount to t2 with a constant value of 1.
- I'm adding "missing" rows to t2 with enough SomeCount so that for each t2.FK there is enough SUM(t2.SomeCount) to match the total in t1.SomeCount. This is what the second SELECT after UNION ALL does.
- On new rows the values of OtherStuff, MoreStuff, CoolStuff, etc. are NULLs
- I'm using basic ANSI SQL-92 -- no CTEs, no windowing functions so this should work on MSSQL 2000, MS Access, dBase (probably:-))
cheers,
</wqw>
Re: [RESOLVED] Abnormal Unrolling
The reason for my way (since wqw explained his, i'll add my explanation to it, since someone else might learn something from it):
1) CTE --> for DBMS not supporting CTE's (e.g. MySQL5.7 and lower) it's easy to resolve the CTE's as nested Queries within its according JOIN
2) T2: I like to "prepare" the Dataset beforehand
3) T4: The reason why T4 selects from T2 (instead from table2 directly) is:
If you "introduce" a Filter ("WHERE ....") on T2 (whether directly within T2 or in the upper half of the UNION, though there within the ON-Clause of the LEFT JOIN --> NOT IN A WHERE-CLAUSE!!),
that Filter propagates automatically to T4! No need to filter in multiple places
4) In the upper half of the UNION i use a LEFT JOIN since we might get an unequal "count" of childrecords compared to "SomeCount". The Coalesce ensures that if NO childrecords exist to return the "SomeCount" from the Parentrecord
5) In the lower half of the UNION i use an INNER JOIN, since the "overhanging SomeCount" is only applicable, if there actually exist Childrecords. It's a simple subtraction with no need for GROUP BY
6) The only aggregation with a GROUP BY is very simple within T4 --> Frankly, Aggregations and its GROUP BY's have driven me barmy in the past, so i try to keep them as simple as possible, nevermind that i loathe Aggregation/Grouping in the "main" Query
EDIT: @1
here without CTE's
https://dbfiddle.uk/MOEO-0I_
Code:
SELECT
T1.SomeStuff, COALESCE(T2.RNPK, T1.SomeCount) As SomeCount, T2.OtherStuff, T2.MoreStuff
FROM Table1 AS T1 LEFT JOIN
(SELECT PK, FK, OtherStuff, MoreStuff, 1 As RNPK FROM Table2) AS T2
ON T1.PK=T2.FK
UNION ALL
SELECT T3.SomeStuff, T3.SomeCount-T4.RNFK, Null, Null
FROM table1 AS T3 INNER JOIN
(SELECT FK, COUNT(PK) AS RNFK FROM
(SELECT PK, FK, OtherStuff, MoreStuff, 1 As RNPK FROM Table2)
GROUP BY FK) AS T4
ON T3.PK=T4.FK
WHERE T3.SomeCount-T4.RNFK>0
ORDER BY T1.SomeStuff, SomeCount
Same results,
.... BUT: As you can see, i execute the Query on table2 twice!
Re: [RESOLVED] Abnormal Unrolling
Quote:
Originally Posted by
Zvoni
6) The only aggregation with a GROUP BY is very simple within T4 --> Frankly, Aggregations and its GROUP BY's have driven me barmy in the past, so i try to keep them as simple as possible, nevermind that i loathe Aggregation/Grouping in the "main" Query
You cannot possibly do it without aggregation here and using GROUP BY in the "main" is most probable to use any indexing on the base table compared to windowed functions on CTEs on derived tables on nested views.
cheers,
</wqw>
Re: [RESOLVED] Abnormal Unrolling
One thing this made me realize is that I've been using something like CTEs for a long time. What we were doing was views of views. The query analyzer doesn't do a good job with those, because it gives up on the optimizing at some point. I came in late to the party, and have stuck with what people were comfortable with, but CTEs would have allowed the query analyzer to work more efficiently. However, since the folks I work with tend to be happier in Access than in SSMS, I'm not sure that they'd be willing to come around to using CTEs. I like to write SQL freehand, while they prefer 'designers' of one sort or another, and try to avoid writing SQL directly. I'm not sure you can write a CTE in SSMS, but you sure can't in Access.
Re: [RESOLVED] Abnormal Unrolling
Quote:
Originally Posted by
Shaggy Hiker
I'm not sure you can write a CTE in SSMS,
Yes, you can (to quote Obama :-))
Quote:
but you sure can't in Access.
No, you can't (to quote Trump *?*)
Re: [RESOLVED] Abnormal Unrolling
That was poorly stated on my part. I know you can write a CTE in SSMS. What I meant to say was that I'm not sure you can write a CTE in SSMS without writing SQL. If all you do is drag tables into a designer, fiddle with joins, etc., without ever touching the raw SQL, is it still possible to create a CTE?
Re: [RESOLVED] Abnormal Unrolling
Quote:
Originally Posted by
Shaggy Hiker
That was poorly stated on my part. I know you can write a CTE in SSMS. What I meant to say was that I'm not sure you can write a CTE in SSMS without writing SQL. If all you do is drag tables into a designer, fiddle with joins, etc., without ever touching the raw SQL, is it still possible to create a CTE?
Ahh….. don’t think so.
OTOH, never used those „click and dragdrop“-thingies in any db-client.
always used SQL, so I shouldn’t talk :)