1. ## [RESOLVED] Abnormal Unrolling

I can't believe I'm asking this, but here it is: We have a normalized database where somebody insists that it would be better if it were somewhat de-normalized. Everybody else thinks that's a bad idea, and it's not going to happen, but I thought there might be an intermediate solution to have it both ways, and that's what this question is about.

Suppose two tables:

Table 1:

PK
SomeStuff
SomeCount

Table 2:
PK
FK
OtherStuff
MoreOtherStuff.

If I had a query with a LEFT OUTER JOIN, it could look like this:

SELECT SomeStuff,SomeCount,OtherStuff,MoreOtherStuff
From Table1 LEFT OUTER JOIN Table 2 ON Table1.PK = Table2.FK

This will do almost exactly what the person who wants the tables denormalized wants, with one exception.

The SomeCount field in Table 1 could be some integer of 1 or greater. Quite often, if the count is > 1, then there will be no records in Table 2, but there MIGHT be. If I could say for certain that whenever SomeCount > 1, then there will be no record in table 2, then the LEFT OUTER JOIN query would return only one record for each record in Table 1, and a Sum of SomeCount will be correct.

I can't say that, though, so SomeCount could be 3, and the LEFT OUTER JOIN query might result in two or three rows, each with SomeCount = 3, and with whatever data from the two or three rows in Table 2. That would mean that a Sum of SomeCount would be different from just summing the SomeCount in Table 1. In other words:

SELECT SUM(SomeCount) FROM Table1

Would not return the same value as:

SELECT SUM(SomeCount)
From Table1 LEFT OUTER JOIN Table 2 ON Table1.PK = Table2.FK

This would bother me not at all, but it may bother the one guy who insists on denormalizing the database. So, what I'm asking is whether or not there is some reasonable approach that would essentially 'unroll' the case where the LEFT OUTER JOIN results in more than one record?

That's a more difficult problem than it might appear at first, so I don't think there is a REASONABLE way to do this (and it's too stupid a need to waste any time on, when it could be solved so much easier through code). The issue is that if SomeCount > 1, then there need not be SomeCount records in Table 2. In other words, if SomeCount were 5, there could be 3 records in Table 2, so the unrolled view would have to end up looking like this:

SomeStuff,1,OtherStuff,MoreOtherStuff
SomeStuff,1,OtherStuff,MoreOtherStuff
SomeStuff,1,OtherStuff,MoreOtherStuff
SomeStuff,2,Null,Null

Where that last record has a count of 2, as there are two records with no matching records in Table 2.

Is there a simple way to do this?

2. ## Re: Abnormal Unrolling

Either a CTE that gets the sum independent of the other table, or a sub query that does the same sounds like the way to solve it ... maybe? Also, it sounds like he wants a Data Warehouse which would have everything denormalized and flattened. again, maybe?

-tg

3. ## Re: Abnormal Unrolling

Yeah, that's kind of where I'm coming down, also. He'll be the only person using that monstrosity, and I'm not quite sure what he's using it for. As far as I know, he never needs both the count AND the information from table 2 at the same time, so I'm inclined to offer up the LEFT OUTER JOIN with the caveat that he can't sum the Count on that view. If he wants the total count, he can sum Table 1 directly. I don't believe he ever needs both.

The whole thing is absurd to begin with.

4. ## Re: Abnormal Unrolling

Do you have any concrete examples? I'm trying to wrap my mind around how a correct result might look like.

But i tend in tg's direction, too: Sum first, Join second

EDIT: I think i figured it out, what it should look like (missed your last sample)
Give me a bit of time to think on it

I take it, it's MS SQL-Server?

EDIT2: Done in SQLite. Couldn't find a fiddle with MS-SQL
Code:
```WITH
T2    AS (SELECT FK, OtherStuff, MoreStuff,
ROW_NUMBER() OVER(PARTITION BY PK ORDER BY PK) AS RNPK,
ROW_NUMBER() OVER(PARTITION BY FK ORDER BY PK) AS RNFK
FROM Table2)

SELECT
T1.SomeStuff, T2.RNPK As SomeCount, T2.OtherStuff, T2.MoreStuff
FROM Table1 AS T1 LEFT JOIN T2 ON T1.PK=T2.FK
UNION ALL
SELECT T3.SomeStuff, T3.SomeCount-COALESCE(MAX(T2.RNFK), 0), Null, Null
FROM table1 AS T3 LEFT JOIN T2 ON T3.PK=T2.FK GROUP BY T3.SomeStuff
HAVING T3.SomeCount-COALESCE(MAX(T2.RNFK), 0)>0```
Returns
SomeStuff SomeCount OtherStuff MoreStuff
Shaggy 1 Hiker1 Burp
Shaggy 1 Hiker2 Fart
Shaggy 1 Hiker3 Gulp
Shaggy 2

If you change SomeCount to 3 in T1, the last line is not returned

EDIT3: Found a fiddle: https://dbfiddle.uk/IYmGLyJG

had to change the query a bit, since MSSQL is stricter than SQLite

Tables
Code:
```CREATE TABLE table1 (
PK    INTEGER,
SomeStuff    VARCHAR(255),
SomeCount    INTEGER
);
CREATE TABLE table2 (
PK    INTEGER,
FK    INTEGER,
OtherStuff    VARCHAR(255),
MoreStuff    VARCHAR(255)
);```
Inserts
Code:
```INSERT INTO table1 (PK, SomeStuff, SomeCount) VALUES (1, 'Shaggy', 5);
INSERT INTO table1 (PK, SomeStuff, SomeCount) VALUES (2, 'DDay', 3);
INSERT INTO table2 (PK, FK, OtherStuff, MoreStuff) VALUES (1, 1, 'Hiker1', 'Burp');
INSERT INTO table2 (PK, FK, OtherStuff, MoreStuff) VALUES (2, 1, 'Hiker2', 'Fart');
INSERT INTO table2 (PK, FK, OtherStuff, MoreStuff) VALUES (3, 1, 'Hiker3', 'Gulp');```
Query
Code:
```WITH
T2  AS (SELECT PK, FK, OtherStuff, MoreStuff, 1 As RNPK FROM Table2),
T4  AS (SELECT FK, COUNT(PK) AS RNFK FROM T2 GROUP BY FK)

SELECT
T1.SomeStuff, COALESCE(T2.RNPK, T1.SomeCount) As SomeCount, T2.OtherStuff, T2.MoreStuff
FROM Table1 AS T1 LEFT JOIN T2 ON T1.PK=T2.FK
UNION ALL
SELECT T3.SomeStuff, T3.SomeCount-T4.RNFK, Null, Null
FROM table1 AS T3 INNER JOIN T4 ON T3.PK=T4.FK
WHERE T3.SomeCount-T4.RNFK>0
ORDER BY T1.SomeStuff, SomeCount```
If you need an explanation how it works, just give a shout

5. ## Re: Abnormal Unrolling

Wow, you even got the MoreStuff items right, and I didn't even include them

I might just see whether or not that makes everybody happy, whether it is a bridge too far, or whether people just want to fight about this for the sake of it. Won't be this week, though.

6. ## Re: Abnormal Unrolling

Yeah, that works. Clever approach. After yet another meeting, I was asked to put it together so that we can whip it out if needed.

7. ## Re: Abnormal Unrolling

Originally Posted by Shaggy Hiker
Yeah, that works. Clever approach. After yet another meeting, I was asked to put it together so that we can whip it out if needed.
Just don't Burp and Fart during one of those meetings

Rawhide!

9. ## Re: [RESOLVED] Abnormal Unrolling

Here is my attempt

Code:
```SELECT      SUM(t2.SomeCount)
FROM        (
SELECT      t2.FK
, 1                 AS SomeCount
, t2.OtherStuff
, t2.MoreStuff
FROM        table2 t2

UNION ALL

SELECT      t1.PK
, MAX(t1.SomeCount) - COUNT(t2.PK) AS SomeCount
, NULL              AS OtherStuff
, NULL              AS MoreStuff
FROM        table1 t1
LEFT JOIN   table2 t2
ON          t2.FK = t1.PK
GROUP BY    t1.PK
HAVING      MAX(t1.SomeCount) - COUNT(t2.PK) <> 0
) t2```
And here is my reasoning:
- I'm adding a new column SomeCount to t2 with a constant value of 1.
- I'm adding "missing" rows to t2 with enough SomeCount so that for each t2.FK there is enough SUM(t2.SomeCount) to match the total in t1.SomeCount. This is what the second SELECT after UNION ALL does.
- On new rows the values of OtherStuff, MoreStuff, CoolStuff, etc. are NULLs
- I'm using basic ANSI SQL-92 -- no CTEs, no windowing functions so this should work on MSSQL 2000, MS Access, dBase (probably:-))

cheers,
</wqw>

10. ## Re: [RESOLVED] Abnormal Unrolling

The reason for my way (since wqw explained his, i'll add my explanation to it, since someone else might learn something from it):

1) CTE --> for DBMS not supporting CTE's (e.g. MySQL5.7 and lower) it's easy to resolve the CTE's as nested Queries within its according JOIN
2) T2: I like to "prepare" the Dataset beforehand
3) T4: The reason why T4 selects from T2 (instead from table2 directly) is:
If you "introduce" a Filter ("WHERE ....") on T2 (whether directly within T2 or in the upper half of the UNION, though there within the ON-Clause of the LEFT JOIN --> NOT IN A WHERE-CLAUSE!!),
that Filter propagates automatically to T4! No need to filter in multiple places
4) In the upper half of the UNION i use a LEFT JOIN since we might get an unequal "count" of childrecords compared to "SomeCount". The Coalesce ensures that if NO childrecords exist to return the "SomeCount" from the Parentrecord
5) In the lower half of the UNION i use an INNER JOIN, since the "overhanging SomeCount" is only applicable, if there actually exist Childrecords. It's a simple subtraction with no need for GROUP BY
6) The only aggregation with a GROUP BY is very simple within T4 --> Frankly, Aggregations and its GROUP BY's have driven me barmy in the past, so i try to keep them as simple as possible, nevermind that i loathe Aggregation/Grouping in the "main" Query

EDIT: @1
here without CTE's
https://dbfiddle.uk/MOEO-0I_
Code:
```SELECT
T1.SomeStuff, COALESCE(T2.RNPK, T1.SomeCount) As SomeCount, T2.OtherStuff, T2.MoreStuff
FROM Table1 AS T1 LEFT JOIN
(SELECT PK, FK, OtherStuff, MoreStuff, 1 As RNPK FROM Table2) AS T2
ON T1.PK=T2.FK
UNION ALL
SELECT T3.SomeStuff, T3.SomeCount-T4.RNFK, Null, Null
FROM table1 AS T3 INNER JOIN
(SELECT FK, COUNT(PK) AS RNFK FROM
(SELECT PK, FK, OtherStuff, MoreStuff, 1 As RNPK FROM Table2)
GROUP BY FK) AS T4
ON T3.PK=T4.FK
WHERE T3.SomeCount-T4.RNFK>0
ORDER BY T1.SomeStuff, SomeCount```
Same results,
.... BUT: As you can see, i execute the Query on table2 twice!

11. ## Re: [RESOLVED] Abnormal Unrolling

Originally Posted by Zvoni
6) The only aggregation with a GROUP BY is very simple within T4 --> Frankly, Aggregations and its GROUP BY's have driven me barmy in the past, so i try to keep them as simple as possible, nevermind that i loathe Aggregation/Grouping in the "main" Query
You cannot possibly do it without aggregation here and using GROUP BY in the "main" is most probable to use any indexing on the base table compared to windowed functions on CTEs on derived tables on nested views.

cheers,
</wqw>

12. ## Re: [RESOLVED] Abnormal Unrolling

One thing this made me realize is that I've been using something like CTEs for a long time. What we were doing was views of views. The query analyzer doesn't do a good job with those, because it gives up on the optimizing at some point. I came in late to the party, and have stuck with what people were comfortable with, but CTEs would have allowed the query analyzer to work more efficiently. However, since the folks I work with tend to be happier in Access than in SSMS, I'm not sure that they'd be willing to come around to using CTEs. I like to write SQL freehand, while they prefer 'designers' of one sort or another, and try to avoid writing SQL directly. I'm not sure you can write a CTE in SSMS, but you sure can't in Access.

13. ## Re: [RESOLVED] Abnormal Unrolling

Originally Posted by Shaggy Hiker
I'm not sure you can write a CTE in SSMS,
Yes, you can (to quote Obama :-))

but you sure can't in Access.
No, you can't (to quote Trump *?*)

14. ## Re: [RESOLVED] Abnormal Unrolling

That was poorly stated on my part. I know you can write a CTE in SSMS. What I meant to say was that I'm not sure you can write a CTE in SSMS without writing SQL. If all you do is drag tables into a designer, fiddle with joins, etc., without ever touching the raw SQL, is it still possible to create a CTE?

15. ## Re: [RESOLVED] Abnormal Unrolling

Originally Posted by Shaggy Hiker
That was poorly stated on my part. I know you can write a CTE in SSMS. What I meant to say was that I'm not sure you can write a CTE in SSMS without writing SQL. If all you do is drag tables into a designer, fiddle with joins, etc., without ever touching the raw SQL, is it still possible to create a CTE?
Ahh….. don’t think so.

OTOH, never used those „click and dragdrop“-thingies in any db-client.
always used SQL, so I shouldn’t talk

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•