[RESOLVED] SQL Server - blank out col if another col's value has already been seen, in a query
Hi. If this isn't possible, I am happy with a "no" answer and I will solve it in my front-end C# code (which I think I can do) rather than the backend database.
I kind of struggled to say in a few words what I want to do, so excuse my Title (which was getting so long it got truncated!)
I have a query that is returning me rows that can have a duplicated value in one of the columns and when that happens I want one of the other column to only have a value in the first row. All subsequent rows should have a blank or zero.
This is a lot easier to explain in a picture:
SONumber = 1076431 is there 2x. This is because Emp 17 worked on a task on 1/5 and another task on 1/24 and both these tasks belong to that SO/Service Order.
SA Credit is a service agreement credit the customer has.
This data goes to a grid in my application and the company prez exports the grid to an Excel file. She wants to put a SUM formula at the bottom of the SA Credit column and add them all. Additional values of -539.00 overinflate how much credit the customer has. So ideally, row #2 should have -539 and then when we get to row #6, since we've already seen SONumber = 1076431, SA Credit should be 0. And for all the rest of the occurrences of 1076431, which can happen (I only ran this query for January for the purpose of less data, and for one employee).
Here is the query. I stripped this way down from the query that it really is, inside a sproc. But this is enough to illustrate (and maybe solve?) the problem.
Code:
-- Tues 2/4/25
SELECT
Trips.EmployeeID AS Emp,
jobs.sonumber,
CONVERT(varchar(10), Trips.ScheduledTime, 101) AS ScheduledDate,
CASE
WHEN (SELECT SUM(PRICE) FROM xqryP21LinesRelated WHERE SONO = Jobs.SONumber AND ITEM like 'SA[_]%' AND ITEM NOT LIKE 'SA[_]BILL%' AND ISNULL(disposition, '') != 'C') != 0
THEN (SELECT STR(SUM(PRICE),16,2) FROM xqryP21LinesRelated WHERE SONO = Jobs.SONumber AND ITEM like 'SA[_]%' AND ITEM NOT LIKE 'SA[_]BILL%' AND ISNULL(disposition, '') != 'C')
ELSE 'N'
END AS [SA Credit]
FROM Trips
INNER JOIN Jobs ON Trips.relJobControl = Jobs.Control
JOIN Emps on Emps.EmpID = Trips.EmployeeID
WHERE (Trips.EmployeeID = '17')
AND (Trips.ScheduledTime BETWEEN '1/1/24' AND '1/31/24')
ORDER BY Emp, ScheduledDate
Thank you.
There are 10 kinds of people in this world. Those who understand binary, and those who don't.
Re: SQL Server - blank out col if another col's value has already been seen, in a que
Here you go.
Note: I tried to recreate your Structure in SQLite, so be careful about it
Had to replace your IsNull with IfNull, or i was skipping your conversion-reoutine for the Date,
and i've left out the Employee-Table completely
Bottom line: You have to count up how many times the relJobControl occurs in Trips. That's what the ROW_NUMBER does
i've turned around your check for != 0
My CASE WHEN checks if there is a child-record in the first place (the Coalesce) or if the returned SUM(Price) of the Child-record = 0 then 'N' Else
the second/nested CASE WHEN checks if we are in the first Line of Trips (ordered by ScheduledTime, see ROW_NUMBER), then return the Price otherwise BLANK
Code:
WITH
TR AS (SELECT relJobControl,EmployeeID As Emp, ScheduledTime, ROW_NUMBER() OVER(PARTITION BY relJobControl ORDER BY ScheduledTime) AS RN
FROM Trips),
P21 AS (SELECT SONO, SUM(Price) AS Price
FROM xqryP21LinesRelated
WHERE ITEM like 'SA[_]%' AND ITEM NOT LIKE 'SA[_]BILL%' AND IFNULL(disposition, '') != 'C'
GROUP BY SONO)
SELECT
TR.Emp, jobs.sonumber,
--CONVERT(varchar(10), Trips.ScheduledTime, 101) AS ScheduledDate,
TR.ScheduledTime As ScheduledDate,
CASE WHEN COALESCE(P21.SONO, 'N')='N' OR P21.Price=0 THEN 'N' ELSE
(CASE WHEN TR.RN=1 THEN P21.Price ELSE '' END) END As 'SA Credit'
FROM TR
INNER JOIN Jobs ON TR.relJobControl = Jobs.Control
LEFT JOIN P21 ON P21.SONO=Jobs.SONumber
--JOIN Emps on Emps.EmpID = Trips.EmployeeID
WHERE (TR.Emp = 17)
AND (TR.ScheduledTime BETWEEN '01/01/24' AND '01/31/24')
ORDER BY Emp, ScheduledDate
Last edited by Zvoni; Tomorrow at 31:69 PM.
----------------------------------------------------------------------------------------
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------------------
People call me crazy because i'm jumping out of perfectly fine airplanes.
---------------------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad
Re: SQL Server - blank out col if another col's value has already been seen, in a que
Good morning, thank you. I read your explanation and it sounds like you completely understand the data structures I'm working with. I will give it a shot sometime today!
There are 10 kinds of people in this world. Those who understand binary, and those who don't.
Re: SQL Server - blank out col if another col's value has already been seen, in a que
Originally Posted by MMock
Good morning, thank you. I read your explanation and it sounds like you completely understand the data structures I'm working with. I will give it a shot sometime today!
There are some things i don't understand in your initial try:
In your Case When you Check on SUM(PRICE) blablabla
but in the THEN-Part you do a STR(SUM(PRICE),16,2)?
Looks like a Conversion/Formatting into Decimal converted to VARCHAR
If my version is "successful" i'd do the conversion with the nested CASE WHEN
Not sure about Syntax. Am not familiar with MS SQL
CASE WHEN COALESCE(P21.SONO, 'N')='N' OR P21.Price=0 THEN 'N' ELSE
STR(CASE WHEN TR.RN=1 THEN P21.Price ELSE '' END, 16, 2) END As 'SA Credit'
Last edited by Zvoni; Tomorrow at 31:69 PM.
----------------------------------------------------------------------------------------
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------------------
People call me crazy because i'm jumping out of perfectly fine airplanes.
---------------------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad
Re: SQL Server - blank out col if another col's value has already been seen, in a que
Yes, I needed to differentiate beteen instances when SA Credit was 0 because there wasn't any credit remaining vs when the customer didn't have a service agreement at all, in which case I wanted an N. So that was my way (and there may be bettter ways) of having the datatype be the same - a string - whether the output was a legitimate credit of 0 or a greater than zero amount versus no credit ever because not applicable to that customer.
There are 10 kinds of people in this world. Those who understand binary, and those who don't.
Re: SQL Server - blank out col if another col's value has already been seen, in a que
As i wrote above: The secret is to count how many times a relJobControl occurs in Trips.
A simple SELECT COUNT(Whatever) doesn't work, because it returns just that: A count.
What you need is a "numbering" magic, which the ROW_NUMBER does.
But ROW_NUMBER has a downside (as any Window-Function): You cannot access its values in the WHERE-Clause of the Statement, that creates it
something like
SELECT blablaba, ROW_NUMBER() OVER(PARTITION BY SomeTing ORDE BY SomeOrder) AS RN FROM SomeTable
WHERE RN=1 (... WHERE ROW_NUMBER() OVER(PARTITION BY SomeTing ORDE BY SomeOrder)>1)
That doesn't work.
It's the reason, why you have to get your Values for Trips from a CTE (you could also use a Sub-Select instead of a CTE)
Just execute the Trips-Select:
Code:
SELECT
relJobControl,
EmployeeID As Emp,
ScheduledTime,
/*Whatever other Fields*/
ROW_NUMBER() OVER(PARTITION BY relJobControl ORDER BY ScheduledTime) AS RN
FROM Trips
and you will see the "magic" appearing.
The PARTITION and ORDER-Clause of ROW_NUMBER is a Well of fun to play with
Last edited by Zvoni; Tomorrow at 31:69 PM.
----------------------------------------------------------------------------------------
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------------------
People call me crazy because i'm jumping out of perfectly fine airplanes.
---------------------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad
Re: SQL Server - blank out col if another col's value has already been seen, in a que
I will definitely be getting back to this but my boss wants other things first. So it actually may be a few weeks (months?). This code she usually only uses the beginning of the year and I believe she has already manually filtered out the extraneous service agreement credits. But it's on my to-do list even though you may not hear anything further for a while :-). Thanks again.
There are 10 kinds of people in this world. Those who understand binary, and those who don't.
Re: SQL Server - blank out col if another col's value has already been seen, in a que
I am back on this today.
So, if you remember this issue: I am selecting tasks that employees spent time on, given a date range and possibly the individual employee to run the query for. We can run the query by providing one employee id or all the employee id's that work for a specified department (that in itself was very interesting/challenging! I have probably used/learned 80% of my sql knowledge from writing this one stored procedure!)
We started out with emp 17, 1/1/24 - 1/31/24.
The first thing I noticed was there was only one row for 17 for case/job = 1073666 but it was given an RN RowNumber = 3. RN 1 and 2 weren't returned. That is because emp 41 worked on two tasks during that date range.
Then I saw only one ocurrence of case/job = 1071829 but it has a RN of 2. So I had to find what was RN = 1. I see Emp 17 also had a task on this case on 8/17/23 so that is RN1 which I don't select in my resulset since I'm looking at Jan 2024.
It all came together when I took the plunge and put our little code snippet into the larger stored procedure and modified the table that's partitioned:
Code:
WITH
TR AS (
SELECT relJobControl,
Control,
EmployeeID As Emp,
ScheduledTime,
ROW_NUMBER() OVER(PARTITION BY relJobControl ORDER BY ScheduledTime) AS RN
FROM Trips
JOIN Emps on Emps.EmpID = Trips.EmployeeID
JOIN EmpTeamDepartments ON EmpTeamDepartments.[EmpDepartmentID] = Emps.TeamDepartment
WHERE ScheduledTime BETWEEN @BeginDate AND @EndDate
AND (EmpTeamDepartments.EmpDepartmentID in (SELECT [EmpDepartmentID] FROM @ListOfEmpDepartments))
),
P21 AS (SELECT SONO, SUM(Price) AS Price
FROM xqryP21LinesRelated
WHERE ITEM like 'SA[_]%' AND ITEM NOT LIKE 'SA[_]BILL%'
--AND IFNULL(disposition, '') != 'C'
AND ISNULL(disposition, '') != 'C'
GROUP BY SONO)
etc...the rest of the sproc...
I need to run this for more date ranges
There are 10 kinds of people in this world. Those who understand binary, and those who don't.
Re: SQL Server - blank out col if another col's value has already been seen, in a que
OK, I had some problems which I believe I worked through.
I realized that I had to put all the Trips columns I ultimately wanted to select into the TR-row-partition code. I was working with a "minimal" TR that was built from the Trips table then I was joining TR to Trips to get the rest of the columns. That is because when I plugged TR into the larger sproc I neglected to move criteria up from the main query into the TR code as well as some of the columns. That is because I am learning, LOL! Honestly, it wouldn't have been any fun if I had just plugged in your snippet and voila, it worked. This way I dissected a lot of it which led to a better understanding.
So here's my final TR:
Code:
WITH
TR AS (
SELECT relJobControl,
Control,
EmployeeID As Emp,
ScheduledTime,
Time_Start, Time_Arrive, Time_Travel,
Time_Depart, Time_Service, Time_End,
HoursLunch, bTaskQuerySA, BilledServiceAdjusted,
PrevailingWage, Trips.Trip_Type, LaborLocation, TripStatus,
Resolution, TaskNotes,
ROW_NUMBER() OVER(PARTITION BY relJobControl ORDER BY ScheduledTime) AS RN
FROM Trips
JOIN Emps on Emps.EmpID = Trips.EmployeeID
JOIN EmpTeamDepartments ON EmpTeamDepartments.[EmpDepartmentID] = Emps.TeamDepartment
-- I think I need the same exact WHERE clause here as in the main query. This was only a couple of the criteria. When I ran the current sproc against this updated one, this one had
-- some blank SA Credit rows.
--WHERE ScheduledTime BETWEEN @BeginDate AND @EndDate
--AND (EmpTeamDepartments.EmpDepartmentID in (SELECT [EmpDepartmentID] FROM @ListOfEmpDepartments))
LEFT OUTER JOIN TripTypes ON Trips.Trip_Type = TripTypes.Trip_Type
WHERE (Trips.EmployeeID LIKE ISNULL(@EmpID,'%'))
AND (TripTypes.IncludeTimeCard = 1)
AND (TripTypes.ShopItem = 0)
AND (Trips.Trip_Type <> 'Follow Up')
AND (Trips.Trip_Type <> 'Cancelled')
AND (Trips.ScheduledTime BETWEEN @BeginDate AND @EndDate)
AND (Trips.FieldStatus <> 'VOID')
AND
(EmpTeamDepartments.EmpDepartmentID in (SELECT [EmpDepartmentID] FROM @ListOfEmpDepartments))
),
How I am testing it is comparing two excel files which I export the results into, one file is the old code and one is the new. The new code was good from the beginning at eliminating the dupes but it also had some blank SA Credit where it should've had a dollar amount. That is when I returned to the drawing board and came out with the code above. So I will do a bit more comparing like this with different input parameters.
Thank you very much for the foundation of all this!
There are 10 kinds of people in this world. Those who understand binary, and those who don't.
Re: SQL Server - blank out col if another col's value has already been seen, in a que
Well done!
... and suddenly you come to love CTEs, Window-Functions (ROW_NUMBER!!)), since those can clear up some confusion.
The Fun starts, when using CTEs recursively
btw: I noticed something in your "TR"-CTE
Code:
....
WHERE ScheduledTime BETWEEN @BeginDate AND @EndDate
...
Variables/Parameters inside a CTE?
huh? MS SQL allows that????
Didn't know that
looking at your issue with the "wrong" Row-Numbers: Ah, didn't know that you also have to respect the employee.
Easy. Just add the EmpID into the PARTITION BY-Clause of the Row-Number
Code:
ROW_NUMBER() OVER(PARTITION BY relJobControl, EmployeeID ORDER BY ScheduledTime) AS RN
and run that (inner) SELECT again.
You will see, that RN starts numbering partitioned by relJobControl AND EmpID,
so for each relJobControl AND EmpID you WILL have at least once a RN=1 if it appears within your filtered Date-Window
EDIT:
Maybe as an excersize for you to get familiar with ROW_NUMBER
Imagine a Room with a 1000 people inside.
1) Task: Find out the oldest person without using MAX-Function
SELECT blablabbal, ROW_NUMBER() OVER(ORDER By DateOfBirth) AS RN FROM blablabla
-- RN=1 is the oldest Person
2) Task: find out the oldest person of each gender
SELECT blablabbal, ROW_NUMBER() OVER(PARTITION BY Gender ORDER By DateOfBirth) AS RN FROM blablabla
-- RN=1 is the oldest Person of each gender
3) Task: Find out the oldest person per Gender and occupation
SELECT blablabbal, ROW_NUMBER() OVER(PARTITION BY Gender, Occupation ORDER By DateOfBirth) AS RN FROM blablabla
-- RN=1 is the oldest Person per Gender and Occupation
Bottom Line: I use ROW_NUMBER in CTE's extensively, since i often have to return the Min/Max-Value of something, but with additional information.
The "Classic" being "Which client has the lowest sales-price?"
Last edited by Zvoni; Feb 12th, 2025 at 02:34 AM.
Last edited by Zvoni; Tomorrow at 31:69 PM.
----------------------------------------------------------------------------------------
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------------------
People call me crazy because i'm jumping out of perfectly fine airplanes.
---------------------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad
Re: SQL Server - blank out col if another col's value has already been seen, in a que
Originally Posted by Zvoni
...
btw: I noticed something in your "TR"-CTE
Code:
....
WHERE ScheduledTime BETWEEN @BeginDate AND @EndDate
...
Variables/Parameters inside a CTE?
huh? MS SQL allows that????
Didn't know that
Yes, thank God! That was a nice discovery when I found it!
Originally Posted by Zvoni
...
looking at your issue with the "wrong" Row-Numbers: Ah, didn't know that you also have to respect the employee.
Easy. Just add the EmpID into the PARTITION BY-Clause of the Row-Number
Code:
ROW_NUMBER() OVER(PARTITION BY relJobControl, EmployeeID ORDER BY ScheduledTime) AS RN
and run that (inner) SELECT again.
You will see, that RN starts numbering partitioned by relJobControl AND EmpID,
so for each relJobControl AND EmpID you WILL have at least once a RN=1 if it appears within your filtered Date-Window
Yes, it was partly thowing in EmployeeID and partly restructuring the query. Here's why.
A "case" can go on and on for years. We track employees' PTO hours (Personal Time Off) using a case, each day off (or hours off if not an entire day) is a single "task". The case is typically named "EMP <emp id>". Since employee 17 has worked at the company for like forever, he has a PTO case that spans many many years. So if I run this simple query:
Code:
WITH
TR AS (
SELECT relJobControl,
EmployeeID As Emp,
ScheduledTime,
ROW_NUMBER() OVER(PARTITION BY relJobControl, EmployeeID ORDER BY ScheduledTime) AS RN
FROM Trips
)
SELECT * from TR where TR.Emp = '17' AND (TR.ScheduledTime BETWEEN '01/01/24' AND '01/31/24')
I get 28 rows in my resultset. These are all "tasks" 17 has worked on during the month of January last year. Some of them being his PTO (relJobControl = 924 is case EMP 17). It looks like RN is wrong. At least, it is not what I need it to be.
The total tasks for 17's PTO Case is 1,238. So it makes sense that the row numbers are in the 800's, because the query PARTITION BY relJobControl, EmployeeID looks at all 1238. But I need the rows returned to start at 1.
When I rewrite the query to build TR with the desired date range first...
Code:
WITH
TR AS (
SELECT relJobControl,
EmployeeID As Emp,
ScheduledTime,
ROW_NUMBER() OVER(PARTITION BY relJobControl, EmployeeID ORDER BY ScheduledTime) AS RN
FROM Trips WHERE ScheduledTime BETWEEN '01/01/24' AND '01/31/24'
)
SELECT * from TR where TR.Emp = '17'
I get results below which are absolutely beautiful to my eyes. And why I say Thank God for allowing parameters *inside* the CTE (because of course that date range is input to the sproc, not hardcoded). And that's why I enjoyed working on this and building it up into its final product. Thanks again and again and again!
There are 10 kinds of people in this world. Those who understand binary, and those who don't.
Re: [RESOLVED] SQL Server - blank out col if another col's value has already been see
Last word of advice regarding ROW_NUMBER():
It took me quite a bit of time to figure out the ins and outs of that function, but once i grasped it, it has become a standard-tool for me, even going so far as pretty much abandoning Min- and Max-Functions.
My advice: play around with it.
Switch the order in the Partition By-Clause, add criteria in its Order by and so on etc.
Apply Filter early, apply filter late, join first, filter second or the other way around, filter in the ON-Clause of the join vs. Filter within CTE etc.
I remember using ROW_NUMBER 6 times, to retrieve 6 Min and Max-values with a SINGLE query, incl. returning to which row which value belonged (Where the usual Min- and Max-Functions fail)
(Stock market --> Week opening min and max, total min and max over the week, min and max closing week including which share had which "Position")
That query was sitting within a CTE, and i just joined 6 times back to the same CTE (each JOIN aliased differently), and set the corresponding RN=1 in the ON-Clause.
Everyone who looked at my Query was completely baffled how it was supposed to work, but it did work
Last edited by Zvoni; Feb 13th, 2025 at 03:25 AM.
Last edited by Zvoni; Tomorrow at 31:69 PM.
----------------------------------------------------------------------------------------
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------------------
People call me crazy because i'm jumping out of perfectly fine airplanes.
---------------------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad