Results 1 to 17 of 17

Thread: Access SQL/Query grouping problem

  1. #1

    Thread Starter
    Lively Member
    Join Date
    Oct 1999
    Posts
    90

    Question

    I'm having a problem getting some Access SQL to do what I want for reporting. I have a SCANS table that captures timestamps of student scans for attendance purposes. I want to grab, for each user id that has a scan, the lowest date and the highest date (for a given date range, but I'll worry about that later). To start, I'm just trying to get the highest (max) date, but I also need all the other fields each that record, which is where I'm having the problem. First, here's some sample data:

    Code:
    user_id	timestamp		station	building
    1	01/01/1999 12:00:00 am	1	1
    1	05/19/2000 11:56:10 pm	2	2
    1	05/19/2000 11:57:47 pm	2	2
    5	09/27/1999 12:00:00 am	1	1
    5	05/19/2000 11:56:10 pm	2	2
    5	05/19/2000 11:57:47 pm	2	2
    6	05/19/2000 12:00:00 am	1	1
    6	05/19/2000 11:56:10 pm	2	2
    6	05/19/2000 11:57:47 pm	2	2
    7	05/19/2000 11:56:10 pm	2	2
    7	05/19/2000 11:57:47 pm	2	2
    The user id of the user who was scanned, the timestamp of the scan, and the station and building where it happened (there's more, but I left them out here for example).

    Anyway, when I do this query:

    SELECT QSA.user_id, Max(QSA.timestamp) AS MaxDate
    FROM Q_SCANS_ALL AS QSA
    GROUP BY QSA.user_id;

    I get the correct results:

    Code:
    user_id	MaxDate
    1	05/19/00 11:57:47 PM
    5	05/19/00 11:57:47 PM
    6	05/19/00 11:57:47 PM
    7	05/19/00 11:57:47 PM
    but I need all the fields. So when I do a query like this:

    SELECT QSA.user_id, Max(QSA.timestamp) AS MaxDate, QSA.station, QSA.building
    FROM Q_SCANS_ALL AS QSA
    GROUP BY QSA.user_id, QSA.station, QSA.building;

    (you're required to include all the shown fields in the GROUP BY if they're not in an aggregate function like MAX), I get this:

    Code:
    user_id	MaxDate			station	building
    1	01/01/99 12:00:00 AM	1	1
    1	05/19/00 11:57:47 PM	2	2
    5	09/27/99 12:00:00 AM	1	1
    5	05/19/00 11:57:47 PM	2	2
    6	05/19/00 12:00:00 AM	1	1
    6	05/19/00 11:57:47 PM	2	2
    7	05/19/00 11:57:47 PM	2	2
    Because it considers each complete record that does not have identical info in the GROUP BY fields to be a different group, each difference in station and building creates a new group for that user and a new MAX date. What I want is to group by just the user id, but still return all the fields in the matching record that has the MAX date in it.

    I've tried various things, like using HAVING or some sort of ORDER BY or a subquery, but nothing seems like it'll do what I need. Can anyone think of anything that might help...I didn't think this fairly simple thing would be such a pain. I don't want to have to do this programmatically if I can avoid it.

    Thanks,
    -JoeyCode

  2. #2
    Hyperactive Member
    Join Date
    Nov 1999
    Location
    Leavenworth KS USA
    Posts
    482
    You're on the right track. To keep it simple, save your
    "correct resultset" as a query ( SELECT QSA.user_id,
    Max(QSA.timestamp) AS MaxDate FROM Q_SCANS_ALL AS QSA
    GROUP BY QSA.user_id;).

    Then build a second SELECT * query that joins Q_SCANS_ALL
    with the user_id and MaxDate from your first.

  3. #3

    Thread Starter
    Lively Member
    Join Date
    Oct 1999
    Posts
    90

    Unhappy

    I can't do that, Mongo. The key for that table is user_id, timestamp, station and building combined, so I need to use them all for matching. While it is highly unlikely, it is POSSIBLE that there could be two SCANS records with the same user_id and timestamp (a machine with an incorrect clock, for instance). Yes, it's nearly impossible, but I have to allow for it, so I need to retrieve the entire record (or at least the entire key).

    Thanks,
    -JoeyCode

  4. #4

    Thread Starter
    Lively Member
    Join Date
    Oct 1999
    Posts
    90

    Lightbulb

    On second thought, what difference does it make if two or more records have the same user_id/timestamp combo? I just want one of them, I guess it doesn't matter which. But, if anyone can help further with this grouping issue, I'd appreciate it. Thanks Mongo.

    -JoeyCode

  5. #5
    Addicted Member
    Join Date
    May 1999
    Posts
    161

    Unhappy

    I think your solution is in a subquery. I don't use Access databases but here is an example of a query I have done in Oracle....

    The table I am looking at contains records that I need to retrieve by barcode, but one barcode can have several groups or records, identified by an id. I want to retrieve the most recent data.

    Code:
        sSQL = "select dcd.pos_begin, dcd.pos_end, dcd.good from " _
             & "draw_cutplans dc, draw_cutplan_details dcd " _
             & "where dc.id = dcd.id and dc.id in " _
             & "(select max(id) from draw_cutplans " _
             & "where bc = '" & frmMain.txtDrawBC.Text & "' group by bc) " _
             & "order by dcd.pos_begin desc"
    I think you should be able to use the same kind of technique. You would just need a group by user_id clause in your subquery.

    Does that help ?

  6. #6

    Thread Starter
    Lively Member
    Join Date
    Oct 1999
    Posts
    90

    Unhappy

    Francis, I'm working on something like that...I made this query:

    Code:
    SELECT QSA.user_id, QSA.timestamp AS MaxDate, QSA.station, QSA.building
    FROM Q_SCANS_ALL AS QSA
    WHERE QSA.timestamp = (SELECT max(QSA2.timestamp)
    FROM Q_SCANS_ALL QSA2
    WHERE QSA2.user_id = QSA.user_id);
    And it works, EXCEPT if there are duplicate timestamps for the same user. Again, that is very unlikely to happen, but it could. So, if I've got some records like this:

    Code:
    user_id	timestamp		station	building
    6	05/19/2000 02:50:44 am	1	4
    6	05/19/2000 11:57:47 pm	2	3
    6	05/19/2000 11:57:47 pm	1	4
    I'll end up with two records for user_id 6 instead of just one. I don't care which one, but I just want one. That's where I'm stuck right now.

    I can't do a max(station) and max(building), because then I'd end up with:
    Code:
    user_id	timestamp		station	building
    6	05/19/2000 11:57:47 pm	2	4
    Which is not accurate, station 2 is not in building 4. I need the original record.

    Any idea how to filter out duplicate records, using only queries, in this scenario? Any way to have one query supply parameters for another in Access? Or a way to iterate through a series of user_ids in a query, like with a FOR EACH type of loop?

    Thanks,
    -JoeyCode

  7. #7
    Guru Clunietp's Avatar
    Join Date
    Oct 1999
    Location
    USA
    Posts
    1,844
    I did this in SQL 7
    Code:
    SELECT *
    FROM Products
    WHERE productid =
            (SELECT TOP 1 MAX(CategoryID)
          FROM categories)
    this basically shows you to use TOP 1, and because you don't care which record, you don't need to specify any more criteria in your subquery....you could probably also use DISTINCT instead of TOP 1 as well....

    I'm pretty sure Access 2000 supports the TOP keyword, but I'm not sure about Access 97

  8. #8

    Thread Starter
    Lively Member
    Join Date
    Oct 1999
    Posts
    90

    Unhappy Same problem...

    Clunietp, Access 97 does support TOP and DISTINCT and all that stuff, but none of those will help in my subquery. I get my MAX timestamp no problem with the query I included in my last message above, but then it matches two different records, both of which are returned.

    If I used TOP, I'd get the same result: the subquery, using TOP, would give me a timestamp to match, and the main query would return the two records that match it. I need to be able to then somehow filter to get only one. This must be a tough one, because so far nobody I've asked (and I've asked a lot of people) can figure it out.

    Thanks,
    -JoeyCode

  9. #9
    Guru Clunietp's Avatar
    Join Date
    Oct 1999
    Location
    USA
    Posts
    1,844
    So your subquery is just fine, but your main query is the one that gives you the multiple records?

    I know for a fact that Access 97 supports Distinct and DistinctRow

  10. #10

    Thread Starter
    Lively Member
    Join Date
    Oct 1999
    Posts
    90

    Yup...

    Yes, Access 97 supports DISTINCT and DISTINCTROW. Did someone say different?

    My subquery returns the max timestamp, but the main query returns ALL records that match it. Do you see what I'm getting at here? The subquery only grabs criteria for matching, if I have two or more records that match its result, that's what the main query returns. The question is, how do I filter out all but one? They're not identical records, so DISTINCT and DISTINCTROW will not work.


  11. #11
    Member
    Join Date
    Jun 2000
    Location
    Perth Western Australia
    Posts
    41
    Here we go..................

    This first SQL statement creates a base recordset to group the Max Timestamp with the User_ID. I named this query qryGroupMax
    ------------------------------------------------------------
    SELECT tblScans.user_id, Max(tblScans.timestamp) AS MaxDate
    FROM tblScans
    GROUP BY tblScans.user_id;
    ------------------------------------------------------------

    Then we use this next query to seperate just one record from that and bring forward the other fields (Station & Building).

    Note I had to use the Last() function incase you have to timestamps exactly the same (User_ID + timestamp)
    Note also that I had to use to inner joins to get the correct result set.

    ------------------------------------------------------------
    SELECT qryGroupMax.user_id, qryGroupMax.MaxDate, Last(tblScans.station) AS LastOfstation, Last(tblScans.building) AS LastOfbuilding
    FROM qryGroupMax INNER JOIN tblScans ON (qryGroupMax.MaxDate = tblScans.timestamp) AND (qryGroupMax.user_id = tblScans.user_id)
    GROUP BY qryGroupMax.user_id, qryGroupMax.MaxDate;
    ------------------------------------------------------------

    I obviously did this in Access queries, but it can be done in VB on two recordsets.

  12. #12
    Lively Member
    Join Date
    Jul 2000
    Location
    Stockholm, Sweden
    Posts
    83
    Should not

    SELECT TOP 1 QSA.user_id, QSA.timestamp AS MaxDate, QSA.station, QSA.building
    FROM Q_SCANS_ALL AS QSA
    WHERE QSA.timestamp = (SELECT max(QSA2.timestamp)
    FROM Q_SCANS_ALL QSA2
    WHERE QSA2.user_id = QSA.user_id);

    do it ?

  13. #13

    Thread Starter
    Lively Member
    Join Date
    Oct 1999
    Posts
    90

    Smile Yes and No...

    Clunietp and AKA, I don't want just one record, I want one record PER USER. Please read from the top to see what I'm after.

    CGTS, the FIRST and LAST functions might just be exactly what I was looking for...thanks!

    -JoeyCode

  14. #14
    Lively Member
    Join Date
    Jul 2000
    Location
    Stockholm, Sweden
    Posts
    83

    Thumbs up

    Ok, I started to read the thread yeasterday and I forgot the start when reading today.

    SELECT
    QSA.user_id,
    Last(QSA.timestamp) AS LastTimeStamp,
    Last(QSA.station) AS LastStation,
    Last(QSA.building) AS LastBuilding,
    First(QSA_F.timestamp) AS FirstTimeStamp,
    First(QSA_F.station) AS FirstStation,
    First(QSA_F.building) AS FirstBuilding
    FROM
    tblScans AS QSA,
    tblScans AS QSA_F
    WHERE
    QSA.timestamp = ( SELECT
    max(QSA2.timestamp)
    FROM
    tblSCANS QSA2
    WHERE
    QSA2.User_Id = QSA.User_Id))
    AND
    QSA_F.timestamp = ( SELECT
    min(QSA2.timestamp)
    FROM
    tblSCANS QSA2
    WHERE
    QSA2.User_Id = QSA.User_Id)
    AND
    QSA_F.user_id = QSA.User_ID
    GROUP BY
    QSA.user_id ;

    Is my new answer to your problem. I do not realy know why it works but in my small Access test database it does. Please conferm if it works for you to. /AKA

  15. #15

    Thread Starter
    Lively Member
    Join Date
    Oct 1999
    Posts
    90

    Smile Good Solution...

    That's a very clever solution, AKA. I'm wondering if using the QSA2 alias in both the min and max subqueries might be problematic. Also, would it be correct to say that you could use either only the FIRST function or only LAST for all the selections, since QSA has only MAX records and QSA_F has only MIN records?

    Your SQL here makes me think of another solution...no need for subqueries or more than one reference to SCANS; just GROUP BY user_id, ORDER BY timestamp, then grab the FIRST and LAST fields, giving the min and max. The only thing about that is I believe there's a relationship between GROUP BY and ORDER BY that might be problem, but I'm gonna try it anyway.

    Thanks for your help,
    -JoeyCode

  16. #16
    Lively Member
    Join Date
    Jul 2000
    Location
    Stockholm, Sweden
    Posts
    83
    I dont know if same alias in diffrent subqueries are a problem for SQL Server, I think that they belong to diffrent context and therfore can have the same name.

    I belive it is correct to use either the FIRST or LAST for all selections, I used both just to test them since it was first time I used them.

    Even better solution comming up ? Good ! Post it so I can learn more as I hope all of us do.
    Yesterday, all my troubles seemed so far away...
    Help, I need somebody, Help...
    Now MCSD and still locking for intresting job in the south parts of Stockholm, Sweden.

  17. #17

    Thread Starter
    Lively Member
    Join Date
    Oct 1999
    Posts
    90

    Unhappy My idea doesn't work...

    For those of you that are interested...

    My idea of GROUPing by user_id, ORDERing by timestamp, then grabbing the FIRST and LAST records...it doesn't work. You apparently can't ORDER BY a field that's not included in the GROUP BY, in this case "timestamp", which makes it not possible to sort within the user_id groups and grab the FIRST and LAST like I hoped. And including timestamp in the GROUP BY will create separate groups for every unique timestamp and obviously won't yield the correct results. Too bad, I was hoping it would work.

    However, I used a multiple subquery and FIRST function solution like AKA's suggestion above and it works like a charm. Thanks for your help everybody.

    Now, look for a new thread about my problems with multiple LEFT JOINS...

    -JoeyCode

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width