-
[RESOLVED] Nearest Minute Causing Duplicates in PrimaryKey of Date (Access)
I find that once in awhile (several days to month can go by) that I end up with a duplicate record.
The PrimaryKey is DATETIME in Access. What appears to be happening is
that when Access UPDATES datetime (which is a double), the time portion of the double
may be one digit larger (e.g. 9 vs 8) or one digit smaller (e.g. 7 vs 8) even though
the time is being rounded to the nearest minute. For whatever reason it then
considers this to be a different DATETIME even though when examining the database
Both records appears with the same datetime stamp.
Any suggestions appreciated.
-
Re: Nearest Minute Causing Duplicates in PrimaryKey of Date (Access)
Don't use the date/time as the primary key.
-
Re: Nearest Minute Causing Duplicates in PrimaryKey of Date (Access)
jmcihinney:
Even if I go with an AutoNumber field as the PrimaryKey, I'm going to end up with duplicate records (as far as date is concerned) .
I guess I could seek to see if a record with the same time (hh:mm:00") exists, to see if the hours and minutes match prior to writing the current record,
This would be a work around to .NoMatch where the entire datetime is taken into account.
Don't see what advantage would be as to whether date is PrimatryKey or the date resides in another keyed field -- or -- non-keyed field.
Got a better alternative?
-
Re: Nearest Minute Causing Duplicates in PrimaryKey of Date (Access)
the point of a PKey is to uniquely identify a row in a table... so by definition it needs to be unique... what you're finding out is that sometimes your PKey isn't unique enough... you're getting duplicates... therefore you really should find an alternate PKey, preferably something in the data already... it might be two fields... but if there is nothing that can be used to create the unique key, then you have to go with a surrogate and generate it through other means (either an auotgenerated number, or a guid, or what ever)...
-tg
-
Re: Nearest Minute Causing Duplicates in PrimaryKey of Date (Access)
techgnome:
Point taken regarding NOT UNIQUE ENOUGH. Maybe by truncating and adding back the seconds as "00" might resolve it. Thought I was doing it but will double check code.
You might find this article of interest:
http://www.made2mentor.com/2011/05/d...te-dimensions/
-
Re: Nearest Minute Causing Duplicates in PrimaryKey of Date (Access)
If you want that all your records should have unique date/time, and no other key qualifies as a primary key, then making that datetime field as Primary Key makes perfect sense.
Whatever datatype your primary key is, if you try to enter duplicate records, it won't let you do that and throw error. Just handle that error whatever way you want to - either dicard that record, or try to reinsert it with datetime field value changed.
-
Re: Nearest Minute Causing Duplicates in PrimaryKey of Date (Access)
On a side note, DateTime fields in Access do have seconds precision.
Probably you have set the "Format" or "Input Mask" property on that field, because of which seconds is not shown. What are the values of these two properties for that field?
-
Re: Nearest Minute Causing Duplicates in PrimaryKey of Date (Access)
Ok... I read the article... all I got out of it was disk space savings using a datetime rather than an integer... negligible in my opinion. The rest of it seems to prove what I was trying to say... sure the DT makes sense as a PK IF IT CAN BE UNIQUE... but in your case, you're finding situations where this is not the case... you need to 1) pair it with a second field, or 2) use something else entirely (third alternative, remove the PK)... If you insist on keeping the DT as the PK, that's fine... but then you have to then decide what to do when you do into those potential duplicate records... toss it? change the date value? store it somewhere else? You may want to also look into whether the data truly is a duplicate, or if it is only the time component that is being duplicated. I never said don't use the Dt as a PKey... I was merely trying to point out that maybe the PKey needed to be re-thought out...
-tg
-
Re: Nearest Minute Causing Duplicates in PrimaryKey of Date (Access)
techgnome:
Article for me pointed out three things:
1) Nothing wrong using Date as PK (this is what I got from jmcilhinney post -- my read between the lines)
2) May be faster
3) Disk Space Savings
As you pointed out (and I agree) -- DT makes sense as a PK IF IT CAN BE UNIQUE
In my case thought it was UNIQUE but found out NOT.
Records come from a Real-time datafeed so only front end work I do is strip off seconds (per code).
So am going to add "00" back as seconds so, ONLY hh:mm:00 (I hope) will be taken into account.
Pradeep1210: See post to techgnome as not using Format or Input Mask.
-
Re: Nearest Minute Causing Duplicates in PrimaryKey of Date (Access)
1) again, I never said there was anything inherently wrong with it... but that it's clear it ISN'T unique after all... therefore it is potentially a poor choice... maybe the stripping of the seconds is the problem...
2) That's a misleading assumption... it was faster ONLY WHEN the JOIN was removed... but that's ALWAYS going to happen... if you remove a join the query will inherently run faster. It was a bad comparison and a bad result.
3) Seriously? Pffft... disk savings is negligible. the reason there was a disk savings in that experiment is because it takes more room to store a number AND a date rather than just a date... they may both be apples, but a Fiji isn't a Granny Smith...
Personally I think it was a flawed article with a flawed experiment with disastrous results. I'd have been more impressed with a table, single col, 1million dates... another table, single column, 1 million numbers... then see what the disk savings is between the two... it's closer than you might thing... and better yet, make the numbers decimal, not integers... that's going to be a closer comparison, because that's how dates are stored... as decimal values. to compare the joining... two tables two columns each... datetime and integer... fill them both with 1 million records.... then select from one, inner joined to the other on the date, then a second query with the two tables inner joined on the integer field... select the top 1000 ... THEN compare THOSE results...
But hey, it's your app, your data... you do what you think you need to... I've given my $0.02 worth... and it sounds like you chose to ignore it.
-tg
-
Re: Nearest Minute Causing Duplicates in PrimaryKey of Date (Access)
Quote:
Article for me pointed out three things
The article is talking about a very specifc situation. It's describing how you should define a date dimension in an analysis/reporting data warehouse. Unless that's what you're doing (which I don't think you are... and if you are you should definitely NOT be using Access) it doesn't apply to you.
There's nothing wrong per-se with using a datetime as a primary key it would be an unusual choice. A primary key isn't just unique, it's a unique identifier. In other words, it's the thing that succinctly and uniquely describes the entity a record represents. It's the defining characteristic. You haven't said what you table contains but would you say it's defining characteristic is a datetime? If so then it's an apropriate primary key. Incidentally, that's why it's apropriate in that article, because a date dimension is a list of days and the defining characteristic of a day is a date.
One thing that makes me think a datetime is not a defining characteristic of your entity is this:-
Quote:
What appears to be happening is that when Access UPDATES datetime
You should almost never need to update a primary key value. If you have to then it didn't do it's job of being a defining characteristic. If you find yourself updating primary keys you are in for a world of hurt. You're going to be dealing with data inconsistencies all over the place or relying on hungry cascades which will tank your performance (I'm not sure Access even supports cascades).
So the answer is that the date might be fine as a primary key but probably isn't.
I think we probably need more info before we can give you really good suggestions though. From what you've said I'm not convinced that you're really dealing with duplicates. I suspect Pradeep was closer to the mark when he said you're looking at different values formatted to look the same. It's hard to be sure from the information you've provided though.
1. What does your table contain?
2. Why are you updating it?
-
Re: Nearest Minute Causing Duplicates in PrimaryKey of Date (Access)
I think your real problem is stripping off the seconds from your Date value.
What do you want to do with the records that come as duplicates (after stripping off seconds part)?
If you want to keep them then:
I would suggest not stripping off the seconds value and keep it as it is.
And since you don't want to see the seconds, just set the Format = mm/dd/yyyy h:nn ampm (or whatever format you want without seconds.)
If you don't want to keep them then:
Your current design is ok. It won't let you add duplicates. You just need to handle the error thrown if any.
-
Re: Nearest Minute Causing Duplicates in PrimaryKey of Date (Access)
techgnome:
Quote:
But hey, it's your app, your data... you do what you think you need to... I've given my $0.02 worth... and it sounds like you chose to ignore it.
Not in the least. Always appreciate your $0.02. Thought we were having a discussion. I didn't take it personal, hope you didn't.
///////////////////
FunkyDexter:
As I see it Datetime is the correct primary key. This table is for stock data and the only distinquishing characteristic is DateTime.
Re: UPDATE: The record may need to be edited for whatever reason (bad data for example). In this case NOT UPDATING the primaryKey ONLY the data.
////////////////////////
Pradeep1210:
What I think is happening is this:
Let say I have a table of data with DateTime as the PrimaryKey.
I find that some of the records are wrong, so I go out to the server an download the same set of records to the client machine.
Rather than deleting the original records between a Begin and End Date and Adding the Same recordset back to the table,
I "Seek" the particular record of interest using its DateTime value and then EDIT that record with the new data from the server.
For whatever reason the "Seek" may not find the Record with the DateTime Stamp in the Table and assumes it is a new record
and then it is added rather than edited. As previously posted, the only reason I can see is if Access is comparing the two records during "Seek" and one of them
is slightly different. Since the DateTime is stored as a double, maybe one of the records has its DateTime value carried out to more significant digits.
When making the comparison it "ASSUMES" they are different -- which in fact they are -- but when displayed in Access it "shows" as two
separate records with the same DateTime.
-
Re: Nearest Minute Causing Duplicates in PrimaryKey of Date (Access)
"I didn't take it personal, hope you didn't." of course not... it's all business man... all business...
"This table is for stock data and the only distinquishing characteristic is DateTime." and the stock symbol...
"maybe one of the records has its DateTime value carried out to more significant digits." -- all datetime values are going to have the same number of digits in them... time tracking is finite... there's only so many milliseconds in a day... so the significant digits should be static... what maybe different is if trailing zeroes are dropped off... but if you're comparing date time to date time... then that shouldn't be affecting things... maybe we need to see how you're actually doing the seek, and/or if there is a better way (I'm not a fan of the seek function... I'd just assume issue a direct (select)update/insert statement.
-tg
-
Re: Nearest Minute Causing Duplicates in PrimaryKey of Date (Access)
Here is the "Seek" code --- very straightforward.
Code:
With rsTemp
'--------------------
'Save it
'--------------------
'See if Record Exists in DB
.Index = "PrimaryKey"
.Seek "=", dtmDateTime
'Add Record if Seek fails, Otherwise edit it
If .NoMatch Then
.AddNew
'Save the Data
!fldHistDateTime = dtmDateTime
!fldHistOpen = CSng(TContract(reqID).THData(i).HOpen)
!fldHistHigh = CSng(TContract(reqID).THData(i).HHigh)
!fldHistLow = CSng(TContract(reqID).THData(i).HLow)
!fldHistClose = CSng(TContract(reqID).THData(i).HClose)
' !fldHistVol = TContract(reqID).THData(i).HVol
.Update
Else
If iReplaceRcds = vbChecked Then
.Edit
'Save the Data
!fldHistDateTime = dtmDateTime
!fldHistOpen = CSng(TContract(reqID).THData(i).HOpen)
!fldHistHigh = CSng(TContract(reqID).THData(i).HHigh)
!fldHistLow = CSng(TContract(reqID).THData(i).HLow)
!fldHistClose = CSng(TContract(reqID).THData(i).HClose)
' !fldHistVol = TContract(reqID).THData(i).HVol
.Update
End If 'Replace Rcds
End If 'NoMatch
End With
Since dates are stored as double, I believe we are in agreement it is a # of precision issue. As I recall a double goes out to 15 decimal points.
Only solution as I see it is convert the DateTime sequence to a double, round it to some precision level (say seconds -- however many decimal points that represents) and then make
sure I store the Datetime based on that precision as well as use that precision when Seeking.
May be better to truncate rather than round at some precision level as rounding might also cause issues.
Got a better idea??
-
Re: Nearest Minute Causing Duplicates in PrimaryKey of Date (Access)
Quote:
This table is for stock data and the only distinquishing characteristic is DateTime
:confused: I don't understand why a datetime would have anything to do with stock at all. Do you mean it's holding transactions against stock, perhaps? e.g. "We took 5 widgets out of stock on the 2 May 2013 at 05:30 for sales order ABC123", that sort of thing? Or I notice that your field names contain the word Hist in them, is this perhaps a series of snapshots of your stock table taken at points in time? Either way I don't understand how a datetime could uniquely identify a record, you'd need a part number in there, surely?
It really sounds like you've got a design problem and you're probably creating further problems for yourself but, as TG said, it's your App. If it makes sense to you then go with it. A couple of things though:-
You've said you're not updating the primary key but it looks like you are:-
Quote:
!fldHistDateTime = dtmDateTime
Assuming fldHistDateTime has been defined as your primary key and is therefore what you're seeking against then you are updating the primary key. If it's not defined as the primary key then your seeking against some other field based on dtmDateTime but updating fldHistDateTime to dtmDateTime. Either way you're likely to experience data inconsistency problems or (assuming it is the priimary key) have to rely on cascades, assuming you've fully set up your referential integrity, which are potentially very hungry.
Quote:
I believe we are in agreement it is a # of precision issue
Having seen that section your code I doubt it is but still can't be sure. I suspect what's actually happening is that the source of dtmDateTime is holding a slightly different value to the value held in fldHistDateTime. You may not be seeing a difference if they're being presented to you without the full milliseconds. You could check that by pulling a few records you'd expect to match from both your source and destination and displaying the datetime values fully, with no formatting.
If that's the case then there's a few things you could do. The best (but not necessarily practical) option would be to examine why the values are different and fix that problem. Are you taking the datetime off two different server's system clocks, for example? Or perhaps uysing the system time at the point of record creation but creating the records a few milliseconds apart?
A more simplistic but imperfect soution would be to do a less granular comparison, similar to what your describing with rounding a float, but the floats a needlessly complicated way of doing it. Instead why not just format the date on both sides of the comparison to only go down to the minute, hour day etc. as apropriate. Be aware that this isn't a perfect solution, though, as the a millisecond's difference could still flip a record from one minute into the next.
-
Re: Nearest Minute Causing Duplicates in PrimaryKey of Date (Access)
FunkyDexter:
Re: Design Issue: Program works great since 1994 -- other than an infrequent problem that is uncovered (this post case in point).
Quote:
You've said you're not updating the primary key but it looks like you are:-
!fldHistDateTime = dtmDateTime
GOOD CATCH. Have remmed this line under EDIT.
Dumbfounded that I missed it. Oh Well!!!!!
FWIW: Just went through my code and found 3 other PrimaryKeys (datetime) I was Editing.
Forgot to Rem or delete when I copied block from the Add code. THANKS.
Re: Precision.
I don't believe formatting the date will do anything to the underlying DateTime double which is what is actually stored. Will VERIFY.
On my system a DateTime converted to a double shows up with a precision of 10 decimal digits.
All decimals are filled no matter whether formatted to hours, minutes, or seconds. Access (to my knowledge) does NOT handle milliseconds.
-
Re: Nearest Minute Causing Duplicates in PrimaryKey of Date (Access)
Quote:
I don't believe formatting the date will do anything to the underlying DateTime double which is what is actually stored
You're right it won't, but you've completely missed the point. I'm not saying store your data in a different way, I'm saying use a different comparison.
Instead of converting your date/time to a double before comparison (which may or may not introduce an inexact value dependent on the parts of the datetime which are more granular than you're concerned with), format it to a string. That way you will be able to see exactly what is going on and you know that you are discarding the DateTime elements of precision rather than the Numeric elements of precision.
E.g. 2 Mar 2012 15:15:33 might convert to 58458625872 expressed as a double (it almost certainly doesn't, I just pulled a number out of the air to use as an example). If you decide you want to round to minutes, how exactly are you going to round that double? You can't. Whereas formatting the datetime to dd MM yyyy hh mm is guaranteed to give you a comparison based correctly on the minutes.
-
Re: Nearest Minute Causing Duplicates in PrimaryKey of Date (Access)
Quote:
If you decide you want to round to minutes, how exactly are you going to round that double? You can't. Whereas formatting the datetime to dd MM yyyy hh mm is guaranteed to give you a comparison based correctly on the minutes.
Got the point of Format now.
Had given thought of using a string but rejected it in favor of comparing using DateDiff.
Code:
If (DateDiff(strDateInterval, dtmDate, dtmDateToCheck) < 0) Then
blnReturn = False
' MsgBox "Different"
' Else
' blnReturn = True
' MsgBox "Same"
End If
Will look into Format and that approach.
-
Re: Nearest Minute Causing Duplicates in PrimaryKey of Date (Access)
Funky
Quote:
Originally Posted by
FunkyDexter
:confused: I don't understand why a datetime would have anything to do with stock at all.
My guess is that DW is dealing with Time and Sales data for stock market data feeds
Spoo
-
Re: Nearest Minute Causing Duplicates in PrimaryKey of Date (Access)
Funky:
This is what I get using format:
Code:
Private Sub Form_Load()
Dim x As Double
Dim h As Date
Dim m As Date
Dim s As Date
Dim dt As Date
dt = Now
h = Format(dt, "hh:00:00")
x = CDbl(h)
Debug.Print x
m = Format(dt, "hh:nn:00")
x = CDbl(m)
Debug.Print x
s = Format(dt, "hh:nn:ss")
x = CDbl(s)
Debug.Print x
End Sub
Results:
0.916666666666667
0.949305555555556
0.949386574074074
Spoo is correct in his guess.
I cant think of a situation where Datetime is Not relevant as it relates to the markets.
Thanks All for your input. Now its a waiting game as to WHEN so I can test for the problem.
-
Re: [RESOLVED] Nearest Minute Causing Duplicates in PrimaryKey of Date (Access)
Quote:
I cant think of a situation where Datetime is Not relevant as it relates to the markets.
When you said Stock I thought you were referring to stock levels rather than stock markets. i.e. How many washers do we have on hand at the moment?