[RESOLVED] PK/Import Problem

Printable View

Jun 6th, 2022, 05:47 PM
Shaggy Hiker

[RESOLVED] PK/Import Problem

I have a database in SQL Server with a single table in it. This ends up receiving information from a service. The data coming in is essentially a flat file (it's just a CSV packed into a JSON field) that gets directly imported into the existing table. That incoming data has no PK. At some point in the past, a PK field was added to the destination table. From looking at the data, it is clear that people changed their minds a few times. Some of the records are GUIDs, then for a time it was some concatenated string made up of a few fields from the table...and maybe the current date. Then it went back to GUIDs.

Therefore, the PK field is not a uniqueidentifier field, it's a varchar field. That's life.

My original attempt was to suck the CSV flat file into a datatable, then push that to the DB table using a command builder. That almost worked, but not quite. The CSV flat file lacks the PK field.

My first thought was to add the PK field to the datatable prior to pushing the DT to the DB, but that doesn't work, because filling in the PK field means that the rowstates are set to Modified, and I need them set to Added, since these are all new rows.

I then thought I might leave out the PK field entirely and put a default value for the row into the table in the DB. That might work, but only if I write out ALL the fields in this monster table. Otherwise, the source table has not field with the name of the PK field, so the CommandBuilder fails, since I built the commandBuilder using "SELECT * FROM DestinationTable". Naturally, that has the PK field.

If I try to add the PK field to the source table in the query that fills the source table, that also doesn't work, because it puts in a field.

The only thing that seems like it would work is to write out every single field for the CommandBuilder Select query, and there are a LOT of them. Worse, I can't guarantee that the set won't change.

Is there a better way to do this?

There are 63 columns in the DT. When I wrote them out, the Update on the dataadapter fails with a message saying that it can't insert Null into the PK field. The PK field was left out because I thought it would be sufficient to set a DefaultValue in the DB table to NewID(). That doesn't appear to work, and I'm not sure why.

What I know I can do is to write out a horrific INSERT query for the DA, but with 63 fields, about half of which follow a stupid naming convention that requires square brackets....Arrrrrggghhhh!
Jun 7th, 2022, 02:24 AM
Zvoni

Re: PK/Import Problem

I take it, the PK is NOT a surrogate PK (as in AutoIncrement Integer)?
The only thing coming to mind, if it doesn't matter WHAT the PK actually is (GUID, some Concat of Fields, dirty socks, girlfriends) and the PK stays a VARCHAR, is to SELECT current DateTime+Row-Number additionally and send that one as a Value for the PK

EDIT: As for writing out the SELECT: By any chance: Are the Fieldnames of the CSV exactly the same as the Fieldnames in the Destination-Table?
I'm thinking of querying the INFORMATION_SCHEMA (or other Sys-Tables), kind of like "SELECT STRING_AGG(FieldName, ', ') AS MyColumns FROM INFORMATION_SCHEMA WHERE Table_name='MyTable',
and then use the returned String to build your final SELECT
https://docs.microsoft.com/en-us/sql...l-server-ver16

EDIT2: You can even use the above approach to exclude Columns:
SELECT STRING_AGG(FieldName, ', ') AS MyColumns FROM INFORMATION_SCHEMA WHERE Table_name='MyTable' And Fieldname<>'PKField' ORDER BY OrdinalPos
Or excluding multiple Columns:
SELECT STRING_AGG(FieldName, ', ') AS MyColumns FROM INFORMATION_SCHEMA WHERE Table_name='MyTable' And Fieldname NOT IN ('Field1', 'Field2') ORDER BY OrdinalPos

EDIT3: Just took a look at it: Your best bet would probably be sys.columns joined with sys.tables and/or sys.objects
Jun 7th, 2022, 08:23 AM
Shaggy Hiker

Re: PK/Import Problem

I like that approach suggested in the Edit.

The PK...is a mess. Having it change from one thing to another, and back again, has made a mess of the whole column. I'm wondering whether it wouldn't be better to just clean that up. I'm not sure who is using it, and for what. The table doesn't link to anything, so there certainly aren't any foreign key relationships to worry about. I'm thinking that it could be an autonumber just as easily, and I'm not sure why it isn't. Perhaps that can be changed over.

Since writing that, I found out that there were some other peculiar decisions that were made, and I have to determine whether or not I have to adhere to those decisions, as well. There are several columns that will be mostly empty. For some reason, almost all (but not QUITE all) of those columns were made non-nullable. So...what's in those fields when they LOOK empty? I'm pretty sure that they are just empty strings. Does that matter? It looks Null, but isn't REALLY Null.

If I care, then it is looking like a big hurkin' INSERT statement will be the way to go, in which case I can do whatever I want with the PK, and I might as well go ahead and ignore the schema, as well, because nearly a quarter of the 63 fields will have to have some logic to transform Null into something else anyways.
Jun 7th, 2022, 08:35 AM
Zvoni

Re: PK/Import Problem

Maybe those Columns are NON-NULL(-able) but have a Default of empty String?
Should be easy to test with a
SELECT .... FROM Table WHERE MyColumn IS NULL
If the records show up, then it's a NULL, if not....

Quote:

Originally Posted by Shaggy Hiker

If I care, then it is looking like a big hurkin' INSERT statement will be the way to go, in which case I can do whatever I want with the PK, and I might as well go ahead and ignore the schema, as well, because nearly a quarter of the 63 fields will have to have some logic to transform Null into something else anyways.

Well, there is always this one:
COALESCE((CASE WHEN Trim(SomeFieldName)='' THEN NULL ELSE SomeFieldName END), 'SomeThingElse')
Catches both: Empty String as well as NULL, and transforms them to SomeThingElse
Jun 7th, 2022, 08:35 AM
ChrisE

Re: PK/Import Problem

I allways create a Linked Table in Access an take a look at the (crap) CSV
don't know if this an option, but here you go, isn't really that much code

Code:

Option Strict On Imports Microsoft.Office.Interop.Access.Dao Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click Dim AccessDatabaseEngine As New Microsoft.Office.Interop.Access.Dao.DBEngine() Dim AccessDatabase As Microsoft.Office.Interop.Access.Dao.Database Dim AccessTableDef As Microsoft.Office.Interop.Access.Dao.TableDef Try AccessDatabase = AccessDatabaseEngine.OpenDatabase("D:\DbCSV_Linked.accdb") AccessTableDef = AccessDatabase.CreateTableDef("tbl_Adressen") AccessTableDef.Connect = "Text;FMT=Delimited;HDR=YES;DATABASE=E:\CSVDateien" AccessTableDef.SourceTableName = "Adressen.csv" AccessDatabase.TableDefs.Append(AccessTableDef) AccessDatabase.Close() Catch ex As Exception MessageBox.Show(ex.Message) Finally AccessDatabase = Nothing End Try End Sub
Jun 7th, 2022, 11:46 AM
Shaggy Hiker

Re: [RESOLVED] PK/Import Problem

This has to be done unattended, so looking at it first is not an option.

However, I found a bunch of bizarre business rules were being applied, and found a better way to solve the whole thing.

What I ended up doing was sucking the data into one datatable, but that datatable would end up being wrong in a variety of interesting ways. For example, it would infer that a certain field was an integer, because it might be for all 2000 records....but it actually is NOT an integer, because somebody decided that making it a field that was something of a coded string. It could be 1, or it could be "1,1", and good luck understanding what THAT means. I figured it out, but can't imagine how anybody would find it useful.

Anyways, once I had that datatable, I created a new datatable that was a clone of the first. I then had to go through the clone and fix the datatype for the incorrectly inferred columns, of which there were only a dozen, or so, with one of three different corrections.

After that, I iterated through the original datatable moving the data over to a new row in the new datatable with the odd conversions that were required by another set of rules. Some fields were nullable, some fields were not nullable, but were strings. For those that were not nullable, sometimes Null meant empty string, sometimes there was a value. Whatever....

Anyways, that proved to be quick to write, though finding all the edge cases in those weird rules was not so quick.
Jun 8th, 2022, 10:53 AM
techgnome

Re: PK/Import Problem

Quote:

Originally Posted by Shaggy Hiker

I like that approach suggested in the Edit.

The PK...is a mess. Having it change from one thing to another, and back again, has made a mess of the whole column. I'm wondering whether it wouldn't be better to just clean that up. I'm not sure who is using it, and for what. The table doesn't link to anything, so there certainly aren't any foreign key relationships to worry about. I'm thinking that it could be an autonumber just as easily, and I'm not sure why it isn't. Perhaps that can be changed over.

Since writing that, I found out that there were some other peculiar decisions that were made, and I have to determine whether or not I have to adhere to those decisions, as well. There are several columns that will be mostly empty. For some reason, almost all (but not QUITE all) of those columns were made non-nullable. So...what's in those fields when they LOOK empty? I'm pretty sure that they are just empty strings. Does that matter? It looks Null, but isn't REALLY Null.

If I care, then it is looking like a big hurkin' INSERT statement will be the way to go, in which case I can do whatever I want with the PK, and I might as well go ahead and ignore the schema, as well, because nearly a quarter of the 63 fields will have to have some logic to transform Null into something else anyways.

I saw a situation where there was a varchar field ... non nullable ... which is fine, until someone decided at some point that they did want to pull null into that field. But rather than changing the table def to allow null on the col(s) in question ... the code was changed to insert the literal string "NULL" into the field ... some days it's just not worth getting out of bed.

-tg
Jun 8th, 2022, 03:09 PM
Shaggy Hiker

Re: [RESOLVED] PK/Import Problem

Back when I started here, which were medieval times to be sure, there was a program written in DBase III or IV. The idea was that a long string of numbers would be entered. The numbers all had meanings, but they were stuck together. So the first two digits might be one thing, the next three another, and so forth. The result was that a single record was a string of digits perhaps some 50-60 characters long.

Obviously, the chance for a typo in something like that was quite high, so the system was set up so that one person entered the number into the DBase app, while a second person entered it on a second computer. I'm not sure what the second person entered the number into. It might have been Notepad, though I think it was somewhat more than that. In any case, a 'file' would consist of about 20 of these records. A file would be saved to a 3.5" floppy disc (that's where the whole world was still at, back then), which was carried to the other computer. The original computer read the disc and compared the lines from one file to the next. It would say which lines were wrong, but wouldn't say in what WAY they were wrong. It was up to the users to stare at that godawful long number, trying to find which digit was wrong.

And then somebody entered an entire file using O in place of 0. I didn't join up until after that, but people still spoke about it. On those monitors, there was no visible difference between "O" and zero. Apparently, it took a VERY long time of people staring at twenty rows of several dozen digits each, trying to find a difference, before somebody figured out what had happened.