|
-
Aug 7th, 2025, 04:27 AM
#1
Looking for a suggestion for "dynamic" tables
Bit of background:-
I'm implementing a system where a user can specify multiple sources of entities (in this case people) and then define some matching criteria that can be used to identify common entities across sources. E.g. Joe Bloggs in Source A is the same person as Joe Bloggs source B because they have the same forename and surname.
(matching criteria will actually be far more complex than that - I use Levenshtein distancing to calculate similarity scores on each field then a weighted average to generate an overall score and assign a match based on an overall threshold being reached)
I cannot know the sources a user will have at design time.
So my basic approach is to have a UI where the user can define a source, it's properties and it's matching criteria to another source. That can then create the necessary supporting tables - basically a table per source with a field per property (call this a source data table). The user will then be responsible for populating the source tables and my matching algorithm will run over the resultant data.
The problem:-
How to create those source data tables. At present I'm using dynamic sql to generate the necessary create table statements but you can't put metadata (table name, column name etc.) in a sql parameters. That means I'm having to use concatenation and that, in turn, risks sql injection. That's what I'd like to get past. Can anyone suggest a way of sanitizing the inputs for this.
I should say that this app will be used by trusted users who would have direct DB access anyway so injection risks aren't the worst thing in the world for me but it still feels shoddy.
Failing that, I could consider a redesign of the matching part of the system. Two obvious approaches would be a NoSQL database or an single Entity Attribute Value Pair table. I don't like either of these though. It would make it difficult for the users to populate the data - they're usually decent sql users but they're not going to know a NoSql platform and asking them to populate a EAV design would be a nightmare. (Aside, for anyone discovering it for the first time, I do NOT recommend EAV - it seems like a good idea at first, you WILL pay for it later.)
I also considered a single table for all sources with the combined field set for all sources but that feels pretty ugly too. It would have a massive number of empty fields on any given row.
I'm open to suggestions for a redesign but it needs to be easily populatable to a moderately able but non professional sql user.
So, any thoughts?
Last edited by FunkyDexter; Aug 7th, 2025 at 04:32 AM.
The best argument against democracy is a five minute conversation with the average voter - Winston Churchill
Hadoop actually sounds more like the way they greet each other in Yorkshire - Inferrd
-
Aug 7th, 2025, 11:52 AM
#2
Re: Looking for a suggestion for "dynamic" tables
Could you stage the table definition process in Access, via a TableDef no SQL statements at all. Then you can convert to SQL server from there.
-
Aug 7th, 2025, 11:55 AM
#3
Re: Looking for a suggestion for "dynamic" tables
I don't think I want to get access involved as a middleman but, I'll be honest, I haven't used access in decades and can't remember it's capabilities so I don't want to dismiss it out of hand. Could you provide a little more detail about what it would involve?
Edit> N.b. I wouldn't want the user to have to open up access and design the table in there. If that's what you're suggesting then it's not for me. I want them to work solely in my gui.
The best argument against democracy is a five minute conversation with the average voter - Winston Churchill
Hadoop actually sounds more like the way they greet each other in Yorkshire - Inferrd
-
Aug 7th, 2025, 12:04 PM
#4
Re: Looking for a suggestion for "dynamic" tables
Yes, you'd be in control via your GUI, I'm suggesting to use a staging area that doesn't have the possibility of SQL Injection (Doesn't need to be access of course). Then you can create your SQL Create table statement from that. The user would never know/see this staging step.
-
Aug 8th, 2025, 04:15 AM
#5
Re: Looking for a suggestion for "dynamic" tables
OK. I don't really want to use Access but you did provide me some inspiration. I was surprised to learn that access wasn't vulnerable to injection so I had a google and found this site. It's actually not completely immune but it is much harder.
Looking at the reasons it's harder, they basically come down to the limited syntax it accepts which excludes the typical attack opportunities. OK. So I think I can sanitize my inputs manually by excluding any non alphanumerics. The only values I'm having to concatenate are table and field names and these will be wrapped with []. The keywords that open up injection aren't supported by TSql so I think a limited character set of [A-Z][a-z][0-9] should be safe.
So I guess, here's me throwing out a general challenge. Given the following:-
c# Code:
//New Table. Create it.
sql = "Create Table SourceData." + WrappedName(tableName)
+ "(Id Int Primary Key Identity(1, 1), "
+ string.Join(", ", efcontext.SourceFields
.Where(sf => sf.SourceId == source.Id)
.Select(sf => WrappedName(sf.Name) + " varchar(" + sf.MaxLength.ToString() + ")")
.ToArray()) + ")";
directcontext.Database.ExecuteSqlCommand(sql);
Can anyone manage to successfully inject into that? (Don't post attacks here - please PM me).
Edit (WrappedName just adds the [])
The best argument against democracy is a five minute conversation with the average voter - Winston Churchill
Hadoop actually sounds more like the way they greet each other in Yorkshire - Inferrd
-
Aug 11th, 2025, 04:07 AM
#6
Re: Looking for a suggestion for "dynamic" tables
If it's not Access, you could use a Staging Table, that contains the "final" SQL-Statement in a Computed Column
if i understood you correctly, you want to do something along the lines of
Code:
UPDATE :ATableName SET :AColumnName=:ColumnParam WHERE ID=:RowParam
which is not possible.
I've come up with this:
Use a Staging table that has the "final" SQL-Statement as a computed Column
Code:
CREATE TABLE "tbl_update" (
"TableName" TEXT,
"FieldName" TEXT,
"FieldParam" TEXT,
"RowParam" TEXT,
"SQLString" TEXT GENERATED ALWAYS AS ('UPDATE ' || "TableName" || ' SET ' || "FieldName" || '=' || "FieldParam" || ' WHERE ID=' || "RowParam") VIRTUAL
);
Then you can use it as follows
Code:
INSERT INTO tbl_update
("TableName", "FieldName", "FieldParam", "RowParam")
VALUES (:TableNameParam,:FieldNameParam,':FieldParam',':RowParam'); //Attention: the last 2 Values are STRINGS
//Or
INSERT INTO tbl_update
("TableName", "FieldName", "FieldParam", "RowParam")
VALUES ('MyTable','MyColumn',':FieldParam',':RowParam'); //Attention: the last 2 Values are STRINGS
In the first version you can use Params for 1st and 2nd argument.
The second version would be Concatenated/hardcoded
Now you read the column "SQLString"
Code:
SELECT SQLString FROM tbl_update
Result:
UPDATE MyTable SET MyColumn=:FieldParam WHERE ID=:RowParam
Assign this to a Query-Object, assign the Params, fire!
I'm aware that a Create Table-Statement is more complex than my sample (which is in SQLite), but still solvable with my approach.
e.g. The Column-Names with its Attributes could be constructed in GUI as a single String from a Collection that doesn't allow duplicates for the Column-Names
Like
Code:
ID INT PRIMARY KEY AUTO_INCREMENT
FirstColumn VARCHAR(100) UNIQUE
SecondColumn INT
ThirdColumn DOUBLE
Throw that into an Array, use a Join-Function with ", " as delimiter, and you get
ID INT PRIMARY KEY AUTO_INCREMENT, FirstColumn VARCHAR(100) UNIQUE, SecondColumn INT, ThirdColumn DOUBLE
Insert this into the staging-Table
and the Computed column would be along the lines of
"SQLCreateTable" TEXT GENERATED ALWAYS AS ('CREATE TABLE IF NOT EXIST ' || "TableName" || ' (' || "FieldName" || ')') VIRTUAL
Result
CREATE TABLE IF NOT EXIST MyTable (ID INT PRIMARY KEY AUTO_INCREMENT, FirstColumn VARCHAR(100) UNIQUE, SecondColumn INT, ThirdColumn DOUBLE)
Of course you should make sure that the staging table is empty before inserting into it.
nutshell: at any given time the staging table is either empty or has exactly one record
EDIT:
Another approach
If i understood you correctly, you won't know beforehand that Source-Table A has 27 columns (in whichever DataType-Combination and order), and Source-Table B has 19 Columns with different Columnnames, Datatypes, whatever) compared to Source A
If your DBMS supports the INFORMATION_SCHEMA you could actually query that for the Columnnames and its Attributes, once you have the Table-Name of the Source
Last edited by Zvoni; Aug 11th, 2025 at 04:43 AM.
Last edited by Zvoni; Tomorrow at 31:69 PM.
----------------------------------------------------------------------------------------
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------------------
People call me crazy because i'm jumping out of perfectly fine airplanes.
---------------------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad
-
Aug 11th, 2025, 04:51 AM
#7
Re: Looking for a suggestion for "dynamic" tables
OK, I think I follow that but I'm away from the project at the moment and won't get to try it out until later in the week.
I can see a couple of issues. Can probably work around them but would appreciate your thoughts.
1. Indeterminate number of fields - Not only do I not know the nature of the fields at design time, I don't know the number either. So I can't use a static quantity of 3. I think I could get around this with an extra abstraction - e.g. storing them as rows in a column and then pivoting them. Or I could have a create statement to create the basic table (the ID field is ALWAYS present so it could just create the table with that single column) and then a series of alter statements to create the dynamic fields as subsequent operations using something similar to your update statement - except an Alter Add Column.
2. Does this actually prevent injection? (Might be better taking this one to the pms) but couldn't a hacker inject an attack into the values held in tlbUpdate. E.g. an attack into the RowParam field. Ultimately the sql statement is still be produced by concatenation in the Generated SqlString column so I think it's still vulnerable. I'm likely missing a trick there though.
Edit> Also, take a look at my challenge in post 5. I think that might be good enough but I know that, if anyone can inject an attack into it, you can. Obviously, pm me a successful attack - don't post it in the main forum.
Last edited by FunkyDexter; Aug 11th, 2025 at 04:54 AM.
The best argument against democracy is a five minute conversation with the average voter - Winston Churchill
Hadoop actually sounds more like the way they greet each other in Yorkshire - Inferrd
-
Aug 11th, 2025, 05:03 AM
#8
Re: Looking for a suggestion for "dynamic" tables
 Originally Posted by FunkyDexter
2. Does this actually prevent injection? (Might be better taking this one to the pms) but couldn't a hacker inject an attack into the values held in tlbUpdate. E.g. an attack into the RowParam field. Ultimately the sql statement is still be produced by concatenation in the Generated SqlString column so I think it's still vulnerable. I'm likely missing a trick there though.
Look closely at the first Version of the INSERT into the Staging table:
the first 2 Arguments are Params, Args 3 and 4 are hardcoded Strings
For your Create Table you'd need an INSERT with just 2 Params: TableName, FieldList (or whatever you want to call it)
Like
INSERT INTO tblCreate ("TableName", "FieldList") VALUES(:MyParamTableName, :MyParamFieldList) --> This is basically hardcoded in your GUI
Set Params
execute
Read SQLCreateTable
assign to Query-Object
Execute
Yes, in a way it's still vulnerable to Injection, but only if you allow the User to enter something like "); " (closing paranthesis followed by semicolon)
Do note, that the "final" SQL-Statement is stored in the Database itself, so it's accessible via SSMS (or whatever other Tool) in plain text
One "security-wise" approach might be to encrypt those entries
OTOH, you can "tinker" with the SQL-Statements outside your app as long as you don't change the "Interface" of it
Last edited by Zvoni; Aug 11th, 2025 at 05:19 AM.
Last edited by Zvoni; Tomorrow at 31:69 PM.
----------------------------------------------------------------------------------------
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------------------
People call me crazy because i'm jumping out of perfectly fine airplanes.
---------------------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad
-
Feb 8th, 2026, 12:52 AM
#9
Junior Member
Re: Looking for a suggestion for "dynamic" tables
This is common for data integration 'stuff'. In Microsoft world, it is SSIS. You can DL tools to do this for free. But you will be wiring up processes to pull the data, enrich it, and then output it however you want (across the variety of machines or servers you have access to).
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|