dcsimg
Results 1 to 19 of 19

Thread: One table for all documents

  1. #1

    Thread Starter
    Frenzied Member schoemr's Avatar
    Join Date
    Apr 2016
    Location
    South Africa
    Posts
    1,081

    One table for all documents

    Hi,

    I have several scenarios where a scanned document must be uploaded to the database. I shall only use two examples for the concept..

    Say I have these two scenarios:


    Meetings: Persons signs attendance register and that gets scanned and pdf uploaded to database.
    Assessments: Persons complete assessment form and that gets scanned and pdf uploaded to database.

    So, currently I have separate tables that "holds" this different documents. This is what I am currently do:

    Name:  1.JPG
Views: 165
Size:  34.3 KB


    But now I am thinking... Maybe it is better to have ONE table that holds ALL documents? Is that the better approach or not??

    Will this be the right design?

    Name:  2.JPG
Views: 164
Size:  32.3 KB

    If so... say I upload a meeting document, there will be MeetingID, what then happened to the AssessmentID? Shall it be blank?

    Thanks....
    Don't miss the whole point of the dance...

    https://www.youtube.com/watch?v=qHnIJeE3LAI

  2. #2
    Addicted Member Goggy's Avatar
    Join Date
    Oct 2017
    Posts
    144

    Re: One table for all documents

    One could also do something like so

    tblScannedDocuments
    - DocumentID
    - Type (Type of document)
    - ForeignId (This can point to different tables depending on the type of document )
    Utterly useless, but always willing to help

    As a finishing touch god created the dutch

  3. #3

    Thread Starter
    Frenzied Member schoemr's Avatar
    Join Date
    Apr 2016
    Location
    South Africa
    Posts
    1,081

    Re: One table for all documents

    Quote Originally Posted by Goggy View Post
    One could also do something like so

    tblScannedDocuments
    - DocumentID
    - Type (Type of document)
    - ForeignId (This can point to different tables depending on the type of document )
    Goggy.............

    I am not sure about your answer... The reason is, the same "ID" can be in both tables...

    E.g. I can have a MeetingID = "10" AND a AssessmentID that is also "10" - how will the tblScannedDocuments then know whose "10" that is?

    So.... I will have to filter on DocumentType AND ID ... Okay!! ... that can work
    Last edited by schoemr; Sep 5th, 2019 at 05:17 AM.
    Don't miss the whole point of the dance...

    https://www.youtube.com/watch?v=qHnIJeE3LAI

  4. #4
    .NUT jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    102,754

    Re: One table for all documents

    This question has exactly zero to do with VB.NET. If you have a database question, ask it in the Database Development forum. I have asked the mods to move this thread.

  5. #5

    Thread Starter
    Frenzied Member schoemr's Avatar
    Join Date
    Apr 2016
    Location
    South Africa
    Posts
    1,081

    Re: One table for all documents

    Quote Originally Posted by jmcilhinney View Post
    This question has exactly zero to do with VB.NET. If you have a database question, ask it in the Database Development forum. I have asked the mods to move this thread.
    Its funny, other people can ask all sorts of questions here then you are as quiet as the gallows... But let me DARE ask a question then you are very quick to jump on your high horse.

    When all that REPORTING questions was asked recently here, why didn't you complain? When I had a reporting question a while back you also were very quick to complain.
    Don't miss the whole point of the dance...

    https://www.youtube.com/watch?v=qHnIJeE3LAI

  6. #6

    Thread Starter
    Frenzied Member schoemr's Avatar
    Join Date
    Apr 2016
    Location
    South Africa
    Posts
    1,081

    Re: One table for all documents

    http://www.vbforums.com/showthread.p...ED-Report-Rdlc

    This is a reporting question, you even help to answer. Why didn't you complain it must be moved to reporting section???

    Is it because I am female? Are you sexist?
    Don't miss the whole point of the dance...

    https://www.youtube.com/watch?v=qHnIJeE3LAI

  7. #7
    .NUT jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    102,754

    Re: One table for all documents

    Get over yourself. I've reported loads of questions to the mods in order to have them moved to the appropriate forum and I have posted comments almost exactly the same as the one I did here on most of those occasions. I don't necessarily open every thread so I don't necessarily see every thread that is posted in the wrong place but if I do, I report it. Many questions are borderline and I'll leave those alone but if a question is obviously in the wrong place, as this one clearly is, I'll report it.
    http://www.vbforums.com/showthread.p...ED-Report-Rdlc

    This is a reporting question, you even help to answer. Why didn't you complain it must be moved to reporting section???
    Because it's NOT a reporting question. Did you even read it? The question is specifically about using a SqlDataAdapter. That has got nothing at all to do with reporting. The fact that the data retrieved is going to be used in a report is irrelevant because use of a data adapter is the same regardless of what you're going to do with the data, just as the language you use to write your application is irrelevant to the design of your database schema. It seems that you have trouble identifying the actual subject of a question, your own included.
    Is it because I am female? Are you sexist?
    No it isn't. No I'm not.
    Last edited by jmcilhinney; Sep 5th, 2019 at 06:15 AM.

  8. #8
    Super Moderator si_the_geek's Avatar
    Join Date
    Jul 2002
    Location
    Bristol, UK
    Posts
    41,096

    Re: One table for all documents

    Thread moved to the 'Database Development' forum ... because there is nothing VB specific here


    Using just one table will add complexity to joins etc (which isn't good, but not terrible), and increase storage space for the extra fields (which is debatable depending on the details of the circumstances, now and in the future).

    What benefit do you think you will get from having just one table?

  9. #9
    .NUT jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    102,754

    Re: One table for all documents

    As I pretty much always do in these situations, I will give my answer now that the thread has been moved. I wait until it has been moved so as to not encourage people to post in the wrong forum again. I've been here before, many times.

    We're working on a project in my office at the moment that is rather relevant to this question. Using the terminology from the question, we would have Meeting, Assessment and ScannedDocument tables, all with their own PK. We would also then have MeetingScannedDocument and AssessmentScannedDocument tables that simply join the others. This is by no means the only way and I'd accept arguments for other options being better - we discussed other options before going this way - but that's the way we did it and that's the way I'd do it again.

  10. #10
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    1,870

    Re: One table for all documents

    Quote Originally Posted by si_the_geek View Post
    Thread moved to the 'Database Development' forum ... because there is nothing VB specific here


    Using just one table will add complexity to joins etc (which isn't good, but not terrible), and increase storage space for the extra fields (which is debatable depending on the details of the circumstances, now and in the future).

    What benefit do you think you will get from having just one table?
    Isn't that the same question about using "Entities" we had some weeks ago?
    One System to rule them all, One IDE to find them,
    One Code to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    For health reasons i try to avoid reading unformatted Code

  11. #11
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    1,870

    Re: One table for all documents

    Quote Originally Posted by schoemr View Post
    Goggy.............

    I am not sure about your answer... The reason is, the same "ID" can be in both tables...

    E.g. I can have a MeetingID = "10" AND a AssessmentID that is also "10" - how will the tblScannedDocuments then know whose "10" that is?

    So.... I will have to filter on DocumentType AND ID ... Okay!! ... that can work
    The Field "Type" (of Document)
    One System to rule them all, One IDE to find them,
    One Code to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    For health reasons i try to avoid reading unformatted Code

  12. #12
    Super Moderator FunkyDexter's Avatar
    Join Date
    Apr 2005
    Location
    An obscure body in the SK system. The inhabitants call it Earth
    Posts
    7,327

    Re: One table for all documents

    The Field "Type" (of Document)
    Don't do this. It's heading dangerously close to the entity attribute value pair anti-pattern and is fraught with problems. The most obvious being that it will be impossible to enforce referential integrity through the foreign keys. Instead you will have to enforce it through programmed constraints or triggers which will affect performance or you will have to rely on your application to enforce it which is just straight up dangerous.

    @Schoemr, both of your suggested designs are OK and broadly meet third normal form. I personally would suggest the first approach (where you keep the different types of documents in separate tables) because the second approach introduces a need for nulls. Nulls are useful but when they're best avoided when you have the choice, they're a pain to check for in your queries and they take up space. In particular, they take up space on indexes and your foreign keys are likely candidates for indexing.

    There is another consideration though: do you think of the different types of documents as being the same? Are you ever going to want to, for example, query for ALL documents? If so that starts to suggest a design that puts all the documents in a single table, like your second.

    But I would suggest a better way would be to have a single table of documents and the separate tables to link each document type to each type of parent. So you'd have a table called "Meeting_Document" that held a document id and a meeting id. You'd have a similar table for Assessment_Documents. This allows you to enforce referential integrity correctly through the use of foreign keys. It also means there's no nulls creating redundancy in the relationships.

    But it also allow for a document to belong to more than one Meeting. (It's behaving as a linking entity). This is easily solved by making Document_ID the primary key in those tables.

    The only weakness that's left is that a document could still belong to e.g. a single meeting and a single assessment. I've never come up with a good solution to that I'm afraid.

    This is quite a common way of implementing inheritance in a database.
    You can depend upon the Americans to do the right thing. But only after they have exhausted every other possibility - Winston Churchill

    Hadoop actually sounds more like the way they greet each other in Yorkshire - Inferrd

  13. #13
    PowerPoster Zvoni's Avatar
    Join Date
    Sep 2012
    Location
    To the moon and then left
    Posts
    1,870

    Re: One table for all documents

    FD, your alternate approach (the one with the "meeting_document"-Table) is basically a m:n-relation.
    In that scenario, wouldn't the (single) document-table be a Parent-table, too?
    How would you enforce referential integrity down to the document-table?
    For discussions sake: Say, a meeting is deleted. Yes, the RI forces the the records from "meeting_document" to be deleted, but onwards to the document-table?
    One System to rule them all, One IDE to find them,
    One Code to bring them all, and to the Framework bind them,
    in the Land of Redmond, where the Windows lie
    ---------------------------------------------------------------------------------
    People call me crazy because i'm jumping out of perfectly fine airplanes.
    ---------------------------------------------------------------------------------
    For health reasons i try to avoid reading unformatted Code

  14. #14
    Super Moderator FunkyDexter's Avatar
    Join Date
    Apr 2005
    Location
    An obscure body in the SK system. The inhabitants call it Earth
    Posts
    7,327

    Re: One table for all documents

    m:n-relation.
    Was that a typo of m:m (as in many to many). If so, no it's not. By making document_id the primary key in the document_meeting table you force it into a one-to-many relationship because each document can only appear in that table once. So each document can relate to only one meeting_document and therefore only one meeting.
    Yes, the RI forces the the records from "meeting_document" to be deleted, but onwards to the document-table?
    Agreed, that situation is not handled. It's really part of the weakness I described in my second to last paragraph. Because this model allows a document to belong to both a meeting and an assessment, referential integrity alone would be insufficient to identify that the deletion of a meeting should result in the deletion of an associated document. It's important to acknowledge and understand the implications of that weakness.

    The fundamental problem here is that any approach that combines the documents in a single table must, by definition, result in the possibility that a document has more than one parent. Or just as significantly, that a given document does not predictably have a given type of parent. It creates the possibility that a document does not have a parent at all. This is where the risk comes from and is why, in my second paragraph, I primarily advocated not combining the tables at all. Everything after that is about how to mitigate as much of the risk as possible if you choose to combine them but, to my almost certain knowledge, there is no way of entirely mitigating the risk. (Excluding triggers, programmed constraints etc. which aren't "design")

    As an aside, this discussion is really about how you implement inheritance in a database design. The answer is, you don't. You can't. Databases (relational ones at least) are not object orientated and don't support object orientated patterns. But there are three widely acknowledged patterns that get you somewhere close. My favourite (the one I'm really advocating for by having separate tables for different types of document) is called "Table-per-Concrete". The approach where you have linking tables per document type is "Table per Type" and the approach where you have a single documents table with multiple foreign key fields is called "Table per Hierarchy". There's some good discussion about the various approaches here.

    You'll notice, though, that having a Type field that describes the behaviour of a single foreign key field is not among these options. LOTS of people do this without understanding the impacts it can have, both on the ability to ensure the integrity of the data and on performance at scale. It's probably the one that's the simplest for us to understand, which is why so many people choose to implement it, but it is by far the worst approach and comes with all the worst points of the three recognised approaches plus a bunch of it's own. It's a form of the Entity Attribute Value Pair Anti Pattern.

    Note, EAV is fine if you're using an object orientated database, but in a relational one it's horrible.


    Edit> Sorry for the War and Peace essay. Inheritance in databases is a particular bug bear of mine and one that I think very few developers and DB professionals really understand the implications of.


    Edit 2> I thought I'd add another note further supporting my preference for the Table per Concrete approach. It's the only approach of the three that truly meets Codd's original principles. It's third normal form with no sparse columns and it's the only one that truly embeds referential integrity in the design itself.

    Its only weakness (that I can think of) is that it doesn't provide a single combined set of all documents to query against but sql server provides a very good solution to that: create an All_Documents view.
    Last edited by FunkyDexter; Sep 6th, 2019 at 03:22 AM.
    You can depend upon the Americans to do the right thing. But only after they have exhausted every other possibility - Winston Churchill

    Hadoop actually sounds more like the way they greet each other in Yorkshire - Inferrd

  15. #15
    .NUT jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    102,754

    Re: One table for all documents

    Quote Originally Posted by jmcilhinney View Post
    I've reported loads of questions to the mods in order to have them moved to the appropriate forum and I have posted comments almost exactly the same as the one I did here on most of those occasions.
    Just one more example:

    http://www.vbforums.com/showthread.php?878421

    It's almost like, if I see a thread that I think has been posted in the wrong forum, I report it so that it can be moved to the correct forum, without regard for who the OP is. It's not a means to punish the poster but rather a means to keep the site order the way it's supposed to be. I tell the OP that I have done it not to make them feel bad but rather to let them know what their mistake was so that they may avoid repeating it in the future.

  16. #16
    Frenzied Member ChrisE's Avatar
    Join Date
    Jun 2017
    Location
    Frankfurt
    Posts
    1,992

    Re: One table for all documents

    well I am no friend of saveing Documents to a Database, I would always only
    save the Path

    another approach would be to use Outlook, don't know if the Task -Meetings- is
    within a Company.

    here a sample how you could use Outlook to send :
    a) Appointment to User(s), with Date, Time, Subject
    b) add that to there Calendar
    c) add an Attachment (your Documents)

    you can query each Folder in Outlook like a Database Table

    Code:
    Imports Microsoft.Office.Interop.Outlook
    Imports Microsoft.Office.Interop
    Imports System.Runtime.InteropServices
    
    Public Class Form1
    
        Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
    
            Dim OutlookApp As Microsoft.Office.Interop.Outlook.Application
            OutlookApp = CreateObject("Outlook.Application")
            Dim ns As Outlook.NameSpace = Nothing
            Dim calendarFolder As Outlook.MAPIFolder = Nothing
            Dim items As Outlook.Items = Nothing
            Dim appItem As Outlook.AppointmentItem = Nothing
    
            ns = OutlookApp.GetNamespace("MAPI")
            calendarFolder = ns.GetDefaultFolder(Outlook.OlDefaultFolders.olFolderCalendar)
            items = calendarFolder.Items
    
            appItem = items.Add(Outlook.OlItemType.olAppointmentItem)
    
            appItem.Subject = "Meeting"
            appItem.Location = "SomeWhere"
            appItem.Start = Convert.ToDateTime("07.09.2019 10:00:00")
            appItem.End = Convert.ToDateTime("07.09.2019 11:45:00")
            appItem.Importance = OlImportance.olImportanceHigh
           
            appItem.Save() '<-- save !!!
            'IMPOTANT! the Appointment must be saved first !!
            'before you can send to a User
    
    
            'next operation, send Apponitment to user
    
    
            Dim mailItem As Outlook.MailItem = appItem.ForwardAsVcal()
            mailItem.To = "xyz@t-online.de"
            mailItem.Subject = "come to the Meeting"
            mailItem.Body = "some Text"
            mailItem.Attachments.Add("E:\TestFolder\op4.pdf")
            mailItem.Send()
    
    
            'clean up !!
            If Not IsNothing(appItem) Then Marshal.ReleaseComObject(appItem)
            If Not IsNothing(items) Then Marshal.ReleaseComObject(items)
            If Not IsNothing(calendarFolder) Then Marshal.ReleaseComObject(calendarFolder)
            If Not IsNothing(ns) Then Marshal.ReleaseComObject(ns)
        End Sub
    End Class
    hth
    to hunt a species to extinction is not logical !
    since 2010 the number of Tigers are rising again in 2016 - 3900 were counted. with Baby Callas it's 3901, my wife and I had 2-3 months the privilege of raising a Baby Tiger.

  17. #17
    .NUT jmcilhinney's Avatar
    Join Date
    May 2005
    Location
    Sydney, Australia
    Posts
    102,754

    Re: One table for all documents

    Quote Originally Posted by ChrisE View Post
    well I am no friend of saveing Documents to a Database, I would always only
    save the Path
    Actually, SQL Server has a good answer to this issue with its FILESTREAM feature, where clients can access the database as though the document data is stored directly but it is actually stored in the file system and external to the database file. You can even have it create a share so that the files can be accessed directly.

  18. #18
    Frenzied Member ChrisE's Avatar
    Join Date
    Jun 2017
    Location
    Frankfurt
    Posts
    1,992

    Re: One table for all documents

    Quote Originally Posted by jmcilhinney View Post
    Actually, SQL Server has a good answer to this issue with its FILESTREAM feature, where clients can access the database as though the document data is stored directly but it is actually stored in the file system and external to the database file. You can even have it create a share so that the files can be accessed directly.
    sounds Like the FILESTREAM feature is the option to use, but that's up to the OP
    to hunt a species to extinction is not logical !
    since 2010 the number of Tigers are rising again in 2016 - 3900 were counted. with Baby Callas it's 3901, my wife and I had 2-3 months the privilege of raising a Baby Tiger.

  19. #19
    PowerPoster techgnome's Avatar
    Join Date
    May 2002
    Posts
    32,413

    Re: One table for all documents

    The way I'd probably do it is to have a Document table that has an ID, Type (tells me if the document is a Word, Excel, PDF, etc), and the Binary (or pointer if using the FILESTREAM option) ... The document doesn't care if it belongs to a meeting or not... it's just a document.
    If a meeting can have more than one document (and I'm betting it could) then I'd have a middle table tblMeetingDocuments that links Meetings to Documents); If Assessments can have more than one document, I'd do the same there... if not, then I'd put the documentID right on the Assessment table (or on the AssessmentForms, which ever is more appropriate).

    Yes, that does mean that documents can live outside of meetings and assessments... but for archival purposes, that should be fine... if you're storing documents related to these things, odds are those things should probably not be deleted in the first place (which can be stopped with a well-placed trigger and appropriate security roles in place). Mark things with a soft-delete and hide it from the user.

    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Featured


Click Here to Expand Forum to Full Width