Results 1 to 14 of 14

Thread: [Access 2010] Quotes are Screwing Up Recordset of Tab Delimited Data

  1. #1

    Thread Starter
    Addicted Member
    Join Date
    Sep 2006
    Posts
    223

    [Access 2010] Quotes are Screwing Up Recordset of Tab Delimited Data

    I have a tab delimited file. One record contains double quotes.

    Name:  excel.jpg
Views: 1245
Size:  7.7 KB

    I'm reading the text TSV file using ADODB and writing the results to MS Access. That double quote is really screwing things up. It appears to end the recordset!

    Name:  VBA.JPG
Views: 1907
Size:  23.3 KB

    The only answer I see from Google is to read the TSV file using VBA, not ADODB. But this really isn't an answer -- it avoids the problem. I want to use ADODB. It's hard to believe there's no way to read TSV files with a double quote in them. Any ideas?


    This is the code that opens the ADODB connection and recordset:

    Code:
    Public Sub openADODBConnection()
        Set c_cn = New ADODB.Connection
    
        c_cn.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
            "Data Source=""" & c_crFolder.path & """;" & _
            "Extended Properties=""Text;HDR=Yes;FMT=Delimited"""
    
        If c_cn.State <> adStateOpen Then
            Call Me.closeDBConnection(exit_failure, "DB Connection Failed.")
            Call MsgBox("ADO Connection failed with: " & c_crFolder.path, , "Connection Err")
            Exit Sub
        End If
    End Sub
    
    
    Public Sub openADODBRecordset(table As Tables)
        Dim file As String, sql As String
        Set c_rs = New ADODB.Recordset
    
        If table = Tables.bulkpor Then
            file = "FFIEC CDR Call Bulk POR " & Me.dateString & ".txt"
        ElseIf table = Tables.rc Then
            file = "FFIEC CDR Call Schedule RC " & Me.dateString & ".txt"
        End If
    
        sql = "SELECT * FROM [" & file & "]"
        c_rs.Open sql, Me.dbConnection, adOpenForwardOnly, adLockReadOnly, adCmdText
        If c_rs.State <> adStateOpen Then Call Me.closeDBRecordSet(exit_failure, "Recordset Err")
    End Sub
    Additionally, I use a schema.init file to force Access to read the file using tab delimiters.

    Code:
    [FFIEC CDR Call Schedule RCRII 06302014(1 of 2).txt]
    Format=TabDelimited

    There surely has to be a way to handle this, right? If I had to use a ham-fisted approach, I'd probably pre-process the files to remove the double quotes physically rather than switch to reading the TSV files with a VBA file read/split operation.

  2. #2
    Lively Member
    Join Date
    May 2015
    Posts
    111

    Re: [Access 2010] Quotes are Screwing Up Recordset of Tab Delimited Data

    I don't know if this will work, but Chr(34) is the ASCII code for double quotes. I know you can use that in VBA. So I'm understanding you correctly you want to search for the double quotes. You can replaced them with Chr(34).

  3. #3

    Thread Starter
    Addicted Member
    Join Date
    Sep 2006
    Posts
    223

    Re: [Access 2010] Quotes are Screwing Up Recordset of Tab Delimited Data

    Sorry if I wasn't clear --- there are double quotes in my tab delimited file which are screwing up how ADODB is reading the file.

    Basically, the first appearance of a double quote is causing the recordset to end.

    I think maybe the best thing to do would be to cook up a small toy problem and post the problem here. I'll try to do that today.

  4. #4
    Smooth Moperator techgnome's Avatar
    Join Date
    May 2002
    Posts
    34,537

    Re: [Access 2010] Quotes are Screwing Up Recordset of Tab Delimited Data

    IS there a reason to use ADO for this? Why not just open the file, read each line, split on the TAB and write it out to the database?
    I suspect it's because the parser isn't expecting the quote identifier to be in the middle of the data. I know with CSV files, I have to indicate if the fields are quote identified so it knows the difference between a comma in the middle of a quote (part of the value) vs outside the quote (separating values).

    You don't show it, but what about the next record in the file... did it get read in OK?
    Also, a couple things to try to help narrow down the issue - put double quotes around the whole field, replacing the double quotes with single quotes, and removing the quotes all together.

    I'm curious as to what the results would be. I have my suspicions.

    -tg
    * I don't respond to private (PM) requests for help. It's not conducive to the general learning of others.*
    * I also don't respond to friend requests. Save a few bits and don't bother. I'll just end up rejecting anyways.*
    * How to get EFFECTIVE help: The Hitchhiker's Guide to Getting Help at VBF - Removing eels from your hovercraft *
    * How to Use Parameters * Create Disconnected ADO Recordset Clones * Set your VB6 ActiveX Compatibility * Get rid of those pesky VB Line Numbers * I swear I saved my data, where'd it run off to??? *

  5. #5
    PowerPoster
    Join Date
    Oct 2010
    Posts
    2,141

    Re: [Access 2010] Quotes are Screwing Up Recordset of Tab Delimited Data

    Additionally, I use a schema.init file to force Access to read the file using tab delimiters.
    I hope that ".init" is a typo; it should be named "Schema.ini".

    Quote Originally Posted by caffeine View Post
    Basically, the first appearance of a double quote is causing the recordset to end.

    I think maybe the best thing to do would be to cook up a small toy problem and post the problem here. I'll try to do that today
    A sample text file would help as I am not able to reproduce your issue.

  6. #6

    Thread Starter
    Addicted Member
    Join Date
    Sep 2006
    Posts
    223

    Re: [Access 2010] Quotes are Screwing Up Recordset of Tab Delimited Data

    Quote Originally Posted by TnTinMN View Post
    I hope that ".init" is a typo; it should be named "Schema.ini".


    A sample text file would help as I am not able to reproduce your issue.

    Totally a typo. I don't think it's possible to read TSV files with ADO without a proper schema.ini.

    Just got back from taking my daughter to the eye doctor. I'll try to cook a toy problem right now.

  7. #7
    PowerPoster
    Join Date
    Dec 2004
    Posts
    25,618

    Re: [Access 2010] Quotes are Screwing Up Recordset of Tab Delimited Data

    i was easily able to duplicate your problem

    if a field is enclosed in quotes they are stripped out, no problem, unless the quotes are required to be kept
    but if a field contains a quote within the text, the quote and the rest of that record are terminated, the next record is fine (up to any quote)

    i found no solution within ADO

    the only simple solution, to keep using the TSV as a recordset, is to, first, open the file for i/o and replace all quotes with some other character, i did test, single quotes do not give the same problem

    I don't think it's possible to read TSV files with ADO without a proper schema.ini.
    there is a registry setting, that can be changed
    i do my best to test code works before i post it, but sometimes am unable to do so for some reason, and usually say so if this is the case.
    Note code snippets posted are just that and do not include error handling that is required in real world applications, but avoid On Error Resume Next

    dim all variables as required as often i have done so elsewhere in my code but only posted the relevant part

    come back and mark your original post as resolved if your problem is fixed
    pete

  8. #8

    Thread Starter
    Addicted Member
    Join Date
    Sep 2006
    Posts
    223

    Re: [Access 2010] Quotes are Screwing Up Recordset of Tab Delimited Data

    I have a toy problem that illustrates the behavior which I've attached. The zip file contains:

    • an empty Access 2010 file with only a VBA module (just run the only procedure)
    • a data file (data.txt)
    • a readme file describing the problem
    • a schema file (schema.ini) which is necessary for reading TSV files


    I've also investigated the problem a little more and learned a bit more about its behavior. This feels like a bug because the behavior is inconsistent. Not only is the rest of the given recordset null (the rest of the recordsets are unaffected and normal) but what shows up depends on where the quoted material is.

    • When the quoted material is at the beginning of the record, the quoted material is printed.
    • When the quoted material is in the middle of the record, the beginning unquoted material is printed.
    • When the quoted material is at the end of the record, the non-quoted material is printed.



    Case A) Entire field is quoted: No problem here.

    Input: A<TAB>"hello and goodbye"<TAB>C

    Field 0: "A"
    Field 1: "hello and goodbye"
    Field 2: "C"



    Case B) Beginning material is quoted: Problem exists.

    Input: A<TAB>"hello" and goodbye<TAB>C

    Field 0: A
    Field 1: hello
    Field 2: Null



    Case B) Middle material is quoted: Problem exists.

    Input: A<TAB>hello "and" goodbye<TAB>C

    Field 0: A
    Field 1: hello
    Field 2: Null



    Case B) End material is quoted: Problem exists.

    Input: A<TAB>hello and "goodbye"<TAB>C

    Field 0: A
    Field 1: hello and
    Field 2: Null



    In case you don't have Access 2010, I'm quoting my code here so you can cut and paste into whatever version of Access you have (don't forget to include a reference to Microsoft ActiveX Data Objects 2.x.


    Code:
    Option Compare Database
    Option Explicit
    
    
    
    ''' Requires a reference to Microsoft ActiveX Data Objects 2.x DLL
    '''
    Public Sub test()
        Dim cn As ADODB.Connection, rs As ADODB.Recordset
        Dim path As String, sql As String, vTmp As Variant, s As String
    
        Set cn = New ADODB.Connection
        Set rs = New ADODB.Recordset
        path = CurrentProject.path
    
    
        ' Open the connection
        '
        cn.Open _
            "Provider=Microsoft.Jet.OLEDB.4.0;" & _
            "Data Source=""" & path & """;" & _
            "Extended Properties=""Text;HDR=Yes;FMT=Delimited"""
    
        If cn.State <> ADODB.ObjectStateEnum.adStateOpen Then
            Call MsgBox("ADO Connection failed", , "Error: ADO Connection")
            GoTo DONE
        End If
    
    
        ' Open the recordset
        '
        sql = "SELECT * FROM [data.txt]"
        rs.Open sql, cn, adOpenForwardOnly, adLockReadOnly, adCmdText
    
        If rs.State <> ADODB.ObjectStateEnum.adStateOpen Then
            Call MsgBox("ADO Recordset failed", , "Error: ADO Recordset")
            GoTo DONE
        End If
    
    
        ' Print recordsets
        '
        Do While rs.EOF = False
            s = vbNullString
            For Each vTmp In rs.Fields
                s = s & vTmp & ","
            Next vTmp
            
            Debug.Print s
            rs.MoveNext
        Loop
    
    
    DONE:
        On Error Resume Next
        rs.Close
        Set rs = Nothing
        cn.Close
        Set rs = Nothing
        On Error GoTo 0
    End Sub
    Attached Files Attached Files

  9. #9
    PowerPoster
    Join Date
    Oct 2010
    Posts
    2,141

    Re: [Access 2010] Quotes are Screwing Up Recordset of Tab Delimited Data

    Quote Originally Posted by westconn1 View Post
    i was easily able to duplicate your problem
    And so would I have if I didn't get lazy and not look at how Excel saved my test data as a Tab Delimited file.
    Sorry about that.

  10. #10

    Thread Starter
    Addicted Member
    Join Date
    Sep 2006
    Posts
    223

    Re: [Access 2010] Quotes are Screwing Up Recordset of Tab Delimited Data

    You certainly don't remember me, but when I was a developer at Fitch Ratings you saved my butt a few times, and if I have semi-decent VBA coding skills, you're definitely to blame for that. Thank you!


    Quote Originally Posted by westconn1 View Post
    i was easily able to duplicate your problem

    if a field is enclosed in quotes they are stripped out, no problem, unless the quotes are required to be kept
    but if a field contains a quote within the text, the quote and the rest of that record are terminated, the next record is fine (up to any quote)
    That's pretty much what I found, too.

    Quote Originally Posted by westconn1 View Post
    i was easily able to duplicate your problem
    i found no solution within ADO

    the only simple solution, to keep using the TSV as a recordset, is to, first, open the file for i/o and replace all quotes with some other character, i did test, single quotes do not give the same problem
    D'oh! I was afraid it would come to that, but I was crossing my fingers there was some other way to side-step the problem than to do I/O before reading the file with ADO.


    Quote Originally Posted by westconn1 View Post
    i was easily able to duplicate your problem
    there is a registry setting, that can be changed
    Yeah, I came across that in my Googling on how to read TSV files (I've never worked with them before), but since the people I work with aren't tech savvy I guess I wasn't considering that. My bad!

  11. #11
    PowerPoster
    Join Date
    Oct 2010
    Posts
    2,141

    Re: [Access 2010] Quotes are Screwing Up Recordset of Tab Delimited Data

    Well after my earlier foot in keyboard moment , I did some digging and found this
    The Format option in Schema.ini specifies the format of the text file. The Text IISAM can read the format automatically from most character-delimited files. You can use any single character as a delimiter in the file except the double quotation mark (").
    Source: https://msdn.microsoft.com/en-us/lib...=vs.85%29.aspx
    What this does not say is, is that if you must have a double-quote between the delimiters then they must be double double-quote characters the same as you would do in your VBA code to insert a double-quote into a string constant.

    Warning! The following is meant as a joke.

    Since you appear adamant to use ADODB to accomplish this, there is hope.

    Second warning! This may make you sick.
    Read it line by line with ADODB and split the lines.
    Code:
       conn.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
       "Extended Properties='Text;';" & _
       "Data Source='" & path & "';"
          
       Dim rs As ADODB.Recordset
       Set rs = New ADODB.Recordset
       
       rs.Open "Select * From [Tabbed.txt]", conn, adOpenForwardOnly, adLockReadOnly, adCmdText
       Dim i As Integer
       Dim fieldNames() As String
       Dim dataFields() As String
       
       While Not rs.EOF
          fieldNames = Split(rs(0), vbTab)
          rs.MoveNext
          While Not rs.EOF
             dataFields = Split(rs(0), vbTab)
             For i = 0 To UBound(fieldNames)
                Debug.Print fieldNames(i) & ";" & vbTab & dataFields(i)
             Next i
             rs.MoveNext
          Wend
       Wend
    The "Schema.ini" would look like this:
    Code:
    [Tabbed.txt]
    ColNameHeader=False
    MaxScanRows=0
    Format=FixedLength
    CharacterSet=OEM
    Rem set the Width to be longer than you need.
    Col1=F1 Text Width 1000

  12. #12

    Thread Starter
    Addicted Member
    Join Date
    Sep 2006
    Posts
    223

    Re: [Access 2010] Quotes are Screwing Up Recordset of Tab Delimited Data

    Hey TnTinMN!

    I think I see what you're doing. You're basically gulping the entire line (treating it as a fixed width data file) using an ADODB.Recordset and doing the splitting yourself via Split (and moving one past the header).

    I think maybe that's the best solution here. Preprocessing all the files would be really wasteful. So be it. Thanks for the idea!

  13. #13
    PowerPoster
    Join Date
    Oct 2010
    Posts
    2,141

    Re: [Access 2010] Quotes are Screwing Up Recordset of Tab Delimited Data

    Quote Originally Posted by caffeine View Post
    I think maybe that's the best solution here.
    You jest I hope.

    In Access VBA IDE, open the Help system (F1) and search for "Line Input" to see an example of reading a file. Then you can use the logic that I outlined to parse each line using the Split function.

  14. #14
    PowerPoster
    Join Date
    Dec 2004
    Posts
    25,618

    Re: [Access 2010] Quotes are Screwing Up Recordset of Tab Delimited Data

    depending on the size of the file

    Code:
    open "somepath\filename" for input as 1
    strfile = input(lof(1),#1)
    close 1
    open "somepath\filename" for output as 1     ' change filename not to overwrite original if desired
    print #1, replace(strfile, """", "'")     'replace quotes with single quote
    close 1
    of course the file may be too big for this simplistic solution, but you have not specified anywhere
    if there are multiple file to process, you could convert them all first, or in turn before opening as recordset
    i do my best to test code works before i post it, but sometimes am unable to do so for some reason, and usually say so if this is the case.
    Note code snippets posted are just that and do not include error handling that is required in real world applications, but avoid On Error Resume Next

    dim all variables as required as often i have done so elsewhere in my code but only posted the relevant part

    come back and mark your original post as resolved if your problem is fixed
    pete

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width