Results 1 to 7 of 7

Thread: [RESOLVED] seriously wacky left() behaviour

  1. #1

    Thread Starter
    Fanatic Member
    Join Date
    Dec 2007
    Location
    West Yorkshire, UK
    Posts
    791

    Resolved [RESOLVED] seriously wacky left() behaviour

    I have a program which takes the html from a web page, removes the formatting and leaves a list of conjugations for the verb. A sample of the result is here:
    Barrer : sweep
    *present indicative
    Yo barro
    Tú barres
    Él/usted barre
    Nosotros barremos
    Vosotros barréis
    Ellos/ustedes barren
    *Imperfect:
    Yo barría
    Tú barrías
    Él/usted barría
    Nosotros barríamos
    Vostros barríais
    Ellos/ustedes barrían
    *preterite:
    Yo barrí
    Etc . . .
    etc . . .

    I entered the "*"s because I want to ignore them when I write them to the database.
    It starts off fine with the first record "barrer : sweep"
    for the second record (the first one encountered which needs to be ignored), iPos is 1, sFirst is "*" and it executes the code in CASE "*"
    so far so good.
    It also works fine for the next 6 lines that I need and adds them to the database record.
    The problem comes with "*Imperfect:"
    iPos = 3, sItem = "" and sFirst = "".
    The same happens with every line from then on which starts with "*" and iPos =3 for all of them.
    WHY 3!!!!!!
    What am I missing?
    PLEASE HELP This is driving me batty!
    Code:
    Option Explicit
    Private cn As ADODB.Connection
    Private rs As ADODB.Recordset
    Private sHTML As String
    Private sItem() As String
    Code:
    Private Sub LoadDatabase()
    Dim strConn As String
    Dim iBasePtr As Integer
    Dim iLoopCtr As Integer
    Dim sFirst As String
    Dim iPos As Integer
        Set cn = New ADODB.Connection
    '    strConn = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Users\hp\Documents\VBProgs\Get From Web\SpanishVerbs.mdb;Persist Security Info=False"
        cn.Open "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\Users\hp\Documents\VBProgs\Get From Web\SpanishVerbs.mdb;Persist Security Info=False"
    
        Set rs = New ADODB.Recordset
        rs.Open "verblist", cn, adOpenKeyset, adLockPessimistic, adCmdTable
        
        rs.AddNew 'adding new record
    'here we need two counts so that when we encounter a field with a * at the beginning
    'it doesn't add it to the database
    
        iBasePtr = 1
        For iLoopCtr = 0 To UBound(sItem)
            iPos = InStr(1, sItem(iLoopCtr), "*", vbTextCompare)
            If iPos > 1 Then
                sItem(iLoopCtr) = Left(sItem(iLoopCtr), iPos - 1)
            End If
            sFirst = Left(sItem(iLoopCtr), 1)
            Select Case sFirst
                Case "*"
                    iBasePtr = iBasePtr     'do nothing and DONT increase iBaseCtr
                Case Else
                    rs.Fields(iBasePtr).Value = sItem(iLoopCtr)
                    iBasePtr = iBasePtr + 1
            End Select
        Next iLoopCtr
        rs.Update 'this updates the recordset
        rs.Close
        Set rs = Nothing
        Set cn = Nothing
    End Sub

  2. #2
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: seriously wacky left() behaviour

    Where do you fill sItem? I see no code assigning anything to it, and the problem is very likely to be in the code that parses the HTML: there may be an extra character such as a newline that breaks your code.


    As for InStr with vbTextCompare, it is unnecessary when you are looking for a special character such as "*". Just use vbBinaryCompare, it is a lot faster.

  3. #3

    Thread Starter
    Fanatic Member
    Join Date
    Dec 2007
    Location
    West Yorkshire, UK
    Posts
    791

    Re: seriously wacky left() behaviour

    sorry, the code to parse the HTML is here. I can't see there is a problem with it pulling a Newline because I also write the list to a listbox and it all looks fine in there.
    Code:
    Private Sub GetList()
    'Tags 1 and 2 mark the text at the beginning and end of the area that I want
    Const Tag1 = "<CENTER><FONT color=#0033ff size=4><STRONG>"
    Const Tag2 = "<CENTER><STRONG><FONT color=#ff0000>"
    Dim i As Long, sWhole As String
    
        sItem = Split(sHTML, Tag1)      'split the string based on the start of the block
        For i = 1 To UBound(sItem)      '-- (i - 1) is used to remove the first item (0)
                                        '- effectively chops off the start of the string
                                        'using this method allows for more than one instance of the required string
            sItem(i - 1) = Split(sItem(i), Tag2, 2)(0)
        Next
        ReDim Preserve sItem(UBound(sItem) - 1) '-- one item short to effectively remove the last part of the string
        sWhole = sItem(0)               'We only get one instance for this application
        sWhole = Replace(sWhole, "<BR><BR>", "<BR>")
        sWhole = Replace(sWhole, "<STRONG><FONT color=#ff0000>", "")
        sWhole = Replace(sWhole, "</STRONG>", "")
        sWhole = Replace(sWhole, "<STRONG>", "*")   'used to mark the headers which will be ignored
        sWhole = Replace(sWhole, "</FONT>", "")
        sWhole = Replace(sWhole, "<FONT color=#0033ff size=4>", "")
        sWhole = Replace(sWhole, "</CENTER>", "")
        sWhole = Replace(sWhole, "</TD>", "")
        sWhole = Replace(sWhole, "</TR>", "")
        sWhole = Replace(sWhole, "<TR>", "")
        sWhole = Replace(sWhole, "<TD 50&#37;?? 2??>", "")
        sWhole = Replace(sWhole, "<TD 25%??>", "")
        sWhole = Replace(sWhole, "</TABLE>", "")
        sWhole = Replace(sWhole, "</DIV>", "")
        sWhole = Replace(sWhole, "</TBODY>", "")
        sWhole = Replace(sWhole, "<BR>", "$")       'put a placemarker in between the lines
    
        sItem = Split(sWhole, "$")                  'then split it up
        ReDim Preserve sItem(UBound(sItem))
        
        For i = 0 To UBound(sItem)
            If i = 1 Then                   'it's complicated format info. Easier to ignore it
                List1.AddItem "*present indicative"
                sItem(i) = "*present indicative"
            Else
                List1.AddItem sItem(i)      'Add it to the list
            End If
        Next
        
        LoadDatabase
    End Sub
    Last edited by Españolita; May 28th, 2009 at 02:23 PM.

  4. #4
    VB-aholic & Lovin' It LaVolpe's Avatar
    Join Date
    Oct 2007
    Location
    Beside Waldo
    Posts
    19,541

    Re: seriously wacky left() behaviour

    I don't see what the problem is right away. But if you are curious as to what the 1st 2 characters are when iPos = 3:
    Code:
    If iPos = 3 Then
        Debug.Print Asc(Left$(sItem(iLoopCtr),1)), Asc(Mid$(sItem(iLoopCtr),2,1))
    End If
    Insomnia is just a byproduct of, "It can't be done"

    Classics Enthusiast? Here's my 1969 Mustang Mach I Fastback. Her sister '67 Coupe has been adopted

    Newbie? Novice? Bored? Spend a few minutes browsing the FAQ section of the forum.
    Read the HitchHiker's Guide to Getting Help on the Forums.
    Here is the list of TAGs you can use to format your posts
    Here are VB6 Help Files online


    {Alpha Image Control} {Memory Leak FAQ} {Unicode Open/Save Dialog} {Resource Image Viewer/Extractor}
    {VB and DPI Tutorial} {Manifest Creator} {UserControl Button Template} {stdPicture Render Usage}

  5. #5

    Thread Starter
    Fanatic Member
    Join Date
    Dec 2007
    Location
    West Yorkshire, UK
    Posts
    791

    Re: seriously wacky left() behaviour

    hi LaVolpe,
    the first two characters are 13 & 10 every time

    *scratches head* isn't that CR, newline?
    Last edited by Españolita; May 28th, 2009 at 02:57 PM.

  6. #6

    Thread Starter
    Fanatic Member
    Join Date
    Dec 2007
    Location
    West Yorkshire, UK
    Posts
    791

    Re: seriously wacky left() behaviour

    well, that's sorted that out.
    I used replace on those two characters and it solved the problem.

    so just out of interest, why didn't it show up when I put them in the listbox?

  7. #7
    VB-aholic & Lovin' It LaVolpe's Avatar
    Join Date
    Oct 2007
    Location
    Beside Waldo
    Posts
    19,541

    Re: seriously wacky left() behaviour

    13 and 10 equate the vbCrLF or vbNewLine. That does explain things I would think. Maybe you may want to strip those out of your sHTML string before you start processing them.

    Edited: I see we posted about the same time. Glad you resolved it. ListBoxes do not display carriage returns, do they, but they should have displayed 2 vertical bars, one for each character (13 & 10).
    Insomnia is just a byproduct of, "It can't be done"

    Classics Enthusiast? Here's my 1969 Mustang Mach I Fastback. Her sister '67 Coupe has been adopted

    Newbie? Novice? Bored? Spend a few minutes browsing the FAQ section of the forum.
    Read the HitchHiker's Guide to Getting Help on the Forums.
    Here is the list of TAGs you can use to format your posts
    Here are VB6 Help Files online


    {Alpha Image Control} {Memory Leak FAQ} {Unicode Open/Save Dialog} {Resource Image Viewer/Extractor}
    {VB and DPI Tutorial} {Manifest Creator} {UserControl Button Template} {stdPicture Render Usage}

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width