Results 1 to 7 of 7

Thread: [RESOLVED] Excel VBA IE scraper - read all rows in table?

  1. #1

    Thread Starter
    New Member
    Join Date
    Dec 2017
    Posts
    8

    Resolved [RESOLVED] Excel VBA IE scraper - read all rows in table?

    Hello,

    I have this code that works but only for one row or last row in table. How could it be modified to output all the rows in a table. I considered obtaining the number of rows shown on page and modify loop but all entries are just the last entry in table. Any help appreciated thanks
    Mike

    Code:
    Set the_input_elements = objIE.document.getElementsByTagName("td")
        For Each input_element In the_input_elements
            
          
            If input_element.getAttribute("headers") = "STREET_NO" Then
                Cells(i, 3) = input_element.innerText
            ElseIf input_element.getAttribute("headers") = "STREET_NAME" Then
                Cells(i, 4) = input_element.innerText
            ElseIf input_element.getAttribute("headers") = "POSTAL_CODE" Then
                Cells(i, 5) = input_element.innerText
            ElseIf input_element.getAttribute("headers") = "BUSINESS_NAME" Then
                Cells(i, 6) = input_element.innerText
            ElseIf input_element.getAttribute("headers") = "ACCOUNT_STATUS" Then
                Cells(i, 7) = input_element.innerText
            End If
           
        Next input_element

  2. #2
    PowerPoster
    Join Date
    Dec 2004
    Posts
    25,618

    Re: Excel VBA IE scraper - read all rows in table?

    not enough information, post the full htm code for the table, ideally you should be returning the "tr" elements, then reading the "td" elements from each "tr", but i still do not see why it only gets the last row unless each row overwrites the previous row,
    where are you incrementing i between rows?
    i do my best to test code works before i post it, but sometimes am unable to do so for some reason, and usually say so if this is the case.
    Note code snippets posted are just that and do not include error handling that is required in real world applications, but avoid On Error Resume Next

    dim all variables as required as often i have done so elsewhere in my code but only posted the relevant part

    come back and mark your original post as resolved if your problem is fixed
    pete

  3. #3

    Thread Starter
    New Member
    Join Date
    Dec 2017
    Posts
    8

    Re: Excel VBA IE scraper - read all rows in table?

    I am very new to this vba and IE still. I would think it would read every td tag? When I run a loop displaying everything in td it shows all addresses and obviously all the data.
    The number of rows in a typical table are usually one but this html code example below has 4 rows based on 4 dundas st address:
    I would like to pull the data as shown in my original code above which seems to pull only from the last row. Thank you!


    Code:
    <tbody>
    <tr style="vertical-align: top;"><th align="center" class="t13ReportHeader" id="OPEN">Open</th><th align="left" class="t13ReportHeader" id="CLIENT_TYPE">Customer Type</th><th align="left" class="t13ReportHeader" id="ACCOUNT_NO">Account No</th><th align="left" class="t13ReportHeader" id="BUSINESS_NAME">Business Name</th><th align="left" class="t13ReportHeader" id="GEO_ID">Geo ID</th><th align="right" class="t13ReportHeader" id="STREET_NO">ST #</th><th align="left" class="t13ReportHeader" id="SUFFIX">Suffix</th><th align="left" class="t13ReportHeader" id="STREET_NAME">Street Name</th><th align="left" class="t13ReportHeader" id="UNIT_NO">Unit No</th><th align="center" class="t13ReportHeader" id="POSTAL_CODE">Postal Code</th><th align="left" class="t13ReportHeader" id="START_DATE">Start Date</th><th align="left" class="t13ReportHeader" id="END_DATE">End Date</th><th align="left" class="t13ReportHeader" id="APPL_STATUS">Appl Status</th><th align="left" class="t13ReportHeader" id="ACCOUNT_STATUS">Acct Status</th></tr>
    
    <tr style="vertical-align: top;"><td align="center" class="t13dataalt" headers="OPEN"><a href="f?p=1275:601:9158715575684:UPDATE:NO:RP,601:P600_CUR_ID,P601_CLIENT_ID,P601_LOCATION_ID,P601_ACCOUNT_ID,P601_BUILDING_ID,P601_PREV_PAGE,P601_REQUEST:14423,14423,14045,14423,7146,600,UPDATE"><img alt="Open the customer info" src="/i5cfsb/e2.gif" border="0"></a></td><td class="t13dataalt" headers="CLIENT_TYPE">COMMERCIAL</td><td class="t13dataalt" headers="ACCOUNT_NO">-07</td><td class="t13dataalt" headers="BUSINESS_NAME">SE LTD</td><td class="t13dataalt" headers="GEO_ID">1498</td><td align="right" class="t13dataalt" headers="STREET_NO">4</td><td class="t13dataalt" headers="SUFFIX"><br></td><td class="t13dataalt" headers="STREET_NAME">DUNDAS ST </td><td class="t13dataalt" headers="UNIT_NO">    1</td><td class="t13dataalt" headers="POSTAL_CODE">M5</td><td class="t13dataalt" headers="START_DATE">27-NOV-2012</td><td class="t13dataalt" headers="END_DATE"><br></td><td class="t13dataalt" headers="APPL_STATUS">APPROVED</td><td class="t13dataalt" headers="ACCOUNT_STATUS">ACTIVE</td></tr>
    
    <tr style="vertical-align: top;"><td align="center" class="t13data" headers="OPEN"><a href="f?p=1275:601:9158715575684:UPDATE:NO:RP,601:P600_CUR_ID,P601_CLIENT_ID,P601_LOCATION_ID,P601_ACCOUNT_ID,P601_BUILDING_ID,P601_PREV_PAGE,P601_REQUEST:14454,14454,14450,14453,7146,600,UPDATE"><img alt="Open the customer info" src="/i5cfsb/e2.gif" border="0"></a></td><td class="t13data" headers="CLIENT_TYPE">COMMERCIAL</td><td class="t13data" headers="ACCOUNT_NO">--</td><td class="t13data" headers="BUSINESS_NAME">FOODN RESTAURANT</td><td class="t13data" headers="GEO_ID">4958</td><td align="right" class="t13data" headers="STREET_NO">4</td><td class="t13data" headers="SUFFIX"><br></td><td class="t13data" headers="STREET_NAME">DUNDAS ST</td><td class="t13data" headers="UNIT_NO"> BSMT</td><td class="t13data" headers="POSTAL_CODE">M5T </td><td class="t13data" headers="START_DATE">28-MAR-2011</td><td class="t13data" headers="END_DATE"><br></td><td class="t13data" headers="APPL_STATUS">APPROVED</td><td class="t13data" headers="ACCOUNT_STATUS">ACTIVE</td></tr>
    
    <tr style="vertical-align: top;"><td align="center" class="t13dataalt" headers="OPEN"><a href="f?p=1275:601:9158715575684:UPDATE:NO:RP,601:P600_CUR_ID,P601_CLIENT_ID,P601_LOCATION_ID,P601_ACCOUNT_ID,P601_BUILDING_ID,P601_PREV_PAGE,P601_REQUEST:15810,15810,7165,15698,7146,600,UPDATE"><img alt="Open the customer info" src="/i5cfsb/e2.gif" border="0"></a></td><td class="t13dataalt" headers="CLIENT_TYPE">COMMERCIAL</td><td class="t13dataalt" headers="ACCOUNT_NO">-06</td><td class="t13dataalt" headers="BUSINESS_NAME">NOODLES</td><td class="t13dataalt" headers="GEO_ID">104944</td><td align="right" class="t13dataalt" headers="STREET_NO">4</td><td class="t13dataalt" headers="SUFFIX"><br></td><td class="t13dataalt" headers="STREET_NAME">DUNDAS ST</td><td class="t13dataalt" headers="UNIT_NO"><br></td><td class="t13dataalt" headers="POSTAL_CODE">M5T</td><td class="t13dataalt" headers="START_DATE">10-JUL-2015</td><td class="t13dataalt" headers="END_DATE"><br></td><td class="t13dataalt" headers="APPL_STATUS">APPROVED</td><td class="t13dataalt" headers="ACCOUNT_STATUS">ACTIVE</td></tr>
    
    <tr style="vertical-align: top;"><td align="center" class="t13data" headers="OPEN"><a href="f?p=1275:601:9158715575684:UPDATE:NO:RP,601:P600_CUR_ID,P601_CLIENT_ID,P601_LOCATION_ID,P601_ACCOUNT_ID,P601_BUILDING_ID,P601_PREV_PAGE,P601_REQUEST:7291,7291,7165,7291,7146,600,UPDATE"><img alt="Open the customer info" src="/i5cfsb/e2.gif" border="0"></a></td><td class="t13data" headers="CLIENT_TYPE">COMMERCIAL</td><td class="t13data" headers="ACCOUNT_NO">-06</td><td class="t13data" headers="BUSINESS_NAME">UA</td><td class="t13data" headers="GEO_ID">10494458</td><td align="right" class="t13data" headers="STREET_NO">4</td><td class="t13data" headers="SUFFIX"><br></td><td class="t13data" headers="STREET_NAME">DUNDAS ST</td><td class="t13data" headers="UNIT_NO"><br></td><td class="t13data" headers="POSTAL_CODE">M5T </td><td class="t13data" headers="START_DATE">27-NOV-2001</td><td class="t13data" headers="END_DATE">09-JUL-2015</td><td class="t13data" headers="APPL_STATUS">APPROVED</td><td class="t13data" headers="ACCOUNT_STATUS">TERMINATED</td></tr>

  4. #4
    PowerPoster
    Join Date
    Nov 2017
    Posts
    3,630

    Re: Excel VBA IE scraper - read all rows in table?

    Quoting for emphasis:

    Quote Originally Posted by westconn1 View Post
    where are you incrementing i between rows?
    In your code you don't have any reference to what the variable i is or what value it has initially, but this should work:

    Code:
        Set the_input_elements = objIE.document.getElementsByTagName("td")
        i=1  'Adjust this value if the starting row needs to be something else
        For Each input_element In the_input_elements
            If input_element.getAttribute("headers") = "STREET_NO" Then
                Cells(i, 3) = input_element.innerText
            ElseIf input_element.getAttribute("headers") = "STREET_NAME" Then
                Cells(i, 4) = input_element.innerText
            ElseIf input_element.getAttribute("headers") = "POSTAL_CODE" Then
                Cells(i, 5) = input_element.innerText
            ElseIf input_element.getAttribute("headers") = "BUSINESS_NAME" Then
                Cells(i, 6) = input_element.innerText
            ElseIf input_element.getAttribute("headers") = "ACCOUNT_STATUS" Then
                Cells(i, 7) = input_element.innerText
            End If
            i=i+1
        Next input_element

  5. #5

    Thread Starter
    New Member
    Join Date
    Dec 2017
    Posts
    8

    Re: Excel VBA IE scraper - read all rows in table?

    Sorry, missed that part of the answer. Thanks for the prompt response.
    the variable i is in a for/next loop that reads addresses from columns in excel to IE which then pulls a table (html above)


    Code:
    For i = 3 To lastrowA ' reads addresses from column in excel to IE which then pulls a table (html above)
    
    
    Set the_input_elements = objIE.document.getElementsByTagName("td")
        For Each input_element In the_input_elements
            
          
            If input_element.getAttribute("headers") = "STREET_NO" Then
                Cells(i, 7) = input_element.innerText
                   
            ElseIf input_element.getAttribute("headers") = "STREET_NAME" Then
                Cells(i, 8) = input_element.innerText
                ElseIf input_element.getAttribute("headers") = "UNIT_NO" Then
                Cells(i, 9) = input_element.innerText
            ElseIf input_element.getAttribute("headers") = "POSTAL_CODE" Then
                Cells(i, 10) = input_element.innerText
            ElseIf input_element.getAttribute("headers") = "BUSINESS_NAME" Then
                Cells(i, 4) = input_element.innerText
            ElseIf input_element.getAttribute("headers") = "APPL_STATUS" Then
                Cells(i, 6) = input_element.innerText
            ElseIf input_element.getAttribute("headers") = "ACCOUNT_STATUS" Then
                Cells(i, 5) = input_element.innerText
            End If
        
        Next input_element

  6. #6
    PowerPoster
    Join Date
    Nov 2017
    Posts
    3,630

    Re: Excel VBA IE scraper - read all rows in table?

    Quote Originally Posted by mikethebike View Post
    Sorry, missed that part of the answer. Thanks for the prompt response.
    the variable i is in a for/next loop that reads addresses from columns in excel to IE which then pulls a table (html above)


    Code:
    For i = 3 To lastrowA ' reads addresses from column in excel to IE which then pulls a table (html above)
    
    
    Set the_input_elements = objIE.document.getElementsByTagName("td")
        For Each input_element In the_input_elements
            
          
            If input_element.getAttribute("headers") = "STREET_NO" Then
                Cells(i, 7) = input_element.innerText
                   
            ElseIf input_element.getAttribute("headers") = "STREET_NAME" Then
                Cells(i, 8) = input_element.innerText
                ElseIf input_element.getAttribute("headers") = "UNIT_NO" Then
                Cells(i, 9) = input_element.innerText
            ElseIf input_element.getAttribute("headers") = "POSTAL_CODE" Then
                Cells(i, 10) = input_element.innerText
            ElseIf input_element.getAttribute("headers") = "BUSINESS_NAME" Then
                Cells(i, 4) = input_element.innerText
            ElseIf input_element.getAttribute("headers") = "APPL_STATUS" Then
                Cells(i, 6) = input_element.innerText
            ElseIf input_element.getAttribute("headers") = "ACCOUNT_STATUS" Then
                Cells(i, 5) = input_element.innerText
            End If
        
        Next input_element
    Yeah, so work through your code here. First pass of outer loop (for i = ...), i = 3. And, lets give the_input_elements three "td" sections.

    So for the first pass of the inner loop (For Each input_element ...), it displays the specific attributes from the first input_element into the defined cells in Row 3, since i=3. It then displays (and overwrites the attributes from the first input_element) the specific attributes from the second input_element into the defined cells in Row 3, since i=3. It then displays (and overwrites the attributes from the second input_element) the specific attributes from the third input_element into the defined cells in Row 3, since i=3.

    I trust you see the problem here. i isn't changing inside the inner loop, so data from each input_element overwrites the previous, until there are no input_elements left in the_input_elements, resulting in the "last" input_element being displayed, which is what you are reporting.

  7. #7

    Thread Starter
    New Member
    Join Date
    Dec 2017
    Posts
    8

    Re: Excel VBA IE scraper - read all rows in table?

    Thank you!!!
    It was so obvious but couldn't see it. I put the j=j+1 in the if statemnt

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width