dcsimg
Results 1 to 5 of 5

Thread: GET icon from web

  1. #1

    Thread Starter
    Frenzied Member
    Join Date
    Mar 2005
    Location
    Italy-Napoli
    Posts
    1,944

    GET icon from web

    my code:

    url: https://www.tuttitalia.it/banche/classifica/1/

    Code:
    Sub ExtractDetails(doc)
    
        Dim R As Integer, TS As Object, COLONNE As Integer, RIGHE As Integer, NRAG As Long
    
        Set TS = doc.getElementsByTagName("table")
    
        COLONNE = TS(2).rows(1).cells.length
        RIGHE = TS(2).rows.length
    
        For R = 1 To RIGHE - 1
            BANCA = Trim$(Mid$(UCase(TS(2).rows(R).cells(0).innerText), 7, 60))
            ABI = TS(2).rows(R).cells(1).innerText
            NRAG = TS(2).rows(R).cells(2).innerText
        Next R
    
    End Sub
    ho to get icon from each bank rows, and store in Access table?

    for example in image attached.

    note:
    not all bank have a icon
    Attached Images Attached Images  

  2. #2

    Thread Starter
    Frenzied Member
    Join Date
    Mar 2005
    Location
    Italy-Napoli
    Posts
    1,944

    Re: GET icon from web

    Sorry for up

  3. #3
    PowerPoster
    Join Date
    Feb 2006
    Posts
    20,843

    Re: GET icon from web

    Not many people will help with web scraping any more.

    Unless a web site explicitly says "go ahead and scrape our data" the assumption is that you can't. Not even a copyright notice is required. This is like finding someone's door unlocked and presuming you can go in and take or do anything you want. You can't.

    Many sites get hit and either lose intellectual property, lose data sales (many offer bulk data access APIs they charge for), or suffer excessive bot-driven load on their web servers. As a result they may take steps to block access or obfuscate their pages. A popular trick today is to put only a template in the HTML itself along with script that loads data from separate URLs to populate the template page. Sometimes it has many layers, for example only a bootstrap script on the page that downloads and runs additional script to fetch data and populate the template.

    These days most web secretaries just follow "recipes" anyway, often based on giant balls of open sores JavaScript "libraries." They don't even realize (or care) what a huge pile of poo their web server is excreting or how bad it smells. Some may even be rewarded for heavy diapers by clueless "project manager" types (i.e. somebody's niece or nephew with all the professional credentials of a strip mall shoe salesman whose own family can't trust the idiot with bus fare).

    But let's assume this site you are targeting doesn't care and has not taken any defensive measures.


    Then you start by downloading the raw HTML and examining it by eye to try to find patterns that will allow you to write a parser to extract the desired information. And then you write the parser.

    HTML is ugly and often very poorly constructed. These days there is a lot of UTF-8 out there and working around the problems that brings are not as trivial as one might hope. There might be just one crappy misplaced character in a web page (usually junk like and other misanthropic variations on the alphabet) and you can end up chasing your tail for days trying to clean it up.

    That particular character can end up as three cruddy characters when you translate from UTF-8 to a human encoding like the Unicode we use in VB6, two of which display as "?" just to help make things clumsy.

    Basically there is a ton of investigation, coding around special cases, and trial and error involved. I'm not sure how anyone can help you with that aside from doing all of the work for you.


    But let's say you find some sucker here with low enough morals not to question web scraping who will do the entire thing for you. Perhaps after more begging and cajoling you get exactly the perfect shiny results you want.

    Two days later the web site goes through a redesign. Or they change something trivial to a human reader, like some extra empty tag gets added somewhere. Or maybe they correct an error. Or one day the data is just a little different from what was there last week.

    Poof! Your scrape-bot no longer works.

  4. #4
    Hyperactive Member
    Join Date
    Mar 2018
    Posts
    369

    Re: GET icon from web

    Quote Originally Posted by dilettante View Post
    Not many people will help with web scraping any more.

    Unless a web site explicitly says "go ahead and scrape our data" the assumption is that you can't.
    This assumption is being challenged in courts and judges are ruling in favor of the scrappers:
    https://www.eff.org/deeplinks/2018/0...computer-crime

  5. #5
    PowerPoster
    Join Date
    Feb 2006
    Posts
    20,843

    Re: GET icon from web

    I guess I was thinking of my personal assumption.

    I had a contract employer a few years back who had their data swiped this way after every daily update. They battled it for a while and decided to just pull the web pages and quit providing any free access at all.

    I can't imagine any court deciding that unlocked doors with no Keep Out signs are fair game though.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Featured


Click Here to Expand Forum to Full Width