Results 1 to 13 of 13

Thread: [RESOLVED] Download a web page into a text file

  1. #1

    Thread Starter
    Addicted Member
    Join Date
    Sep 2005
    Posts
    177

    Resolved [RESOLVED] Download a web page into a text file

    Hello !

    In an application I declare:

    Code:
    Public Declare Function URLDownloadToFile Lib "urlmon" _
        Alias "URLDownloadToFileA" (ByVal pCaller As Long, _
        ByVal szURL As String, ByVal szFileName As String, _
        ByVal dwReserved As Long, ByVal lpfnCB As Long) As Long
    use the function:

    Code:
    Public Function DownloadFile(URL As String, LocalFilename As String) As Boolean
        Dim lngRetVal As Long
        lngRetVal = URLDownloadToFile(0, URL, LocalFilename, 0, 0)
        If lngRetVal = 0 Then DownloadFile = True
    End Function
    and:

    Code:
    URLDownloadToFile 0, "https://some_url.com", "TextFile.txt", 0, 0
    .....
    .....
    to download a, not so large, web page into a text file.

    When I start the application the first time often nothing happens. (Nothing is downloaded !)

    But if I first manually, with a web browser, open the same website, close it again and thereafter re-run the application, everything works as it should and then every attempt !

    At least that's the case in design mode....

    What can I add to the application to make it download the web page at the first start, without having to preopen the website ?

    Best regards and thanks in advance.

    /Kalle

  2. #2
    PowerPoster
    Join Date
    Jul 2010
    Location
    NYC
    Posts
    7,667

    Re: Download a web page into a text file

    Try the BINDF_GETNEWESTVERSION flag.

    If that doesn't work, the function I usually use for simple http text grabs is this:

    Code:
    Public Declare Function InternetCloseHandle Lib "wininet.dll" (ByVal hInet As Long) As Integer
    
    Public Declare Function InternetOpen Lib "wininet.dll" Alias "InternetOpenA" _
                                                                                (ByVal lpszAgent As String, _
                                                                                ByVal dwAccessType As Long, _
                                                                                ByVal lpszProxyName As String, _
                                                                                ByVal lpszProxyBypass As String, _
                                                                                ByVal dwFlags As Long) As Long
    
    Public Declare Function InternetOpenUrl Lib "wininet.dll" Alias "InternetOpenUrlA" (ByVal hInternetSession As Long, _
                                                                                        ByVal lpszUrl As String, _
                                                                                        ByVal lpszHeaders As String, _
                                                                                        ByVal dwHeadersLength As Long, _
                                                                                        ByVal dwFlags As Long, _
                                                                                        ByVal dwContext As Long) As Long
    
    Public Declare Function InternetReadFile Lib "wininet.dll" (ByVal hFile As Long, _
                                                                ByVal lpBuffer As String, _
                                                                ByVal dwNumberOfBytesToRead As Long, _
                                                                lNumberOfBytesRead As Long) As Integer
    
    
    Private Sub CopyURLToFile(ByVal URL As String, ByVal sFilename As String)
        Dim hInternetSession As Long
        Dim hUrl As Long
        Dim hFile As Integer
        Dim hr As Boolean
        Dim lBytes As Long
        Dim sBuffer As String
        Dim fileIsOpen As Boolean
    
        On Error GoTo e0
    
        If (Len(URL) = 0) Or (Len(sFilename) = 0) Then Exit Sub
    
        hInternetSession = InternetOpen(App.EXEName, INTERNET_OPEN_TYPE_PRECONFIG, "", "", 0)
        If hInternetSession = 0 Then Err.Raise vbObjectError + 1000, , _
            "An error occurred calling InternetOpen function"
    
        hUrl = InternetOpenUrl(hInternetSession, URL, vbNullString, 0, _
            INTERNET_FLAG_EXISTING_CONNECT, 0)
        If hUrl = 0 Then Err.Raise vbObjectError + 1000, , _
            "An error occurred calling InternetOpenUrl function"
    
        On Error Resume Next
        Kill sFilename
    
        On Error GoTo e0
        
        hFile = FreeFile
        Open sFilename For Binary As hFile
        fileIsOpen = True
    
        sBuffer = Space(4096)
        
        Do
            hr = InternetReadFile(hUrl, sBuffer, Len(sBuffer), lBytes)
            If lBytes = 0 Or Not hr Then Exit Do
            Put #hFile, , Left$(sBuffer, lBytes)
        Loop
    
    e0:
        Close #hFile
        If hUrl Then InternetCloseHandle hUrl
        If hInternetSession Then InternetCloseHandle hInternetSession
        If Err Then
            Debug.Print "CopyURLToFile.Error->" & Err.Description & " (" & Err.Number & ")"
        End If
    End Sub

  3. #3

    Thread Starter
    Addicted Member
    Join Date
    Sep 2005
    Posts
    177

    Re: Download a web page into a text file

    Thanks fafalone !

    I've tested different variations of your suggestion !

    The problem seems to be 'InternetOpenUrl' which mostly results in = 0, but for unknown reasons suddenly shows an eight digit numbers, which means that contact has been established with the URL and the application can continue.
    Repeated stops and runs will remain positive, with differentiated eight-digit number as long as I don't unload my program.

    But if I close down the program ( Unload it ! Still in design mode) and then load it up and restart again, the result will be back to = 0 - until it suddenly and magically become high again after a number of attempts ...

    I can't see why this is happening, because I don't change any parameters ! My wish is to find this line work every time and from start !

    Maybe it's different and depending of what URL I try to connect ? ? !
    (It's a big difference if I try on "http://wwwmicrosoft.com" ! There it works from the very start and keep on...)

    My attempts are to localize different IP addresses from 'https://www.db-ip.com/IP"


    Have some explanations ? ?
    Why it still remains high during the same session ?
    Why it from start remains zero so many attempts....

    /Kalle

  4. #4
    PowerPoster
    Join Date
    Dec 2004
    Posts
    25,618

    Re: Download a web page into a text file

    try deleting the cache entry for the url first
    i do my best to test code works before i post it, but sometimes am unable to do so for some reason, and usually say so if this is the case.
    Note code snippets posted are just that and do not include error handling that is required in real world applications, but avoid On Error Resume Next

    dim all variables as required as often i have done so elsewhere in my code but only posted the relevant part

    come back and mark your original post as resolved if your problem is fixed
    pete

  5. #5

    Thread Starter
    Addicted Member
    Join Date
    Sep 2005
    Posts
    177

    Re: Download a web page into a text file

    Quote Originally Posted by westconn1 View Post
    try deleting the cache entry for the url first
    Yes ! I sure will..

    But how ? Programmatically or....

    How can it affect solution of the problem ?

  6. #6
    PowerPoster
    Join Date
    Jun 2001
    Location
    Trafalgar, IN
    Posts
    4,141

    Re: Download a web page into a text file

    I see in your sample code the URL is https. Do you need to log into the site before accessing content? If so this may be a problem with your approach.

  7. #7

    Thread Starter
    Addicted Member
    Join Date
    Sep 2005
    Posts
    177

    Re: Download a web page into a text file

    Quote Originally Posted by MarkT View Post
    I see in your sample code the URL is https. Do you need to log into the site before accessing content? If so this may be a problem with your approach.
    Access to the website works with and without "s", after "http". But the "s" is automatically added at the opening !

    Just to try (again !);

    I opened the site with my web browser (Mozilla Firefox).
    Then I started to use my program once again to open the same site, and failed - until I closed the web browser. Then the program worked !
    Opened the web browser again at the site, and the program keept on to work.
    Closed the VB design program and the browser

    Started VB design program. Result = 0. Opened the site again with web browser. Result = 0, even with repeated close web browser....

    Wrote those last lines... Tried again and the reult was 8 digits and high.....

    There is no pattern to lean on....

    /Kalle

  8. #8
    VB-aholic & Lovin' It LaVolpe's Avatar
    Join Date
    Oct 2007
    Location
    Beside Waldo
    Posts
    19,541

    Re: Download a web page into a text file

    What about accessing very common sites, i.e., google, cnn, foxnews, etc? Same issues?

    If not, maybe that site is restricting how fast you can access it between hits or how many simultaneous connections to the site from the same IP? Just a guess. Also that InternetOpenURL function has several optional flags

    Ensure you are closing all handles created by those APIs before you exit your routine.
    Insomnia is just a byproduct of, "It can't be done"

    Classics Enthusiast? Here's my 1969 Mustang Mach I Fastback. Her sister '67 Coupe has been adopted

    Newbie? Novice? Bored? Spend a few minutes browsing the FAQ section of the forum.
    Read the HitchHiker's Guide to Getting Help on the Forums.
    Here is the list of TAGs you can use to format your posts
    Here are VB6 Help Files online


    {Alpha Image Control} {Memory Leak FAQ} {Unicode Open/Save Dialog} {Resource Image Viewer/Extractor}
    {VB and DPI Tutorial} {Manifest Creator} {UserControl Button Template} {stdPicture Render Usage}

  9. #9

    Thread Starter
    Addicted Member
    Join Date
    Sep 2005
    Posts
    177

    Re: Download a web page into a text file

    Quote Originally Posted by LaVolpe View Post
    What about accessing very common sites, i.e., google, cnn, foxnews, etc? Same issues?

    If not, maybe that site is restricting how fast you can access it between hits or how many simultaneous connections to the site from the same IP? Just a guess. Also that InternetOpenURL function has several optional flags

    Ensure you are closing all handles created by those APIs before you exit your routine.
    As I wrote earlier: "It's a big difference if I try on "http://www.microsoft.com" ! There it works from the very start and keep on...

    Also, when in "high 8 digits mode" I can access how many times I want on this site, as well as miss if not in the mode...

    I'm still in design mode in VB, so it's actually not a finished program. Therefor I've put a breakpoint just after hitting 'InternetOpenURL', so that the program halt.
    Then I end it after check the result of 'InternetOpenURL'. So I think all handles are closed then....

    Am I wrong some how ??

    I will take a closer look at the flags you mention. Any special ? ?
    Last edited by Karl-Erik; Jan 27th, 2015 at 12:46 PM. Reason: complete

  10. #10
    VB-aholic & Lovin' It LaVolpe's Avatar
    Join Date
    Oct 2007
    Location
    Beside Waldo
    Posts
    19,541

    Re: Download a web page into a text file

    Do not recommend ending it after that check. You can add a couple lines, for debugging, that goes to the clean-up routine (in blue below). Modifications below are using the sample code in post #2 above.
    Code:
    ...
       hUrl = InternetOpenUrl(hInternetSession, URL, vbNullString, 0, _
            INTERNET_FLAG_EXISTING_CONNECT, 0)
    
        If True = True Then   ' debugging only. When not needed, either remove IF/EndIF code or change 1 True to False
           Stop
            GoTo e0
        End If
    
    ...
    
    e0:
        Close #hFile
        If hUrl Then InternetCloseHandle hUrl
        If hInternetSession Then InternetCloseHandle hInternetSession
        If Err Then
            Debug.Print "CopyURLToFile.Error->" & Err.Description & " (" & Err.Number & ")"
        End If
    ...
    As for the flags, maybe this one: INTERNET_FLAG_NO_CACHE_WRITE. But first see if not closing your handles was the problem

    Do recommend saving project & closing all instances of VB first. Then start fresh. If you had any open handles, they should be released once VB is completely unloaded.
    Last edited by LaVolpe; Jan 27th, 2015 at 02:10 PM.
    Insomnia is just a byproduct of, "It can't be done"

    Classics Enthusiast? Here's my 1969 Mustang Mach I Fastback. Her sister '67 Coupe has been adopted

    Newbie? Novice? Bored? Spend a few minutes browsing the FAQ section of the forum.
    Read the HitchHiker's Guide to Getting Help on the Forums.
    Here is the list of TAGs you can use to format your posts
    Here are VB6 Help Files online


    {Alpha Image Control} {Memory Leak FAQ} {Unicode Open/Save Dialog} {Resource Image Viewer/Extractor}
    {VB and DPI Tutorial} {Manifest Creator} {UserControl Button Template} {stdPicture Render Usage}

  11. #11
    PowerPoster
    Join Date
    Dec 2004
    Posts
    25,618

    Re: Download a web page into a text file

    But how ? Programmatically or....
    deleteurlcacheentry API
    i do my best to test code works before i post it, but sometimes am unable to do so for some reason, and usually say so if this is the case.
    Note code snippets posted are just that and do not include error handling that is required in real world applications, but avoid On Error Resume Next

    dim all variables as required as often i have done so elsewhere in my code but only posted the relevant part

    come back and mark your original post as resolved if your problem is fixed
    pete

  12. #12
    Default Member Bonnie West's Avatar
    Join Date
    Jun 2012
    Location
    InIDE
    Posts
    4,060

    Re: Download a web page into a text file

    In another recent thread, I posted a variation of the following. This might also work for you:

    Code:
    Public Sub SaveWebPageToFile(ByRef URL As String, ByRef FileName As String, Optional ByRef Charset As String = "utf-8")
        Const adSaveCreateOverWrite = 2&
        Dim oHttpReq As Object
    
        Set oHttpReq = CreateObject("WinHttp.WinHttpRequest.5.1")
        oHttpReq.Open "GET", URL
        oHttpReq.Send
    
        With CreateObject("ADODB.Stream")
            .Open
            .Charset = Charset
            .WriteText oHttpReq.ResponseText
            .SaveToFile FileName, adSaveCreateOverWrite
            .Close
        End With
    End Sub
    Code:
    Private Sub Main()
        SaveWebPageToFile "https://www.db-ip.com/", "index.htm"
    End Sub
    On Local Error Resume Next: If Not Empty Is Nothing Then Do While Null: ReDim i(True To False) As Currency: Loop: Else Debug.Assert CCur(CLng(CInt(CBool(False Imp True Xor False Eqv True)))): Stop: On Local Error GoTo 0
    Declare Sub CrashVB Lib "msvbvm60" (Optional DontPassMe As Any)

  13. #13

    Thread Starter
    Addicted Member
    Join Date
    Sep 2005
    Posts
    177

    Re: Download a web page into a text file

    Many thanks, Bonnie West !

    I'm now a happy man !
    With your code, it worked perfectly from the start and repeatedly with different IP numbers !
    Great ! ! You're worth a golden star !

    But also thanks to all the others for your efforts : fafalone, Westconn1, MarkT and LaVolpe !
    You gave me lots of inspiration (and some transpiration !) !

    / Karl-Erik in Sweden
    Last edited by Karl-Erik; Jan 28th, 2015 at 12:23 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width