Results 1 to 17 of 17

Thread: [RESOLVED] Get the root domain from a url?

  1. #1

    Thread Starter
    Frenzied Member longwolf's Avatar
    Join Date
    Oct 2002
    Posts
    1,343

    Resolved [RESOLVED] Get the root domain from a url?

    Is there an API that can pull the root domain from a URL?
    Or maybe a way to do it with the WebBrowser control?

    Or am I going to have to think of some fancy string manipulation?

  2. #2
    Banned timeshifter's Avatar
    Join Date
    Mar 2004
    Location
    at my desk
    Posts
    2,465

    Re: Get the root domain from a url?

    String manipulation isn't hard. Just look for the third slash... that's the end of the root. Easy code to put together.

  3. #3

    Thread Starter
    Frenzied Member longwolf's Avatar
    Join Date
    Oct 2002
    Posts
    1,343

    Re: Get the root domain from a url?

    not completely right.
    For examle"
    http://by128fd.bay128.hotmail.msn.co...mbox=00xxxxxxx

    the root is msn.com
    But there are other, much stranger ways a url can be written.

  4. #4
    "Digital Revolution"
    Join Date
    Mar 2005
    Posts
    4,471

    Re: Get the root domain from a url?

    For that example, you could start at the third slash and work your way back to the 2nd . behind it.

    You will also need to take into account that the user may not enter the http:// part. Or may leave out the www. And there may not be a slash after the domain name, ie: hotmail.msn.com.

  5. #5

    Thread Starter
    Frenzied Member longwolf's Avatar
    Join Date
    Oct 2002
    Posts
    1,343

    Re: Get the root domain from a url?

    Quote Originally Posted by DigiRev
    For that example, you could start at the third slash and work your way back to the 2nd . behind it.

    You will also need to take into account that the user may not enter the http:// part. Or may leave out the www. And there may not be a slash after the domain name, ie: hotmail.msn.com.
    Exactly, that's why I'd like a nice reliable API or WebBrowser method.
    It'd be a lot easier and cleaner than trying to think of all the possibilities

  6. #6
    "Digital Revolution"
    Join Date
    Mar 2005
    Posts
    4,471

    Re: Get the root domain from a url?

    Quote Originally Posted by longwolf
    Exactly, that's why I'd like a nice reliable API or WebBrowser method.
    It'd be a lot easier and cleaner than trying to think of all the possibilities
    Well, it would be pretty simple string manipulation, it would just be a lengthy function, because you have to consider all possibilities.

    To my knowledge, there is no API function for this. And there are no functions available in the WebBrowser control that will give you the root domain name.

    I don't have a lot of experience with the WebBrowser control, though, so I could be wrong.

  7. #7

    Thread Starter
    Frenzied Member longwolf's Avatar
    Join Date
    Oct 2002
    Posts
    1,343

    Re: Get the root domain from a url?

    The WebBrowser control has a LOT to it, there just aren't any help files with it.

    There are several API calls for working with URLs, and it seems like I saw one once for doing this.
    But I can't find it now.

    I'm very good at string manipulation, I'm just hopeing for a better option here.

  8. #8

    Thread Starter
    Frenzied Member longwolf's Avatar
    Join Date
    Oct 2002
    Posts
    1,343

    Re: Get the root domain from a url?

    I found it.
    VB Code:
    1. Private Declare Function UrlGetPart Lib "shlwapi" Alias "UrlGetPartA" (ByVal pszIn As String, ByVal pszOut As String, pcchOut As Long, ByVal dwPart As Long, ByVal dwFlags As Long) As Long

    oops, had the wrong URL, here's the right one:
    http://planetsourcecode.com/vb/scrip...27444&lngWId=1
    Last edited by longwolf; Dec 3rd, 2006 at 12:40 AM.

  9. #9

    Thread Starter
    Frenzied Member longwolf's Avatar
    Join Date
    Oct 2002
    Posts
    1,343

    Re: Get the root domain from a url?

    Beats the heck out a class file I found.
    It was over 500 lines and that was with it using RegEX!!!!

  10. #10
    "Digital Revolution"
    Join Date
    Mar 2005
    Posts
    4,471

    Re: [RESOLVED] Get the root domain from a url?

    Nice find. Never knew that API function existed.

  11. #11

    Thread Starter
    Frenzied Member longwolf's Avatar
    Join Date
    Oct 2002
    Posts
    1,343

    Re: [RESOLVED] Get the root domain from a url?

    Just been playing with it.
    I'll still have to do a little string work to get the root domain, but not much.

  12. #12

    Thread Starter
    Frenzied Member longwolf's Avatar
    Join Date
    Oct 2002
    Posts
    1,343

    Re: [RESOLVED] Get the root domain from a url?

    Turns out there's an easy way to get same thing from the WebBrowser too
    VB Code:
    1. Debug.Print wbrBrowser.Document.domain

  13. #13
    Junior Member scrapersNbots.com's Avatar
    Join Date
    Dec 2016
    Location
    Torrington, CT
    Posts
    25

    Re: [RESOLVED] Get the root domain from a url?

    Found a Good Simple Solution at http://vb6.info/internet/extract-root-url-url-string/
    Overall pretty reliable and pretty short compared to other solutions.

    at top of module:

    Code:
    Private Const CONST_HOSTNAME = 2
    Private Declare Function UrlGetPart Lib "shlwapi" Alias "UrlGetPartA" (ByVal pszIn As String, ByVal pszOut As String, pcchOut As Long, ByVal dwPart As Long, ByVal dwFlags As Long) As Long
    then

    Code:
    Public Function sRet(ByVal strUrl As String) As String
    On Error GoTo oops:
        
        Dim lngPtr  As Long, lngLen As Long
        Dim strSection As String, sTemp As String
        
        If (Len(strUrl) = 0) Then Exit Function
        
        strSection = Space$(260&)
        lngLen = Len(strSection)
                
         If (UrlGetPart(strUrl, strSection, lngLen, CONST_HOSTNAME, 0&) = 0) Then
                      sTemp = Left$(strSection, lngLen)
         Else
                      sRet = "error"
                      Exit Function
         End If
        
        'to avoid issues with complicated sub domains, etc, split at second .
        sTemp = sfuncRootFromSubDomains(sTemp)
        sRet = sTemp
        
    
    Exit Function
    oops:
    If err.Number <> 0 Then
                 Stop '
    End If
    End Function
     
    
    Private Function sfuncRootFromSubDomains(ByVal sUrl As String) As String
    On Error GoTo oops:
    
        sUrl = Replace(sUrl, "http://", "") 'remove http and https
        sUrl = Replace(sUrl, "https://", "")
        sUrl = Replace(sUrl, "www.", "") 'remove www.
    
        Dim sParts() As String:       sParts = Split(sUrl, ".")
        Dim iUpp As Integer:          iUpp = UBound(sParts)
    
        sUrl = (sParts(iUpp - 1) & "." & sParts(iUpp))
        sfuncRootFromSubDomains = sUrl
    
    Exit Function
    oops:
    If err.Number <> 0 Then
                 Stop '
    End If
    End Function
    〘SCRAPER〙software that extracts, organizes and displays data from the web.
    〘BOT〙software that automates and speeds up tasks on the web, mimicking human behavior.

  14. #14
    Junior Member scrapersNbots.com's Avatar
    Join Date
    Dec 2016
    Location
    Torrington, CT
    Posts
    25

    Re: Get the root domain from a url?

    The planet source code solution has been removed for some reason.
    〘SCRAPER〙software that extracts, organizes and displays data from the web.
    〘BOT〙software that automates and speeds up tasks on the web, mimicking human behavior.

  15. #15
    Hyperactive Member
    Join Date
    Oct 2016
    Posts
    369

    Re: [RESOLVED] Get the root domain from a url?

    If post #12 is correct seems a lot simpler than post #13

  16. #16
    coder. Lord Orwell's Avatar
    Join Date
    Feb 2001
    Location
    Elberfeld, IN
    Posts
    7,628

    Re: [RESOLVED] Get the root domain from a url?

    couldn't you split by dots and take the last array element, then split again by slashes and take ... the 2nd?
    My light show youtube page (it's made the news) www.youtube.com/@lightsofelberfeld
    Contact me on the socials www.facebook.com/lordorwell

  17. #17
    Junior Member scrapersNbots.com's Avatar
    Join Date
    Dec 2016
    Location
    Torrington, CT
    Posts
    25

    Re: [RESOLVED] Get the root domain from a url?

    yes that works IF a web browser is being used.
    〘SCRAPER〙software that extracts, organizes and displays data from the web.
    〘BOT〙software that automates and speeds up tasks on the web, mimicking human behavior.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width