Results 1 to 4 of 4

Thread: Is there an ActiveX component for parsing a URL?

  1. #1

    Thread Starter
    Frenzied Member
    Join Date
    Oct 2008
    Posts
    1,181

    Is there an ActiveX component for parsing a URL?

    I would like to add a compiled ActiveX component to VB6 my project (either ActiveX DLL, or an OCX ActiveX Control) that would provide a set of objects that represent different parts of a URL. A URL is typically something like http://www.mywebsite.com/folder1/folder2/file.txt but it can be more complicated. It can also be http://username@www.mywebsite.com/fo...lder2/file.txt or even http://username:password@www.mywebsi...lder2/file.txt. And that's not all. If you have a port number in the URL, the URL might now look like http://username:password@www.mywebsi...lder2/file.txt and it's still not as complex as possible. There's also a query string that might be present, which can itself contain 1 or more variables (each of which is a name=value pair), in which case you might find the URL looking like this, and this is the most complete type of URL that can exist.
    http://username:password@www.mywebsi...ar=ABC&xyz=123

    Of course the simplest URL is actually http://www.mywebsite.com and it may or may not have an extra slash at the end. With the extra slash it would look like http://www.mywebsite.com/ so that needs to be considered too. Same thing with longer URLs which if the last item is a folder, instead of a file, it may have a trailing slash present, but if not present it is still implied to be present, and in both cases it the implication when specifying a folder in a URL is that the desired file is the default one in that folder (usually index.html, index.htm, or index.php). So even a complex URL with all the parts, if it is specifying a folder instead of a file, may or may not have a trailing slash after the folder (but before the question mark, if a query string is present). And if the connection is encrypted, the URL will start with https instead of http.

    As you can see, even something normally thought of a simple, is actually VERY complex. The simplest URLs can be split in VB6 using the Split() function, but that assumes you know what kind of URLs your software will be encountering. If you need software that can handle ANY URL that's thrown at it (for example server software that's actually facing the internet, and anybody who connects to it could send it any possible URL that might exist), it suddenly gets so complex, I can't even BEGIN to figure out how to write pure VB6 code to process the URLs.

    That's where my request for somebody to help me find an ActiveX component comes in. I would hope the ActiveX component would contain all the complex code for parsing a URL, and allow my program to simply request from the ActiveX component the parts of the URL that my program will actually need to use. Such an ActiveX component would need to be able to read a URL, figure out which parts of the URL exist (for example if it contains a port number or not), and then present that info to my program.

    I can imagine such OCX or DLL working like this. Each URL would be represented by a URL object. Parsing of a URL text string would occur when you called the Parse function of the URL object, with the URL string as the parameter to that function. Each URL object would have a number of properties, including boolean values that would tell whether or not each part of the URL was present. It would also contain other properties that would store the value for those parts of the URL if present (such as protocol, domain, username, password, port number, file path). The query string, if present, would be split into several objects of type QueryVariable, and these would be stored in a Collection object that would be presented to any VB6 program that needed to look at the query variables. Each QueryVariable object would have 2 properties called Name and Value, which would store the name of the query variable, and the value that it was set to.

    Now I don't know if an ActiveX component like this exists yet or not. If it did, I could use it easily in VB6, but I do know something like this exists in .Net and is called a System.URI object, but sadly VB6 can't use .Net objects. Therefore, I'm trying to find an ActiveX component that works identically (or nearly identically) to the System.URI object in .Net. Despite the best Google searching I did for such an ActiveX component, I couldn't find one. I hope some of you here on this forum will have better luck finding one than I did. I feel almost certain that one exists, because parsing URLs is a very common thing in internet software like webservers.

  2. #2
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: Is there an ActiveX component for parsing a URL?


  3. #3

    Thread Starter
    Frenzied Member
    Join Date
    Oct 2008
    Posts
    1,181

    Re: Is there an ActiveX component for parsing a URL?

    Quote Originally Posted by dilettante View Post
    Thanks for that. That will help. Now I just have one more issue with processing URLs. If URLs include a special character, then the URL also needs a URL Decoding step. While that can be done easily enough for most ASCII characters (byte values from 0 to 127, for such as the symbols $, or @, or #). This won't work for Unicode symbols (which take 2 bytes), or even for extended ASCII characters (byte values from 128 to 255), because URLs are NOT actually ASCII coded strings. They are UTF8 coded strings, which is a form of variable length encoding. This means ASCII character's who's value is from 0 to 127 take up one byte, but ASCII characters who's value is from 128 to 255 take up 2 bytes. This is because the high bit is set on ASCII characters from 128 to 255, but in UTF8 encoding the high bit has a special meaning, which requires the byte value for extended ASCII characters to actually be split between 2 bytes, so a character that should take 1 byte ends up taking 2 bytes. This means that my program will not only need to handle 1-byte ASCII characters, but if it is to work in all situations, it must properly process UTF8 encoded characters. Any idea how to do that correctly?

  4. #4
    PowerPoster
    Join Date
    Feb 2006
    Posts
    24,482

    Re: Is there an ActiveX component for parsing a URL?

    Look at the dwFlags argument. As far as I know that should be able to handle encoded UTF-8.

    There is also UrlUnescapeW function (shlwapi.h) and friends to consider.

    If your program must run under a service (even Task Scheduler) you might look into the safer alternatives here: https://docs.microsoft.com/en-us/win...ttp-start-page

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width