string formatting/manipulation/parsing
Does anyone know how i can parse
this:
Code:
<tr bgcolor="#e3e6ea" align="center" class="menulink1">
<td align="left" height=17>193.131.192.85</td>
<td height=17>80</td>
<td>anonymous</td>
<td>Great Britain (UK)</td>
<td>29.06.2005</td>
<td><a class="small" href="/cgi-bin/whois.cgi?domain=193.131.192.85" target="_blank"><b>Whois</b></a></td>
</tr>
into
193.131.192.85:80
Code:
<tr bgcolor="#ffffff" class="proxy_text" height=10>
<td>82.201.185.22</td>
<td>8080</td>
<td>anonymous</td>
<td>Egypt</td>
<td>29.06.2005</td>
<td><a href="/cgi-bin/whois.cgi?domain=82.201.185.22" target="_blank">Whois</a></td>
</tr>
into 82.201.185.22:8080
Code:
<td width="24%" height="1"><font face="Tahoma">
<span style="font-size: 8.5pt"><font color="#808080">
218.219.154.126</font></span></font></td>
<td width="13%" height="1"><font face="Tahoma" color="#808080">
<span style="font-size: 8.5pt">80</span></font></td>
<td width="19%" height="1"><font face="Tahoma" color="#808080">
<span style="font-size: 8.5pt">High anonymity</span></font></td>
<td width="22%" height="1"><font face="Tahoma" color="#808080">
<span style="font-size: 8.5pt">Japan</span></font></td>
<td width="22%" height="1">
<font face="Tahoma" style="font-size: 8.5pt" color="#808080">
Sun Mar 27 2005</font></td>
</tr>
into 218.219.154.126:80
all in the same code/function?
Re: string formatting/manipulation/parsing
If it is three separate strings, then it can be done, but if it is one long page then it will be more difficult. You would have to search for the "<TD>" strings, and knowing which ones you want to parse may be an issue.
Can you save them as three strings to be parsed?
1 Attachment(s)
Re: string formatting/manipulation/parsing
It's possible. The way I did it, was remove the HTML tags, and replace them with a special character, then replace any doubled up special characters, newlines, and spaces, then Split the string up into an array. There a bit more to it, I attach the example anyway.
HTH
chem
Re: string formatting/manipulation/parsing
Won't work since it has to be in format
ip:port
ip:port
ip:port
or
(: = similie)
ip;port(seperator)ip;port(sep)ip;port and so on so you can split it into an array and add to a listbox
so it can be in an array then added to a listbox
has to parse all the way through the page with every proxy.
Re: string formatting/manipulation/parsing
Well, will you know the format of the page you are doing every time? I mean are you just going to be doing the 3 page formats listed above or will they be random pages with proxies on them? Because I am writing an HTML parser Class Module right now that would work really well for the purpose you mentioned. You would still have to do some of the work in your program but my Class could get you the inne HTML pretty well.
It won't be ready for a few days (maybe a week to be completely done). If your interested, let me know.
1 Attachment(s)
Re: string formatting/manipulation/parsing
Well, you can make my example do that. If I understand correctly, you want to add them to a listbox in the format IP:*Port. If so, I attached a way to do that:
chem
Re: string formatting/manipulation/parsing
Thats a god solution for this and works perfectly. I was just offering a more general Inner HTML solution. Hope chemicalNova's mini-app was what you were looking for ICENOVA. Good luck.
Re: string formatting/manipulation/parsing