|
-
Mar 20th, 2006, 05:30 PM
#1
Thread Starter
Addicted Member
[RESOLVED] How to Extract this text from html
Hi Dears,
is there a way to extract this text from the current page:
222.92.45.228:20861
86.12.56.187:7212
212.12.185.115:8080
62.150.77.94:27640
24.6.236.232:8081
203.70.47.83:4355
80.55.8.227:23176
66.226.34.68:31733
69.210.211.186:22788
68.53.26.248:7212
24.75.91.18:29122
58.145.97.93:50050
59.14.44.110:50050
58.236.22.118:50050
59.187.231.124:50050
61.32.111.51:50050
61.40.64.46:50050
61.38.147.214:50050
61.101.5.200:50050
61.106.84.127:50050
68.87.66.101:553
69.147.39.38:553
69.147.27.250:553
61.32.75.186:8002
24.255.32.185:8081
125.250.185.108:9597
203.133.27.128:15802
203.160.1.170:553
211.194.117.204:50050
211.213.153.162:50050
211.236.210.87:50050
211.172.142.151:50050
210.114.183.194:50050
217.216.149.200:3382
I try to get list of proxy from some pages
Last edited by _Conan_; Mar 21st, 2006 at 02:37 PM.
-
Mar 20th, 2006, 06:12 PM
#2
Re: How to Extract this text from html
Um, you could use a WebBrowser object and grab the Text through one of it's properties. Or you could use a Regex to try and extract the text.
Honestly, it would be better if you generate these IP addresses and put them into an XML file (Unless you did not generate these).
If you didn't generate these, then this is not a good idea. Many proxies require special agreements (even if they're free) and some things on some may be against the rules on others. Plus, you never know when the HTML will change which could throw off your parsing by alot. The site could even go down.
-
Mar 20th, 2006, 06:32 PM
#3
Re: How to Extract this text from html
If you can read the HTML into one long string, then you can use Regex in order to find all of the matches inside of the string... somthing like...
VB Code:
Dim TestString As String = "this 192.168.1.32:84 is a test 192.43.234.43:54"
Dim Regex As New System.Text.RegularExpressions.Regex("\d*.\d*.\d*.\d*:\d*")
Dim MyMatches As System.Text.RegularExpressions.MatchCollection = Regex.Matches(TestString)
For Each Match As System.Text.RegularExpressions.Match In MyMatches
MessageBox.Show(Match.Value) 'shows each IP address
Next
Replacing "TestString" with your html string....
-
Mar 20th, 2006, 07:06 PM
#4
Thread Starter
Addicted Member
Re: How to Extract this text from html
many thanx Dears and thanx again to GigemBoy it was so sweet Code.
but:
when i do this to the current page:
VB Code:
Dim wc As New Net.WebClient
Dim sSource As String = ""
Dim reader As New IO.StreamReader(wc.OpenRead("http://www.vbforums.com/showthread.php?t=394163"))
sSource = reader.ReadToEnd
Dim Regex As New System.Text.RegularExpressions.Regex("\d*.\d*.\d*.\d*:\d*")
Dim MyMatches As System.Text.RegularExpressions.MatchCollection = Regex.Matches(sSource)
For Each Match As System.Text.RegularExpressions.Match In MyMatches
MessageBox.Show(Match.Value) 'shows each IP address
Next
i got MsgBoxes for strings like this:
htt:
bst:
ssc:
=
so how can i fix this
Edit: is there a way to check if the string is a Proxy.
Last edited by _Conan_; Mar 20th, 2006 at 08:33 PM.
-
Mar 20th, 2006, 08:31 PM
#5
Thread Starter
Addicted Member
Re: How to Extract this text from html
I tried this and it work (not bad) but sure it's not good:
VB Code:
Dim wc As New Net.WebClient
Dim sSource As String = ""
Dim reader As New IO.StreamReader(wc.OpenRead("http://www.vbforums.com/showthread.php?t=394163"))
sSource = reader.ReadToEnd
Dim Regex As New System.Text.RegularExpressions.Regex("\d*.\d*.\d*.\d*:\d*")
Dim MyMatches As System.Text.RegularExpressions.MatchCollection = Regex.Matches(sSource)
For Each Match As System.Text.RegularExpressions.Match In MyMatches
If Not Match.Value.Length < 15 Then
MsgBox(Match.Value)
End If
Next
may any1 help
-
Mar 20th, 2006, 08:47 PM
#6
Thread Starter
Addicted Member
Re: How to Extract this text from html
Finally i Did it .. May any1 one tell me if i'm true or not 
VB Code:
Dim wc As New Net.WebClient
Dim sSource As String = ""
Dim reader As New IO.StreamReader(wc.OpenRead("http://www.vbforums.com/showthread.php?t=394163"))
sSource = reader.ReadToEnd
Dim Regex As New System.Text.RegularExpressions.Regex("[0-9]+.[0-9]+.[0-9]+.[0-9]+:[0-9]+")
Dim MyMatches As System.Text.RegularExpressions.MatchCollection = Regex.Matches(sSource)
For Each Match As System.Text.RegularExpressions.Match In MyMatches
'If Not Match.Value.Length < 15 Then
MsgBox(Match.Value)
'End If
Next
-
Mar 20th, 2006, 11:12 PM
#7
Thread Starter
Addicted Member
Re: [RESOLVED] How to Extract this text from html
LOOL ... when i tried the code with few sites i Return back again with this :
12.150.244.9:8080
216.50.61.168:8000
216.120.43.88:8080
213.25.170.98:8080
213.162.13.82:8080
02/11/27 03:06
02/11/27 03:07
02/11/27 03:07
02/11/27 03:10
02/11/27 03:14
02/11/27 03:15
02/11/27 03:15
02/11/27 03:16
02/11/27 03:16
02/11/27 03:19
02/11/27 03:21
02/11/27 03:21
02/11/27 03:22
02/11/27 03:23
02/11/27 03:24
02/11/27 03:25
02/11/27 03:26
02/11/27 03:26
02/11/27 03:27
02/11/27 03:35
02/11/27 03:35
02/11/27 03:36
02/11/27 03:37
02/11/27 03:39
02/11/27 03:41
02/11/27 03:43
03-19-2006 11:50
02-22-2006 10:18
03-18-2006 12:04
03-18-2006 05:13
03-01-2006_20:04
16-11-2005_10:27
===
So may any1 help
-
Mar 21st, 2006, 12:07 AM
#8
Re: [Un-RESOLVED] How to Extract this text from html
My original post did it for my example, but when testing it on this page, I see it didnt work. All that was needed was an escape slash in front of the periods, like below:
VB Code:
Dim wc As New Net.WebClient
Dim sSource As String = ""
Dim reader As New IO.StreamReader(wc.OpenRead("http://www.vbforums.com/showthread.php?t=394163"))
sSource = reader.ReadToEnd
Dim Regex As New System.Text.RegularExpressions.Regex("\d*\.\d*\.\d*\.\d*:\d*")
Dim MyMatches As System.Text.RegularExpressions.MatchCollection = Regex.Matches(sSource)
For Each Match As System.Text.RegularExpressions.Match In MyMatches
MessageBox.Show(Match.Value)
Next
-
Mar 21st, 2006, 02:33 PM
#9
Thread Starter
Addicted Member
Re: [Un-RESOLVED] How to Extract this text from html
Many Thanx Dude
edit:
dude is there a way to use a proxy with WebClient
-
Mar 21st, 2006, 06:08 PM
#10
Re: [RESOLVED] How to Extract this text from html
I actually made a post about the proxy question a while back, and was able to find it 
http://www.vbforums.com/showthread.php?t=378043
The guy never replied back to see if it actually worked, however...
-
Mar 21st, 2006, 10:09 PM
#11
Thread Starter
Addicted Member
Re: [RESOLVED] How to Extract this text from html
i tried but it doesnt come,
i think the only way is moving to vs.net 2005,
cuz i read somewhere WebClient has Proxy Property.
Many thanx Gigem i tried to Rate ur post but i got:
You must spread some Reputation around before giving it to gigemboy again.
.. Cheer
-
Mar 22nd, 2006, 12:39 AM
#12
Re: [RESOLVED] How to Extract this text from html
Well I think that example gets the proxy settings that are set up for Internet Explorer now that I look at it...
-
Mar 24th, 2006, 11:38 PM
#13
Thread Starter
Addicted Member
Re: [RESOLVED] How to Extract this text from html
The way i used is to Change the MS internet explorer setting and reset it when the function finished. and it work well .
But Mr. Gigemboy
is there a way to Extract only the HyperLinks From The Source of the page
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|