[RESOLVED] [2008] searching inside a string
I am working on a program that gets the code from a web page (i.e. HTML or php). So I have gotten as far as finding the code of the page but now I need to search in it for something specific like a <body> tag or <p> tag. Is there anyone that knows how to do this with the string functions? I can't seem to get any examples to work for me. Could some please help me. Thank you :)
Re: [2008] searching inside a string
Are you trying to get the actual code or the contents of the page?
Re: [2008] searching inside a string
I am after a phone number acutuly I just used <html> as an example. I need to search for the area code and then i need to search in the area around where the area code was found for the rest of the phone number.
Re: [2008] searching inside a string
vb Code:
SomeString = YourString.SubString(YourString.IndexOf("<body>"), 10)
That will find the first index of the word you're looking for and then take the next 10 characters after it. Or more, or less. Depends on what number you put.
Re: [2008] searching inside a string
Well I don't think this is what you are after but, just in case, I will post the code. This will return anything between two html tags.
VB Code:
Private Function ShowContent(ByVal strURL As String)
Dim objRequest As Net.WebRequest = Net.WebRequest.Create(strURL)
Dim objResponse As Net.WebResponse = objRequest.GetResponse()
Dim strmReader = New _
IO.StreamReader(objResponse.GetResponseStream())
Dim strContent As String = strmReader.ReadToEnd()
' The RegEx will check the web page passed to this function
' and then pass everything between the two tags back to the
' web browser control. You could pass an entire page by
' including the beginniung and ending <html> tags.
Dim RegEx As System.Text.RegularExpressions.Regex = _
New System.Text.RegularExpressions.Regex _
("<!-- NEWS -->((.|\n)*?)<!-- End of NEWS -->", _
System.Text.RegularExpressions.RegexOptions.IgnoreCase)
' To get this to work in your own app just set your page and
' replace the beginning tag <!-- NEWS --> and the ending tag
' <!-- End of NEWS --> with your own.
Dim getMatch As System.Text.RegularExpressions.Match = _
RegEx.Match(strContent)
Return (getMatch.Value)
End Function
Re: [2008] searching inside a string
Depends on exactly what you are going to do with the data you return and exactly which parts of the HTML code you want to extract but if you are using a webbrowser control to get the code then you may want to look at using something like this:
vb Code:
Dim bodytext As Windows.Forms.HtmlElementCollection = mywebbrowser.Document.GetElementsByTagName("body")
For Each Tag As HtmlElement In bodytext
MessageBox.Show(Tag.InnerText)
Next
Obviously that example is just to give you an idea of how it works. I think its the InnerText property that has the source code between the start and end tag, cant quite remember but you could play about with it and see.
If not then I would just use the IndexOf method as Maximillian said
Re: [2008] searching inside a string
None of these are what I need so I will explain a little more. the page I am getting my sorce from is whitepages.com and you submit a form which is automated by me(based on input from a database) so once the page loads I use webbrowser1.documenttext to get the source. Then I want to search that source not for the HTML tags (just used as an example in the first post) but for the phone number so it would be something like (416) 328-7493. There might be multiple results for a search so I would need to get the first phone number in the results(most likely the first one that apears in the code). Also it is a multi lined string so I don't know if that would have any difference. But thanks for your help so far and I hope I get some more help and an answer :)
Re: [2008] searching inside a string
You have a HtmlDocument, can you call GetElementById or GetElementsByTagName and search thru the HtmlElementCollection to find what you're after?
Re: [2008] searching inside a string
Well the IndexOf method that Maximillian posted will just retrieve the first occurence of the string you search for so wouldnt that do what you want?
Re: [2008] searching inside a string
Quote:
Originally Posted by stanav
You have a HtmlDocument, can you call GetElementById or GetElementsByTagName and search thru the HtmlElementCollection to find what you're after?
Thats what I already suggested isnt it? He says he just wants to search for a telephone number in the html/php code so using mystring.IndexOf would be fine wouldnt it?
Re: [2008] searching inside a string
ok then i will try all of those ideas thanks again.
Re: [2008] searching inside a string
Quote:
Originally Posted by bagstoper
None of these are what I need so I will explain a little more. the page I am getting my sorce from is whitepages.com and you submit a form which is automated by me(based on input from a database) so once the page loads I use webbrowser1.documenttext to get the source. Then I want to search that source not for the HTML tags (just used as an example in the first post) but for the phone number so it would be something like (416) 328-7493. There might be multiple results for a search so I would need to get the first phone number in the results(most likely the first one that apears in the code). Also it is a multi lined string so I don't know if that would have any difference. But thanks for your help so far and I hope I get some more help and an answer :)
Well, since you know the phone number format, and you also know that the first result is likely what you need, I think regular expression is the best approach.
Re: [2008] searching inside a string
Quote:
Originally Posted by MaximilianMayrhofer
vb Code:
SomeString = YourString.SubString(YourString.IndexOf("<body>"), 10)
That will find the first index of the word you're looking for and then take the next 10 characters after it. Or more, or less. Depends on what number you put.
turns out this worked. i tried it and it worked with a little tweaking. so thank you this post has made my life a lot easier and sorry for not trying it sooner.
Again thanks this is great :) :) :) :) :thumb: