|
-
Jul 15th, 2006, 06:38 AM
#2
Re: how to extract the urls ,phone no,faxes,metatags from webpages
Every web page is different. You first need to get a bunch of sample pages with the information you want. Go through them and understand their structure so that you can then come up with your string parsing algorithm to parse the page's source code and extract the information you need.
Now you could do simple string searches, or you can use regex, or a combination of both. You know what the page sources look like, so you'd have to decide this.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|