|
-
Aug 17th, 2000, 09:56 AM
#1
Here's the situation. I'm taking web pages and extracting the data out of them. I need to take everything enclosed in the HTML tags and write them to another file. I know how to read and write to the new file, I just can't seem to figure out how to get only the part of the string enclosed in the HTML tags. For example:
For this line...<TD WIDTH="20"><B><U>1</U></B></TD>
I'll need to get just the number 1
For this line...<TD ALIGN="RIGHT" WIDTH="60">Half Ounce</TD>
I'll need to get just "Half Ounce"
Does anyone know how to accomplish this?
-
Aug 17th, 2000, 10:09 AM
#2
transcendental analytic
Try this out, you have to open your file in binary and set #1 to the filenumber you want to use
Code:
X=instr(text,"<")
Do while X
X=instr(x+1,text,">")
Y=instr(x+1,text,"<")
If Y=0 then
Y=len(text)
Put#1,mid(text,x+1,y-x-1)
Y=0
else
Put#1,,mid(text,x+1,y-x-1)
end if
X=Y
Loop
Use  
writing software in C++ is like driving rivets into steel beam with a toothpick.
writing haskell makes your life easier:
reverse (p (6*9)) where p x|x==0=""|True=chr (48+z): p y where (y,z)=divMod x 13
To throw away OOP for low level languages is myopia, to keep OOP is hyperopia. To throw away OOP for a high level language is insight.
-
Aug 17th, 2000, 10:13 AM
#3
Junior Member
Use Microsoft HTML (MSHTML.TLB) so you don't need to extract everything manually you have a wonderful object-model with that you can work...
Frank
VB-progress:  ->  ->  -> 
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|