-
Here's the situation. I'm taking web pages and extracting the data out of them. I need to take everything enclosed in the HTML tags and write them to another file. I know how to read and write to the new file, I just can't seem to figure out how to get only the part of the string enclosed in the HTML tags. For example:
For this line...<TD WIDTH="20"><B><U>1</U></B></TD>
I'll need to get just the number 1
For this line...<TD ALIGN="RIGHT" WIDTH="60">Half Ounce</TD>
I'll need to get just "Half Ounce"
Does anyone know how to accomplish this?
-
Try this out, you have to open your file in binary and set #1 to the filenumber you want to use
Code:
X=instr(text,"<")
Do while X
X=instr(x+1,text,">")
Y=instr(x+1,text,"<")
If Y=0 then
Y=len(text)
Put#1,mid(text,x+1,y-x-1)
Y=0
else
Put#1,,mid(text,x+1,y-x-1)
end if
X=Y
Loop
-
Use Microsoft HTML (MSHTML.TLB) so you don't need to extract everything manually you have a wonderful object-model with that you can work...