Results 1 to 31 of 31

Thread: Help extracting dyanamic data from html code

  1. #1

    Thread Starter
    Frenzied Member
    Join Date
    Apr 2005
    Posts
    1,907

    Arrow Help extracting dyanamic data from html code

    Hi all . I got a html code that i used inet.openURL to put it in textbox. Now i want to extract Artistname , albumname,songname,artistpic from it. The code already extracts song ids but i want to output songname and other informatin with it. I be happy if an expert show me an easy way to extract those data.Thanks

    Note: in one page there is one album for single artist but mutliple songnames
    Note: the bold parts are dynamaic and changing and i want to extract them

    These blocks of html are amoung othe codes that i removed them in mypost
    html part that holds each song name:

    VB Code:
    1. <img border="0" src="../images/download.gif" width="16" height="16" longdesc="Download [B]songname[/B]" alt="Download [B]songname[/B]"></a>
    2.                             &nbsp;[B]songname[/B]
    3.                             </td>

    Html part that holds artist name and album

    VB Code:
    1. ::: Singer: <b>[B]artistname[/B]</b> Album <b>[B]albumename[/B]</b>::::</font></td>

    Html part tha holdes artist image
    VB Code:
    1. <td>
    2.                 <br>
    3.                 <a href="http://localhost/ShowImage.asp?img=http://localhost/[B]artistpic.jpg[/B]" target=_blank>
    4.                 <img border="0" src="../CdImages/artistpic.jpg" width="180" height="180" longdesc="Click here to Enlarge" alt="Click here to Enlarge" >
    5.                 </a>
    6.                 </td>


    VB Code:
    1. Private Sub Command1_Click(Index As Integer)
    2.  
    3. Select Case Index
    4.     Case 0:
    5.         If txtURL.Text <> "" Then
    6.             RichTextBox1.Text = Inet1.OpenURL(txtURL.Text, icString)
    7.         End If
    8.    
    9.     Case 1:
    10.         End
    11. End Select
    12. End Sub
    13.  
    14.  
    15. Private Sub Command2_Click()
    16.  Dim sResult() As String, n As Long
    17.  
    18.  
    19.             [B]If GetLine(RichTextBox1.Text, "../player/player.asp?id=", "')", sResult) [/B] Then
    20.      For n = LBound(sResult) To UBound(sResult)
    21.          [B] List1.AddItem sResult(n)[/B]        
    22.         Next n
    23.        
    24. Else
    25.         ' No occurances were found
    26.     End If
    27. End Sub
    28.  
    29.  
    30.  
    31.  
    32. Private Function GetLine(ByVal sText As String, ByVal sStart As String, ByVal sEnd As String, ByRef sArr() As String) As Boolean
    33.     Dim lPos As Long, lEnd As Long, lCount As Long, sTemp() As String
    34.    
    35.     ReDim sTemp(100)
    36.    
    37.     lPos = InStr(1, sText, sStart, vbTextCompare)
    38.     Do While lPos
    39.         lEnd = InStr(lPos, sText, sEnd, vbTextCompare)
    40.         If lEnd Then
    41.      
    42.             sTemp(lCount) = Mid$(sText, lPos, lEnd - lPos)
    43.             lPos = InStr(lEnd, sText, sStart, vbTextCompare)
    44.         Else
    45.             sTemp(lCount) = Mid$(sText, lPos)
    46.             lPos = 0
    47.         End If
    48.         lCount = lCount + 1
    49.         If lCount > UBound(sTemp) Then ReDim Preserve sTemp(100 + lCount)
    50.     Loop
    51.  
    52.     If lCount > 0 Then
    53.         ReDim Preserve sTemp(lCount - 1)
    54.         sArr = sTemp
    55.     End If
    56.     GetLine = lCount
    57. End Function
    Last edited by tony007; May 3rd, 2006 at 03:01 AM.

  2. #2
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: Help extracting dyanamic data from html code

    See this thread about parsing on how to make data easy to parse.

  3. #3

    Thread Starter
    Frenzied Member
    Join Date
    Apr 2005
    Posts
    1,907

    Arrow Re: Help extracting dyanamic data from html code

    Quote Originally Posted by Merri
    See this thread about parsing on how to make data easy to parse.
    MAnhy thanks for u reply. But unfortutnly i do not know how to use replace for my own datatype!!

    VB Code:
    1. $text = Replace$(Text, " width=" & ChrW$(34) & " 8%" & ChrW$(34), vbNullString)
    2. $text = Replace$(Text, " align=" & ChrW$(34) & "right" & ChrW$(34), vbNullString)

    The sample data i posted in my first post has some of the strange and hard to parse and i never used replace method and i do not know how to construct the replace criateria so i be happy if u show how it can be applied to my data type.

    Furthermore, the html blocks that posted are only part of the html page , some of those blocks repeated many times with diffrent data and some of the html blocks reapeat only onces at the html page that i want to extract data from. So i wonder if this method works!! For my entire html how many times i should do replace since there are many html parts there?do not u thik i might replace some thing that i want retrive by mistake using this method?Thanks
    Last edited by tony007; May 3rd, 2006 at 10:28 AM.

  4. #4
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: Help extracting dyanamic data from html code

    In that case your job is kind of easier. You just need to locate something unique in the HTML that won't appear in the data you process and then jump to what you need. For example, to rip off the song name:

    VB Code:
    1. Dim lngPos As Long, lngEndPos As Long, strHTML As String
    2.  
    3. strHTML = RichTextBox1.Text
    4.  
    5. ' find unique location before song name
    6. lngPos = InStr(strHTML, "longdesc=" & ChrW$(34) & "Download ")
    7.  
    8. ' find some data right before the song name (6 = length of "&nbsp;")
    9. lngPos = InStr(lngPos, strHTML, "&nbsp;") + 6
    10.  
    11. ' find some data right after the song name
    12. lngEndPos = InStr(lngPos, strHTML, "</td>")
    13.  
    14. ' show the result
    15. MsgBox "Song name is '" & Mid$(strHTML, lngPos, lngEndPos - lngPos) & "'"

    There might be some quirk in that as I didn't try it myself, but that's the basics. Identify something unique, search for that, then search for something right before the data and add up the length of that data so you get the beginning of the data you want and then also get the position where the data ends, so you can rip off the data you want with Mid$.

    You might want to add error checking as well in case you get a page that doesn't follow the same logic as most of the pages.

  5. #5

    Thread Starter
    Frenzied Member
    Join Date
    Apr 2005
    Posts
    1,907

    Re: Help extracting dyanamic data from html code

    Thanks man i tried that it worked for only first songname but how i can make it so it goes finds the rest of song names ? furthermore how to do mulitple search for example i want to collect song id with its name,artistname ,albumname and artistpic all shown in bold and print them out in listview :

    Note:songname repeates more then once with diffrent songnames for each songid
    VB Code:
    1. <img border="0" src="../images/download.gif" width="16" height="16" longdesc="Download [B]songname[/B]" alt="Download [B]songname[/B]"></a>
    2.                             &nbsp;[B]songname[/B]
    3.                             </td>
    4.                             <td><a href="javascript:newWindow('../../RateSong.asp?SongID=[B]15719[/B]')">

    VB Code:
    1. ::: Singer: <b>[B]artistname[/B]</b> Album <b>[B]albumename[/B]</b>::::</font></td>

    VB Code:
    1. <td>
    2.                 <br>
    3.                 <a href="http://localhost/ShowImage.asp?img=http://localhost/[B]artistpic.jpg[/B]" target=_blank>
    4.                 <img border="0" src="../CdImages/artistpic.jpg" width="180" height="180" longdesc="Click here to Enlarge" alt="Click here to Enlarge" >
    5.                 </a>
    6.                 </td>
    Thanks
    Last edited by tony007; May 3rd, 2006 at 12:04 PM.

  6. #6
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: Help extracting dyanamic data from html code

    You can just keep looping until the unique code can't be find anymore. Just check if lngPos is smaller than 1 (= error condition) and you know there is no more songs. You can find a song ID by after looking for the song name searching for SongID= and doing practically the same done with the song name. I'm not doing more code because what I've already told here is pretty much enough to know on how to parse the data.

    How you store the data is a different thing. I don't know in what format you need them.

  7. #7

    Thread Starter
    Frenzied Member
    Join Date
    Apr 2005
    Posts
    1,907

    Re: Help extracting dyanamic data from html code

    Quote Originally Posted by Merri
    You can just keep looping until the unique code can't be find anymore. Just check if lngPos is smaller than 1 (= error condition) and you know there is no more songs. You can find a song ID by after looking for the song name searching for SongID= and doing practically the same done with the song name. I'm not doing more code because what I've already told here is pretty much enough to know on how to parse the data.

    How you store the data is a different thing. I don't know in what format you need them.
    How to loop ?

  8. #8
    VB6, XHTML & CSS hobbyist Merri's Avatar
    Join Date
    Oct 2002
    Location
    Finland
    Posts
    6,654

    Re: Help extracting dyanamic data from html code

    VB Code:
    1. Do While CONDITIONISTRUE
    2.     ' code here
    3. Loop

    For example:

    VB Code:
    1. Dim lngA As Long
    2.  
    3. Do While lngA < 5
    4.     lngA = lngA + 1
    5.     MsgBox lngA
    6. Loop


    If your code gets stuck on never ending loop, try pushing Ctrl + Pause/Break

  9. #9

    Thread Starter
    Frenzied Member
    Join Date
    Apr 2005
    Posts
    1,907

    Arrow Re: Help extracting dyanamic data from html code

    I give up your method made me more confused then solving my problem!!! it does not work for type of data i have !! first u susgges replace then u come up with new code....so i hope some one elese help me here.

  10. #10
    eltiT resU motsuC Static's Avatar
    Join Date
    Oct 2000
    Location
    Rochester, NY
    Posts
    9,390

    Re: Help extracting dyanamic data from html code

    this looks familiar.. didnt u post another thread with the same style question? I had u use the webbrowser to parse the data....
    JPnyc rocks!! (Just ask him!)
    If u have your answer please go to the thread tools and click "Mark Thread Resolved"

  11. #11

    Thread Starter
    Frenzied Member
    Join Date
    Apr 2005
    Posts
    1,907

    Arrow Re: Help extracting dyanamic data from html code

    Quote Originally Posted by Static
    this looks familiar.. didnt u post another thread with the same style question? I had u use the webbrowser to parse the data....
    static i do lots of raw data extraction data from html but unfortunetlyi there is no one standred method to do it ,so yes i asked a few questions simmiler to this but with diffrent data patterns . i looked evey where i did not find any standred way to deal with raw data from html . One method works for one pattern but it does not work for other patterens. And now i can not proceed with this and i am totally lost!!!!

  12. #12
    eltiT resU motsuC Static's Avatar
    Join Date
    Oct 2000
    Location
    Rochester, NY
    Posts
    9,390

    Re: Help extracting dyanamic data from html code

    I have found that working with and parsing HTML is best done with a webbrowser control and the HTML Object library...

    u can loop thur tables, images, links, etc... even change the appearence of things.
    perfect example...

    I wrote a browser that does things like this on the fly
    see images below:

    1) highlights new threads green, threads Ive started red, threads ive replied to blue... (can even hide resolved threads)
    2) it grabs the forum sections and puts them in a menu
    3) grabs whos viewing the section or thread u are in...

    all done with the HTML object library...
    Attached Images Attached Images    
    JPnyc rocks!! (Just ask him!)
    If u have your answer please go to the thread tools and click "Mark Thread Resolved"

  13. #13

    Thread Starter
    Frenzied Member
    Join Date
    Apr 2005
    Posts
    1,907

    Arrow Re: Help extracting dyanamic data from html code

    static i thank u for u nice reply. Well i realy do not understand when u say :

    parsing HTML is best done with a webbrowser control and the HTML Object library...
    since i did not work with webbrowser more then just displaying a simple webpage such as www.cnn.com !! What i want at this point to extract those data from html code chunk and disply them in listview or something so that i put them in a database .May in future i use webbrowser to show them in diffrent format as the way you did. Therefore, as i mentioned i will be doing a lot of html data extraction so it would be kind that u show me how to use the method that u mentioed for extracting songid,songname,artistname,albumname,artist pic and puttting them in listview.Thank u and looking forward to your reply.

  14. #14
    eltiT resU motsuC Static's Avatar
    Join Date
    Oct 2000
    Location
    Rochester, NY
    Posts
    9,390

    Re: Help extracting dyanamic data from html code

    can u either give me the URL or upload(attach) the html source?
    JPnyc rocks!! (Just ask him!)
    If u have your answer please go to the thread tools and click "Mark Thread Resolved"

  15. #15

    Thread Starter
    Frenzied Member
    Join Date
    Apr 2005
    Posts
    1,907

    Re: Help extracting dyanamic data from html code

    Quote Originally Posted by Static
    can u either give me the URL or upload(attach) the html source?
    Thanks for u reply. Here i will attach a the html as zip since th site does not allow paste large code html and upload html file.

    This is simplifed version of actual page in webserver.There are lots of javascript and frames ... around it but this is main part that i am intrest is shown in html file. I made it short and readable but the main parts that i am intrested are in this html so i hope u be able to help me extract those data from it. Furthermor, some of data such as songid are not visable and it is inside html. The number of songs per html page veries from page to page so they are dynamic.Thanks
    Note :the page is webserver not offline copy so i am extract data from live content not offline html pages
    Attached Files Attached Files
    Last edited by tony007; May 4th, 2006 at 08:47 AM.

  16. #16
    eltiT resU motsuC Static's Avatar
    Join Date
    Oct 2000
    Location
    Rochester, NY
    Posts
    9,390

    Re: Help extracting dyanamic data from html code

    ok.. thats not bad at all.. (still working on artist pic.. there isnt one in the webpage so...

    anyway.. all the info (except pic) is in ONE href.... the WriteLyrics

    so here it is so far:
    Attached Files Attached Files
    JPnyc rocks!! (Just ask him!)
    If u have your answer please go to the thread tools and click "Mark Thread Resolved"

  17. #17

    Thread Starter
    Frenzied Member
    Join Date
    Apr 2005
    Posts
    1,907

    Re: Help extracting dyanamic data from html code

    Quote Originally Posted by Static
    ok.. thats not bad at all.. (still working on artist pic.. there isnt one in the webpage so...

    anyway.. all the info (except pic) is in ONE href.... the WriteLyrics

    so here it is so far:

    static many many thanks to u . Thanks for introduce me to this method.

    One thing may i know why u could not catch the name of artist pic which ends with .jpg. I just wanted to put artistpic.jpg to listview too. so be happy u show me how that can be done

    Furthermore, when i try it in live content in internet i get the data but i get this error. could u tell me how to remove the error?



    and pointing to this line:

    VB Code:
    1. Set LI = ListView1.ListItems.Add(, "ID:" & Replace(TMP(1), "&Singer", ""), Replace(TMP(1), "&Singer", ""))

    I realy need to learn this method could u point me to some tutorials and also explain to me how to construct this if i want to get data from diffrent type of page. I be happy if u show me how to make one for the attached html. it is more simpler and i need songid,songname,artistname,artistablumThanks
    Attached Files Attached Files
    Last edited by tony007; May 4th, 2006 at 10:11 AM.

  18. #18
    eltiT resU motsuC Static's Avatar
    Join Date
    Oct 2000
    Location
    Rochester, NY
    Posts
    9,390

    Re: Help extracting dyanamic data from html code

    I never found any good tutorials.. all my work has been trial and error...
    and digging thru MSDN for samples....

    that error means its trying to add the same song ID twice.. is the song ID listed more than once in the page?? (or multiple pages?) You will need to test for the existance of the listitem...


    as far as the pic... well.. can u give me a real copy of the html with images etc for me to test.. (PM it to me.. ) the best would be a live version for me to work on... but I know thats not always possible..
    JPnyc rocks!! (Just ask him!)
    If u have your answer please go to the thread tools and click "Mark Thread Resolved"

  19. #19
    eltiT resU motsuC Static's Avatar
    Join Date
    Oct 2000
    Location
    Rochester, NY
    Posts
    9,390

    Re: Help extracting dyanamic data from html code

    the pic WILL be a little harder.. first its large... so it might look funny in the listview.
    second.. its "after" the song lists... so U need to loop back thru the songs to add it...
    hmm.. will have to think about that one.. would u want the pic made smaller?
    or how about click on the list item and have the pic display in a picbox?



    some more questions.. is this data etc on your PC? meaning do u have direct access to the images? or are u pulling this off the web.. (which means pic must be downloaded)
    JPnyc rocks!! (Just ask him!)
    If u have your answer please go to the thread tools and click "Mark Thread Resolved"

  20. #20

    Thread Starter
    Frenzied Member
    Join Date
    Apr 2005
    Posts
    1,907

    Re: Help extracting dyanamic data from html code

    Quote Originally Posted by Static
    the pic WILL be a little harder.. first its large... so it might look funny in the listview.
    second.. its "after" the song lists... so U need to loop back thru the songs to add it...
    hmm.. will have to think about that one.. would u want the pic made smaller?
    or how about click on the list item and have the pic display in a picbox?



    some more questions.. is this data etc on your PC? meaning do u have direct access to the images? or are u pulling this off the web.. (which means pic must be downloaded)

    Many thanks for helping me. Well actually i want just the name of the pic (for example CdImages/artistpic.jpg) to be printed in another listbox so i can just refrence them later and also just display the pic in picbox one time since the pic is exist one time in who page and it is for single album

    The pic on remote server so it would be just nice to download that pic some how on load and save in harddisk.

    As far as the error. Well there is not song repeatiton there, i checked it!! . it does pull all the data but only that error comes.

    Did u have a chance to look at the second html page i send u.Thanks
    Last edited by tony007; May 4th, 2006 at 10:30 AM.

  21. #21
    eltiT resU motsuC Static's Avatar
    Join Date
    Oct 2000
    Location
    Rochester, NY
    Posts
    9,390

    Re: Help extracting dyanamic data from html code

    name of pick to another listbox shoud be easy... how did u send other page? PM? Email?
    JPnyc rocks!! (Just ask him!)
    If u have your answer please go to the thread tools and click "Mark Thread Resolved"

  22. #22

    Thread Starter
    Frenzied Member
    Join Date
    Apr 2005
    Posts
    1,907

    Arrow Re: Help extracting dyanamic data from html code

    Quote Originally Posted by Static
    name of pick to another listbox shoud be easy... how did u send other page? PM? Email?
    oh i attached it in my previouse posts . i will attached i again.

    Regarding the error if u see the first ziped file that i uplaoded in html code the value of songid is reapeated a few times for each song but in diffrent patterons. For example one for email song and one for rating song ... But as u see all of them refere to same song so i just want it to show one time in lisview. It is strange why in local file it worked without error but in remote site it displays data but it gives that error!!

    oh one more thing the number of rows for each song is more then the example i send u.So u think that makes the error ? but those rows are dyanamic and they are not always the same number of rows.Thanks



    I tried u code for the following html but did not work:
    Note: i uploaded the html page as zip file as an attachment too

    Code:
    <html>
    
    <table width="400" border="1" cellspacing="0" bordercolor="#9999FF">
      <tr>
        <td colspan="5">artistname Album albumname</td>
      </tr>
      <tr>
        <td colspan="5"><div align="left">some discription here<br><b>Release Year :</b> 2000<br>
      </div></td>
      </tr>
      <tr>
        <th width="35" scope="row">#</th>
        <td width="30" align="center">&nbsp;</td>
        <td width="270" align="center">Song</td>
        <td width="30" align="center">Send</td>
        <td width="35" align="center">Rate</td>
      </tr>
      <tr>
        <td align="center" scope="row">1</td>
        <td align="center"><INPUT TYPE="Checkbox" NAME="song_id" ONCLICK="reviewSelection();" VALUE="3940"></td>
        <td><a href="#" class="song_title" onclick="loadPlayer('3940');return false;">songname1
    </a> </td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
      </tr>
      <tr>
        <td align="center" scope="row">2</td>
        <td align="center"><INPUT TYPE="Checkbox" NAME="song_id" ONCLICK="reviewSelection();" VALUE="3941"></td>
        <td><a href="#" class="song_title" onclick="loadPlayer('3941');return false;">songname2
    </a> </td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
      </tr>
      <tr>
        <td align="center" scope="row">3</td>
        <td align="center"><INPUT TYPE="Checkbox" NAME="song_id" ONCLICK="reviewSelection();" VALUE="3942"></td>
        <td><a href="#" class="song_title" onclick="loadPlayer('3942');return false;">songname3</a> </td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
      </tr>
      <tr>
        <td align="center" scope="row">4</td>
        <td align="center"><INPUT TYPE="Checkbox" NAME="song_id" ONCLICK="reviewSelection();" VALUE="3943"></td>
        <td><a href="#" class="song_title" onclick="loadPlayer('3943');return false;">songname4</a> </td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
      </tr>
      <tr>
        <td align="center" scope="row">5</td>
        <td align="center"><INPUT TYPE="Checkbox" NAME="song_id" ONCLICK="reviewSelection();" VALUE="3944"></td>
        <td><a href="#" class="song_title" onclick="loadPlayer('3944');return false;">songname5
    </a> </td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
      </tr>
      <tr>
        <td align="center" scope="row">6</td>
        <td align="center"><INPUT TYPE="Checkbox" NAME="song_id" ONCLICK="reviewSelection();" VALUE="3945"></td>
        <td><a href="#" class="song_title" onclick="loadPlayer('3945');return false;">songname6
    </a> </td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
      </tr>
      <tr>
        <td align="center" scope="row">7</td>
        <td align="center"><INPUT TYPE="Checkbox" NAME="song_id" ONCLICK="reviewSelection();" VALUE="3946"></td>
        <td><a href="#" class="song_title" onclick="loadPlayer('3946');return false;">songname7
    </a> </td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
      </tr>
      <tr>
        <td align="center" scope="row">8</td>
        <td align="center"><INPUT TYPE="Checkbox" NAME="song_id" ONCLICK="reviewSelection();" VALUE="3947"></td>
        <td><a href="#" class="song_title" onclick="loadPlayer('3947');return false;">songname8
    </a> </td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
      </tr>
      <tr>
        <td align="center" scope="row">9</td>
        <td align="center"><INPUT TYPE="Checkbox" NAME="song_id" ONCLICK="reviewSelection();" VALUE="3948"></td>
        <td><a href="#" class="song_title" onclick="loadPlayer('3948');return false;">songname9
    </a> </td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
      </tr>
      <tr>
        <td colspan="5" scope="row" align="center"> 
        <INPUT TYPE="Button" NAME="PlayBtn" VALUE=" Play " ONCLICK="javascript:buildList();">&nbsp;&nbsp; 
        <INPUT TYPE="button" NAME="selectbutton" VALUE="Select All" ONCLICK="javascript:changeAll();"> 
        <input type="hidden" Name="Selection" value="false"> 
      </tr></td>
    
    </table></form>
    
    
    
    </html>
    Attached Files Attached Files
    Last edited by tony007; May 4th, 2006 at 12:17 PM.

  23. #23

    Thread Starter
    Frenzied Member
    Join Date
    Apr 2005
    Posts
    1,907

    Arrow Re: Help extracting dyanamic data from html code

    static i found that if i remove this from my html (the part after the <body>). I do not get that error. Could u some how modify the code that it deals with this ?
    Because in the top of the table therese is this code that makes prolem.Thanks

    Thanks :

    Code:
    <html>
    <head>
    
    </head>
    <body topmargin="0" leftmargin="0" ONLOAD="preloadImages();">
    
    
    <div align="center">
      <center>
      <tablewidth="100%" id="AutoNumber1">
        <tr>
          <td width="100%">
          <table width="100%" cellspacing="0" cellpadding="0" border="0">
          	<tr>
          		<td>
          		 <!-- ImageReady Preload Script (toplogo.psd) -->
    <SCRIPT TYPE="text/javascript">
    <!--
    function newImage(arg) {
    	if (document.images) {
    		rslt = new Image();
    		rslt.src = arg;
    		return rslt;
    	}
    }
    function changeImages() {
    	if (document.images && (preloadFlag == true)) {
    		for (var i=0; i<changeImages.arguments.length; i+=2) {
    			document[changeImages.arguments[i]].src = changeImages.arguments[i+1];
    		}
    	}
    }
    var preloadFlag = false;
    function preloadImages() {
    	if (document.images) {
    		home_over = newImage("../images/home-over.jpg");
    		forum_home_over = newImage("../images/forum-home_over.jpg");
    		forum_over = newImage("../images/forum-over.jpg");
    		forum_download_over = newImage("../images/forum-download_over.jpg");
    		download_forum_over = newImage("../images/download-forum_over.jpg");
    		download_over = newImage("../images/download-over.jpg");
    		download_buy_over = newImage("../images/download-buy_over.jpg");
    		buy_download_over = newImage("../images/buy-download_over.jpg");
    		buy_over = newImage("../images/buy-over.jpg");
    		buy_chat_over = newImage("../images/buy-chat_over.jpg");
    		chat_buy_over = newImage("../images/chat-buy_over.jpg");
    		chat_over = newImage("../images/chat-over.jpg");
    		chat_contact_over = newImage("../images/chat-contact_over.jpg");
    		contact_over = newImage("../images/contact-over.jpg");
    		preloadFlag = true;
    	}
    }
    // -->
    </SCRIPT>
    <!-- ImageReady Slices (toplogo.psd) -->
    <TABLE WIDTH="100%" BORDER=0 CELLPADDING=0 CELLSPACING=0 bgcolor="#000000">
    	<TR>
    		<TD>
    			<IMG SRC="../images/toplogo_01.jpg" WIDTH=131 HEIGHT=120 ALT="">
    		</TD>
    		<TD align="left">
    			<IMG SRC="../images/toplogo_02.jpg" WIDTH=130 HEIGHT=120 ALT="">
    		</TD>
    		<TD>
    			<IMG SRC="../images/toplogo_03.jpg" WIDTH=136 HEIGHT=120 ALT="">
    		</TD>
    		<TD>
    			<IMG SRC="../images/toplogo_04.jpg" WIDTH=135 HEIGHT=120 ALT="">
    		</TD>
    		<TD>
    			<IMG SRC="../images/toplogo_05.jpg" WIDTH=131 HEIGHT=120 ALT="">
    		</TD>
    		<TD>
    			<IMG SRC="../images/toplogo_06.jpg" WIDTH=127 HEIGHT=120 ALT="">
    		</TD>
    		<TD background="../images/extend.jp" align="left" width="100%">
    			<img src="../images/extend.jpg" height="120" width="100%">
    		</TD>
    	</TR>
    
    	<TR bgcolor="#515151">
    		<TD align="center"><a href="../../"><font color="#FFFFFF">
    		<b></font></a></b></TD>
    		<TD align="center">
    		<b><a href="../../Music/"><font color="#FFFFFF">
    		<b>Music</font></a></b></TD>
    		<TD align="center"><b><a href="../../Videos/">
    		<font color="#FFFFFF"><b> Music Videos</font></a></b></TD>
    		<TD align="center">
    		<a href="../../join.asp"><font color="#FFFFFF"><b>Join Us</b></font></a>		
    		</TD>
    		<TD align="center"><a href="../../FeedBack.asp">
    		<font color="#FFFFFF"><b>Contact/Feedback</b></font></a></TD>
    		<TD align="center"><a href="../../Top50.asp"><font color="#FFFFFF"><b>TOP 50</b></font></a>	
    		<font color="#FF0000">
    		<span style="font-style: italic; background-color: #FFFF00">New</span></font>&nbsp;</TD>
    		<TD align="center">
    		</TD>
    	</TR>
    </TABLE>  
          		&nbsp;</td>
          	</tr>
          	<tr>
    			<!-- Main --->
          		<td>
    					<table width="100%">
    						<tr>
    							<td width="15%" valign="top">
    							<table border="0" width="90%" id="table1" cellspacing="0" style="border-style:dashed>
    	<tr>
    		<td>
    		<p align="center"><b><i>::: Search <font color="#FF0000"><i><b>New</b></i></font> :::</i></b>
    		<br>
    		<form method="GET" action="../search.asp" name="Search">
    		<input type="text" name="s" size="20">
    		<input type="submit" value="Search" name="B1">
    		</form>
    		</td>
    	</tr>
    </table>
    Last edited by tony007; May 4th, 2006 at 11:47 AM.

  24. #24
    eltiT resU motsuC Static's Avatar
    Join Date
    Oct 2000
    Location
    Rochester, NY
    Posts
    9,390

    Re: Help extracting dyanamic data from html code

    this will take care of version1 (not dupes.. added a "check if exists")

    VB Code:
    1. Private Sub Form_Load()
    2.     WebBrowser1.Navigate "file:///C:/VB/tony007/htmlpagehere/page.html"
    3.    
    4. End Sub
    5.  
    6. Private Sub WebBrowser1_DocumentComplete(ByVal pDisp As Object, URL As Variant)
    7.     If (pDisp Is WebBrowser1.Application) Then
    8.         GetInfo WebBrowser1.Document
    9.     End If
    10. End Sub
    11.  
    12.  
    13. Private Sub GetInfo(HTML As HTMLDocument)
    14.     Dim HTMLA As HTMLAnchorElement
    15.     Dim HTMLT As HTMLTable
    16.     Dim HTMLR As HTMLTableRow
    17.     Dim HTMLC As HTMLTableCell
    18.     Dim TMP() As String
    19.     Dim LI As ListItem
    20.     For Each HTMLT In HTML.getElementsByTagName("table")
    21.         For Each HTMLR In HTMLT.rows
    22.             For Each HTMLC In HTMLR.cells
    23.                 For Each HTMLA In HTMLC.getElementsByTagName("a")
    24.                     If InStr(HTMLA.href, "WriteLyrics.asp?SongID=") Then
    25.                         TMP = Split(HTMLA.href, "=")
    26.                         If Not LIExists(Replace(TMP(1), "&Singer", "")) Then
    27.                             Set LI = ListView1.ListItems.Add(, "ID:" & Replace(TMP(1), "&Singer", ""), Replace(TMP(1), "&Singer", ""))
    28.                             LI.ListSubItems.Add , , TMP(4)
    29.                             LI.ListSubItems.Add , , Replace(TMP(2), "&Album", "")
    30.                             LI.ListSubItems.Add , , Replace(TMP(3), "&Song", "")
    31.                         End If
    32.                     End If
    33.                 Next
    34.             Next
    35.         Next
    36.     Next
    37.    
    38.    
    39. End Sub
    40. Private Function LIExists(sKEY As String) As Boolean
    41.     On Error GoTo ItsNotThere
    42.     Dim LI2 As ListItem
    43.     Set LI2 = ListView1.ListItems("ID:" & sKEY)
    44.     LIExists = True
    45.    
    46.     Exit Function
    47. ItsNotThere:
    48. End Function
    JPnyc rocks!! (Just ask him!)
    If u have your answer please go to the thread tools and click "Mark Thread Resolved"

  25. #25

    Thread Starter
    Frenzied Member
    Join Date
    Apr 2005
    Posts
    1,907

    Arrow Re: Help extracting dyanamic data from html code

    Thanks man that fixed the error i was getting. I am still waiting for the second html and also the add pic and add picname to seperate listbox. I appreatie a lot i am learning a lot with your help.Thanks

  26. #26
    eltiT resU motsuC Static's Avatar
    Join Date
    Oct 2000
    Location
    Rochester, NY
    Posts
    9,390

    Re: Help extracting dyanamic data from html code

    ok this will handle BOTH types of HTML tables.... (hopefully on the "live" version)

    it worked on the tests.. only thing is.. the "artistANDalbum".. is there anything seperating them? I would need to know how to split it to get the individually

    VB Code:
    1. Private Sub Form_Load()
    2.     WebBrowser1.Navigate "file:///C:/VB/tony007/htmlpagehere/page.html"
    3.    
    4. End Sub
    5.  
    6. Private Sub WebBrowser1_DocumentComplete(ByVal pDisp As Object, URL As Variant)
    7.     If (pDisp Is WebBrowser1.Application) Then
    8.         GetInfo WebBrowser1.Document
    9.     End If
    10. End Sub
    11.  
    12.  
    13. Private Sub GetInfo(HTML As HTMLDocument)
    14.     Dim HTMLA As HTMLAnchorElement
    15.     Dim HTMLT As HTMLTable
    16.     Dim HTMLR As HTMLTableRow
    17.     Dim HTMLC As HTMLTableCell
    18.     Dim HTMLI As HTMLInputElement
    19.     Dim pTYPE As Integer
    20.     Dim TMP() As String
    21.     Dim tArtist As String
    22.     Dim LI As ListItem
    23.     For Each HTMLT In HTML.getElementsByTagName("table")
    24.         For Each HTMLR In HTMLT.rows
    25.             For Each HTMLC In HTMLR.cells
    26.                 If HTMLC.colSpan = 5 Then
    27.                     tArtist = HTMLC.innerText
    28.                     pTYPE = 1
    29.                 End If
    30.                 If pTYPE = 1 Then
    31.                     For Each HTMLI In HTMLC.getElementsByTagName("input")
    32.                         If HTMLI.Type = "checkbox" Then
    33.                             Set LI = ListView1.ListItems.Add(, "ID:" & HTMLI.Value, HTMLI.Value)
    34.                         End If
    35.                     Next
    36.                     For Each HTMLA In HTMLC.getElementsByTagName("a")
    37.                         If InStr(HTMLA.onclick, LI.Text) Then
    38.                             LI.ListSubItems.Add , , HTMLA.innerText
    39.                             LI.ListSubItems.Add , , tArtist
    40.                             Exit For
    41.                         End If
    42.                     Next
    43.                 Else
    44.                     For Each HTMLA In HTMLC.getElementsByTagName("a")
    45.                         If InStr(HTMLA.href, "WriteLyrics.asp?SongID=") Then
    46.                             TMP = Split(HTMLA.href, "=")
    47.                             If Not LIExists(Replace(TMP(1), "&Singer", "")) Then
    48.                                 Set LI = ListView1.ListItems.Add(, "ID:" & Replace(TMP(1), "&Singer", ""), Replace(TMP(1), "&Singer", ""))
    49.                                 LI.ListSubItems.Add , , TMP(4)
    50.                                 LI.ListSubItems.Add , , Replace(TMP(2), "&Album", "")
    51.                                 LI.ListSubItems.Add , , Replace(TMP(3), "&Song", "")
    52.                             End If
    53.                         End If
    54.                     Next
    55.                 End If
    56.             Next
    57.         Next
    58.     Next
    59.    
    60.    
    61. End Sub
    62. Private Function LIExists(sKEY As String) As Boolean
    63.     On Error GoTo ItsNotThere
    64.     Dim LI2 As ListItem
    65.     Set LI2 = ListView1.ListItems("ID:" & sKEY)
    66.     LIExists = True
    67.    
    68.     Exit Function
    69. ItsNotThere:
    70. End Function
    JPnyc rocks!! (Just ask him!)
    If u have your answer please go to the thread tools and click "Mark Thread Resolved"

  27. #27

    Thread Starter
    Frenzied Member
    Join Date
    Apr 2005
    Posts
    1,907

    Arrow Re: Help extracting dyanamic data from html code

    MAny thanks . Yes the word Album seperates them . It is like this :

    Code:
    <table width="400" border="1" cellspacing="0" bordercolor="#9999FF">
      <tr>
        <td colspan="5">artistname Album albumname</td>
      </tr>
      <tr>
        <td colspan="5"><div align="left">some discription here<br><b>Release Year :</b> 2000<br>
      </div></td>
      </tr>
      <tr>
    
    ........... rest of html
    I tired it and it worked on live version but artist name is missing from it. i get album name in colum that has artistname title and no artitsname infor in the listview! But still no good news of artist pic name:-)
    Last edited by tony007; May 4th, 2006 at 12:31 PM.

  28. #28
    eltiT resU motsuC Static's Avatar
    Join Date
    Oct 2000
    Location
    Rochester, NY
    Posts
    9,390

    Re: Help extracting dyanamic data from html code

    give this a shot:

    1) add a listbox to the form...

    VB Code:
    1. Private Declare Function URLDownloadToFile Lib "urlmon" Alias "URLDownloadToFileA" (ByVal pCaller As Long, ByVal szURL As String, ByVal szFileName As String, ByVal dwReserved As Long, ByVal lpfnCB As Long) As Long
    2.  
    3. Public Function DownloadFile(URL As String, LocalFilename As String) As Boolean
    4.     Dim lngRetVal As Long
    5.     lngRetVal = URLDownloadToFile(0, URL, LocalFilename, 0, 0)
    6.     If lngRetVal = 0 Then DownloadFile = True
    7. End Function
    8.  
    9.  
    10. Private Sub Form_Load()
    11.     WebBrowser1.Navigate "file:///C:/VB/tony007/htmlpagehere/page2.html"
    12.    
    13. End Sub
    14.  
    15. Private Sub WebBrowser1_DocumentComplete(ByVal pDisp As Object, URL As Variant)
    16.     If (pDisp Is WebBrowser1.Application) Then
    17.         GetInfo WebBrowser1.Document
    18.     End If
    19. End Sub
    20.  
    21.  
    22. Private Sub GetInfo(HTML As HTMLDocument)
    23.     Dim HTMLA As HTMLAnchorElement
    24.     Dim HTMLT As HTMLTable
    25.     Dim HTMLR As HTMLTableRow
    26.     Dim HTMLC As HTMLTableCell
    27.     Dim HTMLI As HTMLInputElement
    28.     Dim HTMLImg As HTMLImg
    29.     Dim pTYPE As Integer
    30.     Dim TMP() As String
    31.     Dim tArtist As String
    32.     Dim tAlbum As String
    33.     Dim LI As ListItem
    34.     For Each HTMLT In HTML.getElementsByTagName("table")
    35.         For Each HTMLR In HTMLT.rows
    36.             For Each HTMLC In HTMLR.cells
    37.                 If HTMLC.colSpan = 5 Then
    38.                     If InStr(HTMLC.innerText, " Album ") Then
    39.                         TMP = Split(HTMLC.innerText, " Album ")
    40.                         tArtist = TMP(0)
    41.                         tAlbum = TMP(1)
    42.                         pTYPE = 1
    43.                     End If
    44.                 End If
    45.                 If pTYPE = 1 Then
    46.                     For Each HTMLI In HTMLC.getElementsByTagName("input")
    47.                         If HTMLI.Type = "checkbox" Then
    48.                             Set LI = ListView1.ListItems.Add(, "ID:" & HTMLI.Value, HTMLI.Value)
    49.                         End If
    50.                     Next
    51.                     For Each HTMLA In HTMLC.getElementsByTagName("a")
    52.                         If InStr(HTMLA.onclick, LI.Text) Then
    53.                             LI.ListSubItems.Add , , HTMLA.innerText
    54.                             LI.ListSubItems.Add , , tArtist
    55.                             LI.ListSubItems.Add , , tAlbum
    56.                             Exit For
    57.                         End If
    58.                     Next
    59.                 Else
    60.                     For Each HTMLA In HTMLC.getElementsByTagName("a")
    61.                         If InStr(HTMLA.href, "WriteLyrics.asp?SongID=") Then
    62.                             TMP = Split(HTMLA.href, "=")
    63.                             If Not LIExists(Replace(TMP(1), "&Singer", "")) Then
    64.                                 Set LI = ListView1.ListItems.Add(, "ID:" & Replace(TMP(1), "&Singer", ""), Replace(TMP(1), "&Singer", ""))
    65.                                 LI.ListSubItems.Add , , TMP(4)
    66.                                 LI.ListSubItems.Add , , Replace(TMP(2), "&Album", "")
    67.                                 LI.ListSubItems.Add , , Replace(TMP(3), "&Song", "")
    68.                             End If
    69.                         End If
    70.                         If InStr(HTMLA.href, "/CdImages/") <> 0 Then
    71.                             TMP = Split(HTMLA.href, "=")
    72.                             pName = Right(TMP(1), Len(TMP(1)) - InStrRev(TMP(1), "/"))
    73.                             DownloadFile TMP(1), App.Path & "\" & pName
    74.                             List1.AddItem pName
    75.                         End If
    76.                     Next
    77.                    
    78.                    
    79.                 End If
    80.             Next
    81.         Next
    82.     Next
    83.    
    84.    
    85. End Sub
    86. Private Function LIExists(sKEY As String) As Boolean
    87.     On Error GoTo ItsNotThere
    88.     Dim LI2 As ListItem
    89.     Set LI2 = ListView1.ListItems("ID:" & sKEY)
    90.     LIExists = True
    91.    
    92.     Exit Function
    93. ItsNotThere:
    94. End Function
    JPnyc rocks!! (Just ask him!)
    If u have your answer please go to the thread tools and click "Mark Thread Resolved"

  29. #29

    Thread Starter
    Frenzied Member
    Join Date
    Apr 2005
    Posts
    1,907

    Re: Help extracting dyanamic data from html code

    wow. Thanks .That solved the pic name problem but instead of showing outputing one time the artis name it repated it more less then number of songs and more then one!!. For the second html a few information is missing such as Artistname and year release info.:-))

  30. #30
    eltiT resU motsuC Static's Avatar
    Join Date
    Oct 2000
    Location
    Rochester, NY
    Posts
    9,390

    Re: Help extracting dyanamic data from html code

    bah.. picky picky!!

    ok.. I cant really go farther without a Real copy of the pages.. its hard to see patterns in mock up versions
    JPnyc rocks!! (Just ask him!)
    If u have your answer please go to the thread tools and click "Mark Thread Resolved"

  31. #31

    Thread Starter
    Frenzied Member
    Join Date
    Apr 2005
    Posts
    1,907

    Arrow Re: Help extracting dyanamic data from html code

    Quote Originally Posted by Static
    bah.. picky picky!!

    ok.. I cant really go farther without a Real copy of the pages.. its hard to see patterns in mock up versions
    Many i know i am picky i was looking for such solution for like more then year but never figured out!! I zip both page 2 and 3 in html format and the bold part in page 3 are the ones i want to extract(Artistname,Album Name ,release year, songs in page 3 which is called albumlist.html).
    The only thing tha are missing from page2.html are artistname and release relase that it displays once at the top of table .

    Thanks

    Code:
    <html>
    
    <table width="420" border="0" cellpadding="2" cellspacing="2">
      <tr>
        <td colspan="3" bgcolor="#AEA7ED">Artistname</td>
      </tr>
      <tr>    <td bgcolor="#BFD1FB" align="center">Album Name </td>
        <td align="center" bgcolor="#BFD1FB">Release Year</td>
        <td align="center" bgcolor="#BFD1FB">Songs</td>
      </tr>
      <tr>
        <td width="277" bgcolor="#BFD1FB"><a href="songs.php?show_songs=AHT03&JALSA=a5f964470e9a2c7150c7a8f81dc9e273;allow=NO;mohim=Download" title="some data to be passed
    ">albumname</a></td>
        <td width="75" align="center" bgcolor="#BFD1FB">2003</td>
        <td width="48" align="center" bgcolor="#BFD1FB">10</td>
      </tr>
      <tr>
        <td width="277" bgcolor="#BFD1FB"><a href="songs.php?show_songs=AHT04&JALSA=a5f964470e9a2c7150c7a8f81dc9e273;allow=NO;mohim=Download" title="="some data to be passed">albumname2</a></td>
        <td width="75" align="center" bgcolor="#BFD1FB">2002</td>
        <td width="48" align="center" bgcolor="#BFD1FB">11</td>
      </tr>
      <tr>
        <td width="277" bgcolor="#BFD1FB"><a href="songs.php?show_songs=AHT02&JALSA=a5f964470e9a2c7150c7a8f81dc9e273;allow=NO;mohim=Download" title="="some data to be passed">albumname3</a></td>
        <td width="75" align="center" bgcolor="#BFD1FB">2001</td>
        <td width="48" align="center" bgcolor="#BFD1FB">11</td>
      </tr>
      <tr>
        <td width="277" bgcolor="#BFD1FB"><a href="songs.php?show_songs=AHT05&JALSA=a5f964470e9a2c7150c7a8f81dc9e273;allow=NO;mohim=Download" title="="some data to be passed">albumname4</a></td>
        <td width="75" align="center" bgcolor="#BFD1FB">2001</td>
        <td width="48" align="center" bgcolor="#BFD1FB">12</td>
      </tr>
      <tr>
        <td width="277" bgcolor="#BFD1FB"><a href="songs.php?show_songs=AHT01&JALSA=a5f964470e9a2c7150c7a8f81dc9e273;allow=NO;mohim=Download" title="="some data to be passed">albumname5</a></td>
        <td width="75" align="center" bgcolor="#BFD1FB">2000</td>
        <td width="48" align="center" bgcolor="#BFD1FB">14</td>
      </tr>
      <tr>
        <td width="277" bgcolor="#BFD1FB"><a href="songs.php?show_songs=AHT06&JALSA=a5f964470e9a2c7150c7a8f81dc9e273;allow=NO;mohim=Download" title="="some data to be passed">albumname6</a></td>
        <td width="75" align="center" bgcolor="#BFD1FB">1999</td>
        <td width="48" align="center" bgcolor="#BFD1FB">11</td>
      </tr>
    </table>
    
    
    
    <html>
    Attached Files Attached Files
    Last edited by tony007; May 4th, 2006 at 01:37 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width