Results 1 to 2 of 2

Thread: extracting data from html

  1. #1

    Thread Starter
    Member
    Join Date
    Jan 2007
    Posts
    62

    extracting data from html

    okay, i have some output that looks exactly like this

    Code:
    		
    	<div id="blogList" class="blogList">
    	  <h3 style="color: #666666">Recently Updated Weblogs...</h3>
    
                <div id="pingedBlogs" class="blogroller">
                	
                    <div id="19" class="blog">
                    <table class="blog"><tr>
                        <td class="blogName">
                            <a href="http://lucky007.18.dtiblog.com/" class="pingLink">
                                BLOG NAME ONE</a>
                        </td>
                        <td class="blogPingTime">
                            1:34 PM
                        </td>
                        <td>
                        
                            <img src="/images/no_feed_icon.png" title="No XML feed" alt="No XML feed" border="0">
                            
                        </td>
                    </tr></table>
                    </div>
    					
                    <div id="18" class="blog">
                    <table class="blog"><tr>
                        <td class="blogName">
                            <a href="http://onihanemuru.cocolog-nifty.com/blog/" class="pingLink">
                                å¿ƒæ˜ ã™æ°´é¡ã¯</a>
                        </td>
                        <td class="blogPingTime">
                            11:33 AM
                        </td>
                        <td>
                        
                            <img src="/images/no_feed_icon.png" title="No XML feed" alt="No XML feed" border="0">
                            
                        </td>
                    </tr></table>
                    </div>
    					
                    <div id="17" class="blog">
                    <table class="blog"><tr>
                        <td class="blogName">
                            <a href="http://blog.myspace.com/happinessisfreeforever" class="pingLink">
                                Wounds</a>
                        </td>
                        <td class="blogPingTime">
                            12:43 PM
                        </td>
                        <td>
                        
                            <img src="/images/no_feed_icon.png" title="No XML feed" alt="No XML feed" border="0">
                            
                        </td>
                    </tr></table>
                    </div>
    					
                    <div id="16" class="blog">
                    <table class="blog"><tr>
                        <td class="blogName">
                            <a href="http://jpriestbcn.blogspot.com/" class="pingLink">
                                Weblog&#32;de&#32;JPriest</a>
                        </td>
                        <td class="blogPingTime">
                            12:43 PM
                        </td>
                        <td>
                        
                            <a href="http://jpriestbcn.blogspot.com/" target="_blank">
                                <img src="/images/feed_icon.png" title="XML feed" alt="XML feed" border="0">
                            </a>
                            
                        </td>
                    </tr></table>
                    </div>
    					
                    <div id="15" class="blog">
                    <table class="blog"><tr>
                        <td class="blogName">
                            <a href="http://norinori1122.18.dtiblog.com/" class="pingLink">
                                ãƒã‚¤ãƒ¬ã‚°ã€€ãƒ¬ãƒ¼ã‚¹ã‚¯ã‚¤ãƒ¼ãƒ³â—†æƒ…å ±é¤¨</a>
                        </td>
                        <td class="blogPingTime">
                            1:34 PM
                        </td>
                        <td>
                        
                            <img src="/images/no_feed_icon.png" title="No XML feed" alt="No XML feed" border="0">
                            
                        </td>
                    </tr></table>
                    </div>
    					
                    <div id="14" class="blog">
                    <table class="blog"><tr>
                        <td class="blogName">
                            <a href="http://gloves.yourinformationpro.com" class="pingLink">
                                gloves.yourinformationpro.com</a>
                        </td>
                        <td class="blogPingTime">
                            12:43 PM
                        </td>
                        <td>
                        
                            <a href="http://gloves.yourinformationpro.com" target="_blank">
                                <img src="/images/feed_icon.png" title="XML feed" alt="XML feed" border="0">
                            </a>
                            
                        </td>
                    </tr></table>
                    </div>
    					
                    <div id="13" class="blog">
                    <table class="blog"><tr>
                        <td class="blogName">
                            <a href="http://askthewebhost.net/blogs/hosting-rating-web" class="pingLink">
                                Hosting&#32;rating&#32;web</a>
                        </td>
                        <td class="blogPingTime">
                            11:33 AM
                        </td>
                        <td>
                        
                            <img src="/images/no_feed_icon.png" title="No XML feed" alt="No XML feed" border="0">
                            
                        </td>
                    </tr></table>
                    </div>
    					
                    <div id="12" class="blog">
                    <table class="blog"><tr>
                        <td class="blogName">
                            <a href="http://blog.myspace.com/fullfrontalliberty" class="pingLink">
                                A&#32;Texas&#32;Dose&#32;of&#32;Freedom</a>
                        </td>
                        <td class="blogPingTime">
                            1:34 PM
                        </td>
                        <td>
                        
                            <img src="/images/no_feed_icon.png" title="No XML feed" alt="No XML feed" border="0">
                            
                        </td>
                    </tr></table>
                    </div>
    					
                    <div id="11" class="blog">
                    <table class="blog"><tr>
                        <td class="blogName">
                            <a href="http://safruddin.wordpress.com/" class="pingLink">
                                My&#32;Inspiration</a>
                        </td>
                        <td class="blogPingTime">
                            11:33 AM
                        </td>
                        <td>
                        
                            <img src="/images/no_feed_icon.png" title="No XML feed" alt="No XML feed" border="0">
                            
                        </td>
                    </tr></table>
                    </div>
    					
                    <div id="10" class="blog">
                    <table class="blog"><tr>
                        <td class="blogName">
                            <a href="http://ciaopanic.fasion.info/archives/50050112.html" class="pingLink">
                                ciaopanic&#37;83&#37;60&#37;83&#37;83&#37;83I&#37;83p&#37;83j&#37;8...</a>
                        </td>
                        <td class="blogPingTime">
                            1:34 PM
                        </td>
                        <td>
                        
                            <img src="/images/no_feed_icon.png" title="No XML feed" alt="No XML feed" border="0">
                            
                        </td>
                    </tr></table>
                    </div>
    				
                </div>
                <script language="JavaScript" type="text/javascript">
                var blogListViewPort = new ViewPort(10);
                blogListViewPort.setUiViewComponent($('pingedBlogs'));
                blogListViewPort.setI(19);
                </script>            
    
    	</div>
    i want to take all of the blog titles and add them to a listbox, each title is in the td tag called "blog name". so ideally i want to take the first one, "BLOG NAME ONE" and add it to a list, but i also want to take all of the other ones and add them to a list with it. (without the link thats inside of the titles also of course)

    anyone up for trying to parse this out? i was toying with it but couldnt figure it out.

  2. #2
    "Digital Revolution"
    Join Date
    Mar 2005
    Posts
    4,471

    Re: extracting data from html

    Wrote this up real quickly, see if it's what you need.
    Attached Files Attached Files

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width