Results 1 to 5 of 5

Thread: PHP parsing Help

  1. #1

    Thread Starter
    Junior Member
    Join Date
    May 2008
    Posts
    21

    Exclamation PHP parsing Help

    i want to parse a online webpage..its source seems to be like this.
    HTML Code:
    1. <tr>
    2.     <td colspan="2" align="center"><font color="#400040"><b>Register No</b></font></td>
    3.     <th colspan="2"><font color="Brown">30507205022</font></th>
    4.     <td colspan="2" align="center"><font color="#400040"><b>Name</b></font></td>
    5.     <th colspan="2"><font color="Brown">GAURAV DIXIT S</font></th>
    6.   </tr>
    7.   <tr>
    8.     <td colspan="2"><font color="blue"><center><b>Subject</b></font></td>
    9.     <td colspan="2"><font color="blue"><center><b>Internal</b></font></td>
    10.     <td colspan="2"><font color="blue"><center><b>External</b></font></td>
    11.     <td colspan="2"><font color="blue"><center><b>Result</b></font></td>
    12.   </tr>
    13.   <tr>
    14.     <td colspan="2"><center> CS1151</td>
    15.     <td colspan="2"><center> 17</td>
    16.     <td colspan="2"><center> 54</td>
    17.     <td colspan="2"><center> P
    18. </td>
    19.   </tr>
    20.  
    21.  
    22.  
    23.   <tr>
    24.     <td colspan="2"><center> CS1152</td>
    25.     <td colspan="2"><center> 18</td>
    26.     <td colspan="2"><center> 68</td>
    27.     <td colspan="2"><center> P
    28. </td>
    29.   </tr>
    30.  
    31.  
    32.  
    33.   <tr>
    34.     <td colspan="2"><center> CS1202</td>
    35.     <td colspan="2"><center> 18</td>
    36.     <td colspan="2"><center> 42</td>
    37.     <td colspan="2"><center> P
    38. </td>
    39.   </tr>
    40.  
    41.  
    42.  
    43.   <tr>
    44.     <td colspan="2"><center> CS1204</td>
    45.     <td colspan="2"><center> 17</td>
    46.     <td colspan="2"><center> 42</td>
    47.     <td colspan="2"><center> P
    48. </td>
    49.   </tr>
    50.  
    51.  
    52.  
    53.   <tr>
    54.     <td colspan="2"><center> CS1205</td>
    55.     <td colspan="2"><center> 16</td>
    56.     <td colspan="2"><center> 44</td>
    57.     <td colspan="2"><center> P
    58. </td>
    59.   </tr>
    60.  
    61.  
    62.  
    63.   <tr>
    64.     <td colspan="2"><center> CS1206</td>
    65.     <td colspan="2"><center> 18</td>
    66.     <td colspan="2"><center> 60</td>
    67.     <td colspan="2"><center> P
    68. </td>
    69.   </tr>
    70.  
    71.  
    72.  
    73.   <tr>
    74.     <td colspan="2"><center> IT1201</td>
    75.     <td colspan="2"><center> 17</td>
    76.     <td colspan="2"><center> 39</td>
    77.     <td colspan="2"><center> P
    78. </td>
    79.   </tr>
    80.  
    81.  
    82.  
    83.   <tr>
    84.     <td colspan="2"><center> IT1202</td>
    85.     <td colspan="2"><center> 16</td>
    86.     <td colspan="2"><center> 42</td>
    87.     <td colspan="2"><center> P
    88. </td>
    89.   </tr>
    90.  
    91.  
    92.  
    93.   <tr>
    94.     <td colspan="2"><center> MA1201</td>
    95.     <td colspan="2"><center> 14</td>
    96.     <td colspan="2"><center> 23</td>
    97.     <td colspan="2"><center> F
    98. </td>
    99.   </tr>

    i want to get the name,register no,and marks from these tables.

  2. #2
    PowerPoster
    Join Date
    Sep 2003
    Location
    Edmonton, AB, Canada
    Posts
    2,629

    Re: PHP parsing Help

    you can grab the page's source with file_get_contents(), and then make a simple regular expression with preg_match_all() to grab everything relevant and put it into an array.

  3. #3
    Frenzied Member
    Join Date
    Apr 2009
    Location
    CA, USA
    Posts
    1,516

    Re: PHP parsing Help

    Since you asked about parsing, I'm gonna skip any explanation about retrieving the data. Assuming you've already got the source code there in a string, here's what you might do...

    Code:
    //this contains your HTML source, on a single line:
    $htmlStr = "<tr>     <td colspan=\"2\" align=\"center\"><font color=\"#400040\"><b>Register No</b>...";
    
    //get rid of all tags except td, th and their contents
    $htmlStr = preg_replace("/<\/?[^(td|th|\/)][^>]*>/","",$htmlStr);
    
    //now get the td/th data in an array
    preg_match_all("/>([^<]*)</",$htmlStr,$matches);
    
    //$matches[1] now has the data you want, but also some junk whitespace
    //so, filter out the whitespace, then re-key the array
    
    function noSpace($var){
      if(trim($var) == ""){
        return false;
      }else{
        return true;
      }
    }
    $tableValues = array_values(array_filter($matches[1],"noSpace"));
    
    print_r($tableValues);
    The array $tableValues should have all the data you want at the end. Please ask questions about any part of it you'd like more details on.

  4. #4

    Thread Starter
    Junior Member
    Join Date
    May 2008
    Posts
    21

    Re: PHP parsing Help

    Thanks SambaNeko for ur help...u mean i have to get th e source using
    $htmlstr = file_get_contents();

  5. #5
    Frenzied Member
    Join Date
    Apr 2009
    Location
    CA, USA
    Posts
    1,516

    Re: PHP parsing Help

    Looks like that'd do it...

    Code:
    $htmlStr = file_get_contents("http://www.example.com/");
    ...but the code I gave above was designed for the sample source code you provided - if you retrieve the contents of an entire page (and it has other elements), I'm not sure what'd happen. There would certainly be interference if there are any other tables on the page.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width