Results 1 to 17 of 17

Thread: 2 things. getting source of another site and parsing.

  1. #1

    Thread Starter
    Addicted Member
    Join Date
    Oct 2003
    Location
    england
    Posts
    161

    2 things. getting source of another site and parsing.

    hi. i have two little problems. firstly, how do i get php to get the source code of another website? do i put an iframe in the page, or is there some sort of syntax to do it?

    then, (i kind of know this, but just reminded myself...) what is the best way to parse some text out of a big piece of text (which can be in a variable...)

    a few ways it could be done which would suit me would be to:

    remove all the text beofre a specified point, eg. the text: "hello"

    get a bit of text between two other bits of text. so, between "my " and " is", and i want to get "name". yah?

    anyway, any help would be great, thanks.

  2. #2
    Ex-Super Mod'rater Electroman's Avatar
    Join Date
    Sep 2000
    Location
    Newcastle, England
    Posts
    4,349
    As to the first question you can't actually get the source from a php file on a different server, its just not allowed, otherwise people would be able to see your passwords in the files. I assume you mean like you want to use someone elses page in one of your and for that the php file would need to be treated as a normal html file. This is becuase once your site request the URl of a php file that server will parse the php file its self then send you the resulting HTML. So using a IFrame would work but you have to understand the source code never leaves the server.

    As for the second point StrPos() and SubStr() should help you, I don't remember the syntax so it would be best to check it on www.php.net
    When your thread has been resolved please edit the original post in the thread ()
    and amend "-[RESOLVED]-" to the end of the title and change the icon to , Thank you.

    When posting Code use the [VBCode]Code Here[/VBCode] tags to be able to use the code highlighting.

  3. #3
    PowerPoster
    Join Date
    Sep 2003
    Location
    Edmonton, AB, Canada
    Posts
    2,629
    You can use fopen() to open any web URL. You cannot download their literal source, but you can download and store the browser output into a variable.

    This should work:
    PHP Code:
    $f fopen("http://www.vbforums.com""r");
    $r fread($f1024000);
    echo 
    $r;
    fclose($f); 
    Like Archer? Check out some Sterling Archer quotes.

  4. #4

    Thread Starter
    Addicted Member
    Join Date
    Oct 2003
    Location
    england
    Posts
    161
    right. but will fopen open the website from the server the script is running from, or the pc of the person who is running the script? i want it to open from the pc of the person who is running the script...

    edit: i've tried this, and i onyl get the first 100 or so lines of source (on these forums only the top bar shows up) how come the whole page doesnt load?
    Last edited by shaunyboy; Apr 27th, 2004 at 10:54 AM.

  5. #5

    Thread Starter
    Addicted Member
    Join Date
    Oct 2003
    Location
    england
    Posts
    161
    hmm i used this instead, and it got all the code:

    PHP Code:
    $contents file_get_contents "http://www.vbforums.com" );
    echo(
    $contents); 
    however, it gets the site from the server, not the user's pc. damn. what can i do? maybe this could be done in javascript or something like that... but i dont know javascript

  6. #6
    Ex-Super Mod'rater Electroman's Avatar
    Join Date
    Sep 2000
    Location
    Newcastle, England
    Posts
    4,349
    Posted by shaunyboy
    edit: i've tried this, and i onyl get the first 100 or so lines of source (on these forums only the top bar shows up) how come the whole page doesnt load?
    Because of the line:
    PHP Code:
    $r fread($f1024000); 
    this only gets the first 1000KB from the file. Make the number bigger to get more of the file.

    Are you after the php source or the generated HTM any way?
    When your thread has been resolved please edit the original post in the thread ()
    and amend "-[RESOLVED]-" to the end of the title and change the icon to , Thank you.

    When posting Code use the [VBCode]Code Here[/VBCode] tags to be able to use the code highlighting.

  7. #7

    Thread Starter
    Addicted Member
    Join Date
    Oct 2003
    Location
    england
    Posts
    161
    Originally posted by Electroman
    Because of the line:
    PHP Code:
    $r fread($f1024000); 
    this only gets the first 1000KB from the file. Make the number bigger to get more of the file.

    Are you after the php source or the generated HTM any way?
    i'm after the source. I realised what that did when i looked it up on php.net, thanks anyway.

  8. #8
    Ex-Super Mod'rater Electroman's Avatar
    Join Date
    Sep 2000
    Location
    Newcastle, England
    Posts
    4,349
    Originally posted by shaunyboy
    i'm after the source. I realised what that did when i looked it up on php.net, thanks anyway.
    To get the source you can use this:
    PHP Code:
    $f fopen("file.php""r"); 
    $r fread($f1024000); 
    echo 
    $r
    fclose($f); 
    However this must be a relitive path for you to get the source which means it will only work for the site that the file is running on, otherwise you get the generated html. BTW as I pointed out in my first post it is impossible for you to get the source from files that aren't on the same server. This is to protect the source and it will be Apache (or whatever is being used) that will stop you because your server would send a request for the file to the server it is on, Apache will get this request and in its settings it will know php files must be parsed before sending them (hence the impossible ).
    When your thread has been resolved please edit the original post in the thread ()
    and amend "-[RESOLVED]-" to the end of the title and change the icon to , Thank you.

    When posting Code use the [VBCode]Code Here[/VBCode] tags to be able to use the code highlighting.

  9. #9

    Thread Starter
    Addicted Member
    Join Date
    Oct 2003
    Location
    england
    Posts
    161
    Originally posted by Electroman
    To get the source you can use this:
    PHP Code:
    $f fopen("file.php""r"); 
    $r fread($f1024000); 
    echo 
    $r
    fclose($f); 
    However this must be a relitive path for you to get the source which means it will only work for the site that the file is running on, otherwise you get the generated html. BTW as I pointed out in my first post it is impossible for you to get the source from files that aren't on the same server. This is to protect the source and it will be Apache (or whatever is being used) that will stop you because your server would send a request for the file to the server it is on, Apache will get this request and in its settings it will know php files must be parsed before sending them (hence the impossible ).
    oh, you mean the actual script/code/whatever. nono, i just want to get the html the server generates im not crazy.

  10. #10
    Ex-Super Mod'rater Electroman's Avatar
    Join Date
    Sep 2000
    Location
    Newcastle, England
    Posts
    4,349
    In that case I suppose this thread is resolved
    When your thread has been resolved please edit the original post in the thread ()
    and amend "-[RESOLVED]-" to the end of the title and change the icon to , Thank you.

    When posting Code use the [VBCode]Code Here[/VBCode] tags to be able to use the code highlighting.

  11. #11

    Thread Starter
    Addicted Member
    Join Date
    Oct 2003
    Location
    england
    Posts
    161
    no it's not.

    say the user is logged onto a site. i want the script to navigate to that site ON THEIR COMPUTER, so i can get the source and, for example, parse out the username, so i can then say: 'you are logged into site A as username: whatever'. see?

    but, since php is a server-side language, this aint gonna happen, right?
    Last edited by shaunyboy; Apr 27th, 2004 at 04:34 PM.

  12. #12
    Ex-Super Mod'rater Electroman's Avatar
    Join Date
    Sep 2000
    Location
    Newcastle, England
    Posts
    4,349
    Originally posted by shaunyboy
    no it's not.

    say the user is logged onto a site. i want the script to navigate to that site ON THEIR COMPUTER, so i can get the source and, for example, parse out the username, so i can then say: 'you are logged into site A as username: whatever'. see?

    but, since php is a server-side language, this aint gonna happen, right?
    Correct, however if you have your own php enabled site then put a script (the one you are talking about) on there. Then build a VB app that will use the browser control to open this script of yours. The browser control will then hold the HTML code and all that is left is to parse it.
    When your thread has been resolved please edit the original post in the thread ()
    and amend "-[RESOLVED]-" to the end of the title and change the icon to , Thank you.

    When posting Code use the [VBCode]Code Here[/VBCode] tags to be able to use the code highlighting.

  13. #13
    PowerPoster
    Join Date
    Sep 2003
    Location
    Edmonton, AB, Canada
    Posts
    2,629
    Actually, you could grab the user information from that site that you're logged into, just as long as that username is printed SOMEWHERE on the page. It might be a problem to getting it to parse correctly, especially if they change their template a lot, but you can always grab that string. Finding the way to get it correctly and only it is the problem though.. take this example I just whipped up:

    Page to get the username from: (change the GET request of 'user' to change the username that appears, eg: add ?user=name for 'name' to be the username)
    http://david.gamersepitome.net/files/php/parse/user.php

    Page that parses that page and grabs the username:
    http://david.gamersepitome.net/files...rse/parser.php

    Source of the page that has the username on it:
    PHP Code:
    <?
      //set a default user if the $_GET['user'] var isn't set
      $_GET['user'] = (isset($_GET['user']) && $_GET['user'] != "") ? $_GET['user'] : "username";
    ?>
    <html>
      <head>
        <title>parse test</title>
      </head>
      <style type="text/css">
        body { background: #ffffff; color: #000000; }
        table { border: 1px solid #000000; }
      </style>
      <body>
        <table width="600" height="400">
          <tr>
            <td colspan="2" class="top" align="right">
              <?=date("F jS, Y"time()) . "\n";?>
            </td>
          </tr>
          <tr>
            <td colspan="2"><h1>parse test site</h1></td>
          </tr>
          <tr>
            <td width="100" valign="top">
              Logged in as:<br>
              <!-- TAKE NOTE OF THE BELOW HTML COMMENTS! -->

                <!-- username --><?=$_GET['user'];?><!-- username -->

              <!-- TAKE NOTE OF THE ABOVE HTML COMMENTS! -->
            </td>
            <td width="500" valign="top">
              <b>news:</b><br>
              <blockquote>
                in the news today, some guy was shot down while walking to the deli to buy a ham sandwich. we do not know why ham sandwiches are hated, but we intend to find out.
              </blockquote>
            </td>
          </tr>
        </table>
      </body>
    </html>
    Source of the page that parses that page and gets the username:
    PHP Code:
    <? $site = "http://david.gamersepitome.net/files/php/parse/user.php?user=blah"; ?>
    reading and parsing source from "<i><?=$site;?></i>"....<br><br>

    <b>parsed result:</b>
    <?
      $f = fopen($site, "r");
      $r = fread($f, 1024000); //read in 1024 kb
      fclose($f);

      $p = explode("<!-- username -->", $r); //split the file by an html comment
      $puser = $p[(count($p) - 2)];

      echo "you're logged into site A with username: " . $puser . "\n";
    ?>
    <br><br><br>
    the full html source of "<i><?=$site;?><i>":
    <hr width="100%" size="1" color="#000000">
    <blockquote>
    <xmp><?=$r;?></xmp>
    </blockquote>
    <hr width="100%" size="1" color="#000000">
    Hope that helps you out a bit.. post any questions you might have.
    Like Archer? Check out some Sterling Archer quotes.

  14. #14
    Ex-Super Mod'rater Electroman's Avatar
    Join Date
    Sep 2000
    Location
    Newcastle, England
    Posts
    4,349
    I've just thought, I dont think that would work because to log on to the page it would use cookies on the machine, however this way the request is coming from the php file and therefore its not going to look on the users PC for the cookie but the server instead. Is that right?

    BTW what you said above is what I was kinda tring to get at but badly worded .
    When your thread has been resolved please edit the original post in the thread ()
    and amend "-[RESOLVED]-" to the end of the title and change the icon to , Thank you.

    When posting Code use the [VBCode]Code Here[/VBCode] tags to be able to use the code highlighting.

  15. #15

    Thread Starter
    Addicted Member
    Join Date
    Oct 2003
    Location
    england
    Posts
    161
    Originally posted by kows
    Actually, you could grab the user information from that site that you're logged into, just as long as that username is printed SOMEWHERE on the page. It might be a problem to getting it to parse correctly, especially if they change their template a lot, but you can always grab that string. Finding the way to get it correctly and only it is the problem though.. take this example I just whipped up:

    Page to get the username from: (change the GET request of 'user' to change the username that appears, eg: add ?user=name for 'name' to be the username)
    http://david.gamersepitome.net/files/php/parse/user.php

    Page that parses that page and grabs the username:
    http://david.gamersepitome.net/files...rse/parser.php

    Source of the page that has the username on it:
    PHP Code:
    <?
      //set a default user if the $_GET['user'] var isn't set
      $_GET['user'] = (isset($_GET['user']) && $_GET['user'] != "") ? $_GET['user'] : "username";
    ?>
    <html>
      <head>
        <title>parse test</title>
      </head>
      <style type="text/css">
        body { background: #ffffff; color: #000000; }
        table { border: 1px solid #000000; }
      </style>
      <body>
        <table width="600" height="400">
          <tr>
            <td colspan="2" class="top" align="right">
              <?=date("F jS, Y"time()) . "\n";?>
            </td>
          </tr>
          <tr>
            <td colspan="2"><h1>parse test site</h1></td>
          </tr>
          <tr>
            <td width="100" valign="top">
              Logged in as:<br>
              <!-- TAKE NOTE OF THE BELOW HTML COMMENTS! -->

                <!-- username --><?=$_GET['user'];?><!-- username -->

              <!-- TAKE NOTE OF THE ABOVE HTML COMMENTS! -->
            </td>
            <td width="500" valign="top">
              <b>news:</b><br>
              <blockquote>
                in the news today, some guy was shot down while walking to the deli to buy a ham sandwich. we do not know why ham sandwiches are hated, but we intend to find out.
              </blockquote>
            </td>
          </tr>
        </table>
      </body>
    </html>
    Source of the page that parses that page and gets the username:
    PHP Code:
    <? $site = "http://david.gamersepitome.net/files/php/parse/user.php?user=blah"; ?>
    reading and parsing source from "<i><?=$site;?></i>"....<br><br>

    <b>parsed result:</b>
    <?
      $f = fopen($site, "r");
      $r = fread($f, 1024000); //read in 1024 kb
      fclose($f);

      $p = explode("<!-- username -->", $r); //split the file by an html comment
      $puser = $p[(count($p) - 2)];

      echo "you're logged into site A with username: " . $puser . "\n";
    ?>
    <br><br><br>
    the full html source of "<i><?=$site;?><i>":
    <hr width="100%" size="1" color="#000000">
    <blockquote>
    <xmp><?=$r;?></xmp>
    </blockquote>
    <hr width="100%" size="1" color="#000000">
    Hope that helps you out a bit.. post any questions you might have.
    wow, thanks, but again, like electroman says, it wont work... thanks for the parsing help though.

  16. #16
    Ex-Super Mod'rater Electroman's Avatar
    Join Date
    Sep 2000
    Location
    Newcastle, England
    Posts
    4,349
    Is there anything stopping you from using a browser control to open the page and use VB to parse the HTML from the control?
    When your thread has been resolved please edit the original post in the thread ()
    and amend "-[RESOLVED]-" to the end of the title and change the icon to , Thank you.

    When posting Code use the [VBCode]Code Here[/VBCode] tags to be able to use the code highlighting.

  17. #17

    Thread Starter
    Addicted Member
    Join Date
    Oct 2003
    Location
    england
    Posts
    161
    Originally posted by Electroman
    Is there anything stopping you from using a browser control to open the page and use VB to parse the HTML from the control?
    yes . I'm looking for a web-based solution, i am not making a program, and people are very cautious about what they download and run these days... and also for ease of use reasons.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width