Results 1 to 5 of 5

Thread: how can i modify this code to search in non-english text !?

  1. #1

    Thread Starter
    New Member
    Join Date
    Aug 2000
    Posts
    12
    hi

    Code:
    #!/usr/local/bin/perl
    ####################################################################
    # URL Search Engine
    # Copyright 1996 Techno Trade  http://www.technotrade.com
    # Written By : Sammy Afifi   [email protected]
    # Date Last Modified : Jan 14, 1997
    ####################################################################
    #
    # This script is free of charge.
    # Please link back to http://technotrade.com, thank you :-)
    #######
    #
    # Jan 14, stops people from entering html tags, converts < to <
    #
    ####################################################################
    
    #  $linktitle, $linkdescrip,  $linkwords, $linkemail, $linkurl
    # define some global variables
    
        $fields = 5;                       # Number of fields in each record
        $filename = "urls.txt";      # The database text file
        $results = 1000;               # maximum number of results to display
    
        &parse_form;
        
        $searchstring = $FORM{'searchstring'};
    
        &addrecord if ($searchstring eq "**ADD RECORD**");
    
    
        &open_file("FILE1","",$filename);
    
        print "Content-type: text/html\n\n";
        print "<HTML>\n";
        print "<BODY BGCOLOR=#FFFFFF TEXT=#000000 LINK=#0000FF VLINK=#800040 ALINK=#800040>\n";
        print "<TITLE>Search Results</TITLE>\n";
        print "<CENTER><BR>\n";
        print "<FONT SIZE=5 COLOR=000000 FACE=\"ARIAL,TIMES NEW ROMAN\"><B>Results :</b></FONT></CENTER>\n";
    
        print "<HR width=80% noshade><BR><UL>\n";
        $counter = 0;
    
        while (($line = &read_file("FILE1")) && ($counter < $results)) {
             # split the fields at the | character     
             @tabledata = split(/\s*\|\s*/,$line ,$fields);
              &check_record;
              if ($found == 1) {
                $counter++;
                &print_record;
              }
    
        }
        close(FILE1);
        print "</UL>\n";
    
        if ($counter == 0) {
           print "<BR><center><B>Sorry ! not found.</B>\n";
        }
        
        
        print "<CENTER>\n";
        print "<HR width=80% noshade>\n";
        print "</CENTER>\n";
        print "</A></BODY></HTML>\n";
    
    
    
    #########################################
    #
    #  Print the matched record
    #
    #########################################
    sub print_record {
           print "<BR>\n";
           print "<LI><A HREF=" . $linkurl . ">" . $linktitle . "</A>  : $linkdescrip</B><BR>\n";    
    }
    
    
    ##########################################
    #
    #  Check to see if record matches search criteria
    #
    ##########################################
    sub check_record {
        # get the data from the record read from the file. $tabledata
    
       $linktitle = $tabledata[0];
       $linkdescrip = $tabledata[1];
       $linkwords = $tabledata[2];
       $linemail   = $tabledata[3];
       $linkurl   = $tabledata[4];
       chop($linkurl);
    
        #build the search line with all fields we want to search in
        $searchline = $linktitle . " " . $linkdescrip . " " . $linkwords;
    
    
       #search by keywords
       # only perform the keyword search if the length of the search string is greater than 2
       # don't think we want people to search for and  or or etc.
       $sfound = 0;
       $found = 0;
       $notfound = 1; 
    
       $stlen = length($searchstring);
       if ($stlen > 1) {
           @words = split(/ +/,$searchstring);
            foreach $aword (@words) {
               if ($searchline =~ /\b$aword/i) {
                      $sfound = 1;
               } 
               else {
                      $notfound = 0;
                }
             }
         }
        if ($sfound == 1 && $notfound == 1) {
            $found = 1;
         }
    
        # if search string is too small .. set found to 1
        if ($stlen <= 1) {
            $found = 1;
        }
        #if page doesn't have a title then return not found
        $tlen = length($linktitle);
        if ($tlen < 1) {
            $found = 0;
        }
    }
    
    
    ############################################
    #
    #  Add Record
    #
    ############################################
    
    sub addrecord {
    
      $linktitle    = $FORM{'linktitle'};
      $linkdescrip    = $FORM{'linkdescrip'};
      $linkwords    = $FORM{'linkwords'};
      $linkemail   = $FORM{'linkemail'};
      $linkurl = $FORM{'linkurl'};
    
      # Convert < tags to <
      $linktitle =~ s/</\</g;
      $linkdescrip =~ s/</\</g;
      $linkwords =~ s/</\</g;
      $linkemail =~ s/</\</g;
      $linkurl =~ s/</\</g;
    
    
      
      &open_file("FILE1",">>",$filename);
    
      &write_file("FILE1",$linktitle . "|". $linkdescrip. "|" .$linkwords ."|" .$linkemail ."|" .$linkurl ."\n");
       close(FILE1);
       print "Content-type: text/html\n\n";
       print "<html><head><title>Thank You</title></head>\n";
       print "<body background="/images/bground.gif" bgcolor="#ffffff" text="#000000" marginheight="0" topmargin="0"><BR><h3><CENTER>Thank you !</CENTER></h3>\n";
       print "</body></html>\n";
       exit;
    }
    
    
    
    
    
    
    
    
    
    
    
    sub parse_form {
    
       read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
       if (length($buffer) < 5) {
             $buffer = $ENV{QUERY_STRING};
        }
     
      @pairs = split(/&/, $buffer);
       foreach $pair (@pairs) {
          ($name, $value) = split(/=/, $pair);
    
          $value =~ tr/+/ /;
          $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
    
          $FORM{$name} = $value;
       }
    }
    
    
    
    sub open_file {
    
      local ($filevar, $filemode, $filename) = @_;
      
      open ($filevar,$filemode . $filename) ||
         die ("Can't open $filename");
    }
    
    sub read_file {
    
      local ($filevar) = @_;
    
      <$filevar>;  
    }
    
    sub write_file {
    
      local ($filevar, $line) = @_;
    
      print $filevar ($line);
    }
    i'm waitting ...

  2. #2
    Addicted Member
    Join Date
    Aug 1999
    Location
    Ilirska Bistrica, Slovenia
    Posts
    242
    As much as I know about Perl there's no need to modify that code. It's just search engine and it will search specified string no matter in what language it is written.

    An example for your statement:
    If you want to make an Spanish program in VB Microsoft would have to translate all VB keywords, operators, statements... to Spanish that you could make program in Spanish language. No offence, but isn't that stupid?
    Zvonko Bostjancic
    Ilirska Bistrica, Slovenia
    [email protected]
    Using VS6 Professional with SP3
    Programming mostly in VB and I've started to learn VC++ & MFC

  3. #3

    Thread Starter
    New Member
    Join Date
    Aug 2000
    Posts
    12
    wanted : Perl Peogrammer

  4. #4
    Frenzied Member Jop's Avatar
    Join Date
    Mar 2000
    Location
    Amsterdam, the Netherlands
    Posts
    1,986
    Hey, I think Zvonko's right, ofcourse it will search in any language, that has nothing to do with the code, but be sure to put the keywords in the desired language ofcourse.

    And BTW, why are you posting a Perl Question in a VB forum?
    Jop - validweb.nl

    Alcohol doesn't solve any problems, but then again, neither does milk.

  5. #5
    Fanatic Member
    Join Date
    Feb 2000
    Location
    Japan
    Posts
    840
    Maybe perl has the same problem as VB does with Double byte characters. A Len() call will give the length in characters as apposed to bytes. "Lenb()"

    Not sure how perl works with strings but it'd be worth investigating
    Paul Dwyer
    Network Engineer
    Aussie In Tokyo

    Using Powerbasic 6 & VB6 SP4 (Please also add your VB Version to your signature!)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width