Results 1 to 14 of 14

Thread: keyword search

  1. #1

    Thread Starter
    Addicted Member
    Join Date
    Jul 2002
    Location
    Grand Rapids, MI
    Posts
    166

    keyword search

    I need to do a keyword search on various fields in a database. However, I'm not getting my keywords from a textbox but from a string. A user enters a question or sentence, and I need to find other records that may be related.

    Problem 1) I figure to get the keywords, I would split the string on spaces, but I don't want to search for all the words, especially not the more common ones (like the, a, an, but, etc.) Could I do this with a text file of common words, or maybe a table of common words in the database?

    Problem 2) Assuming I have the keywords, how would I do the search? I can't assume that the keywords will always be in the same order...

    Any help would be appreciated. Code would be ok, but I'm looking more just for the general logic that I would need for such a project. Thanks.

  2. #2
    -= B u g S l a y e r =- peet's Avatar
    Join Date
    Aug 2000
    Posts
    9,629
    Hi there Feetstink, sounds to me that you already have thought this through.

    1. As you suggested yourself, split the sting into an array.
    Load the nono words from a file. Loop through the array and build up your SQL expr. While looping check that the current word is not a nono word by using InStr.

    2. The strings you want must contain all the valid words from the array in 1. You can achieve this by using AND in your sql

    WHERE Field1 Like '%asdfasd%' And Field1 Like '%axcfrewr%' ......
    -= a peet post =-

  3. #3

    Thread Starter
    Addicted Member
    Join Date
    Jul 2002
    Location
    Grand Rapids, MI
    Posts
    166
    Thanks for the reply

    The first part sounds good, and I think I can code it.

    The second part is good too, but I need something a little different. I need it to bring back matches that are like >75% right too... I dont' want to eliminate a match because someone used the word Compaq instead of computer. You know what I mean?

    thanks..

  4. #4
    -= B u g S l a y e r =- peet's Avatar
    Join Date
    Aug 2000
    Posts
    9,629
    Oh.. I see. Think that will be a tough one. I have never done such coding, so I really can't give you any usefull tip on that subject ... sorry
    -= a peet post =-

  5. #5

    Thread Starter
    Addicted Member
    Join Date
    Jul 2002
    Location
    Grand Rapids, MI
    Posts
    166
    anyone?

  6. #6
    Super Moderator si_the_geek's Avatar
    Join Date
    Jul 2002
    Location
    Bristol, UK
    Posts
    41,974
    I looked at this and thought "I should know how to do this!".. so i've worked it out!

    you need to do this in a few stages, including temporary tables (temp table syntax may be different depending on you database system). You'll need to change field/table names to what you have in your database

    Step 1: put all matching rows into temporary table
    Code:
    select *, 0 as tmp_count from department
    where dept_name like '%Di%'
    or dept_name like '%ec%'
    or ....
    into my_temp
    Step 2: for each row n the temp table, calculate how many of the words match (repeat for each word)
    Code:
    update my_temp
    set tmp_count = tmp_count + 1
    where dept_name like '%Di%'
    ;
    update my_temp 
    set tmp_count = tmp_count + 1
    where dept_name like '%ec%'
    ;
    ...
    Step 3: get the results! (in descending order or relevance)
    Code:
    select * from my_temp
    order by tmp_count desc
    Step 4: remove the temporary table
    Code:
    drop temp table my_temp

  7. #7
    Super Moderator si_the_geek's Avatar
    Join Date
    Jul 2002
    Location
    Bristol, UK
    Posts
    41,974
    update: to get the percentage relevance too, change the SQL in step 3 to this:

    Code:
    select *, ( tmp_count /  <number of words to search for>  ) * 100 
    from my_temp
    order by tmp_count desc

  8. #8
    Frenzied Member swatty's Avatar
    Join Date
    Aug 2002
    Location
    somewhere on earth
    Posts
    1,478
    Which database your checking against ??


    If it is SQL Server there is something like English Query that is available for this kind of stuff.
    Code:
    If Question = Incomplete Then
       AnswerNextOne
    Else
       ReplyIfKnown
    End If
    cu Swatty

  9. #9

    Thread Starter
    Addicted Member
    Join Date
    Jul 2002
    Location
    Grand Rapids, MI
    Posts
    166

    Temporary Tables

    I'm sorry i took such a long time to reply to my own question. Thanks for your help. I'm using Microsoft Access 2000 for my database. I can't find anything about temporary tables in the help for Access. is it possible to do?
    In the first line of Step 1: it says
    select *, 0 as tmp_count from department
    what does the 0 do?

    Also does anyone know where I can find a text file of common words so I don't have to start one from scratch? preferably comma delmited...

    thanks again..

  10. #10
    Super Moderator si_the_geek's Avatar
    Join Date
    Jul 2002
    Location
    Bristol, UK
    Posts
    41,974
    the "0 as tmp_count" means that when the temporary table is filled with the matching rows, there is an extra column added (called tmp_count), which is later used to count the number of times that the word(s) appear (in step 2).

    This way you can get them ordered by relevance, rather than just in a random order.


    the alternative would be to just do this instead of all the steps I listed before:

    select * from department
    where dept_name like '%Di%'
    or dept_name like '%ec%'
    or ....

    of course this doesn't put the most relevant ones first.



    I've just checked Access, the INTO TEMP bit moves up to before the FROM, eg:

    select *, 0 as tmp_count into my_temp
    from department
    where dept_name like '%Di%'
    or dept_name like '%ec%'
    or ....

    FROM Countries;

  11. #11
    Addicted Member
    Join Date
    Aug 2002
    Location
    Belfast, N. Ireland
    Posts
    167
    Maybe this will be some help,

    1000 most common English words

  12. #12

    Thread Starter
    Addicted Member
    Join Date
    Jul 2002
    Location
    Grand Rapids, MI
    Posts
    166
    thank you both for your replies, si especially...
    i'll probably be back sometime, but it may take a long time for me to get enough of a working version to have more questions.

  13. #13
    Super Moderator si_the_geek's Avatar
    Join Date
    Jul 2002
    Location
    Bristol, UK
    Posts
    41,974
    no problem, it was useful for me too... not that I'm gonna use it at the moment!

    have fun with the rest of it

  14. #14

    Thread Starter
    Addicted Member
    Join Date
    Jul 2002
    Location
    Grand Rapids, MI
    Posts
    166
    where did you find the information for the temporary tables in Access? I can't find anything about dropping them... i just need a site where i can find more of that kind of information, if you know of one. thanx.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width