Results 1 to 2 of 2

Thread: normalize weighting

  1. #1

    Thread Starter
    New Member
    Join Date
    Mar 2006
    Posts
    1

    normalize weighting

    i'm trying to figure out a fair weighting system for words in a file. this is what i mean. lets say i have a file that has 200 unique words and the word 'foo' is there 5 times. then i have another file that has 150 unique words with 'foo' also listed 5 times. it works out to 5/200=.025 and 5/150=.033. just because the second file has less words doesn't make it a better match for foo. they should be equal. is there anyway i can normalize the weights. and i can't just use the number of occurances because there are other metrics involved. i hope this makes sense. any help would be appreciated.

  2. #2
    Frenzied Member zaza's Avatar
    Join Date
    Apr 2001
    Location
    Borneo Rainforest Habits: Scratching
    Posts
    1,486

    Re: normalize weighting

    Welcome to the Forums


    just because the second file has less words doesn't make it a better match for foo
    Yes it does. A higher proportion of the words in that file are "foo", and hence it has more relevance. Unless by "better match" you mean that a 200,000 word file with 2 instances of "foo" would be a better match than a 1 word file consisting solely of "foo", in which case you have to go by the absolute number.
    If you want to normalise something, then you have to know what you're normalising it to. If you want to make it more relevant if the first word is "foo" than the second, then there are ways of doing such things, but you need to decide how you're determining what makes one file a "better match" than another before you can proceed.

    zaza
    I use VB 6, VB.Net 2003 and Office 2010



    Code:
    Excel Graphing | Excel Timer | Excel Tips and Tricks | Add controls in Office | Data tables in Excel | Gaussian random number distribution (VB6/VBA,VB.Net) | Coordinates, Vectors and 3D volumes

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width