Results 1 to 6 of 6

Thread: Math question - how to remove values way different from others in an array of numbers

  1. #1

    Thread Starter
    Frenzied Member
    Join Date
    Jul 2003
    Posts
    1,269

    Math question - how to remove values way different from others in an array of numbers

    Lets say in an array I have 20 items

    12834
    34252
    53526
    12320
    12140
    353622
    534
    53277
    89233
    23512

    how can i 'detect' that 353622 is wildly different from others in the list, as is '534'? And if i wanted to be mroe stringent.. detect '89233' as different too?

  2. #2

  3. #3

    Thread Starter
    Frenzied Member
    Join Date
    Jul 2003
    Posts
    1,269

    Re: Math question - how to remove values way different from others in an array of numbers

    Quote Originally Posted by MartinLiss
    What are the rules that define something as different?
    Im not sure.. these would be defined I suppose as 'outliners' via statistics

  4. #4

  5. #5
    Fanatic Member Matt_T_hat's Avatar
    Join Date
    Dec 2001
    Location
    '76 Male Body Evil-Errors: 666
    Posts
    774

    Re: Math question - how to remove values way different from others in an array of num

    Let me see how much I can remember from statistics.

    You would want to find the mean (sum of numbers divided by the number of numbers (n)) however if you have significant rouge numbers you may wish to multiply all the numbers and find the nth root.

    In your case the tenth root.

    Now you have a reliable mean you can work out the standard deviation. The formula for that should be easy to find in Google (I can't remember it off the top of my head). Spread sheets packages have such functions built in...

    Then you would want to find every thing within one standard deviation which should be around 50% of your population (those numbers).

    This page seems to cover that pretty well:

    http://www.statcan.ca/english/edu/po...2/variance.htm

    Now you would be looking for a number of standard deviations that contain 95% or more of your population sample. As your sample is so small you are going to be looking for contains all but 2 or 3.

    inside_range > (n - 3)

    Using code you will have an upper bound and a lower bound outside of which are values you don't want. You could drop these from the array and recalculate the mean and the SD and repeat until you are happy.

    The other option would be create a function that takes the mean and SD and a value as arguments and returns the number of SDs the number is from the mean. Again you would have to look up such a formula but it is fairly simple stuff I recall. You could then test for values that have a SD that is outside of the acceptable range and drop them at that stage.

    I hope that helps.
    ?
    'What's this bit for anyway?
    For Jono

  6. #6
    Junior Member
    Join Date
    Aug 2007
    Posts
    17

    Re: Math question - how to remove values way different from others in an array of numbers

    yeah i was thinking something similar.

    First, take the mean of all the numbers
    Second, find the range of all the numbers
    Third, create a new range based on a preset % of the old range
    new_range = [mean-%*old_range/2 mean+%*old_range/2]

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width