Using a mathematical identity to speed up certain calculations.-VBForums
Page 1 of 2 12 LastLast
Results 1 to 40 of 49

Thread: Using a mathematical identity to speed up certain calculations.

  1. #1

    Thread Starter
    Hyperactive Member Maven's Avatar
    Join Date
    Feb 2003
    Location
    Greeneville, TN
    Posts
    321

    Using a mathematical identity to speed up certain calculations.

    A long time ago, a mathematician by the name of Carl Friedrich Gauss was attending grade school. The teacher of his class wanted a break, so the teacher decided to assign the class a very time consuming problem. Sum all of the numbers between 1 and 50. 1+2+3..+..+49+50. So the students of the class set to work on the daunting task of summing all of these numbers. After a minute or two, the teacher noticed that Gauss wasn't doing anything. After investigation, the teacher found out that Gauss had discovered a new way to solve the problem without having to sum up each element by hand. The identity he found was: the sum of i from 1 to n is (n(n+1))/2.

    What is the moral of the story? Don't use a loop if you can do something in constant time.

    A good web site for practicing programming skills is Project Euler. Of particular interest, the very first problem can be solved using the method above.

    Try to solve it on your own before reading the rest of this post.






    Keep trying......






    Alright. So you need to know the sum of multiples of three or five for every single number under one thousand.

    The first step is to calculate the sum of multiples of threes under 1000:

    The sum of i from [1 to n] is (n(n+1))/2; as a result, 3 * sum of i from [1 to n] is 3(n(n+1))/2.
    where n = floor(N/3) => floor(999/3) = 333
    N is the total number you want to count up to.
    The next step is to calculate the sum of multiples of fives under 1000:

    The sum of i from [1 to m] is (m(m+1))/2; as a result, 5 * sum of i from [1 to m] is 5(n(n+1))/2.
    where m = floor(N/5) => floor(999/5) = 166
    We can now begin to construct a formula.

    3(n(n+1))/2 + 5(m(m+1))/2
    where n = floor(N/3) => floor(999/3) => 333
    where m = floor(N/5) => floor(999/5) => 199
    If you try this formula, you might get an incorrect answer. Why?

    It turns out that there is a set theory problem lurking behind the scenes. The left hand component will generate values like 3*5 while the right hand component will generate values like 5 * 3. So we are calculating 15 twice in the above formula (along with other similar values).

    So it turns out we got one more step to do. We need to subtract off the extra calculation we are doing. How do we do that?

    3(n(n+1))/2 + 5(m(m+1))/2 - (3*5)(o(o+1)/2
    where n = floor(N/3) => floor(999/3) => 333
    where m = floor(N/5) => floor(999/5) => 166
    where o = floor(N/15) => floor(999/15) => 66
    Now we should get the correct value. I apologize for the syntax, but I don't think this thread supports latex.

    The above method is the smart man's approach to solving the problem. Notice that it can be implemented in code without using a single loop.

    *updated thanks to Schmidt for pointing out the index error.

    Here is how the above mathematics is implemented in visual basic.

    Code:
    Public Class Form1
    
        Function CalculateResultBest(ByVal num As Long) As Long
            Dim n As Integer
            Dim m As Integer
            Dim o As Integer
    
            n = Math.Floor(num / 3)
            m = Math.Floor(num / 5)
            o = Math.Floor(num / 15)
    
            CalculateResultBest = ((3 * n * (n + 1)) / 2) + ((5 * m * (m + 1)) / 2) - ((15 * o * (o + 1)) / 2)
        End Function
    
        Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
            MsgBox(CalculateResultBest(999))
        End Sub
    End Class
    Last edited by Maven; Jun 28th, 2014 at 06:40 PM.
    Education is an admirable thing, but it is well to remember from time to time that nothing that is worth knowing can be taught. - Oscar Wilde

  2. #2
    Noodly Appendage wossname's Avatar
    Join Date
    Aug 2002
    Location
    #!/bin/bash
    Posts
    5,669

    Re: Using a mathematical identity to speed up certain calculations.

    Come again?
    I don't live here any more.

  3. #3

    Thread Starter
    Hyperactive Member Maven's Avatar
    Join Date
    Feb 2003
    Location
    Greeneville, TN
    Posts
    321

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by wossname View Post
    Come again?
    You can solve the problem in O(1) time using the above method. The formula 3(n(n+1))/2 + 5(n(n+1))/2 - (3*5)(n(n+1)/2 will solve it so that you don't need a loop. The formula will allow you to solve it for any n, but the problem asks for 1000. [error]Simply replace n with 1000 and do the calculation in your program to yield the answer to the problem.[end of error]

    [correction made in op]

    If you have a specific question about the mathematics behind it, well ask.

    But this is the most efficient way to solve this type of problem. The dumb method is to use loops and brute force everything.
    Last edited by Maven; Jun 28th, 2014 at 05:36 PM.
    Education is an admirable thing, but it is well to remember from time to time that nothing that is worth knowing can be taught. - Oscar Wilde

  4. #4
    Banned
    Join Date
    Jun 2013
    Posts
    1,002

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by Maven View Post
    You can solve the problem in O(1) time using the above method. The formula 3(n(n+1))/2 + 5(n(n+1))/2 - (3*5)(n(n+1)/2 will solve it so that you don't need a loop. The formula will allow you to solve it for any n, but the problem asks for 1000. Simply replace n with 1000 and do the calculation in your program to yield the answer to the problem.

    If you have a specific question about the mathematics behind it, well ask.
    I'm tempted, but from what I've seen so far I'm not sure if that is a good idea...

    Quote Originally Posted by Maven View Post
    But this is the most efficient way to solve this type of problem. The dumb method is to use loops and brute force everything.
    Well, among "dumb things a developer can do", premature optimization has a high ranking.

    So, what's wrong in using a simple "naive loop", as long as it is fast enough I might ask:

    E.g. for your given problem at hand, this would be the easily understood snippet below:
    Code:
    Function CalculateResult(ByVal n As Long) As Long
    Dim i As Long, Result As Long
      For i = 1 To n
        If i Mod 3 = 0 Then Result = Result + i
        If i Mod 5 = 0 Then Result = Result + i
      Next i
      CalculateResult = Result
    End Function
    Sure, if you're a bright candle and don't have to think about the more optimized approach
    very hard, then there's nothing wrong in writing it up "as you go" of course.

    Though the *most dumb* thing a developer can do, is to use (or promote) an "optimized approach",
    when it spits out the wrong results (as your recommended solution surely does).

    When I add the missing bracket in your expression, testing it out in the DirectWindow it gives:
    n=1000: ? 3*(n*(n+1))/2 + 5*(n*(n+1))/2 - (3*5)*(n*(n+1))/2
    -3503500

    Which obviously isn't the right result.

    So (as I read it earlier somewhere):

    Try again...

    You might want to take a look at the optimized approach below, to get some ideas on what you did wrong:
    Code:
    Function CalculateResultOptimized(ByVal n As Long) As Long
    Dim n3 As Long: n3 = n \ 3
    Dim n5 As Long: n5 = n \ 5
      CalculateResultOptimized = (3 * n3 * (n3 + 1) + 5 * n5 * (n5 + 1)) \ 2
    End Function
    Both approaches (the naive loop, as well as the optimized snippet above) give the correct result:
    ?CalculateResult(1000), CalculateResultOptimized(1000)
    267333 267333

    Olaf
    Last edited by Schmidt; Jun 28th, 2014 at 12:54 PM.

  5. #5
    Fanatic Member
    Join Date
    Apr 2012
    Posts
    616

    Re: Using a mathematical identity to speed up certain calculations.

    You are including numbers that are multiples of both 3 and 5 twice

    e.g. When i is 15, both If statements are true and, hence, 30 is added to your Result variable. This is the 'set theory' problem the OP mentions.

    Code:
    Function CalculateResult(ByVal n As Long) As Long
    Dim i As Long, Result As Long
      For i = 1 To n
        If i Mod 3 = 0 Then Result = Result + i
        If i Mod 5 = 0 Then Result = Result + i
      Next i
      CalculateResult = Result
    End Function
    Oh, and your loop should go up to n-1 since that is also stated in the problem.

    So..

    Code:
    Function CalculateResult(ByVal n As Long) As Long
    Dim i As Long, Result As Long
      For i = 1 To n - 1
        If i Mod 3 = 0 Or i Mod 5 = 0 Then Result = Result + i
      Next i
      CalculateResult = Result
    End Function
    ...is correct.
    Last edited by ColinE66; Jun 28th, 2014 at 03:22 PM.
    If you don't know where you're going, any road will take you there...

    My VB6 love-children: Vee-Hive and Vee-Launcher

  6. #6
    Banned
    Join Date
    Jun 2013
    Posts
    1,002

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by ColinE66 View Post
    You are including numbers that are multiples of both 3 and 5 twice

    e.g. When i is 15, both If statements are true and, hence, 30 is added to your Result variable. This is the 'set theory' problem the OP mentions.

    Code:
    Function CalculateResult(ByVal n As Long) As Long
    Dim i As Long, Result As Long
      For i = 1 To n
        If i Mod 3 = 0 Then Result = Result + i
        If i Mod 5 = 0 Then Result = Result + i
      Next i
      CalculateResult = Result
    End Function
    Oh, and your loop should go up to n-1 since that is also stated in the problem.

    So..

    Code:
    Function CalculateResult(ByVal n As Long) As Long
    Dim i As Long, Result As Long
      For i = 1 To n - 1
        If i Mod 3 = 0 Or i Mod 5 = 0 Then Result = Result + i
      Next i
      CalculateResult = Result
    End Function
    ...is correct.
    Believe me or not Colin, but I was aware of the problem of "multiples of the factors" being sumed-up
    twice, when I posted my examples (the OP mentioned the problem clearly enough in his post).

    My comment was thought to point out the premature optimization problem - and in addition
    the wrong expressions the OP was using to calculate the "sum of the set-parts" directly
    (not taking into account, that any given SubSet-Sum should be calculated, based on:
    nFactor = n IntegerDiv Factor - and not on n itself).

    And as for n-1 in the loop - that would be wrong IMO, since you can hand over
    a 999-Parameter as 'n' into the Functions quite easily (if that's the upper bound
    of the range you're doing your calculations on).

    Olaf

  7. #7
    Fanatic Member
    Join Date
    Apr 2012
    Posts
    616

    Re: Using a mathematical identity to speed up certain calculations.

    Hey, I was just pointing it out since I could see straight away that your code did not address the stated problem.

    As for the n-1, it did occur to me that you could pass 999 to the function but since the problem was stated as solving for a value less than n (and also because you yourself called your own function with 1000, in line with the stipulated parameter for the problem) I subtracted the 1 inside the function. Difference of opinion on that one, I'm afraid
    If you don't know where you're going, any road will take you there...

    My VB6 love-children: Vee-Hive and Vee-Launcher

  8. #8
    Banned
    Join Date
    Jun 2013
    Posts
    1,002

    Re: Using a mathematical identity to speed up certain calculations.

    And as for the "balance" of the whole thing with regards to "performance vs. efforts"...

    The "naive loop" is still not bad performancewise, even when we consider the task being:
    "Do that for a somewhat larger list of PrimeNumbers, not only hardwired to the values of 3 and 5."

    Code:
    Option Explicit
    
    Private Sub Form_Click()
    Dim PrimeFactorList() As Long
       With New_c.ArrayList(vbLong, 2, 3, 5, 7, 11, 13, 17, 19)
         .CopyToArray PrimeFactorList
       End With
       
       Cls
       New_c.Timing True: Print CalculateResult(100000, PrimeFactorList), New_c.Timing
    End Sub
     
    Private Function CalculateResult(ByVal n As Long, PrimeFactorList() As Long) As Currency
    Dim i As Long, j As Long, UBPrimeFactorList As Long
      UBPrimeFactorList = UBound(PrimeFactorList)
      For i = 1 To n
        For j = 0 To UBPrimeFactorList
          If i Mod PrimeFactorList(j) = 0 Then CalculateResult = CalculateResult + i: Exit For
        Next j
      Next i
    End Function
    The above naive loop will respond (even for a larger n of 100000) within about 1.5msec (native compiled).

    Now try to come up with the appropriate (generic) function which does the more direct calculation,
    based on the n * (n+1) / 2 approach.

    Keep in mind, that now any possible combination of the "multiple-factors-problem" has to be
    considered (considered, not always applied), to subtract properly from the resulting Set-Sums.

    Olaf
    Last edited by Schmidt; Jun 28th, 2014 at 04:21 PM.

  9. #9
    Fanatic Member
    Join Date
    Apr 2012
    Posts
    616

    Re: Using a mathematical identity to speed up certain calculations.

    What does your CalculateResultOptimized function look like now, Olaf? Incorporating a solution to the 'set-theory intersection problem'?
    If you don't know where you're going, any road will take you there...

    My VB6 love-children: Vee-Hive and Vee-Launcher

  10. #10
    Banned
    Join Date
    Jun 2013
    Posts
    1,002

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by ColinE66 View Post
    What does your CalculateResultOptimized function look like now, Olaf? Incorporating a solution to the 'set-theory intersection problem'?
    Well, that's what I'm currently struggling with (to come up with a generically usable approach, handing out similar results as the Looping-Function).

    I'm not a mathematician - and although the simplified case of two numbers has a trivial solution,
    which gives the right result.

    ...still hardwired at the moment:
    Code:
    Private Function CalculateResultOptimized(ByVal n As Long) As Currency
    Dim n3 As Long:  n3 = n \ 3
    Dim n5 As Long:  n5 = n \ 5
      CalculateResultOptimized = CalculateResultOptimized + 3 * n3 * (n3 + 1) \ 2
      CalculateResultOptimized = CalculateResultOptimized + 5 * n5 * (n5 + 1) \ 2
    
     'now the set-subtraction for the occurence of "combined factors"
     Dim n15 As Long: n15 = n \ 15
      CalculateResultOptimized = CalculateResultOptimized - 15 * n15 * (n15 + 1) \ 2
    End Function

    ...already the still simple case of only 3 factors (2, 3 and 5) is currently giving me headaches,
    since simply applying the logic from above will not work "straight out" as it seems:

    e.g. the following will not produce the correct result
    (and albeit having "slight clues" why it doesn't work - I've not yet found where I have
    the knot in my brain...)

    Code:
    Private Function CalculateResultOptimized(ByVal n As Long) As Currency
    Dim n2 As Long:  n2 = n \ 2
    Dim n3 As Long:  n3 = n \ 3
    Dim n5 As Long:  n5 = n \ 5
      CalculateResultOptimized = CalculateResultOptimized + 2 * n2 * (n2 + 1) \ 2
      CalculateResultOptimized = CalculateResultOptimized + 3 * n3 * (n3 + 1) \ 2
      CalculateResultOptimized = CalculateResultOptimized + 5 * n5 * (n5 + 1) \ 2
    
    'now the set-subtractions for all potential occurences of multiples of the factors
    Dim n6 As Long:  n6 = n \ 6    '2*3
    Dim n10 As Long: n10 = n \ 10  '2*5
    Dim n15 As Long: n15 = n \ 15  '3*5
    Dim n30 As Long: n30 = n \ 30  '2*3*5
      CalculateResultOptimized = CalculateResultOptimized - 6 * n6 * (n6 + 1) \ 2
      CalculateResultOptimized = CalculateResultOptimized - 10 * n10 * (n10 + 1) \ 2
      CalculateResultOptimized = CalculateResultOptimized - 15 * n15 * (n15 + 1) \ 2
      CalculateResultOptimized = CalculateResultOptimized - 30 * n30 * (n30 + 1) \ 2
    End Function
    Olaf
    Last edited by Schmidt; Jun 28th, 2014 at 05:02 PM.

  11. #11
    Fanatic Member
    Join Date
    Apr 2012
    Posts
    616

    Re: Using a mathematical identity to speed up certain calculations.

    Yeah, I'm playing around with this, too. Independently of your initial Optimised approach (I didn't really study it) I came up with something identical and am now trying to find a way of eliminating the multiple-factors problem from your post #8

    But I'm not a mathematician either ;(
    If you don't know where you're going, any road will take you there...

    My VB6 love-children: Vee-Hive and Vee-Launcher

  12. #12
    Fanatic Member
    Join Date
    Apr 2012
    Posts
    616

    Re: Using a mathematical identity to speed up certain calculations.

    Success! Works for original problem; just changed your arraylist to contain only 3 and 5 as below. Should work with more provided array is sorted in ascending order.

    Code:
    Option Explicit
    
    Private Sub Form_Click()
    Dim PrimeFactorList() As Long
       With New_c.ArrayList(vbLong, 3, 5)
         .CopyToArray PrimeFactorList
       End With
       Cls
       New_c.Timing True: Print CalculateResult(1000, PrimeFactorList), New_c.Timing
    End Sub
     
    Private Function CalculateResult(ByVal n As Long, PrimeFactorList() As Long) As Currency
    Dim i As Long, j As Long, k As Long, UBPrimeFactorList As Long, AlreadyCounted As Boolean
      UBPrimeFactorList = UBound(PrimeFactorList)
      For i = 1 To n - 1
        For j = 0 To UBPrimeFactorList
          AlreadyCounted = False
          For k = 0 To j - 1
             If i Mod PrimeFactorList(k) = 0 Then
                AlreadyCounted = True
                Exit For
             End If
          Next k
          If Not AlreadyCounted And i Mod PrimeFactorList(j) = 0 Then CalculateResult = CalculateResult + i: Exit For
        Next j
      Next i
    End Function
    EDIT: 41msec to process your lengthy arraylist for 100,000 numbers (not compiled)
    Last edited by ColinE66; Jun 28th, 2014 at 05:31 PM.
    If you don't know where you're going, any road will take you there...

    My VB6 love-children: Vee-Hive and Vee-Launcher

  13. #13
    Fanatic Member
    Join Date
    Apr 2012
    Posts
    616

    Re: Using a mathematical identity to speed up certain calculations.

    re post #10, I can see that you are going to have great difficulty avoiding subtracting the same numbers more than once when different multiples 'intersect'. I assume that is what you are struggling to resolve...
    If you don't know where you're going, any road will take you there...

    My VB6 love-children: Vee-Hive and Vee-Launcher

  14. #14
    Banned
    Join Date
    Jun 2013
    Posts
    1,002

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by ColinE66 View Post
    re post #10, I can see that you are going to have great difficulty avoiding subtracting the same numbers more than once when different multiples 'intersect'. I assume that is what you are struggling to resolve...
    Yep - I'm trying to find a generic function which internally makes use of the O(1)-category n*(n+1)/2 approach,
    to avoid the O(n) looping over the whole range.

    Not sure, if your function in #12 really does speed-up what I posted in #8,
    since I already had an early exit there, as soon as a valid factor was found
    (not possible to count something twice).

    When you use my function in #8 with n-1 as the parameter, then it wil always produce
    the same results as yours for the same list of PrimeFactors.

    Olaf

  15. #15
    Fanatic Member
    Join Date
    Apr 2012
    Posts
    616

    Re: Using a mathematical identity to speed up certain calculations.

    Awww shite. I mis-read your post #8 and thought that you hadn't allowed for the multiple-multiples scenario. Didn't even test it, in fact. Just went to work trying to eliminate them! Laughing at my own stupidity now. And, yes, it will definitely be slower than your code as I (obviously) just added a big overhead!
    If you don't know where you're going, any road will take you there...

    My VB6 love-children: Vee-Hive and Vee-Launcher

  16. #16

    Thread Starter
    Hyperactive Member Maven's Avatar
    Join Date
    Feb 2003
    Location
    Greeneville, TN
    Posts
    321

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by Schmidt View Post
    Well, among "dumb things a developer can do", premature optimization has a high ranking.
    You should read the post you link more carefully:
    Code:
    All optimization is premature unless
    
    A program is too slow (many people forget this part).
    
    You have a measurement (profile or similar) showing that the optimization could improve things.
    We can measure my optimization very easily. Your algorithm (which is incorrect by the way), is an O(N) algorithm; as a result, the time it takes your program to complete is linear with the input size. My algorithm is O(1) and it will complete in constant time regardless of input size.

    Brute force
    Brute force is the by definition the worse possible algorithm.


    E.g. for your given problem at hand, this would be the easily understood snippet below:

    Sure, if you're a bright candle and don't have to think about the more optimized approach
    very hard, then there's nothing wrong in writing it up "as you go" of course.
    One should strive for efficient algorithms. You're abusing Kunth's quote. He was talking about optimizing without understanding the gains.

    Though the *most dumb* thing a developer can do, is to use (or promote) an "optimized approach",
    when it spits out the wrong results (as your recommended solution surely does).
    I made a small indexing error. I typed the post straight out without testing. At any rate, I've corrected it.


    The correct answer is 233168.
    Education is an admirable thing, but it is well to remember from time to time that nothing that is worth knowing can be taught. - Oscar Wilde

  17. #17

    Thread Starter
    Hyperactive Member Maven's Avatar
    Join Date
    Feb 2003
    Location
    Greeneville, TN
    Posts
    321

    Re: Using a mathematical identity to speed up certain calculations.

    Here is the code and how you would implement the mathematics from the OP.

    Code:
    Public Class Form1
    
        Function CalculateResultBest(ByVal num As Long) As Long
            Dim n As Integer
            Dim m As Integer
            Dim o As Integer
    
            n = Math.Floor(num / 3)
            m = Math.Floor(num / 5)
            o = Math.Floor(num / 15)
    
            CalculateResultBest = ((3 * n * (n + 1)) / 2) + ((5 * m * (m + 1)) / 2) - ((15 * o * (o + 1)) / 2)
        End Function
    
        Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
            MsgBox(CalculateResultBest(999))
        End Sub
    End Class
    If you would rather the function behave as a less than num, simply add num = num-1 at the top of the function.

    At any rate, you will not obtain a faster result. And it's rather easy mathematics. My apologies for not testing it before posting (too much confidence causes mistake now and again). I made a simple indexing error without even thinking about it.

    One could generalize this algorithm easily to work for more cases. Just remember to subtract off intersections.
    Last edited by Maven; Jun 28th, 2014 at 06:38 PM.
    Education is an admirable thing, but it is well to remember from time to time that nothing that is worth knowing can be taught. - Oscar Wilde

  18. #18
    Banned
    Join Date
    Jun 2013
    Posts
    1,002

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by Maven View Post
    You should read the post you link more carefully:
    Code:
    All optimization is premature unless
    
    A program is too slow (many people forget this part).
    
    You have a measurement (profile or similar) showing that the optimization could improve things.
    I've read it - and am aware of what Hoare/Knuth were really saying, before posting the link.

    I'm just not sure about a practical use-case for the Gauss-optimization in most
    of a developers daily (often non-scientific) work - so, if you have such a case where
    this O(1)-level Gauss-optimization plays a larger role, I'd like to hear about it.

    Currently I'm trying to come up with a generic function for that, "just for fun".

    Quote Originally Posted by Maven View Post
    We can measure my optimization very easily. Your algorithm (which is incorrect by the way),
    Wich is the next wrong information I had to read from you... (since there's nothing incorrect
    with the function I posted in #8 - and neither was there anything incorrect with my first
    example, which was made to point out what you did wrong in applying Gauss to the problem
    at hand).

    Quote Originally Posted by Maven View Post
    is an O(N) algorithm; as a result, the time it takes your program to complete is linear with the input size. My algorithm is O(1) and it will complete in constant time regardless of input size.
    You didn't post any algorithm - you posted misleading information (now corrected, although not yet in #3).
    But in its original version nobody would have been able to make any use of the Gauss-formula you
    wrongly applied to the given problem.

    Quote Originally Posted by Maven View Post
    Brute force is the by definition the worse possible algorithm.
    That's like saying: "Bubble-Sort doesn't have its place, because there's much faster sorting-algos".

    Quote Originally Posted by Maven View Post
    One should strive for efficient algorithms.
    Sure - when they are really needed in your application - or when you're implementing a common base-library
    which you don't know what future usage-scenarios might be thrown at it, *and* when your personal capabilities
    are sufficient enough, to fullfill the promise of a given "theory" with an *error-free* implementation -
    then your statement makes sense.

    Quote Originally Posted by Maven View Post
    I made a small indexing error. I typed the post straight out without testing.
    Sorry, but it was not "a small indexing error" - you forgot a badly needed, separate integer-divide
    on each of the Sub-Sets, before applying the differing 'ns' to the Gauss-Function to
    determine a Sub-Sets Sum.

    Olaf

  19. #19
    Fanatic Member
    Join Date
    Apr 2012
    Posts
    616

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by Schmidt View Post
    Sorry, but it was not "a small indexing error" - you forgot a badly needed, separate integer-divide
    on each of the Sub-Sets, before applying the differing 'ns' to the Gauss-Function to
    determine a Sub-Sets Sum.

    Olaf
    Concur. Original post was a "large arithmetic error".
    If you don't know where you're going, any road will take you there...

    My VB6 love-children: Vee-Hive and Vee-Launcher

  20. #20
    Banned
    Join Date
    Jun 2013
    Posts
    1,002

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by Maven View Post
    Here is the code and how you would implement the mathematics from the OP.

    Code:
    Public Class Form1
    
        Function CalculateResultBest(ByVal num As Long) As Long
            Dim n As Integer
            Dim m As Integer
            Dim o As Integer
    
            n = Math.Floor(num / 3)
            m = Math.Floor(num / 5)
            o = Math.Floor(num / 15)
    
            CalculateResultBest = ((3 * n * (n + 1)) / 2) + ((5 * m * (m + 1)) / 2) - ((15 * o * (o + 1)) / 2)
        End Function
    
        Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
            MsgBox(CalculateResultBest(999))
        End Sub
    End Class
    That was already posted in #10 (the first function).

    Quote Originally Posted by Maven View Post
    One could generalize this algorithm easily to work for more cases. Just remember to subtract off intersections.
    Well - "easily" is a bit optimistic, as we already found problems with simply
    "subtracting the intersections".

    Please show us, how easily you can come up with an implementation for a Factorlist of more
    than only two factors - say, for the list of factors: 2, 3, 5, 7, 11, 13, 17, 19
    (as demonstrated with the still pretty simple looping-function in #8).

    Olaf

  21. #21

    Thread Starter
    Hyperactive Member Maven's Avatar
    Join Date
    Feb 2003
    Location
    Greeneville, TN
    Posts
    321

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by Schmidt View Post
    I've read it - and am aware of what Hoare/Knuth were really saying, before posting the link.

    I'm just not sure about a practical use-case for the Gauss-optimization in most
    of a developers daily (often non-scientific) work - so, if you have such a case where
    this O(1)-level Gauss-optimization plays a larger role, I'd like to hear about it.
    The topic already explained when and where this kind of method is useful. It works when you need to sum up a set of numbers. In addition, it teaches algorithms.


    You didn't post any algorithm - you posted misleading information (now corrected, although not yet in #3).
    But in its original version nobody would have been able to make any use of the Gauss-formula you
    wrongly applied to the given problem.
    The Guass identity is the correct method to use to solve this type of problem.

    That's like saying: "Bubble-Sort doesn't have its place, because there's much faster sorting-algos".
    The bubble sort is mostly used as an introduction to algorithms. Mostly, sorting algorithms are selected based upon various criterias. First the implementation size has to be compared with the amount of work that has to be done. In addition, sorting algorithms are data dependent. So knowing what kind of data your working with can influence your decision on sorting algorithms. And finally, algorithmic complexity is a big deal. The bubble sort is a O(N^2) algorithm. The heap sort is a o(Nlog(N)) algorithm. There is also the selection and insertion sort that both run in O(N^2). The insertion sort is the most useful of the O(N^2) class, and it's sometimes used in quick sort when the numbers get low (like n<10).

    In a basic nutshell, one should select the best algorithm for the job. In the case of summing up sets of numbers, I gave you the gold standard of O(1).

    Sure - when they are really needed in your application - or when you're implementing a common base-library
    which you don't know what future usage-scenarios might be thrown at it, *and* when your personal capabilities
    are sufficient enough, to fullfill the promise of a given "theory" with an *error-free* implementation -
    then your statement makes sense.
    The identity only requires simple algebra and it's quite famous.

    Sorry, but it was not "a small indexing error" - you forgot a badly needed, separate integer-divide
    on each of the Sub-Sets, before applying the differing 'ns' to the Gauss-Function to
    determine a Sub-Sets Sum.
    Olaf
    It was a simple indexing error. In my original post, the index was setup to run over 1000 iterations for each component of the formula. I simply forgot to adjust them. The where clauses I added was simply to control indexes for the summation.

    While I'm grateful you pointed out that I had some error, you going a little too far in your arguments. What are you trying to demonstrate other than trolling?
    Last edited by Maven; Jun 28th, 2014 at 11:43 PM.
    Education is an admirable thing, but it is well to remember from time to time that nothing that is worth knowing can be taught. - Oscar Wilde

  22. #22

    Thread Starter
    Hyperactive Member Maven's Avatar
    Join Date
    Feb 2003
    Location
    Greeneville, TN
    Posts
    321

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by Schmidt View Post
    That was already posted in #10 (the first function).



    Well - "easily" is a bit optimistic, as we already found problems with simply
    "subtracting the intersections".

    Please show us, how easily you can come up with an implementation for a Factorlist of more
    than only two factors - say, for the list of factors: 2, 3, 5, 7, 11, 13, 17, 19
    (as demonstrated with the still pretty simple looping-function in #8).

    Olaf
    You have to take a power set of it. Use the single elements in the formula, the rest of the elements are used for intersections.

    Example: The power set of <1, 2, 3>

    <<1>, <2>, <3>, <1,2>, <1,3>, <2,3>, <1,2,3>>


    The first three are your numbers for the formula. The rest of them are your intersections.

    Just do a search for recursive power set algorithm on google.

    Here is one:
    http://stackoverflow.com/questions/1...hout-any-loops
    Last edited by Maven; Jun 28th, 2014 at 10:30 PM.
    Education is an admirable thing, but it is well to remember from time to time that nothing that is worth knowing can be taught. - Oscar Wilde

  23. #23
    Fanatic Member
    Join Date
    Apr 2012
    Posts
    616

    Re: Using a mathematical identity to speed up certain calculations.

    I think it's pretty obvious that one needs the combinations of all elements in order to exclude the intersections. And this, of course, is easily coded when the number of elements and their values are known in advance. Personally, I'd really like to see something optimal that doesn't know those parameters up-front.
    If you don't know where you're going, any road will take you there...

    My VB6 love-children: Vee-Hive and Vee-Launcher

  24. #24
    Banned
    Join Date
    Jun 2013
    Posts
    1,002

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by Maven View Post
    In a basic nutshell, one should select the best algorithm for the job.
    In the case of summing up sets of numbers, I gave you the gold standard of O(1).
    See, that's the problem I have with your postings.

    You didn't gave us *anything* so far in this thread.

    What you gave us, was the (quite wellknown) story about little Gauss -
    and *his* famous (though still easy to grasp for non-mathematicians) Sum-Formula.

    What you then gave us on top of that was a problem you said is
    "easily solved" by applying the Gauss-Sum-Formula - but the way you
    applied it to the problem was entirely wrong, misleading the reader.

    Am I aggressive in pointing that out (even going to some length, to
    correct your wrong adaption)?

    Then on top of that you ridiculed everybody who would engage in applying
    "lesser algorithms from now on" as dumb:
    "...The dumb method is to use loops and brute force everything..."

    Well, such a "dumb method" is still standing there in my post #8 -
    solving the problem in a generic fashion for arbitrary lists of numbers...
    Reliably - and up to an n of 1Mio still fast enough (when native compiled).

    So, what I would like to see from you here, is "making good on your own
    words" - show us, how the problem can be approached (applying my corrected
    version of the Gauss-Sum-Formula-adaption to it) - handing out the same correct
    results as the "dumb function" in #8, for the set of numbers:
    2, 3, 5, 7, 11, 13, 17, 19
    (the correct result for these factors with n=100000 is: 4144972731)

    Please don't wiggle around - make good on your own words - and demonstrate
    how easy it is, to write an O(1)-algo for this case.

    I mean;
    - you say "anybody who's resorting to a loop, is doing something dumb"
    - you also say "it's easy"

    And yes, coming up with the power-sets in a generic fasion *is* the easy part -
    what now only remains, is to apply "proper summation or subtraction" of
    these Powersets.

    Did you read post #10 at all (studying the second function and what was said there)?

    Quote Originally Posted by Maven View Post
    The identity only requires simple algebra and it's quite famous.
    Yes, the Gauss-Sum-Formula *is* trivial, Maven.
    It's applying it to a concrete problem, a concrete implementation which is non-trivial
    (and you stumbled over that problem already).

    My plea to you is simply, to show the un-enlightened here, how the master himself
    applies it correctly in a *generic* fashion for an *arbitrary* list of factors.

    I already broke a lance for "keeping things and algorithms simple, when there's no
    need to come up with a faster implementation".

    You said: "...one should strive for efficient algorithms"

    Sure, but now - can you deliver please - or will we see further "evasive actions"?

    Olaf

  25. #25
    Banned
    Join Date
    Jun 2013
    Posts
    1,002

    Re: Using a mathematical identity to speed up certain calculations.

    FWIW, I have it working here now (sha1-checksum for my Zip-File is: b08cec4ca48ba84043b38040e94b14c4896cdff0)

    (...just waiting a bit for Maven to finish his implementation too...)

    And sure, for smaller lists (as for example {3 , 5}) the Gauss-approach is working much
    faster than then naive looping one - especially for larger n of 100000 or 1Mio.

    Though the O(1) behaviour (with regards to n) is only *one* thing - the other one
    is the amount of factor-combinations, when we talk about larger lists of Primes, as e.g.:
    {2, 3, 5, 7, 11, 13, 17, 19} or
    {2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43}

    The Powerset grows quite fast (exponentially) with the Factors-count, so there's
    a large trade-off for the Gauss-Implementation (whereas the loop-approach shrugs
    this larger list-count off, scaling roughly linearly with the amount of factors to check.

    So there's a "balance-point" (depending on n and the factors-count), where either
    the one - or the other of the algorithms can shine.

    Below's two ScreenShots with the results...

    For n = 100,000:
    http://vbRichClient.com/Downloads/GaussSumFormula.png

    And for n = 1Mio
    http://vbRichClient.com/Downloads/Ga...ormula1Mio.png

    Olaf
    Last edited by Schmidt; Jun 29th, 2014 at 11:25 AM.

  26. #26
    Fanatic Member
    Join Date
    Apr 2012
    Posts
    616

    Re: Using a mathematical identity to speed up certain calculations.

    Can you PM me the link, please, Olaf? Keen to take a look...
    If you don't know where you're going, any road will take you there...

    My VB6 love-children: Vee-Hive and Vee-Launcher

  27. #27
    Banned
    Join Date
    Jun 2013
    Posts
    1,002

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by ColinE66 View Post
    Can you PM me the link, please, Olaf? Keen to take a look...
    Sure - take a good look at the IIF-function and what it finally accomplishes,
    since that was the only thing I was struggling with the whole time, until the light
    finally came on... ;-)

    Olaf

  28. #28

    Thread Starter
    Hyperactive Member Maven's Avatar
    Join Date
    Feb 2003
    Location
    Greeneville, TN
    Posts
    321

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by ColinE66 View Post
    I think it's pretty obvious that one needs the combinations of all elements in order to exclude the intersections. And this, of course, is easily coded when the number of elements and their values are known in advance. Personally, I'd really like to see something optimal that doesn't know those parameters up-front.
    A power set algorithm is what is needed. There have been many different power set algorithms made over the years, and one can find one of his or her liking with a quick google search.
    Education is an admirable thing, but it is well to remember from time to time that nothing that is worth knowing can be taught. - Oscar Wilde

  29. #29

    Thread Starter
    Hyperactive Member Maven's Avatar
    Join Date
    Feb 2003
    Location
    Greeneville, TN
    Posts
    321

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by Schmidt View Post
    See, that's the problem I have with your postings.

    You didn't gave us *anything* so far in this thread.

    What you gave us, was the (quite wellknown) story about little Gauss -
    and *his* famous (though still easy to grasp for non-mathematicians) Sum-Formula.

    What you then gave us on top of that was a problem you said is
    "easily solved" by applying the Gauss-Sum-Formula - but the way you
    applied it to the problem was entirely wrong, misleading the reader.
    And it is easily solved by the Gauss identity. And it's rather straight forwards. After all, you implemented the method in one of your posts.


    Am I aggressive in pointing that out (even going to some length, to
    correct your wrong adaption)?
    The only thing wrong was the index over the summations were not setup. There is a difference in pointing out that an equation is yielding the wrong answer and trolling.

    Then on top of that you ridiculed everybody who would engage in applying
    "lesser algorithms from now on" as dumb:
    "...The dumb method is to use loops and brute force everything..."
    Brute force is the dumb method. One can't do worse than a brute force algorithm. Hence its name.



    Please don't wiggle around - make good on your own words - and demonstrate
    how easy it is, to write an O(1)-algo for this case.

    I mean;
    - you say "anybody who's resorting to a loop, is doing something dumb"
    - you also say "it's easy"

    And yes, coming up with the power-sets in a generic fasion *is* the easy part -
    what now only remains, is to apply "proper summation or subtraction" of
    these Powersets.
    A power set algorithm is a different algorithm from the calculation algorithm. You should use it to obtain combinations.

    I said a brute force algorithm is dumb, and it is. Your now making strawmen.


    Did you read post #10 at all (studying the second function and what was said there)?
    I just did one test with your algorithm using N=10 and it produced the correct result of 47.

    One thing you will have problems with is common multiples.

    2*5 = 10 and 2*5*3 = 30

    So every so many hops your double counting.
    Last edited by Maven; Jun 30th, 2014 at 12:14 AM.
    Education is an admirable thing, but it is well to remember from time to time that nothing that is worth knowing can be taught. - Oscar Wilde

  30. #30
    Banned
    Join Date
    Jun 2013
    Posts
    1,002

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by Maven View Post
    The function you created does not give the correct answer. You didn't subtract off intersections. Should I troll too now?
    That's again wrong information - and as for trolling - IMO you're doing it
    the whole time in our "conversation" (if it may be called that).

    Quote Originally Posted by Maven View Post
    In addition, I already told you the algorithm you need to use to generalize the problem.
    IE: A power set algorithm.
    Nope - I can only repeat myself here - you so far didn't tell anybody anything useful in this
    thread, all the stuff you're now riding around with, was already there - posted by others.

    E.g. the only code-snippet you came up so far (in #17), is a copy of #10 (first function).

    And pointing out a "set of powers" to be able to tackle the problem was *obvious* Maven.

    No need to mention it in your post #22 for the first time, after the representation of a
    complete Powerset was already posted (in #10 again, second function - showing the
    powerset for 2,3,5).

    Quote Originally Posted by Maven View Post
    A power set algorithm is a different algorithm from the calculation algorithm.
    You should use it to obtain combinations.
    Yeah - tell us news...


    Quote Originally Posted by Maven View Post
    I said a brute force algorithm is dumb, and it is. Your now making strawmen.
    Maven, do you read the postings of others at all?
    I mean, I've posted already the results, where the "dumb bruteforce loop"
    as you call it, outperforms the Gauss-Sum-approach by factor 15 or so.

    Quote Originally Posted by Maven View Post
    pseudocode:
    If combinations is greater than 1
    subtact
    else
    add
    endif
    And that's clearly wrong, since it will not work this way for a factor-count > 2
    (also already mentioned in #10).

    Quote Originally Posted by Maven View Post
    I just did one test with your algorithm using N=10 and it produced the correct result of 47.
    Yeah - still wiggling around the issue, I see - for N = 10 the result is correct of course -
    but for N >= 30, function 2 in post #10 will fail (since that's where the product of the largest
    "set of powers" (2*3*5 = 30) will start coming into play).

    Quote Originally Posted by Maven View Post
    If you were using your algorithm in post #8 to test it for accuracy, your algorithm in post #8 is incorrect.
    And another, clearly wrong statement from you.

    I mean, you *do* want to be taken seriously, won't you?

    Why then posting all that nonsense?
    Why not simply delivering an implementation of your own, to show how it's done correctly?

    I have mine ready - and can clearly tell you, that your advice:
    "...If combinations is greater than 1 subtract else add..." will not work for
    more than two factors.

    Olaf
    Last edited by Schmidt; Jun 29th, 2014 at 06:08 PM.

  31. #31
    Fanatic Member
    Join Date
    Apr 2012
    Posts
    616

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by Maven View Post
    One thing you will have problems with is common multiples.

    2*5 = 10 and 2*5*3 = 30

    So every so many hops your double counting.
    Pretty sure you added that to your post after Olaf's reply in #30 (i.e. in your edit of 30/6/14 @ 12:14am) - back-dating your posts is hardly playing fair!

    I'd quite like to see your solution to the problem, too, out of interest. Saying it is simple and pointing the reader to Google in a thread you began in order to educate people is a little strange, IMO. Eliminating the intersections of common multiples is most certainly not 'straight-forward'.
    If you don't know where you're going, any road will take you there...

    My VB6 love-children: Vee-Hive and Vee-Launcher

  32. #32
    Addicted Member
    Join Date
    May 2014
    Location
    Kanallaki Greece
    Posts
    150

    Re: Using a mathematical identity to speed up certain calculations.

    Back in school...i found an easy way to add all even numbers in a range of i,j

    The problem was supposed to have a solution with a loop of j-i times, and inside using mod to find if number is even..and then if yes add to an accumulator.

    The formula is simple
    if i is even then i=i-1
    if j is even then j=j+1
    so the sum is (j*j-i*i)/4
    and the number of even numbers is (j-i)/2

  33. #33

    Thread Starter
    Hyperactive Member Maven's Avatar
    Join Date
    Feb 2003
    Location
    Greeneville, TN
    Posts
    321

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by Schmidt View Post
    Yeah - still wiggling around the issue, I see - for N = 10 the result is correct of course -
    but for N >= 30, function 2 in post #10 will fail (since that's where the product of the largest
    "set of powers" (2*3*5 = 30) will start coming into play).
    Olaf
    Here... I post code for you


    You need a power set function... I simply implemented one of the java functions from here: http://rosettacode.org/wiki/Power_Set

    Code:
        Public Function powerSet(ByRef elementList As List(Of Long)) As List(Of List(Of Long))
            Dim ps As List(Of List(Of Long)) = New List(Of List(Of Long))
            Dim i As Integer
            Dim j As Integer
    
            ps.Add(New List(Of Long)()) 'empty set
    
            'now do powerset
            For i = 0 To elementList.Count - 1
                'Start a new list
                Dim newPS As List(Of List(Of Long)) = New List(Of List(Of Long))
    
                For j = 0 To ps.Count() - 1
                    'Add current subset
                    newPS.Add(New List(Of Long)(ps(j)))
    
    
                    Dim final As List(Of Long) = New List(Of Long)(ps(j))
                    final.Add(elementList(i))
                    newPS.Add(final)
                Next
                ps = newPS
            Next
            powerSet = ps
        End Function
    To do the calculation using the power set, you simply need to loop over the sets and subtract off the appropriate number of intersections. It's also called: http://en.wikipedia.org/wiki/Inclusi...sion_principle

    Code:
        Public Function mulSet(myset As List(Of Long))
            Dim i As Integer
            Dim result As Long = 1
            For i = 0 To myset.Count() - 1
                result = result * myset(i)
            Next
    
            mulSet = result
        End Function
    
    
        Public Function sumMultiples(powerSet As List(Of List(Of Long)), N As Long) As Long
            Dim x As Long
            Dim aIndex As Long
            Dim i As Integer
            Dim result As Long
    
    
            For i = 0 To powerSet.Count() - 1
                Dim currentList As List(Of Long) = powerSet(i)
                If currentList.Count() > 0 Then
    
    
                    If currentList.Count() Mod 2 = 0 Then
                        x = mulSet(currentList)
                        aIndex = N \ x
    
                        result = result - x * aIndex * (aIndex + 1) / 2
                    Else
                        x = mulSet(currentList)
                        aIndex = N \ x
    
                        result = result + x * aIndex * (aIndex + 1) / 2
                    End If
                End If
            Next
    
            sumMultiples = result
        End Function
    To use the above two funcitons, you simply need to call them like so
    Code:
        Public Function buildArray() As List(Of Long)
            Dim t1 As List(Of Long) = New List(Of Long)
            t1.Add(2)
            t1.Add(3)
            t1.Add(5)
            t1.Add(7)
            t1.Add(11)
            t1.Add(13)
            t1.Add(17)
            t1.Add(19)
            buildArray = t1
        End Function
    
        Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
    
            Dim N As Long = 1000000000
    
            Dim elements As List(Of Long) = buildArray()
            Dim sets = powerSet(elements)
            Dim reducedSet As Long = sumMultiples(sets, N)
    
            MsgBox(reducedSet)
    
        End Sub
    The complexity of the algorithm is decided by the power set for the inputs in the list. The calculation is still O(1). In other words, the size of N isn't going to change the speed of the code.
    Last edited by Maven; Jun 30th, 2014 at 03:22 PM.
    Education is an admirable thing, but it is well to remember from time to time that nothing that is worth knowing can be taught. - Oscar Wilde

  34. #34

    Thread Starter
    Hyperactive Member Maven's Avatar
    Join Date
    Feb 2003
    Location
    Greeneville, TN
    Posts
    321

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by ColinE66 View Post
    Pretty sure you added that to your post after Olaf's reply in #30 (i.e. in your edit of 30/6/14 @ 12:14am) - back-dating your posts is hardly playing fair!

    I'd quite like to see your solution to the problem, too, out of interest. Saying it is simple and pointing the reader to Google in a thread you began in order to educate people is a little strange, IMO. Eliminating the intersections of common multiples is most certainly not 'straight-forward'.
    I just seen he had an exit for in one of his algorithms. It's been a long time since I've looked at vb code, and I didn't notice it when I first posted.

    I posted the code above. And yes its straight forward. The power set algorithm is the only hard part to this algorithm.
    Last edited by Maven; Jun 30th, 2014 at 03:29 PM.
    Education is an admirable thing, but it is well to remember from time to time that nothing that is worth knowing can be taught. - Oscar Wilde

  35. #35
    Only Slightly Obsessive jemidiah's Avatar
    Join Date
    Apr 2002
    Posts
    2,371

    Re: Using a mathematical identity to speed up certain calculations.

    Since this is "code-it-better", I thought I'd mention the Python implementation of some of these algorithms, since many of these operations are particularly straightforward in standard Python. The "sum_quotients5" below in particular would be much longer in standard C# or VB.NET.

    The original problem's solutions, first naive, then with "Gauss's formula" (I call it "the triangular number formula"; it's very ancient):

    Code:
    def sum_quotients1(a, b, n_max):
        return sum(n for n in range(n_max) if n%a == 0 or n%b == 0)
    
    def sum_quotients2(a, b, n_max):
        n_max = n_max - 1
        def b2(n):
            return n*(n+1)/2
        return a * b2(n_max//a) + b * b2(n_max//b) - a*b * b2(n_max//(a*b))
    Of course, the second one only works when a, b are relatively prime positive integers. When they are general integers, the "set-theoretic problem" is that multiples of their lcm are included; recall lcm(a, b) * gcd(a, b) = a*b. The following accounts for that possibility (and changes b2 into a "lambda function"):

    Code:
    from fractions import gcd
    
    def lcm2(a, b):
        return a*b / gcd(a, b)
    
    def sum_quotients3(a, b, n_max):
        n_max = n_max - 1
        b2 = lambda n: n*(n+1)/2
        return a * b2(n_max//a) + b * b2(n_max//b) - lcm2(a, b) * b2(n_max//lcm2(a, b))
    The more general problem allowing a list of (let's say positive) integers in place of just a and b can be solved naively like this:

    Code:
    def sum_quotients4(L, n_max):
        return sum(n for n in range(n_max) if any(n%a==0 for a in L))
    The "inclusion-exclusion" approach can be done as follows. We need an n-ary lcm function first (modified from here for convenience):

    Code:
    def lcm(L):
        return reduce(lcm2, L)
    Next we need to iterate over the power set of L, alternatively adding and subtracting the lcm of subsets as we go:

    Code:
    from itertools import combinations
    
    def sum_quotients5(L, n_max):
        n_max = n_max - 1
        b2 = lambda n: n*(n+1)/2
        
        s = 0
        for i in range(len(L)):
            for subset in combinations(L, i+1):
                lcm_subset = lcm(subset)
                s += (-1)**i * lcm_subset * b2(n_max//lcm_subset)
        return s
    (The "s=0" through "return s" could have been written as

    Code:
        return sum((-1)**i * lcm(subset) * b2(n_max//lcm(subset)) for i in range(len(L)) for subset in combinations(L, i+1))
    )

    As for correctness, these all return 233168 on the input (3, 5, 1000) (or ([3, 5], 1000) as appropriate). Versions 1, 3, 4, and 5 also all return 249500 on (2, 2, 1000) or ([2, 2], 1000) and (2, 4, 1000) or ([2, 4], 1000).) Moreover, versions 4 and 5 both give 327958 on ([3, 4, 5, 6, 7], 1000), which strongly suggests this is all correct: version 4 is extremely simple and is nigh-guaranteed to be correct for that reason.

    Of course, since there are 2^N subsets of a list of length N (to construct each subset, for each item in the list, just choose whether to include it in the subset), so the inclusion-exclusion approach is terribly inefficient in that direction. If the elements of L are relatively large compared to n_max, then it becomes more efficient.

    If we're using ipython's interpreter to run these commands, the "magic" commands %time, %timeit, and %prun are wonderfully useful. %time tells you how long a command takes to run; %timeit roughly averages %time over many runs, with the number of runs chosen adaptively; and %prun profiles your code, giving you a function-by-function breakdown of how much time is spent where and how often things are getting called.

    (Finally, I'd like to say I have no interest whatsoever in getting involved some sort of personal debate. I'm only here since dee-u PM'd me asking for my input on this thread, and here it is.)
    The time you enjoy wasting is not wasted time.
    Bertrand Russell

    <- Remember to rate posts you find helpful.

  36. #36
    Banned
    Join Date
    Jun 2013
    Posts
    1,002

    Re: Using a mathematical identity to speed up certain calculations.

    Well, below is the link to the Zip I've posted the SHA1-checksum for in #25:
    http://vbRichClient.com/Downloads/Ga...ulaArchive.zip

    And below's the code to paste into a VB6-Form - in the meantime also containing an LCM-
    corrected version, to allow for non-primes (in the List you pass to the Gauss-based algo) too.
    (thanks to jemidiah for the hint and his nice contribution to the thread)

    The VB-code may BTW serve as an example, that one can achieve efficient routines also without lambdas.

    Into a VB6-Form (compile native with all extended options, to reach the speeds as shown in the screenshot below)
    http://vbRichClient.com/Downloads/GaussSum.png

    Code:
    Option Explicit
    
    Private Declare Function timeGetTime Lib "winmm.dll" () As Long
    Private Declare Function timeBeginPeriod Lib "winmm.dll" (ByVal uPeriod As Long) As Long
    
    Private Sub Form_Load()
      timeBeginPeriod 1
      AutoRedraw = True
      Caption = "Click the Form..."
    End Sub
    
    Function LongArr(ParamArray P()) As Long()
    Dim i As Long, L() As Long
      ReDim L(UBound(P))
      For i = 0 To UBound(P): L(i) = P(i): Next
      LongArr = L
    End Function
     
    Private Sub Form_Click()
    Dim n As Long, PrimeFactorList() As Long, T As Long
       Cls
       
       n = 100000
       Print vbLf; "n ="; n; vbLf
       
       Print "Result for Factors(3, 4, 5, 6, 7) ... using n = 999"
       PrimeFactorList = LongArr(3, 4, 5, 6, 7)
       T = timeGetTime: Print "Naive", SumNaive(999, PrimeFactorList), Format(timeGetTime - T, "0msec")
       T = timeGetTime: Print "Gauss", SumGauss(999, PrimeFactorList), Format(timeGetTime - T, "0msec"), "<- wrong result"
       T = timeGetTime: Print "GaussLCM", SumGaussLCM(999, PrimeFactorList), Format(timeGetTime - T, "0msec")
       Print
       
       Print "Result for Primes (2, 3, 5, 7, 11, 13, 17, 19)"
       PrimeFactorList = LongArr(2, 3, 5, 7, 11, 13, 17, 19)
       T = timeGetTime: Print "Naive", SumNaive(n, PrimeFactorList), Format(timeGetTime - T, "0msec")
       T = timeGetTime: Print "Gauss", SumGauss(n, PrimeFactorList), Format(timeGetTime - T, "0msec")
       T = timeGetTime: Print "GaussLCM", SumGaussLCM(n, PrimeFactorList), Format(timeGetTime - T, "0msec")
       Print
       
       Print "Result for Primes (2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43)"
       PrimeFactorList = LongArr(2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43)
       T = timeGetTime: Print "Naive", SumNaive(n, PrimeFactorList), Format(timeGetTime - T, "0msec")
       T = timeGetTime: Print "Gauss", SumGauss(n, PrimeFactorList), Format(timeGetTime - T, "0msec")
       T = timeGetTime: Print "GaussLCM", SumGaussLCM(n, PrimeFactorList), Format(timeGetTime - T, "0msec")
    End Sub
     
    Function SumNaive(ByVal n As Long, L() As Long) As Currency
    Dim i As Long, j As Long
      For i = 1 To n
        For j = 0 To UBound(L)
          If i Mod L(j) = 0 Then SumNaive = SumNaive + i: Exit For
      Next j, i
    End Function
    
    Function SumGauss(ByVal n As Long, L() As Long) As Currency
    Dim i As Long, j As Long, Pow As Double, ni As Long
      For i = 1 To 2 ^ (UBound(L) + 1) - 1
        Pow = -1
        For j = 0 To UBound(L)
          If i And 2 ^ j Then Pow = -Pow * L(j)
        Next j
        ni = Int(n / Abs(Pow))
        SumGauss = SumGauss + Pow * ni * (ni + 1) / 2
      Next i
    End Function
    
    Function SumGaussLCM(ByVal n As Long, L() As Long) As Currency
    Dim i As Long, j As Long, Pow As Double, ni As Long
      For i = 1 To 2 ^ (UBound(L) + 1) - 1
        Pow = -1
        For j = 0 To UBound(L)
          If i And 2 ^ j Then Pow = -Int(Pow * L(j) / GCD(Abs(Pow), L(j)))
        Next j
        ni = Int(n / Abs(Pow))
        SumGaussLCM = SumGaussLCM + Pow * ni * (ni + 1) / 2
      Next i
    End Function
    
    'used in SumGaussLCM only, to calculate the least-common-multiple there
    Function GCD(ByVal a As Double, ByVal b As Double) As Double
      If b > 0 Then GCD = GCD(b, a - b * Int(a / b)) Else GCD = a
    End Function
    Olaf
    Last edited by Schmidt; Jul 1st, 2014 at 12:13 AM.

  37. #37
    Only Slightly Obsessive jemidiah's Avatar
    Join Date
    Apr 2002
    Posts
    2,371

    Re: Using a mathematical identity to speed up certain calculations.

    Here's a little speed comparison data using the code in my previous post on my personal machine:

    Code:
    : %timeit sum_quotients5([174, 192, 934, 554, 1234, 4321], 100000)
    10000 loops, best of 3: 199 µs per loop
    
    : %timeit sum_quotients4([174, 192, 934, 554, 1234, 4321], 100000)
    10 loops, best of 3: 130 ms per loop
    So, the naive version is very roughly 1000 times slower than the other version in this particular use case. Some profiling data, mostly for illustration:

    Code:
    : %prun %timeit sum_quotients5([174, 192, 934, 554, 1234, 4321], 100000)
    
             1851399 function calls (1851298 primitive calls) in 1.258 seconds
    
       Ordered by: internal time
    
       ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       530319    0.477    0.000    0.477    0.000 fractions.py:18(gcd)
       530319    0.231    0.000    0.709    0.000 <ipython-input-73-77bf6e782cd3>:3(lcm2)
         4111    0.226    0.000    1.251    0.000 <ipython-input-99-b022e6efa616>:3(sum_quotients5)
       258993    0.171    0.000    0.879    0.000 {reduce}
       258993    0.081    0.000    0.960    0.000 <ipython-input-51-dc0a7396641b>:4(lcm)
       258993    0.063    0.000    0.063    0.000 <ipython-input-99-b022e6efa616>:5(<lambda>)
            7    0.002    0.000    1.254    0.179 <magic-timeit>:1(inner)
         4116    0.002    0.000    0.002    0.000 {range}
            8    0.001    0.000    0.001    0.000 {compile}
         4113    0.000    0.000    0.000    0.000 {len}
            8    0.000    0.000    0.000    0.000 shlex.py:120(read_token)
    ...[continued]...
    (Normally I would just have done "%prun [code to run]" rather than "%prun %timeit [code to run]", but there was very little data without running it a bunch of times; more complicated operations like some of the research code I've used have much more interesting profile info.)

    Here you can see the main bottleneck on this particular data is the gcd function I'm using, followed closely by the lcm2 function. Reduce is getting called a lot, as you'd expect. The lambda I called b2 is listed there as well just under lcm.

    I checked a few other numbers against Schmidt's and am frankly impressed that Python is able to perform so closely to a compiled language, though to be fair VB6 is quite old. One advantage of Python in this context is that it has built-in arbitrary-length integers. For instance:

    Code:
    In [109]: sum_quotients5([3, 5], 10**100)
    23333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333331666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666668L
    The time you enjoy wasting is not wasted time.
    Bertrand Russell

    <- Remember to rate posts you find helpful.

  38. #38
    Banned
    Join Date
    Jun 2013
    Posts
    1,002

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by jemidiah View Post
    Here's a little speed comparison data using the code in my previous post on my personal machine:

    Code:
    : %timeit sum_quotients5([174, 192, 934, 554, 1234, 4321], 100000)
    10000 loops, best of 3: 199 µs per loop
    
    : %timeit sum_quotients4([174, 192, 934, 554, 1234, 4321], 100000)
    10 loops, best of 3: 130 ms per loop
    So, the naive version is very roughly 1000 times slower than the other version in this particular use case. Some profiling data, mostly for illustration:
    It's about 650 times slower - but yes, the difference is quite large - and overproportionally so,
    when we e.g. compare it with the VB6-native-compiler-timing when I run the same task:
    n = 100000, [174, 192, 934, 554, 1234, 4321].

    The naive loop in VB6 needs only 1.98ms in that scenario (about factor 65 faster than python) -
    whilst the LCM-corrected Gauss needs 18µsec (about 11 times as fast as python).

    The speed-relation between Naive and GaussLCM in VB6 is about 1:110 - not 1:650 as in python.
    Perhaps the naive Python-loop could gain when the implementation is done a little bit differently.

    Quote Originally Posted by jemidiah View Post
    I checked a few other numbers against Schmidt's and am frankly impressed that Python is able to perform so closely to a compiled language, though to be fair VB6 is quite old.
    I'd say that the advantage of factor 10-20 is a quite normal relation - it's basically the same speedup
    we see, when we compare VB6-PCode (IDE-DebugMode) vs. the VC6-C2.exe generated native VB6-binaries.
    As for "old" - the progress with modern C/C++ compiler-optimizations is somewhere in the range of 40-80%
    (when we compare VC-6 native compiles with all optimizations vs. those of e.g. VC-2012).
    Not really "a magnitude" we're talking about there.

    Quote Originally Posted by jemidiah View Post
    One advantage of Python in this context is that it has built-in arbitrary-length integers.
    Ack, that's quite a nice feature to have available built in.
    For VB6 one would have to resort to an appropriate library for that.

    Well, out of interest, I'd like to see a timing for both, your naive and the GaussLCM-version for this longer list of Primes:
    [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43] on n = 100000

    since that is what's used in the following code (slightly updated for speed), which produced
    native compiled the following result (Gauss now roughly 6 times slower than the naive algo).
    http://vbRichClient.com/Downloads/GaussSumOpt.png

    Into a Form:
    Code:
    Option Explicit
    
    Private Declare Function timeGetTime Lib "winmm" () As Long
    Private Declare Function timeBeginPeriod Lib "winmm" (ByVal uPeriod As Long) As Long
    
    Private Sub Form_Load()
      timeBeginPeriod 1
      AutoRedraw = True
      Caption = "Click the Form..."
    End Sub
    
    Function LongArr(ParamArray P()) As Long()
    Dim i As Long, L() As Long
      ReDim L(UBound(P))
      For i = 0 To UBound(P): L(i) = P(i): Next
      LongArr = L
    End Function
     
    Private Sub Form_Click()
    Dim n As Long, PrimeFactorList() As Long, i As Long, T As Long, Result As Currency
       Cls
       
       n = 100000
       Print vbLf; "n ="; n; vbLf
    
       Print "Result for Factors(174, 192, 934, 554, 1234, 4321)"
       PrimeFactorList = LongArr(174, 192, 934, 554, 1234, 4321)
       T = timeGetTime
         For i = 1 To 500: Result = SumNaive(n, PrimeFactorList): Next
       Print "Naive", Result, Format((timeGetTime - T) / 500, "0.000msec")
       T = timeGetTime
         For i = 1 To 500: Result = SumGaussLCM(n, PrimeFactorList): Next
       Print "GaussLCM", Result, Format((timeGetTime - T) / 500, "0.000msec"); vbLf
     
       
       Print "Result for Primes (2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43)"
       PrimeFactorList = LongArr(2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43)
       T = timeGetTime
         For i = 1 To 10: Result = SumNaive(n, PrimeFactorList): Next
       Print "Naive", Result, Format((timeGetTime - T) / 10, "0.0msec")
       T = timeGetTime
         For i = 1 To 10: Result = SumGaussLCM(n, PrimeFactorList): Next
       Print "GaussLCM", Result, Format((timeGetTime - T) / 10, "0.0msec"); vbLf
    End Sub
     
    Private Function SumNaive(ByVal n As Long, L() As Long) As Double
    Dim i As Long, j As Long, UBL As Long
      UBL = UBound(L)
      For i = 1 To n
        For j = 0 To UBL
          If i Mod L(j) = 0 Then SumNaive = SumNaive + i: Exit For
      Next j, i
    End Function
     
    Private Function SumGaussLCM(ByVal n As Long, L() As Long) As Double
    Dim i As Long, j As Long, s As Double, Pow As Double, ni As Long
    Static P2(0 To 30) As Long: If P2(30) = 0 Then For i = 0 To 30: P2(i) = 2 ^ i: Next
    
      For i = 1 To P2(UBound(L) + 1) - 1
        Pow = 1
        s = -0.5
        For j = 0 To UBound(L)
          If i And P2(j) Then s = -s: Pow = Int(Pow * L(j) / GCD(Pow, L(j)))
        Next j
        ni = Int(n / Pow)
        SumGaussLCM = SumGaussLCM + s * Pow * ni * (ni + 1)
      Next i
    End Function
    
    'used in SumGaussLCM only, to calculate the least-common-multiple there
    Private Function GCD(ByVal a As Double, ByVal b As Long) As Double
    Dim h As Long
      Do While b > 0
        If a < 2147483647# Then h = a Mod b Else h = a - b * Int(a / b)
        a = b: b = h
      Loop
      GCD = a
    End Function
    Olaf
    Last edited by Schmidt; Jul 1st, 2014 at 06:16 PM.

  39. #39
    Only Slightly Obsessive jemidiah's Avatar
    Join Date
    Apr 2002
    Posts
    2,371

    Re: Using a mathematical identity to speed up certain calculations.

    The timings you requested, first "GaussLCM", then naive:

    Code:
    In [110]: %timeit sum_quotients5([2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43], 100000)
    10 loops, best of 3: 93.9 ms per loop
    
    In [111]: %timeit sum_quotients4([2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43], 100000)
    10 loops, best of 3: 131 ms per loop
    I had actually run the first one earlier, and had compared the results to your GaussLCM result in #36, which took 51 ms, hence my "surprisingly close" comments. I'm curious why Python seems so much slower with the Naive method than VB6. I tried a few minor variations and they took essentially the same amount of time as my original.

    That said, I think this type of comparison is mostly an intellectual exercise. I wouldn't use Python in an important, speed-critical application; I'd use Cython or C. The main advantage of Python in this case in my mind is its simplicity, like the naive method being a single line; the fact that gcd, arbitrary-length integers, and more powerful iteration are built-in, no extra debugging or thought needed; and the overall "readability" of the code. I'd say efficiency isn't just a measurement of how fast the code runs; it also includes how long it took to correctly write and how much effort it cost the coder. For solving Project Euler's first problem, I'd say Python wins by a mile on that front: just type "sum(n for n in range(1000) if n%3 == 0 or n%5 == 0)" into an interpreter and you have your answer immediately.
    The time you enjoy wasting is not wasted time.
    Bertrand Russell

    <- Remember to rate posts you find helpful.

  40. #40
    Super Moderator si_the_geek's Avatar
    Join Date
    Jul 2002
    Location
    Bristol, UK
    Posts
    37,039

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by Schmidt View Post
    since that is what's used in the following code (slightly updated for speed), which produced
    native compiled the following result (Gauss now roughly 6 times slower than the naive algo).
    While I don't dispute the change of preference for method based on a different data set, there is obvious bias in your code against the Gauss method.

    In SumNaive you optimise the Ubound call (using the variable UBL), but don't do the equivalent in SumGaussLCM.

    In GCD and SumGaussLCM you do slow integer division (using Int(a / b) etc) rather than the proper method (eg: (a \ b) ).


    There are also other optimisations that could be done for SumGaussLCM (such as a Mod replacement), but the ones I spotted are all things that would take noticeable effort - so it is understandable that you didn't do them for something you only claimed was slightly updated for speed.

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Featured


Click Here to Expand Forum to Full Width

Survey posted by VBForums.