Results 1 to 40 of 49

Thread: Using a mathematical identity to speed up certain calculations.

Hybrid View

  1. #1

    Thread Starter
    Hyperactive Member Maven's Avatar
    Join Date
    Feb 2003
    Location
    Greeneville, TN
    Posts
    322

    Using a mathematical identity to speed up certain calculations.

    A long time ago, a mathematician by the name of Carl Friedrich Gauss was attending grade school. The teacher of his class wanted a break, so the teacher decided to assign the class a very time consuming problem. Sum all of the numbers between 1 and 50. 1+2+3..+..+49+50. So the students of the class set to work on the daunting task of summing all of these numbers. After a minute or two, the teacher noticed that Gauss wasn't doing anything. After investigation, the teacher found out that Gauss had discovered a new way to solve the problem without having to sum up each element by hand. The identity he found was: the sum of i from 1 to n is (n(n+1))/2.

    What is the moral of the story? Don't use a loop if you can do something in constant time.

    A good web site for practicing programming skills is Project Euler. Of particular interest, the very first problem can be solved using the method above.

    Try to solve it on your own before reading the rest of this post.






    Keep trying......






    Alright. So you need to know the sum of multiples of three or five for every single number under one thousand.

    The first step is to calculate the sum of multiples of threes under 1000:

    The sum of i from [1 to n] is (n(n+1))/2; as a result, 3 * sum of i from [1 to n] is 3(n(n+1))/2.
    where n = floor(N/3) => floor(999/3) = 333
    N is the total number you want to count up to.
    The next step is to calculate the sum of multiples of fives under 1000:

    The sum of i from [1 to m] is (m(m+1))/2; as a result, 5 * sum of i from [1 to m] is 5(n(n+1))/2.
    where m = floor(N/5) => floor(999/5) = 166
    We can now begin to construct a formula.

    3(n(n+1))/2 + 5(m(m+1))/2
    where n = floor(N/3) => floor(999/3) => 333
    where m = floor(N/5) => floor(999/5) => 199
    If you try this formula, you might get an incorrect answer. Why?

    It turns out that there is a set theory problem lurking behind the scenes. The left hand component will generate values like 3*5 while the right hand component will generate values like 5 * 3. So we are calculating 15 twice in the above formula (along with other similar values).

    So it turns out we got one more step to do. We need to subtract off the extra calculation we are doing. How do we do that?

    3(n(n+1))/2 + 5(m(m+1))/2 - (3*5)(o(o+1)/2
    where n = floor(N/3) => floor(999/3) => 333
    where m = floor(N/5) => floor(999/5) => 166
    where o = floor(N/15) => floor(999/15) => 66
    Now we should get the correct value. I apologize for the syntax, but I don't think this thread supports latex.

    The above method is the smart man's approach to solving the problem. Notice that it can be implemented in code without using a single loop.

    *updated thanks to Schmidt for pointing out the index error.

    Here is how the above mathematics is implemented in visual basic.

    Code:
    Public Class Form1
    
        Function CalculateResultBest(ByVal num As Long) As Long
            Dim n As Integer
            Dim m As Integer
            Dim o As Integer
    
            n = Math.Floor(num / 3)
            m = Math.Floor(num / 5)
            o = Math.Floor(num / 15)
    
            CalculateResultBest = ((3 * n * (n + 1)) / 2) + ((5 * m * (m + 1)) / 2) - ((15 * o * (o + 1)) / 2)
        End Function
    
        Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
            MsgBox(CalculateResultBest(999))
        End Sub
    End Class
    Last edited by Maven; Jun 28th, 2014 at 06:40 PM.
    Education is an admirable thing, but it is well to remember from time to time that nothing that is worth knowing can be taught. - Oscar Wilde

  2. #2
    type Woss is new Grumpy; wossname's Avatar
    Join Date
    Aug 2002
    Location
    #!/bin/bash
    Posts
    5,682

    Re: Using a mathematical identity to speed up certain calculations.

    Come again?
    I don't live here any more.

  3. #3

    Thread Starter
    Hyperactive Member Maven's Avatar
    Join Date
    Feb 2003
    Location
    Greeneville, TN
    Posts
    322

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by wossname View Post
    Come again?
    You can solve the problem in O(1) time using the above method. The formula 3(n(n+1))/2 + 5(n(n+1))/2 - (3*5)(n(n+1)/2 will solve it so that you don't need a loop. The formula will allow you to solve it for any n, but the problem asks for 1000. [error]Simply replace n with 1000 and do the calculation in your program to yield the answer to the problem.[end of error]

    [correction made in op]

    If you have a specific question about the mathematics behind it, well ask.

    But this is the most efficient way to solve this type of problem. The dumb method is to use loops and brute force everything.
    Last edited by Maven; Jun 28th, 2014 at 05:36 PM.
    Education is an admirable thing, but it is well to remember from time to time that nothing that is worth knowing can be taught. - Oscar Wilde

  4. #4
    PowerPoster
    Join Date
    Jun 2013
    Posts
    7,219

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by Maven View Post
    You can solve the problem in O(1) time using the above method. The formula 3(n(n+1))/2 + 5(n(n+1))/2 - (3*5)(n(n+1)/2 will solve it so that you don't need a loop. The formula will allow you to solve it for any n, but the problem asks for 1000. Simply replace n with 1000 and do the calculation in your program to yield the answer to the problem.

    If you have a specific question about the mathematics behind it, well ask.
    I'm tempted, but from what I've seen so far I'm not sure if that is a good idea...

    Quote Originally Posted by Maven View Post
    But this is the most efficient way to solve this type of problem. The dumb method is to use loops and brute force everything.
    Well, among "dumb things a developer can do", premature optimization has a high ranking.

    So, what's wrong in using a simple "naive loop", as long as it is fast enough I might ask:

    E.g. for your given problem at hand, this would be the easily understood snippet below:
    Code:
    Function CalculateResult(ByVal n As Long) As Long
    Dim i As Long, Result As Long
      For i = 1 To n
        If i Mod 3 = 0 Then Result = Result + i
        If i Mod 5 = 0 Then Result = Result + i
      Next i
      CalculateResult = Result
    End Function
    Sure, if you're a bright candle and don't have to think about the more optimized approach
    very hard, then there's nothing wrong in writing it up "as you go" of course.

    Though the *most dumb* thing a developer can do, is to use (or promote) an "optimized approach",
    when it spits out the wrong results (as your recommended solution surely does).

    When I add the missing bracket in your expression, testing it out in the DirectWindow it gives:
    n=1000: ? 3*(n*(n+1))/2 + 5*(n*(n+1))/2 - (3*5)*(n*(n+1))/2
    -3503500

    Which obviously isn't the right result.

    So (as I read it earlier somewhere):

    Try again...

    You might want to take a look at the optimized approach below, to get some ideas on what you did wrong:
    Code:
    Function CalculateResultOptimized(ByVal n As Long) As Long
    Dim n3 As Long: n3 = n \ 3
    Dim n5 As Long: n5 = n \ 5
      CalculateResultOptimized = (3 * n3 * (n3 + 1) + 5 * n5 * (n5 + 1)) \ 2
    End Function
    Both approaches (the naive loop, as well as the optimized snippet above) give the correct result:
    ?CalculateResult(1000), CalculateResultOptimized(1000)
    267333 267333

    Olaf
    Last edited by Schmidt; Jun 28th, 2014 at 12:54 PM.

  5. #5

    Thread Starter
    Hyperactive Member Maven's Avatar
    Join Date
    Feb 2003
    Location
    Greeneville, TN
    Posts
    322

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by Schmidt View Post
    Well, among "dumb things a developer can do", premature optimization has a high ranking.
    You should read the post you link more carefully:
    Code:
    All optimization is premature unless
    
    A program is too slow (many people forget this part).
    
    You have a measurement (profile or similar) showing that the optimization could improve things.
    We can measure my optimization very easily. Your algorithm (which is incorrect by the way), is an O(N) algorithm; as a result, the time it takes your program to complete is linear with the input size. My algorithm is O(1) and it will complete in constant time regardless of input size.

    Brute force
    Brute force is the by definition the worse possible algorithm.


    E.g. for your given problem at hand, this would be the easily understood snippet below:

    Sure, if you're a bright candle and don't have to think about the more optimized approach
    very hard, then there's nothing wrong in writing it up "as you go" of course.
    One should strive for efficient algorithms. You're abusing Kunth's quote. He was talking about optimizing without understanding the gains.

    Though the *most dumb* thing a developer can do, is to use (or promote) an "optimized approach",
    when it spits out the wrong results (as your recommended solution surely does).
    I made a small indexing error. I typed the post straight out without testing. At any rate, I've corrected it.


    The correct answer is 233168.
    Education is an admirable thing, but it is well to remember from time to time that nothing that is worth knowing can be taught. - Oscar Wilde

  6. #6
    PowerPoster
    Join Date
    Jun 2013
    Posts
    7,219

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by Maven View Post
    You should read the post you link more carefully:
    Code:
    All optimization is premature unless
    
    A program is too slow (many people forget this part).
    
    You have a measurement (profile or similar) showing that the optimization could improve things.
    I've read it - and am aware of what Hoare/Knuth were really saying, before posting the link.

    I'm just not sure about a practical use-case for the Gauss-optimization in most
    of a developers daily (often non-scientific) work - so, if you have such a case where
    this O(1)-level Gauss-optimization plays a larger role, I'd like to hear about it.

    Currently I'm trying to come up with a generic function for that, "just for fun".

    Quote Originally Posted by Maven View Post
    We can measure my optimization very easily. Your algorithm (which is incorrect by the way),
    Wich is the next wrong information I had to read from you... (since there's nothing incorrect
    with the function I posted in #8 - and neither was there anything incorrect with my first
    example, which was made to point out what you did wrong in applying Gauss to the problem
    at hand).

    Quote Originally Posted by Maven View Post
    is an O(N) algorithm; as a result, the time it takes your program to complete is linear with the input size. My algorithm is O(1) and it will complete in constant time regardless of input size.
    You didn't post any algorithm - you posted misleading information (now corrected, although not yet in #3).
    But in its original version nobody would have been able to make any use of the Gauss-formula you
    wrongly applied to the given problem.

    Quote Originally Posted by Maven View Post
    Brute force is the by definition the worse possible algorithm.
    That's like saying: "Bubble-Sort doesn't have its place, because there's much faster sorting-algos".

    Quote Originally Posted by Maven View Post
    One should strive for efficient algorithms.
    Sure - when they are really needed in your application - or when you're implementing a common base-library
    which you don't know what future usage-scenarios might be thrown at it, *and* when your personal capabilities
    are sufficient enough, to fullfill the promise of a given "theory" with an *error-free* implementation -
    then your statement makes sense.

    Quote Originally Posted by Maven View Post
    I made a small indexing error. I typed the post straight out without testing.
    Sorry, but it was not "a small indexing error" - you forgot a badly needed, separate integer-divide
    on each of the Sub-Sets, before applying the differing 'ns' to the Gauss-Function to
    determine a Sub-Sets Sum.

    Olaf

  7. #7
    Frenzied Member
    Join Date
    Apr 2012
    Posts
    1,253

    Re: Using a mathematical identity to speed up certain calculations.

    You are including numbers that are multiples of both 3 and 5 twice

    e.g. When i is 15, both If statements are true and, hence, 30 is added to your Result variable. This is the 'set theory' problem the OP mentions.

    Code:
    Function CalculateResult(ByVal n As Long) As Long
    Dim i As Long, Result As Long
      For i = 1 To n
        If i Mod 3 = 0 Then Result = Result + i
        If i Mod 5 = 0 Then Result = Result + i
      Next i
      CalculateResult = Result
    End Function
    Oh, and your loop should go up to n-1 since that is also stated in the problem.

    So..

    Code:
    Function CalculateResult(ByVal n As Long) As Long
    Dim i As Long, Result As Long
      For i = 1 To n - 1
        If i Mod 3 = 0 Or i Mod 5 = 0 Then Result = Result + i
      Next i
      CalculateResult = Result
    End Function
    ...is correct.
    Last edited by ColinE66; Jun 28th, 2014 at 03:22 PM.
    If you don't know where you're going, any road will take you there...

    My VB6 love-children: Vee-Hive and Vee-Launcher

  8. #8
    PowerPoster
    Join Date
    Jun 2013
    Posts
    7,219

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by ColinE66 View Post
    You are including numbers that are multiples of both 3 and 5 twice

    e.g. When i is 15, both If statements are true and, hence, 30 is added to your Result variable. This is the 'set theory' problem the OP mentions.

    Code:
    Function CalculateResult(ByVal n As Long) As Long
    Dim i As Long, Result As Long
      For i = 1 To n
        If i Mod 3 = 0 Then Result = Result + i
        If i Mod 5 = 0 Then Result = Result + i
      Next i
      CalculateResult = Result
    End Function
    Oh, and your loop should go up to n-1 since that is also stated in the problem.

    So..

    Code:
    Function CalculateResult(ByVal n As Long) As Long
    Dim i As Long, Result As Long
      For i = 1 To n - 1
        If i Mod 3 = 0 Or i Mod 5 = 0 Then Result = Result + i
      Next i
      CalculateResult = Result
    End Function
    ...is correct.
    Believe me or not Colin, but I was aware of the problem of "multiples of the factors" being sumed-up
    twice, when I posted my examples (the OP mentioned the problem clearly enough in his post).

    My comment was thought to point out the premature optimization problem - and in addition
    the wrong expressions the OP was using to calculate the "sum of the set-parts" directly
    (not taking into account, that any given SubSet-Sum should be calculated, based on:
    nFactor = n IntegerDiv Factor - and not on n itself).

    And as for n-1 in the loop - that would be wrong IMO, since you can hand over
    a 999-Parameter as 'n' into the Functions quite easily (if that's the upper bound
    of the range you're doing your calculations on).

    Olaf

  9. #9
    Frenzied Member
    Join Date
    Apr 2012
    Posts
    1,253

    Re: Using a mathematical identity to speed up certain calculations.

    Hey, I was just pointing it out since I could see straight away that your code did not address the stated problem.

    As for the n-1, it did occur to me that you could pass 999 to the function but since the problem was stated as solving for a value less than n (and also because you yourself called your own function with 1000, in line with the stipulated parameter for the problem) I subtracted the 1 inside the function. Difference of opinion on that one, I'm afraid
    If you don't know where you're going, any road will take you there...

    My VB6 love-children: Vee-Hive and Vee-Launcher

  10. #10
    PowerPoster
    Join Date
    Jun 2013
    Posts
    7,219

    Re: Using a mathematical identity to speed up certain calculations.

    And as for the "balance" of the whole thing with regards to "performance vs. efforts"...

    The "naive loop" is still not bad performancewise, even when we consider the task being:
    "Do that for a somewhat larger list of PrimeNumbers, not only hardwired to the values of 3 and 5."

    Code:
    Option Explicit
    
    Private Sub Form_Click()
    Dim PrimeFactorList() As Long
       With New_c.ArrayList(vbLong, 2, 3, 5, 7, 11, 13, 17, 19)
         .CopyToArray PrimeFactorList
       End With
       
       Cls
       New_c.Timing True: Print CalculateResult(100000, PrimeFactorList), New_c.Timing
    End Sub
     
    Private Function CalculateResult(ByVal n As Long, PrimeFactorList() As Long) As Currency
    Dim i As Long, j As Long, UBPrimeFactorList As Long
      UBPrimeFactorList = UBound(PrimeFactorList)
      For i = 1 To n
        For j = 0 To UBPrimeFactorList
          If i Mod PrimeFactorList(j) = 0 Then CalculateResult = CalculateResult + i: Exit For
        Next j
      Next i
    End Function
    The above naive loop will respond (even for a larger n of 100000) within about 1.5msec (native compiled).

    Now try to come up with the appropriate (generic) function which does the more direct calculation,
    based on the n * (n+1) / 2 approach.

    Keep in mind, that now any possible combination of the "multiple-factors-problem" has to be
    considered (considered, not always applied), to subtract properly from the resulting Set-Sums.

    Olaf
    Last edited by Schmidt; Jun 28th, 2014 at 04:21 PM.

  11. #11
    Frenzied Member
    Join Date
    Apr 2012
    Posts
    1,253

    Re: Using a mathematical identity to speed up certain calculations.

    What does your CalculateResultOptimized function look like now, Olaf? Incorporating a solution to the 'set-theory intersection problem'?
    If you don't know where you're going, any road will take you there...

    My VB6 love-children: Vee-Hive and Vee-Launcher

  12. #12
    PowerPoster
    Join Date
    Jun 2013
    Posts
    7,219

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by ColinE66 View Post
    What does your CalculateResultOptimized function look like now, Olaf? Incorporating a solution to the 'set-theory intersection problem'?
    Well, that's what I'm currently struggling with (to come up with a generically usable approach, handing out similar results as the Looping-Function).

    I'm not a mathematician - and although the simplified case of two numbers has a trivial solution,
    which gives the right result.

    ...still hardwired at the moment:
    Code:
    Private Function CalculateResultOptimized(ByVal n As Long) As Currency
    Dim n3 As Long:  n3 = n \ 3
    Dim n5 As Long:  n5 = n \ 5
      CalculateResultOptimized = CalculateResultOptimized + 3 * n3 * (n3 + 1) \ 2
      CalculateResultOptimized = CalculateResultOptimized + 5 * n5 * (n5 + 1) \ 2
    
     'now the set-subtraction for the occurence of "combined factors"
     Dim n15 As Long: n15 = n \ 15
      CalculateResultOptimized = CalculateResultOptimized - 15 * n15 * (n15 + 1) \ 2
    End Function

    ...already the still simple case of only 3 factors (2, 3 and 5) is currently giving me headaches,
    since simply applying the logic from above will not work "straight out" as it seems:

    e.g. the following will not produce the correct result
    (and albeit having "slight clues" why it doesn't work - I've not yet found where I have
    the knot in my brain...)

    Code:
    Private Function CalculateResultOptimized(ByVal n As Long) As Currency
    Dim n2 As Long:  n2 = n \ 2
    Dim n3 As Long:  n3 = n \ 3
    Dim n5 As Long:  n5 = n \ 5
      CalculateResultOptimized = CalculateResultOptimized + 2 * n2 * (n2 + 1) \ 2
      CalculateResultOptimized = CalculateResultOptimized + 3 * n3 * (n3 + 1) \ 2
      CalculateResultOptimized = CalculateResultOptimized + 5 * n5 * (n5 + 1) \ 2
    
    'now the set-subtractions for all potential occurences of multiples of the factors
    Dim n6 As Long:  n6 = n \ 6    '2*3
    Dim n10 As Long: n10 = n \ 10  '2*5
    Dim n15 As Long: n15 = n \ 15  '3*5
    Dim n30 As Long: n30 = n \ 30  '2*3*5
      CalculateResultOptimized = CalculateResultOptimized - 6 * n6 * (n6 + 1) \ 2
      CalculateResultOptimized = CalculateResultOptimized - 10 * n10 * (n10 + 1) \ 2
      CalculateResultOptimized = CalculateResultOptimized - 15 * n15 * (n15 + 1) \ 2
      CalculateResultOptimized = CalculateResultOptimized - 30 * n30 * (n30 + 1) \ 2
    End Function
    Olaf
    Last edited by Schmidt; Jun 28th, 2014 at 05:02 PM.

  13. #13
    Frenzied Member
    Join Date
    Apr 2012
    Posts
    1,253

    Re: Using a mathematical identity to speed up certain calculations.

    Success! Works for original problem; just changed your arraylist to contain only 3 and 5 as below. Should work with more provided array is sorted in ascending order.

    Code:
    Option Explicit
    
    Private Sub Form_Click()
    Dim PrimeFactorList() As Long
       With New_c.ArrayList(vbLong, 3, 5)
         .CopyToArray PrimeFactorList
       End With
       Cls
       New_c.Timing True: Print CalculateResult(1000, PrimeFactorList), New_c.Timing
    End Sub
     
    Private Function CalculateResult(ByVal n As Long, PrimeFactorList() As Long) As Currency
    Dim i As Long, j As Long, k As Long, UBPrimeFactorList As Long, AlreadyCounted As Boolean
      UBPrimeFactorList = UBound(PrimeFactorList)
      For i = 1 To n - 1
        For j = 0 To UBPrimeFactorList
          AlreadyCounted = False
          For k = 0 To j - 1
             If i Mod PrimeFactorList(k) = 0 Then
                AlreadyCounted = True
                Exit For
             End If
          Next k
          If Not AlreadyCounted And i Mod PrimeFactorList(j) = 0 Then CalculateResult = CalculateResult + i: Exit For
        Next j
      Next i
    End Function
    EDIT: 41msec to process your lengthy arraylist for 100,000 numbers (not compiled)
    Last edited by ColinE66; Jun 28th, 2014 at 05:31 PM.
    If you don't know where you're going, any road will take you there...

    My VB6 love-children: Vee-Hive and Vee-Launcher

  14. #14

    Thread Starter
    Hyperactive Member Maven's Avatar
    Join Date
    Feb 2003
    Location
    Greeneville, TN
    Posts
    322

    Re: Using a mathematical identity to speed up certain calculations.

    Here is the code and how you would implement the mathematics from the OP.

    Code:
    Public Class Form1
    
        Function CalculateResultBest(ByVal num As Long) As Long
            Dim n As Integer
            Dim m As Integer
            Dim o As Integer
    
            n = Math.Floor(num / 3)
            m = Math.Floor(num / 5)
            o = Math.Floor(num / 15)
    
            CalculateResultBest = ((3 * n * (n + 1)) / 2) + ((5 * m * (m + 1)) / 2) - ((15 * o * (o + 1)) / 2)
        End Function
    
        Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
            MsgBox(CalculateResultBest(999))
        End Sub
    End Class
    If you would rather the function behave as a less than num, simply add num = num-1 at the top of the function.

    At any rate, you will not obtain a faster result. And it's rather easy mathematics. My apologies for not testing it before posting (too much confidence causes mistake now and again). I made a simple indexing error without even thinking about it.

    One could generalize this algorithm easily to work for more cases. Just remember to subtract off intersections.
    Last edited by Maven; Jun 28th, 2014 at 06:38 PM.
    Education is an admirable thing, but it is well to remember from time to time that nothing that is worth knowing can be taught. - Oscar Wilde

  15. #15
    PowerPoster
    Join Date
    Jun 2013
    Posts
    7,219

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by Maven View Post
    Here is the code and how you would implement the mathematics from the OP.

    Code:
    Public Class Form1
    
        Function CalculateResultBest(ByVal num As Long) As Long
            Dim n As Integer
            Dim m As Integer
            Dim o As Integer
    
            n = Math.Floor(num / 3)
            m = Math.Floor(num / 5)
            o = Math.Floor(num / 15)
    
            CalculateResultBest = ((3 * n * (n + 1)) / 2) + ((5 * m * (m + 1)) / 2) - ((15 * o * (o + 1)) / 2)
        End Function
    
        Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
            MsgBox(CalculateResultBest(999))
        End Sub
    End Class
    That was already posted in #10 (the first function).

    Quote Originally Posted by Maven View Post
    One could generalize this algorithm easily to work for more cases. Just remember to subtract off intersections.
    Well - "easily" is a bit optimistic, as we already found problems with simply
    "subtracting the intersections".

    Please show us, how easily you can come up with an implementation for a Factorlist of more
    than only two factors - say, for the list of factors: 2, 3, 5, 7, 11, 13, 17, 19
    (as demonstrated with the still pretty simple looping-function in #8).

    Olaf

  16. #16
    Frenzied Member
    Join Date
    Apr 2012
    Posts
    1,253

    Re: Using a mathematical identity to speed up certain calculations.

    Yeah, I'm playing around with this, too. Independently of your initial Optimised approach (I didn't really study it) I came up with something identical and am now trying to find a way of eliminating the multiple-factors problem from your post #8

    But I'm not a mathematician either ;(
    If you don't know where you're going, any road will take you there...

    My VB6 love-children: Vee-Hive and Vee-Launcher

  17. #17
    Frenzied Member
    Join Date
    Apr 2012
    Posts
    1,253

    Re: Using a mathematical identity to speed up certain calculations.

    re post #10, I can see that you are going to have great difficulty avoiding subtracting the same numbers more than once when different multiples 'intersect'. I assume that is what you are struggling to resolve...
    If you don't know where you're going, any road will take you there...

    My VB6 love-children: Vee-Hive and Vee-Launcher

  18. #18
    PowerPoster
    Join Date
    Jun 2013
    Posts
    7,219

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by ColinE66 View Post
    re post #10, I can see that you are going to have great difficulty avoiding subtracting the same numbers more than once when different multiples 'intersect'. I assume that is what you are struggling to resolve...
    Yep - I'm trying to find a generic function which internally makes use of the O(1)-category n*(n+1)/2 approach,
    to avoid the O(n) looping over the whole range.

    Not sure, if your function in #12 really does speed-up what I posted in #8,
    since I already had an early exit there, as soon as a valid factor was found
    (not possible to count something twice).

    When you use my function in #8 with n-1 as the parameter, then it wil always produce
    the same results as yours for the same list of PrimeFactors.

    Olaf

  19. #19
    Frenzied Member
    Join Date
    Apr 2012
    Posts
    1,253

    Re: Using a mathematical identity to speed up certain calculations.

    Awww shite. I mis-read your post #8 and thought that you hadn't allowed for the multiple-multiples scenario. Didn't even test it, in fact. Just went to work trying to eliminate them! Laughing at my own stupidity now. And, yes, it will definitely be slower than your code as I (obviously) just added a big overhead!
    If you don't know where you're going, any road will take you there...

    My VB6 love-children: Vee-Hive and Vee-Launcher

  20. #20
    Frenzied Member
    Join Date
    Apr 2012
    Posts
    1,253

    Re: Using a mathematical identity to speed up certain calculations.

    I think it's pretty obvious that one needs the combinations of all elements in order to exclude the intersections. And this, of course, is easily coded when the number of elements and their values are known in advance. Personally, I'd really like to see something optimal that doesn't know those parameters up-front.
    If you don't know where you're going, any road will take you there...

    My VB6 love-children: Vee-Hive and Vee-Launcher

  21. #21

    Thread Starter
    Hyperactive Member Maven's Avatar
    Join Date
    Feb 2003
    Location
    Greeneville, TN
    Posts
    322

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by ColinE66 View Post
    I think it's pretty obvious that one needs the combinations of all elements in order to exclude the intersections. And this, of course, is easily coded when the number of elements and their values are known in advance. Personally, I'd really like to see something optimal that doesn't know those parameters up-front.
    A power set algorithm is what is needed. There have been many different power set algorithms made over the years, and one can find one of his or her liking with a quick google search.
    Education is an admirable thing, but it is well to remember from time to time that nothing that is worth knowing can be taught. - Oscar Wilde

  22. #22
    PowerPoster
    Join Date
    Jun 2013
    Posts
    7,219

    Re: Using a mathematical identity to speed up certain calculations.

    FWIW, I have it working here now (sha1-checksum for my Zip-File is: b08cec4ca48ba84043b38040e94b14c4896cdff0)

    (...just waiting a bit for Maven to finish his implementation too...)

    And sure, for smaller lists (as for example {3 , 5}) the Gauss-approach is working much
    faster than then naive looping one - especially for larger n of 100000 or 1Mio.

    Though the O(1) behaviour (with regards to n) is only *one* thing - the other one
    is the amount of factor-combinations, when we talk about larger lists of Primes, as e.g.:
    {2, 3, 5, 7, 11, 13, 17, 19} or
    {2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43}

    The Powerset grows quite fast (exponentially) with the Factors-count, so there's
    a large trade-off for the Gauss-Implementation (whereas the loop-approach shrugs
    this larger list-count off, scaling roughly linearly with the amount of factors to check.

    So there's a "balance-point" (depending on n and the factors-count), where either
    the one - or the other of the algorithms can shine.

    Below's two ScreenShots with the results...

    For n = 100,000:
    http://vbRichClient.com/Downloads/GaussSumFormula.png

    And for n = 1Mio
    http://vbRichClient.com/Downloads/Ga...ormula1Mio.png

    Olaf
    Last edited by Schmidt; Jun 29th, 2014 at 11:25 AM.

  23. #23
    Frenzied Member
    Join Date
    Apr 2012
    Posts
    1,253

    Re: Using a mathematical identity to speed up certain calculations.

    Can you PM me the link, please, Olaf? Keen to take a look...
    If you don't know where you're going, any road will take you there...

    My VB6 love-children: Vee-Hive and Vee-Launcher

  24. #24
    PowerPoster
    Join Date
    Jun 2013
    Posts
    7,219

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by ColinE66 View Post
    Can you PM me the link, please, Olaf? Keen to take a look...
    Sure - take a good look at the IIF-function and what it finally accomplishes,
    since that was the only thing I was struggling with the whole time, until the light
    finally came on... ;-)

    Olaf

  25. #25
    Frenzied Member
    Join Date
    May 2014
    Location
    Kallithea Attikis, Greece
    Posts
    1,289

    Re: Using a mathematical identity to speed up certain calculations.

    Back in school...i found an easy way to add all even numbers in a range of i,j

    The problem was supposed to have a solution with a loop of j-i times, and inside using mod to find if number is even..and then if yes add to an accumulator.

    The formula is simple
    if i is even then i=i-1
    if j is even then j=j+1
    so the sum is (j*j-i*i)/4
    and the number of even numbers is (j-i)/2

  26. #26
    Only Slightly Obsessive jemidiah's Avatar
    Join Date
    Apr 2002
    Posts
    2,431

    Re: Using a mathematical identity to speed up certain calculations.

    Since this is "code-it-better", I thought I'd mention the Python implementation of some of these algorithms, since many of these operations are particularly straightforward in standard Python. The "sum_quotients5" below in particular would be much longer in standard C# or VB.NET.

    The original problem's solutions, first naive, then with "Gauss's formula" (I call it "the triangular number formula"; it's very ancient):

    Code:
    def sum_quotients1(a, b, n_max):
        return sum(n for n in range(n_max) if n%a == 0 or n%b == 0)
    
    def sum_quotients2(a, b, n_max):
        n_max = n_max - 1
        def b2(n):
            return n*(n+1)/2
        return a * b2(n_max//a) + b * b2(n_max//b) - a*b * b2(n_max//(a*b))
    Of course, the second one only works when a, b are relatively prime positive integers. When they are general integers, the "set-theoretic problem" is that multiples of their lcm are included; recall lcm(a, b) * gcd(a, b) = a*b. The following accounts for that possibility (and changes b2 into a "lambda function"):

    Code:
    from fractions import gcd
    
    def lcm2(a, b):
        return a*b / gcd(a, b)
    
    def sum_quotients3(a, b, n_max):
        n_max = n_max - 1
        b2 = lambda n: n*(n+1)/2
        return a * b2(n_max//a) + b * b2(n_max//b) - lcm2(a, b) * b2(n_max//lcm2(a, b))
    The more general problem allowing a list of (let's say positive) integers in place of just a and b can be solved naively like this:

    Code:
    def sum_quotients4(L, n_max):
        return sum(n for n in range(n_max) if any(n%a==0 for a in L))
    The "inclusion-exclusion" approach can be done as follows. We need an n-ary lcm function first (modified from here for convenience):

    Code:
    def lcm(L):
        return reduce(lcm2, L)
    Next we need to iterate over the power set of L, alternatively adding and subtracting the lcm of subsets as we go:

    Code:
    from itertools import combinations
    
    def sum_quotients5(L, n_max):
        n_max = n_max - 1
        b2 = lambda n: n*(n+1)/2
        
        s = 0
        for i in range(len(L)):
            for subset in combinations(L, i+1):
                lcm_subset = lcm(subset)
                s += (-1)**i * lcm_subset * b2(n_max//lcm_subset)
        return s
    (The "s=0" through "return s" could have been written as

    Code:
        return sum((-1)**i * lcm(subset) * b2(n_max//lcm(subset)) for i in range(len(L)) for subset in combinations(L, i+1))
    )

    As for correctness, these all return 233168 on the input (3, 5, 1000) (or ([3, 5], 1000) as appropriate). Versions 1, 3, 4, and 5 also all return 249500 on (2, 2, 1000) or ([2, 2], 1000) and (2, 4, 1000) or ([2, 4], 1000).) Moreover, versions 4 and 5 both give 327958 on ([3, 4, 5, 6, 7], 1000), which strongly suggests this is all correct: version 4 is extremely simple and is nigh-guaranteed to be correct for that reason.

    Of course, since there are 2^N subsets of a list of length N (to construct each subset, for each item in the list, just choose whether to include it in the subset), so the inclusion-exclusion approach is terribly inefficient in that direction. If the elements of L are relatively large compared to n_max, then it becomes more efficient.

    If we're using ipython's interpreter to run these commands, the "magic" commands %time, %timeit, and %prun are wonderfully useful. %time tells you how long a command takes to run; %timeit roughly averages %time over many runs, with the number of runs chosen adaptively; and %prun profiles your code, giving you a function-by-function breakdown of how much time is spent where and how often things are getting called.

    (Finally, I'd like to say I have no interest whatsoever in getting involved some sort of personal debate. I'm only here since dee-u PM'd me asking for my input on this thread, and here it is.)
    The time you enjoy wasting is not wasted time.
    Bertrand Russell

    <- Remember to rate posts you find helpful.

  27. #27
    PowerPoster
    Join Date
    Jun 2013
    Posts
    7,219

    Re: Using a mathematical identity to speed up certain calculations.

    Well, below is the link to the Zip I've posted the SHA1-checksum for in #25:
    http://vbRichClient.com/Downloads/Ga...ulaArchive.zip

    And below's the code to paste into a VB6-Form - in the meantime also containing an LCM-
    corrected version, to allow for non-primes (in the List you pass to the Gauss-based algo) too.
    (thanks to jemidiah for the hint and his nice contribution to the thread)

    The VB-code may BTW serve as an example, that one can achieve efficient routines also without lambdas.

    Into a VB6-Form (compile native with all extended options, to reach the speeds as shown in the screenshot below)
    http://vbRichClient.com/Downloads/GaussSum.png

    Code:
    Option Explicit
    
    Private Declare Function timeGetTime Lib "winmm.dll" () As Long
    Private Declare Function timeBeginPeriod Lib "winmm.dll" (ByVal uPeriod As Long) As Long
    
    Private Sub Form_Load()
      timeBeginPeriod 1
      AutoRedraw = True
      Caption = "Click the Form..."
    End Sub
    
    Function LongArr(ParamArray P()) As Long()
    Dim i As Long, L() As Long
      ReDim L(UBound(P))
      For i = 0 To UBound(P): L(i) = P(i): Next
      LongArr = L
    End Function
     
    Private Sub Form_Click()
    Dim n As Long, PrimeFactorList() As Long, T As Long
       Cls
       
       n = 100000
       Print vbLf; "n ="; n; vbLf
       
       Print "Result for Factors(3, 4, 5, 6, 7) ... using n = 999"
       PrimeFactorList = LongArr(3, 4, 5, 6, 7)
       T = timeGetTime: Print "Naive", SumNaive(999, PrimeFactorList), Format(timeGetTime - T, "0msec")
       T = timeGetTime: Print "Gauss", SumGauss(999, PrimeFactorList), Format(timeGetTime - T, "0msec"), "<- wrong result"
       T = timeGetTime: Print "GaussLCM", SumGaussLCM(999, PrimeFactorList), Format(timeGetTime - T, "0msec")
       Print
       
       Print "Result for Primes (2, 3, 5, 7, 11, 13, 17, 19)"
       PrimeFactorList = LongArr(2, 3, 5, 7, 11, 13, 17, 19)
       T = timeGetTime: Print "Naive", SumNaive(n, PrimeFactorList), Format(timeGetTime - T, "0msec")
       T = timeGetTime: Print "Gauss", SumGauss(n, PrimeFactorList), Format(timeGetTime - T, "0msec")
       T = timeGetTime: Print "GaussLCM", SumGaussLCM(n, PrimeFactorList), Format(timeGetTime - T, "0msec")
       Print
       
       Print "Result for Primes (2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43)"
       PrimeFactorList = LongArr(2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43)
       T = timeGetTime: Print "Naive", SumNaive(n, PrimeFactorList), Format(timeGetTime - T, "0msec")
       T = timeGetTime: Print "Gauss", SumGauss(n, PrimeFactorList), Format(timeGetTime - T, "0msec")
       T = timeGetTime: Print "GaussLCM", SumGaussLCM(n, PrimeFactorList), Format(timeGetTime - T, "0msec")
    End Sub
     
    Function SumNaive(ByVal n As Long, L() As Long) As Currency
    Dim i As Long, j As Long
      For i = 1 To n
        For j = 0 To UBound(L)
          If i Mod L(j) = 0 Then SumNaive = SumNaive + i: Exit For
      Next j, i
    End Function
    
    Function SumGauss(ByVal n As Long, L() As Long) As Currency
    Dim i As Long, j As Long, Pow As Double, ni As Long
      For i = 1 To 2 ^ (UBound(L) + 1) - 1
        Pow = -1
        For j = 0 To UBound(L)
          If i And 2 ^ j Then Pow = -Pow * L(j)
        Next j
        ni = Int(n / Abs(Pow))
        SumGauss = SumGauss + Pow * ni * (ni + 1) / 2
      Next i
    End Function
    
    Function SumGaussLCM(ByVal n As Long, L() As Long) As Currency
    Dim i As Long, j As Long, Pow As Double, ni As Long
      For i = 1 To 2 ^ (UBound(L) + 1) - 1
        Pow = -1
        For j = 0 To UBound(L)
          If i And 2 ^ j Then Pow = -Int(Pow * L(j) / GCD(Abs(Pow), L(j)))
        Next j
        ni = Int(n / Abs(Pow))
        SumGaussLCM = SumGaussLCM + Pow * ni * (ni + 1) / 2
      Next i
    End Function
    
    'used in SumGaussLCM only, to calculate the least-common-multiple there
    Function GCD(ByVal a As Double, ByVal b As Double) As Double
      If b > 0 Then GCD = GCD(b, a - b * Int(a / b)) Else GCD = a
    End Function
    Olaf
    Last edited by Schmidt; Jul 1st, 2014 at 12:13 AM.

  28. #28
    Only Slightly Obsessive jemidiah's Avatar
    Join Date
    Apr 2002
    Posts
    2,431

    Re: Using a mathematical identity to speed up certain calculations.

    Here's a little speed comparison data using the code in my previous post on my personal machine:

    Code:
    : %timeit sum_quotients5([174, 192, 934, 554, 1234, 4321], 100000)
    10000 loops, best of 3: 199 µs per loop
    
    : %timeit sum_quotients4([174, 192, 934, 554, 1234, 4321], 100000)
    10 loops, best of 3: 130 ms per loop
    So, the naive version is very roughly 1000 times slower than the other version in this particular use case. Some profiling data, mostly for illustration:

    Code:
    : %prun %timeit sum_quotients5([174, 192, 934, 554, 1234, 4321], 100000)
    
             1851399 function calls (1851298 primitive calls) in 1.258 seconds
    
       Ordered by: internal time
    
       ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       530319    0.477    0.000    0.477    0.000 fractions.py:18(gcd)
       530319    0.231    0.000    0.709    0.000 <ipython-input-73-77bf6e782cd3>:3(lcm2)
         4111    0.226    0.000    1.251    0.000 <ipython-input-99-b022e6efa616>:3(sum_quotients5)
       258993    0.171    0.000    0.879    0.000 {reduce}
       258993    0.081    0.000    0.960    0.000 <ipython-input-51-dc0a7396641b>:4(lcm)
       258993    0.063    0.000    0.063    0.000 <ipython-input-99-b022e6efa616>:5(<lambda>)
            7    0.002    0.000    1.254    0.179 <magic-timeit>:1(inner)
         4116    0.002    0.000    0.002    0.000 {range}
            8    0.001    0.000    0.001    0.000 {compile}
         4113    0.000    0.000    0.000    0.000 {len}
            8    0.000    0.000    0.000    0.000 shlex.py:120(read_token)
    ...[continued]...
    (Normally I would just have done "%prun [code to run]" rather than "%prun %timeit [code to run]", but there was very little data without running it a bunch of times; more complicated operations like some of the research code I've used have much more interesting profile info.)

    Here you can see the main bottleneck on this particular data is the gcd function I'm using, followed closely by the lcm2 function. Reduce is getting called a lot, as you'd expect. The lambda I called b2 is listed there as well just under lcm.

    I checked a few other numbers against Schmidt's and am frankly impressed that Python is able to perform so closely to a compiled language, though to be fair VB6 is quite old. One advantage of Python in this context is that it has built-in arbitrary-length integers. For instance:

    Code:
    In [109]: sum_quotients5([3, 5], 10**100)
    23333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333331666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666666668L
    The time you enjoy wasting is not wasted time.
    Bertrand Russell

    <- Remember to rate posts you find helpful.

  29. #29
    PowerPoster
    Join Date
    Jun 2013
    Posts
    7,219

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by jemidiah View Post
    Here's a little speed comparison data using the code in my previous post on my personal machine:

    Code:
    : %timeit sum_quotients5([174, 192, 934, 554, 1234, 4321], 100000)
    10000 loops, best of 3: 199 µs per loop
    
    : %timeit sum_quotients4([174, 192, 934, 554, 1234, 4321], 100000)
    10 loops, best of 3: 130 ms per loop
    So, the naive version is very roughly 1000 times slower than the other version in this particular use case. Some profiling data, mostly for illustration:
    It's about 650 times slower - but yes, the difference is quite large - and overproportionally so,
    when we e.g. compare it with the VB6-native-compiler-timing when I run the same task:
    n = 100000, [174, 192, 934, 554, 1234, 4321].

    The naive loop in VB6 needs only 1.98ms in that scenario (about factor 65 faster than python) -
    whilst the LCM-corrected Gauss needs 18µsec (about 11 times as fast as python).

    The speed-relation between Naive and GaussLCM in VB6 is about 1:110 - not 1:650 as in python.
    Perhaps the naive Python-loop could gain when the implementation is done a little bit differently.

    Quote Originally Posted by jemidiah View Post
    I checked a few other numbers against Schmidt's and am frankly impressed that Python is able to perform so closely to a compiled language, though to be fair VB6 is quite old.
    I'd say that the advantage of factor 10-20 is a quite normal relation - it's basically the same speedup
    we see, when we compare VB6-PCode (IDE-DebugMode) vs. the VC6-C2.exe generated native VB6-binaries.
    As for "old" - the progress with modern C/C++ compiler-optimizations is somewhere in the range of 40-80%
    (when we compare VC-6 native compiles with all optimizations vs. those of e.g. VC-2012).
    Not really "a magnitude" we're talking about there.

    Quote Originally Posted by jemidiah View Post
    One advantage of Python in this context is that it has built-in arbitrary-length integers.
    Ack, that's quite a nice feature to have available built in.
    For VB6 one would have to resort to an appropriate library for that.

    Well, out of interest, I'd like to see a timing for both, your naive and the GaussLCM-version for this longer list of Primes:
    [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43] on n = 100000

    since that is what's used in the following code (slightly updated for speed), which produced
    native compiled the following result (Gauss now roughly 6 times slower than the naive algo).
    http://vbRichClient.com/Downloads/GaussSumOpt.png

    Into a Form:
    Code:
    Option Explicit
    
    Private Declare Function timeGetTime Lib "winmm" () As Long
    Private Declare Function timeBeginPeriod Lib "winmm" (ByVal uPeriod As Long) As Long
    
    Private Sub Form_Load()
      timeBeginPeriod 1
      AutoRedraw = True
      Caption = "Click the Form..."
    End Sub
    
    Function LongArr(ParamArray P()) As Long()
    Dim i As Long, L() As Long
      ReDim L(UBound(P))
      For i = 0 To UBound(P): L(i) = P(i): Next
      LongArr = L
    End Function
     
    Private Sub Form_Click()
    Dim n As Long, PrimeFactorList() As Long, i As Long, T As Long, Result As Currency
       Cls
       
       n = 100000
       Print vbLf; "n ="; n; vbLf
    
       Print "Result for Factors(174, 192, 934, 554, 1234, 4321)"
       PrimeFactorList = LongArr(174, 192, 934, 554, 1234, 4321)
       T = timeGetTime
         For i = 1 To 500: Result = SumNaive(n, PrimeFactorList): Next
       Print "Naive", Result, Format((timeGetTime - T) / 500, "0.000msec")
       T = timeGetTime
         For i = 1 To 500: Result = SumGaussLCM(n, PrimeFactorList): Next
       Print "GaussLCM", Result, Format((timeGetTime - T) / 500, "0.000msec"); vbLf
     
       
       Print "Result for Primes (2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43)"
       PrimeFactorList = LongArr(2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43)
       T = timeGetTime
         For i = 1 To 10: Result = SumNaive(n, PrimeFactorList): Next
       Print "Naive", Result, Format((timeGetTime - T) / 10, "0.0msec")
       T = timeGetTime
         For i = 1 To 10: Result = SumGaussLCM(n, PrimeFactorList): Next
       Print "GaussLCM", Result, Format((timeGetTime - T) / 10, "0.0msec"); vbLf
    End Sub
     
    Private Function SumNaive(ByVal n As Long, L() As Long) As Double
    Dim i As Long, j As Long, UBL As Long
      UBL = UBound(L)
      For i = 1 To n
        For j = 0 To UBL
          If i Mod L(j) = 0 Then SumNaive = SumNaive + i: Exit For
      Next j, i
    End Function
     
    Private Function SumGaussLCM(ByVal n As Long, L() As Long) As Double
    Dim i As Long, j As Long, s As Double, Pow As Double, ni As Long
    Static P2(0 To 30) As Long: If P2(30) = 0 Then For i = 0 To 30: P2(i) = 2 ^ i: Next
    
      For i = 1 To P2(UBound(L) + 1) - 1
        Pow = 1
        s = -0.5
        For j = 0 To UBound(L)
          If i And P2(j) Then s = -s: Pow = Int(Pow * L(j) / GCD(Pow, L(j)))
        Next j
        ni = Int(n / Pow)
        SumGaussLCM = SumGaussLCM + s * Pow * ni * (ni + 1)
      Next i
    End Function
    
    'used in SumGaussLCM only, to calculate the least-common-multiple there
    Private Function GCD(ByVal a As Double, ByVal b As Long) As Double
    Dim h As Long
      Do While b > 0
        If a < 2147483647# Then h = a Mod b Else h = a - b * Int(a / b)
        a = b: b = h
      Loop
      GCD = a
    End Function
    Olaf
    Last edited by Schmidt; Jul 1st, 2014 at 06:16 PM.

  30. #30
    Super Moderator si_the_geek's Avatar
    Join Date
    Jul 2002
    Location
    Bristol, UK
    Posts
    41,929

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by Schmidt View Post
    since that is what's used in the following code (slightly updated for speed), which produced
    native compiled the following result (Gauss now roughly 6 times slower than the naive algo).
    While I don't dispute the change of preference for method based on a different data set, there is obvious bias in your code against the Gauss method.

    In SumNaive you optimise the Ubound call (using the variable UBL), but don't do the equivalent in SumGaussLCM.

    In GCD and SumGaussLCM you do slow integer division (using Int(a / b) etc) rather than the proper method (eg: (a \ b) ).


    There are also other optimisations that could be done for SumGaussLCM (such as a Mod replacement), but the ones I spotted are all things that would take noticeable effort - so it is understandable that you didn't do them for something you only claimed was slightly updated for speed.

  31. #31
    PowerPoster
    Join Date
    Jun 2013
    Posts
    7,219

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by si_the_geek View Post
    While I don't dispute the change of preference for method based on a different data set, there is obvious bias in your code against the Gauss method.

    In SumNaive you optimise the Ubound call (using the variable UBL), but don't do the equivalent in SumGaussLCM.

    In GCD and SumGaussLCM you do slow integer division (using Int(a / b) etc) rather than the proper method (eg: (a \ b) ).


    There are also other optimisations that could be done for SumGaussLCM (such as a Mod replacement), but the ones I spotted are all things that would take noticeable effort - so it is understandable that you didn't do them for something you only claimed was slightly updated for speed.
    There's no bias against one method or the other - since both are valid algorithms in their own right.
    For certain scenarios one would be choosen over the other.

    Feel free to optimize the GaussLCM-method in my post #38 further if you want,
    but I can tell you that neither the "Ubound-Optimization" will bring any speed-increase -
    nor will any of your suggested "Mod or Div optimizations" work, because those
    are only viable for the (positive) range of VBs Long-Type (Values < 2^31) -
    and for somewhat longer powersets we go far above this range (in the Pow-Variable,
    which is of type Double for a reason).

    I already applied a small optimization in the sub-routine of the GaussLCM (GCD), which
    is the one that takes the most time - where I did:
    ...If a < 2147483647# Then h = a Mod b ...
    but applying a likewise optimization in other places is not worthwhile, because the
    inner-loop within the (already changed to non-recursive) GCD-function is the most
    frequented piece of code.

    So your allegation that I was biased is unfounded - you didn't even checked
    the changes you suggested out on your own...

    In the same style as in the other discussions we had already, you engage in
    speculation, applying only half-knowledge - not posting any code.
    I hope I am allowed to say that much - and that you don't try to ban me again,
    for the sole reason of pointing out your imprecise and kind of insulting behaviour.

    Remember that we are in Code-It-Better here - if you think that you can speed
    up the last SumGaussLCM by factor 2 or 3, then by all means don't talk that much -
    just do it and post the code and the results of your native compiled binary.
    *Then* call me biased (after delivering proof).

    And no, other than you wrote further above that:
    "there are also other optimisations that could be done for SumGaussLCM (such as a Mod replacement)"
    there is no such thing in the GaussLCM-routine - there's only the GCD-Sub-Routine which
    contains a Modulo-Op (already optimized for Values where this is possible) ...
    whilst the SumGaussLCM itself contains only "faked Integer-Divs" (no Modulo-Ops at all).

    So, what remains is your Ubound-suggestion - well, here is the updated code which contains that:

    Code:
    Option Explicit
    
    Private Declare Function timeGetTime Lib "winmm" () As Long
    Private Declare Function timeBeginPeriod Lib "winmm" (ByVal uPeriod As Long) As Long
    
    Private Sub Form_Load()
      timeBeginPeriod 1
      AutoRedraw = True
      Caption = "Click the Form..."
    End Sub
    
    Function LongArr(ParamArray P()) As Long()
    Dim i As Long, L() As Long
      ReDim L(UBound(P))
      For i = 0 To UBound(P): L(i) = P(i): Next
      LongArr = L
    End Function
    
    Private Sub Form_Click()
    Dim n As Long, PrimeFactorList() As Long, i As Long, T As Long, Result As Currency
       Cls
       
       n = 100000
       Print vbLf; "n ="; n; vbLf
    
       Print "Result for Factors(174, 192, 934, 554, 1234, 4321)"
       PrimeFactorList = LongArr(174, 192, 934, 554, 1234, 4321)
       T = timeGetTime
         For i = 1 To 500: Result = SumNaive(n, PrimeFactorList): Next
       Print "Naive", Result, Format((timeGetTime - T) / 500, "0.000msec")
       T = timeGetTime
         For i = 1 To 500: Result = SumGaussLCM(n, PrimeFactorList): Next
       Print "GaussLCM", Result, Format((timeGetTime - T) / 500, "0.000msec"); vbLf
     
       Print "Result for Primes (2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43)"
       PrimeFactorList = LongArr(2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43)
       T = timeGetTime
         For i = 1 To 10: Result = SumNaive(n, PrimeFactorList): Next
       Print "Naive", Result, Format((timeGetTime - T) / 10, "0.0msec")
       T = timeGetTime
         For i = 1 To 10: Result = SumGaussLCM(n, PrimeFactorList): Next
       Print "GaussLCM", Result, Format((timeGetTime - T) / 10, "0.0msec"); vbLf
       
       Print "Result for Primes (2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71)"
       PrimeFactorList = LongArr(2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71)
       T = timeGetTime
         For i = 1 To 3: Result = SumNaive(n, PrimeFactorList): Next
       Print "Naive", Result, Format((timeGetTime - T) / 3, "0.0msec")
       T = timeGetTime
         For i = 1 To 3: Result = SumGaussLCM(n, PrimeFactorList): Next
       Print "GaussLCM", Result, Format((timeGetTime - T) / 3, "0.0msec"); vbLf
    End Sub
     
    Private Function SumNaive(ByVal n As Long, L() As Long) As Double
    Dim i As Long, j As Long, UBL As Long
      UBL = UBound(L)
      For i = 1 To n
        For j = 0 To UBL
          If i Mod L(j) = 0 Then SumNaive = SumNaive + i: Exit For
      Next j, i
    End Function
     
    Private Function SumGaussLCM(ByVal n As Long, L() As Long) As Double
    Dim i As Long, j As Long, s As Double, Pow As Double, ni As Long, UBL As Long
    Static P2(0 To 30) As Long: If P2(0) = 0 Then For i = 0 To 30: P2(i) = 2 ^ i: Next
     
      UBL = UBound(L)
      For i = 1 To P2(UBL + 1) - 1
        Pow = 1
        s = -0.5
        For j = 0 To UBL
          If i And P2(j) Then s = -s: Pow = Int(Pow * L(j) / GCD(Pow, L(j)))
        Next j
        ni = Int(n / Pow)
        SumGaussLCM = SumGaussLCM + s * Pow * ni * (ni + 1)
      Next i
    End Function
    
    'used in SumGaussLCM only, to calculate the least-common-multiple there
    Private Function GCD(ByVal a As Double, ByVal b As Long) As Double
    Dim h As Long
      Do While b > 0
        If a < 2147483647# Then h = a Mod b Else h = a - b * Int(a / b)
        a = b: b = h
      Loop
      GCD = a
    End Function
    And as you see, no changes speedwise when you compare the results for the already posted sets from #38:
    http://www.vbforums.com/images/ieimages/2014/07/1.png

    ...with these new ones:
    http://vbRichClient.com/Downloads/GaussSumOpt2.png

    And the best argument against your allegation that I was disfavouring the GaussLCM deliberately,
    is the last set in the above screen-shot- which was only increased by 6 additional Primes, but causing
    a disadvantage of the GaussLCM of about factor *500* now.

    With my earlier, shorter lists, I tried to point out, from what amount of Primes in the list,
    (roughly - and for n=100000) the GaussLCM is at a disadvantage with native VB6-code.
    But there *exists* such a Balance-point (as already mentioned in post #25) -
    and no factor 2 or even a factor 10 faster GaussLCM would help much, since the
    performance of this algo *will* get worse (with the Factors-Count of the list) exponentially.


    Olaf
    Last edited by Schmidt; Jul 2nd, 2014 at 09:36 PM.

  32. #32
    Super Moderator si_the_geek's Avatar
    Join Date
    Jul 2002
    Location
    Bristol, UK
    Posts
    41,929

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by Schmidt View Post
    There's no bias against one method or the other - since both are valid algorithms in their own right.
    Both being valid clearly does not mean they are of equal quality. If you wrote one and a beginner wrote the other, we would expect the quality of yours to be better.

    In the context of a speed trial, it is unquestionable (but not necessarily intentional) bias to have a speed optimisation only in one, when it can clearly be applied to both. That applies whether or not the effect turns out to be noticeable.

    Your admission of only having "slightly updated for speed" clearly implies that there will probably be a speed bias towards the simpler one (as that is much easier to optimise). As I said before that is absolutely fine (because you made the caveat), but later claiming there is no bias at all is blatantly dishonest.

    but I can tell you that neither the "Ubound-Optimization" will bring any speed-increase -
    nor will any of your suggested "Mod or Div optimizations" work, because those
    are only viable for the (positive) range of VBs Long-Type (Values < 2^31)
    Regarding the UBound optimisation, you thought it was worth adding to the other method, so it was wrong to not also apply it in the same way to the Gauss one.

    While I made a mistake regarding the range of the \ operator in VB6, a Mod replacement clearly does not need to have the same limits as the built-in Mod operator.

    In the same style as in the other discussions we had already, you engage in
    speculation, applying only half-knowledge - not posting any code.
    My one mistake does not mean speculation or half-knowledge. It is childish for you to make up that kind of insult about people just because they point out some of the flaws in your posts.

    As for the comment regarding code, you are well aware that these days the majority of my forum time needs to be moderation, rather than code based.

    Stop the trolling and lies - as evidenced in that quote and several other places in your post, and openly admitted by you on multiple occasions in the past.

    I hope I am allowed to say that much - and that you don't try to ban me again,
    for the sole reason of pointing out your imprecise and kind of insulting behaviour.
    That is slanderous.

    As you are well aware, people only get banned for violating the rules - and in the case of your previous ban the violation was unquestionably fully intentional (as were the multiple times before that when you were only warned).

  33. #33
    Only Slightly Obsessive jemidiah's Avatar
    Join Date
    Apr 2002
    Posts
    2,431

    Re: Using a mathematical identity to speed up certain calculations.

    The timings you requested, first "GaussLCM", then naive:

    Code:
    In [110]: %timeit sum_quotients5([2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43], 100000)
    10 loops, best of 3: 93.9 ms per loop
    
    In [111]: %timeit sum_quotients4([2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43], 100000)
    10 loops, best of 3: 131 ms per loop
    I had actually run the first one earlier, and had compared the results to your GaussLCM result in #36, which took 51 ms, hence my "surprisingly close" comments. I'm curious why Python seems so much slower with the Naive method than VB6. I tried a few minor variations and they took essentially the same amount of time as my original.

    That said, I think this type of comparison is mostly an intellectual exercise. I wouldn't use Python in an important, speed-critical application; I'd use Cython or C. The main advantage of Python in this case in my mind is its simplicity, like the naive method being a single line; the fact that gcd, arbitrary-length integers, and more powerful iteration are built-in, no extra debugging or thought needed; and the overall "readability" of the code. I'd say efficiency isn't just a measurement of how fast the code runs; it also includes how long it took to correctly write and how much effort it cost the coder. For solving Project Euler's first problem, I'd say Python wins by a mile on that front: just type "sum(n for n in range(1000) if n%3 == 0 or n%5 == 0)" into an interpreter and you have your answer immediately.
    The time you enjoy wasting is not wasted time.
    Bertrand Russell

    <- Remember to rate posts you find helpful.

  34. #34

    Thread Starter
    Hyperactive Member Maven's Avatar
    Join Date
    Feb 2003
    Location
    Greeneville, TN
    Posts
    322

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by jemidiah View Post
    The timings you requested, first "GaussLCM", then naive:

    Code:
    In [110]: %timeit sum_quotients5([2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43], 100000)
    10 loops, best of 3: 93.9 ms per loop
    
    In [111]: %timeit sum_quotients4([2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43], 100000)
    10 loops, best of 3: 131 ms per loop
    I had actually run the first one earlier, and had compared the results to your GaussLCM result in #36, which took 51 ms, hence my "surprisingly close" comments. I'm curious why Python seems so much slower with the Naive method than VB6. I tried a few minor variations and they took essentially the same amount of time as my original.

    That said, I think this type of comparison is mostly an intellectual exercise. I wouldn't use Python in an important, speed-critical application; I'd use Cython or C. The main advantage of Python in this case in my mind is its simplicity, like the naive method being a single line; the fact that gcd, arbitrary-length integers, and more powerful iteration are built-in, no extra debugging or thought needed; and the overall "readability" of the code. I'd say efficiency isn't just a measurement of how fast the code runs; it also includes how long it took to correctly write and how much effort it cost the coder. For solving Project Euler's first problem, I'd say Python wins by a mile on that front: just type "sum(n for n in range(1000) if n%3 == 0 or n%5 == 0)" into an interpreter and you have your answer immediately.
    I don't think one could tell much about comparing times like these. Processes are frequently swapped in and out by the operating system; as a result, one can't tell by looking at the system time how long exactly a process took to complete. Of course, this is outside of computer or operating system comparisons.

    In addition, ask yourself what your trying to optimize for exactly. You're going to get different problems with resulting strategies depending on how you answer the question.
    Education is an admirable thing, but it is well to remember from time to time that nothing that is worth knowing can be taught. - Oscar Wilde

  35. #35
    Only Slightly Obsessive jemidiah's Avatar
    Join Date
    Apr 2002
    Posts
    2,431

    Re: Using a mathematical identity to speed up certain calculations.

    Since the original topic is likely dead (after having been derailed twice, and having relatively little content to begin with), I'll make a few "meta" comments.

    Boy, this was a painful thread to participate in. Schmidt's interactions with si and Maven both got very personal very quickly. He managed to make everyone else (including me, a little) defensive after one reply each. Frankly Schmidt, you need to learn to play well with others. Having good ideas is not enough--having amazingly great ideas might be, but I haven't seen that from you so far. I'm reminded of an intern a friend of mine worked with. He had implemented some odd sorting algorithm, and during code review was told to change it to something standard. Instead, he wrote a proof of correctness of his original algorithm in the comments. Unsurprisingly, he was not hired after the internship. I don't know if the particulars of this example are applicable to you Schmidt, but the general idea certainly is: coding is about more than just telling a computer how to do things; having productive interactions with others is fundamentally important.

    On the flip side, Maven and si could have done better to not let Schmidt get under their skin. Just focus on the content; if you start writing something that depends on your character/skills or the character/skills of the other person, you should probably just delete it and let it go. For instance, I had started to reply to Schmidt's "650" comment above, saying something to the effect of, "I can divide 130 ms by 199 µs, thank you very much," but am glad I didn't actually post it, since it's just not helpful. (I'm sorry it ended up in this post, though it's a good example.) (I understand si's role as a moderator is different from my role as a random forum user. si and Schmidt's exchange seems almost entirely unhelpful regardless.)

    With all the bickering and nitpicking and talking past each other, this thread is *way* longer than it should have been, and I doubt it will help many people code anything better, if only because of the sheer amount of personal "spew" you have to wade through to get at the content.

    Anywho, best of luck everyone; no hard feelings; have a good life if I don't see you again.
    The time you enjoy wasting is not wasted time.
    Bertrand Russell

    <- Remember to rate posts you find helpful.

  36. #36

    Thread Starter
    Hyperactive Member Maven's Avatar
    Join Date
    Feb 2003
    Location
    Greeneville, TN
    Posts
    322

    Re: Using a mathematical identity to speed up certain calculations.

    Quote Originally Posted by jemidiah View Post
    Since the original topic is likely dead (after having been derailed twice, and having relatively little content to begin with), I'll make a few "meta" comments.

    Boy, this was a painful thread to participate in. Schmidt's interactions with si and Maven both got very personal very quickly. He managed to make everyone else (including me, a little) defensive after one reply each. Frankly Schmidt, you need to learn to play well with others. Having good ideas is not enough--having amazingly great ideas might be, but I haven't seen that from you so far. I'm reminded of an intern a friend of mine worked with. He had implemented some odd sorting algorithm, and during code review was told to change it to something standard. Instead, he wrote a proof of correctness of his original algorithm in the comments. Unsurprisingly, he was not hired after the internship. I don't know if the particulars of this example are applicable to you Schmidt, but the general idea certainly is: coding is about more than just telling a computer how to do things; having productive interactions with others is fundamentally important.

    On the flip side, Maven and si could have done better to not let Schmidt get under their skin. Just focus on the content; if you start writing something that depends on your character/skills or the character/skills of the other person, you should probably just delete it and let it go. For instance, I had started to reply to Schmidt's "650" comment above, saying something to the effect of, "I can divide 130 ms by 199 µs, thank you very much," but am glad I didn't actually post it, since it's just not helpful. (I'm sorry it ended up in this post, though it's a good example.) (I understand si's role as a moderator is different from my role as a random forum user. si and Schmidt's exchange seems almost entirely unhelpful regardless.)

    With all the bickering and nitpicking and talking past each other, this thread is *way* longer than it should have been, and I doubt it will help many people code anything better, if only because of the sheer amount of personal "spew" you have to wade through to get at the content.

    Anywho, best of luck everyone; no hard feelings; have a good life if I don't see you again.

    I can only blame myself since I made the post without proof reading or checking the formula. The main point to the topic was that mathematical identities are useful for speeding up certain calculations on computers. An observation that nobody seems to have made is that both algorithms come from the same identity. The left hand side of the identity was the basis for the algorithm posted by Schmidt. There is nothing interesting about the approach; instead, it was obvious. In mathematics, there are countless identities for various problems. When one faces a calculation, it's wroth while to check to see if an identity is readily available. Some identities can save many orders of magnitude of algorithmic complexity. For example, the right hand identity found by Guass saved us one order of complexity.

    I would like to point out that there is a difference between code optimization and algorithmic optimization. I think people tend to confuse the two. In code optimization, the goal is to get an algorithm to run faster on a specific machine. Outside of embedded systems, code optimization is mostly a waste of time. Algorithmic optimization is about reducing complexity of an algorithm. I don't mean complexity in the sense that the algorithm is easier to understand; instead, I mean complexity in terms of the number of steps required to solve a problem. For example, a selection sort has a worse case of O(n^2). If we're talking about sorting an array, the steps required to do it using a selection sort is the size of the array squared. The heap sort on the other hand has a O(nlogn) complexity. Again n here is referring to the size of the array. So the heap sort has a lower complexity than the selection sort even though the heap sort is more difficult to understand than the selection sort.

    So to bring this back to the above, mathematical identities can be useful to get lower complexity on many mathematical problems. They are superior to code optimization of higher complexity algorithms because not performing steps is faster than performing highly optimized steps.

    This ties into many arguments about languages. What is the fastest language etc. Algorithms are generally more important than language selection. One can optimize an O(n^2) algorithm in assembly code and bring the algorithm to its maximum potential on a machine; however, we could implement an algorithm with a lower complexity of o(n) in visual basic and run circles around the highly tuned o(n^2) algorithm written in assembler.
    Education is an admirable thing, but it is well to remember from time to time that nothing that is worth knowing can be taught. - Oscar Wilde

  37. #37
    Angel of Code Niya's Avatar
    Join Date
    Nov 2011
    Posts
    8,598

    Re: Using a mathematical identity to speed up certain calculations.

    Any thread with Si+Schmidt or me+axisdj+Carlos Rocha+Schmidt or dilettante+any MS fan is going to have some kind of drama. Donno what it is about these combination of personalities but it never ends well lol....
    Treeview with NodeAdded/NodesRemoved events | BlinkLabel control | Calculate Permutations | Object Enums | ComboBox with centered items | .Net Internals article(not mine) | Wizard Control | Understanding Multi-Threading | Simple file compression | Demon Arena

    Copy/move files using Windows Shell | I'm not wanted

    C++ programmers will dismiss you as a cretinous simpleton for your inability to keep track of pointers chained 6 levels deep and Java programmers will pillory you for buying into the evils of Microsoft. Meanwhile C# programmers will get paid just a little bit more than you for writing exactly the same code and VB6 programmers will continue to whitter on about "footprints". - FunkyDexter

    There's just no reason to use garbage like InputBox. - jmcilhinney

    The threads I start are Niya and Olaf free zones. No arguing about the benefits of VB6 over .NET here please. Happiness must reign. - yereverluvinuncleber

  38. #38
    Super Moderator si_the_geek's Avatar
    Join Date
    Jul 2002
    Location
    Bristol, UK
    Posts
    41,929

    Re: Using a mathematical identity to speed up certain calculations.

    I would disagree with the level of benefit from language chosen and code optimisation, because in the right cases they can each make a massive speed difference, and in many others they give a worthwhile benefit (which is often enough).

    I do agree however that the algorithm is the best place to start - because it is usually by far the biggest differentiator (and often takes less effort), plus it often alters the other two phases.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  



Click Here to Expand Forum to Full Width