FizzBuzz! Why is one method faster than the other?

Maximilian

Lifer
Feb 8, 2004
12,604
15
81
So I made a fizzbuzz program (yeah go me!) because it dawned on me ive never actually done that. I then made a modified version which in my mind should have been a bit faster but it wasent. fizzBuzz1 is the modified version, fizzBuzz2 is the original one I made.


After 1000 runs the average execution time of fizzBuzz1 was 1712
Code:
    public static long fizzBuzz1()
    {
        int maxNumber = 100000;
        boolean fizzModulus;
        boolean buzzModulus;
        long startTime = System.currentTimeMillis();
        long endTime;
        long executionTime;

        for (int count = 0; count <= maxNumber; count++)
        {
            //here we calculate if the number is fizz or buzz or both
            fizzModulus = count % 3 == 0;
            buzzModulus = count % 5 == 0;

            //we then use the results to print out the appropriate text
            if (fizzModulus && buzzModulus)
            {
                System.out.println("fizz buzz");
            }
            else
            {
                if (fizzModulus)
                {
                    System.out.println("fizz");
                }
                else
                {
                    if (buzzModulus)
                    {
                        System.out.println("buzz");
                    }
                    else
                    {
                        System.out.println(count);
                    }
                }                    
            }
        }
        endTime = System.currentTimeMillis();
        executionTime = endTime - startTime;
        System.out.println("Execution time: " + executionTime);
        return executionTime;
    }

After 1000 runs the average execution time of fizzBuzz1 was 1646
Code:
public static long fizzBuzz2()
    {
        int maxNumber = 100000;
        long startTime = System.currentTimeMillis();
        long endTime;
        long executionTime;

        for (int count = 0; count <= maxNumber; count++)
        {
            if (count % 3 == 0 && count % 5 == 0)
            {
                System.out.println("fizz buzz");
            }
            else
            {
                if (count % 3 == 0)
                {
                    System.out.println("fizz");
                }
                else
                {
                    if (count % 5 == 0)
                    {
                        System.out.println("buzz");
                    }
                    else
                    {
                        System.out.println(count);
                    }
                }                    
            }
        }
        endTime = System.currentTimeMillis();
        executionTime = endTime - startTime;
        System.out.println("Execution time: " + executionTime);
        return executionTime;
    }

Why is fizzBuzz2 faster? It does between 2 and 3 modulus calculations per run, fizzBuzz1 only ever does 2 calculations. I thought checking a stored boolean was faster than doing a calculation? What gives?
 

DaveSimmons

Elite Member
Aug 12, 2001
40,730
670
126
Perhaps the compiler couldn't optimize the code as well because the calculations are now variables that might be altered between the assignments and the two uses.

Optimizations don't always work like you'd expect when using a high-level language.

Edit: for example the 3-mod version might be optimized by the compiler to keep count in a register then do a mod on it to another register then a register compare, while the variables version adds slower stores to and fetches from RAM.
 
Last edited:

Markbnj

Elite Member <br>Moderator Emeritus
Moderator
Sep 16, 2005
15,682
14
81
www.markbetz.net
4%? Not really a huge difference. I'm not saying it isn't notable, but it's a small enough difference that figuring out where it came from in high level language code will be difficult, as Dave noted. I'm not sure 1000 runs is enough to average out stuff like demands for CPU from other processes, etc.
 

Schmide

Diamond Member
Mar 7, 2002
5,745
1,036
126
I altered the code a bit to make the example compatible with the interactive compiler. Without the passed fizz buzz it would optimize out the whole code.

(apparently the forums remove the links) Copy, paste and remove the space after the h and g (it's safe I promise it was generated from the IC). Otherwise you can just copy the examples in and such.

h ttp://g oo.gl/ffaK09


Code:
void fizzBuzz1(int &fizz, int &buzz) {
   int maxNumber = 100000;
   bool fizzModulus;
   bool buzzModulus;

   for (int count = 0; count <= maxNumber; count++)
   {
      //here we calculate if the number is fizz or buzz or both
      fizzModulus = count % 3 == 0;
      buzzModulus = count % 5 == 0;

      //we then use the results to print out the appropriate text
      if (fizzModulus && buzzModulus)
      {
         fizz++; buzz++;
      }
      else
      {
         if (fizzModulus)
         {
            fizz++; 
         }
         else
         {
            if (buzzModulus)
            {
               buzz++;
            }
         }                    
      }
   }
}

Output asm

Code:
::L__routine_start__Z9fizzBuzz1RiS(void):
fizzBuzz1(int&, int&):
        xorl      %r10d, %r10d                                  #6.19
        movl      $1, %r8d                                      #6.19
..B1.2:                         # Preds ..B1.7 ..B1.1
        movl      $1717986919, %eax                             #10.29
        movl      %r10d, %r9d                                   #10.29
        imull     %r10d                                         #10.29
        sarl      $31, %r9d                                     #10.29
        movl      $1431655766, %eax                             #9.29
        sarl      $1, %edx                                      #10.29
        movl      %r10d, %r11d                                  #10.29
        subl      %r9d, %edx                                    #10.29
        lea       (%rdx,%rdx,4), %ecx                           #10.29
        imull     %r10d                                         #9.29
        subl      %ecx, %r11d                                   #10.29
        movl      $0, %ecx                                      #10.7
        cmove     %r8d, %ecx                                    #10.7
        subl      %r9d, %edx                                    #9.29
        movl      %r10d, %r9d                                   #9.29
        lea       (%rdx,%rdx,2), %eax                           #9.29
        subl      %eax, %r9d                                    #9.29
        jne       ..B1.5        # Prob 50%                      #13.11
        incl      (%rdi)                                        #15.10
        testl     %ecx, %ecx                                    #13.26
        jne       ..B1.6        # Prob 50%                      #13.26
        jmp       ..B1.7        # Prob 100%                     #13.26
..B1.5:                         # Preds ..B1.2
        testl     %ecx, %ecx                                    #25.17
        je        ..B1.7        # Prob 50%                      #25.17
..B1.6:                         # Preds ..B1.3 ..B1.5
        incl      (%rsi)                                        #27.16
..B1.7:                         # Preds ..B1.5 ..B1.3 ..B1.6
        incl      %r10d                                         #6.44
        cmpl      $100000, %r10d                                #6.33
        jle       ..B1.2        # Prob 82%                      #6.33
        ret                                                     #32.1



h ttp://g oo.gl/ScJwv2

Code:
void fizzBuzz2(int &fizz, int &buzz) {
   int maxNumber = 100000;

   for (int count = 0; count <= maxNumber; count++)
   {
      if (count % 3 == 0 && count % 5 == 0)
      {
         fizz++; buzz++;
      }
      else
      {
         if (count % 3 == 0)
         {
            fizz++; 
         }
         else
         {
            if (count % 5 == 0)
            {
               buzz++;
            }
         }                    
      }
   }
}

Output asm
Code:
::L__routine_start__Z9fizzBuzz2RiS(void):
fizzBuzz2(int&, int&):
        xorl      %ecx, %ecx                                    #4.4
..B1.2:                         # Preds ..B1.7 ..B1.1
        movl      $1431655766, %eax                             #6.19
        movl      %ecx, %r8d                                    #6.19
        imull     %ecx                                          #6.19
        sarl      $31, %r8d                                     #6.19
        movl      %ecx, %r10d                                   #6.19
        subl      %r8d, %edx                                    #6.19
        lea       (%rdx,%rdx,2), %r9d                           #6.19
        subl      %r9d, %r10d                                   #6.19
        jne       ..B1.5        # Prob 50%                      #6.24
        movl      $1717986919, %eax                             #6.37
        movl      %ecx, %r8d                                    #6.37
        imull     %ecx                                          #6.37
        sarl      $31, %r8d                                     #6.37
        movl      %ecx, %r10d                                   #6.37
        sarl      $1, %edx                                      #6.37
        subl      %r8d, %edx                                    #6.37
        incl      (%rdi)                                        #8.10
        lea       (%rdx,%rdx,4), %r9d                           #6.37
        subl      %r9d, %r10d                                   #6.37
        je        ..B1.6        # Prob 50%                      #6.42
        jmp       ..B1.7        # Prob 100%                     #6.42
..B1.5:                         # Preds ..B1.2
        movl      $1717986919, %eax                             #18.25
        movl      %ecx, %r8d                                    #18.25
        imull     %ecx                                          #18.25
        sarl      $31, %r8d                                     #18.25
        movl      %ecx, %r10d                                   #18.25
        sarl      $1, %edx                                      #18.25
        subl      %r8d, %edx                                    #18.25
        lea       (%rdx,%rdx,4), %r9d                           #18.25
        subl      %r9d, %r10d                                   #18.25
        jne       ..B1.7        # Prob 50%                      #18.30
..B1.6:                         # Preds ..B1.3 ..B1.5
        incl      (%rsi)                                        #20.16
..B1.7:                         # Preds ..B1.3 ..B1.5 ..B1.6
        incl      %ecx                                          #4.4
        cmpl      $100001, %ecx                                 #4.4
        jb        ..B1.2        # Prob 82%                      #4.4
        ret                                                     #25.1

2 things I notice immediately.

They are using magic numbers to avoid divides. (0x55555556 for div 3 and 0x6666667 for div 5)

The compiler seems to do an early jump for fizzBuzz2 choosing to duplicate the div 5 code.

Although neither should be that far off from the other, the order of system outs is probably effecting the runtime.
 
Last edited:

Cogman

Lifer
Sep 19, 2000
10,286
145
106
printlns are a pretty big problem here. Giving up control to the OS is going to mess with any benchmark you are trying to do. You are effectively benchmarking println speed and seeing that it differs depending on what the OS is doing.

Take out output statements from benchmark code.

For the record, I got exactly opposite results from the OPs.

With printlns the code executes in 4000ms, without, both methods execute in under 20ms (which is pretty near the OS resolution).
 
Last edited:

Cogman

Lifer
Sep 19, 2000
10,286
145
106
Here is a corrected benchmark that more correctly benchmarks which method is faster.

Code:
	private static final long RUNTIME = 2 * 1000;

	public static void main(String[] args)
	{
		for (int i = 0; i < 10; i++)
		{
			long fizzbuzz1Num = 0;
			long start = System.currentTimeMillis();
			while (System.currentTimeMillis() - start < RUNTIME)
			{ 
				fizzBuzz1();
				++fizzbuzz1Num;
			}
			long fizzbuzz2Num = 0;
			start = System.currentTimeMillis();
			while (System.currentTimeMillis() - start < RUNTIME)
			{
				fizzBuzz2();
				++fizzbuzz2Num;
			}
			System.out.printf("fizzbuzz1 ran %d times and fizzbuzz2 ran %d times\n", fizzbuzz1Num, fizzbuzz2Num);
		}

	}

	public static long fizzBuzz1()
	{
		final int maxNumber = 100000;
		long arbNumber = 0;
		for (int count = 0; count <= maxNumber; count++)
		{
			//here we calculate if the number is fizz or buzz or both
			boolean fizzModulus = count % 3 == 0;
			boolean buzzModulus = count % 5 == 0;

			//we then use the results to print out the appropriate text
			if (fizzModulus && buzzModulus)
			{
				arbNumber += 1;
			}
			else
			{
				if (fizzModulus)
				{
					arbNumber += 2;
				}
				else
				{
					if (buzzModulus)
					{
						arbNumber += 3;
					}
					else
					{
						arbNumber += 4;
					}
				}
			}
		}
		return arbNumber;
	}

	public static long fizzBuzz2()
	{
		final int maxNumber = 100000;
		long arbNumber = 0;

		for (int count = 0; count <= maxNumber; count++)
		{
			if (count % 3 == 0 && count % 5 == 0)
			{
				arbNumber += 1;
			}
			else
			{
				if (count % 3 == 0)
				{
					arbNumber += 2;
				}
				else
				{
					if (count % 5 == 0)
					{
						arbNumber += 3;
					}
					else
					{
						arbNumber += 4;
					}
				}
			}
		}
		return arbNumber;
	}

And the output on my machine while I'm doing mostly nothing

Code:
fizzbuzz1 ran 9090 times and fizzbuzz2 ran 9125 times
fizzbuzz1 ran 47686740 times and fizzbuzz2 ran 169181425 times
fizzbuzz1 ran 164685086 times and fizzbuzz2 ran 174013853 times
fizzbuzz1 ran 170153380 times and fizzbuzz2 ran 173782804 times
fizzbuzz1 ran 170447111 times and fizzbuzz2 ran 173984947 times
fizzbuzz1 ran 170500965 times and fizzbuzz2 ran 174065139 times
fizzbuzz1 ran 171182655 times and fizzbuzz2 ran 174628662 times
fizzbuzz1 ran 171280194 times and fizzbuzz2 ran 175042304 times
fizzbuzz1 ran 171394829 times and fizzbuzz2 ran 174860114 times
fizzbuzz1 ran 170961805 times and fizzbuzz2 ran 174351304 times

As you can see from these results, runtime is pretty much equivalent. Of course, I kind of think that I'm more testing the speed of array allocation and insertion, but at least that is less intensive that printouts. I'll see if I can come up with a better test that gets to the crux of the issue.

EDIT OK, I've changed up the benchmark to remove array stuff. Now this is purely testing the speed of each function. It is somewhat impressive how much the optimizer helps the methods perform once it kicks in.

Either way, it looks to me like the methods are pretty comparable in speed (though the second does look ever so slightly faster).
 
Last edited:

Ken g6

Programming Moderator, Elite Member
Moderator
Dec 11, 1999
16,696
4,658
75
Modulus is an expensive operation. (http://www.agner.org/optimize/) On modern processors, an unpredictable conditional can also be an expensive operation. fizzBuzz2 trades having more moduli (sp?) in the code for having more conditionals that can skip them. ("&&" is one such conditional!)

Edit: For real speed improvement, try a sieve. :sneaky:
 

Maximilian

Lifer
Feb 8, 2004
12,604
15
81
Hmm interesting stuff, maybe one day ill understand assembly :awe:

I took out the println statements as suggested and ended up with a pretty meaningless 0ms average execution time for both methods. Cogmans measuring the number of method executions in 2000ms is a much better idea!

Learn something every day ;)
 

Cogman

Lifer
Sep 19, 2000
10,286
145
106
Hmm interesting stuff, maybe one day ill understand assembly :awe:

I took out the println statements as suggested and ended up with a pretty meaningless 0ms average execution time for both methods. Cogmans measuring the number of method executions in 2000ms is a much better idea!

Learn something every day ;)

I'm quite fond of it for micro benchmarks. It works pretty well for measuring short methods and it gives the VM a chance to optimize things (that is what you are seeing in the results). Taking multiple samples also helps to prevent you from having just one bad run, it allows you to throw out the outliers.