Optimization unstable code

Mark R

Diamond Member
Oct 9, 1999
8,513
16
81
I've just had a frustrating couple of hours trying to diagnose this in a hobby program I've been putting together.

Essentially, the key routine is a large Galois field multiply. Under debug, it worked perfectly, tested over millions of test vectors.

When compiled with optimization on, it would fail on a proportion of test vectors (about 0.1%).

I eventually worked out that this was due to integer overflow during the multiplication. But why the difference?

The multiply was performed in typical school-book long multiplication fashion (with some tweaking to take account of the fact that different coefficients had different radix.)

Through luck, rather more than judgement, the order in which my original code summed the coefficients tended to alternate sign - so that a positive product would then sum with a negative product, etc.

What I found with the optimized version was that various intermediate products were cached and reused. This changed the order of summing in certain places, and would occasionally cause an integer overflow.

I was using 32 bit signed ints, with 13 bit (max) coefficients. I thought that a 26 bit product would fit safely into a 32 bit int. But I had forgotten something - I needed an extra bit (for the radix correction) - OK, that's 27 bits - there's still usable 4 bits left.

The other thing I had forgotten was that with 20 coefficients, some of the coefficients required summing more than 16 intermediate products. In effect, these coefficients needed another 5 bits. And it was, indeed, exactly here that the overflow was occurring when the coefficients were summed in a non-prearranged order.

I fixed it by breaking the summing process in vulnerable "columns" into 2, and performing a carry step part way through the summing.
 

Snapshot1

Member
Dec 26, 2011
42
0
0
Although it is tempting to lay the blame for this failure mode on the optimization the fact is that if the process & data was vulnerable to overflow errors it was actually broken all along despite the good luck observed under debug and non-optimized testing.


Nice work figuring it out and a good lesson(re-)learned for anyone working with similar code.
 

Leros

Lifer
Jul 11, 2004
21,867
7
81
The only optimization bug I've found was pretty simple to figure out once I dug into the generated assembly. It was using an 8-bit operation on a 16-bit register.