GPU and CPU calculation results not identical (OpenCL issue)

boren

Member
Dec 13, 2009
103
0
71
When I turn on OpenCL in DxO 8.1.5 (image processing application) the resulting files are slightly different than when OpenCL is turned off. The files are binary different, and I can even spot a few differences if I increase magnification significantly.

Is there an application I could use to validate that the OpenCL implementation of my card (AMD HD 7850 2GB) is correct and accurate? If it isn't, then I guess AMD is to blame and I'll contact their support and report this issue. If the results are accurate, then I report it to DxO support.
 

Atreidin

Senior member
Mar 31, 2011
464
27
86
I'm unclear on why it would necessarily be AMD's fault, they didn't write that program.
 

boren

Member
Dec 13, 2009
103
0
71
I didn't say it's necessarily be AMD's fault. It's one of two possibilities:

Is there an application I could use to validate that the OpenCL implementation of my card (AMD HD 7850 2GB) is correct and accurate? If it isn't, then I guess AMD is to blame and I'll contact their support and report this issue. If the results are accurate, then I report it to DxO support.
The reason why I want to test it with a separate tool is to identify which possibility is the right one.
 

Ajay

Lifer
Jan 8, 2001
16,094
8,112
136
It may be a case where the GPU renderer in using single precision floating point for speed, whereas the CPU is using double precision. The real question is: 'is the post processed image accurate enough for you purposes'?
 

ViRGE

Elite Member, Moderator Emeritus
Oct 9, 1999
31,516
167
106
It may be a case where the GPU renderer in using single precision floating point for speed, whereas the CPU is using double precision. The real question is: 'is the post processed image accurate enough for you purposes'?
Indeed. DxO wouldn't be the first or the last program to take a different codepath when using the GPU. Sometimes it's a precision thing, other times it's just the difference between an algorithm that works well in serial on a CPU, and a similar algorithm that's better suited for wide execution on a GPU.
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
It's not uncommon for different processors to give different results.

That why is applications such as distributed computing, where the same workset is sent to multiple machines, the work must be sent to machines of the same architecture.

Read the docs at wcgrid.org for more details.

Short version - there's not a problem.
 

Rakehellion

Lifer
Jan 15, 2013
12,181
35
91
When I turn on OpenCL in DxO 8.1.5 (image processing application) the resulting files are slightly different than when OpenCL is turned off. The files are binary different, and I can even spot a few differences if I increase magnification significantly.

Is there an application I could use to validate that the OpenCL implementation of my card (AMD HD 7850 2GB) is correct and accurate? If it isn't, then I guess AMD is to blame and I'll contact their support and report this issue. If the results are accurate, then I report it to DxO support.

Floating point calculations are expected to give different results on different architectures. Use integer calculations if you want files that are bit identical.
 

BrightCandle

Diamond Member
Mar 15, 2007
4,762
0
76
Floating point calculations are expected to give different results on different architectures. Use integer calculations if you want files that are bit identical.

That is why we have a standard for floating point that ensures that the calculations are the same. But you need to enable strict floating point in the code path to ensure that is used and that has a performance cost.

I think it more likely in this case that its not the same algorithm, its a completely different algorithm designed to work on the GPU. OpenCL does not allow a developer to reuse existing code to make their GPU program, you have to write in a special form of C where the iteration is taken out of your hands and put into the API. Its very likely that change is responsible as the developers can no longer use their previous code and have had to create a separate path.
 

DaveSimmons

Elite Member
Aug 12, 2001
40,730
670
126
It might be fastest to contact the authors of DxO.

This could be:
- bugs in the OpenCL driver
- bugs in DxO
- floating-point differences (for example with an x86 CPU you might be using 4-, 8-, or 10-byte floats)
- algorithm differences. Even with the same floats, the way results are created could propagate or magnify floating errors differently.
 

Schmide

Diamond Member
Mar 7, 2002
5,712
978
126
This is the nature of DCT (discrete consign transformation) compression. It is lossy! If you parallel the operation the same work gets done in a different order.