AMD vs Intel Processors In Linux

Azuma Hazuki

Golden Member
Jun 18, 2012
1,532
866
131
So, Phoronix aside, I am wondering if Linux and benchmarks compiled on Linux show the same kind of utter routing of AMD CPUs by Intel as do the standard binary ones under Windows.

"Wintel monopoly" is a cliche for a reason, and we all remember when benchmark suites would check for the CPUID and cripple the chip if it didn't return GenuineIntel.

Specifically, would the Vishera and Kaveri CPUs gain anything noticeable if code were compiled specifically for them (as on Gentoo Linux), and how much? I have heard that the FX-8350 is excellent for compilation and gains significantly more performance with code compiled with the proper -march setting...
 

monstercameron

Diamond Member
Feb 12, 2013
3,818
1
0
depends on the type of workloads, integer is where amd is strong but floats its not very good even compared to PII. Any speeds ups most likely would be academic.
 

jhu

Lifer
Oct 10, 1999
11,918
9
81
depends on the type of workloads, integer is where amd is strong but floats its not very good even compared to PII. Any speeds ups most likely would be academic.

Not quite true about floating point. Mixed IPC (integer + float) for Piledriver has caught to Phenom II, but Piledriver can be clocked much higher.

With regard to performance recompiling on Linux, it depends on the compiler. Intel's compiler is best if you're able to utilize their MKL. Otherswise gcc isn't significantly slower than icc (from my testing it's about 0-4%; your workload may vary). Llvm is the slowest. Others such as Open64 or Portland group compilers I haven't been inclined to test.
 

dealcorn

Senior member
May 28, 2011
247
4
76
Generally, Phoronix rates Intel's support for Linux as good even if Intel graphics are better supported in Windows. AMD's Linux support is generally not as well regarded. The Gentoo forums may provide better insights, but I would not over inflate your expectations. If you decide to check it out, do share your results.

The benefits of setting the correct -march compilation flag likely flow to all Linux distributions. Rather than testing on an unfamiliar distribution (for you), I suggest testing on whatever distribution you are most familiar with. In Debian, it is not hard to download a stock kernel and then set a preferred -march for compilation. Scrutinize your source reporting benefits from kernel tuning and try to identify other important flags. Your question is good. Try to start with the easiest testing procedure to get clear evidence whether you are on a good path.
 
Last edited:

monstercameron

Diamond Member
Feb 12, 2013
3,818
1
0
Generally, Phoronix rates Intel's support for Linux as good even if Intel graphics are better supported in Windows. AMD's Linux support is generally not as well regarded. The Gentoo forums may provide better insights, but I would not over inflate your expectations. If you decide to check it out, do share your results.

*snip*

dont conflate cpus and gpus drivers.
 

MiddleOfTheRoad

Golden Member
Aug 6, 2014
1,123
5
0
Generally, Phoronix rates Intel's support for Linux as good even if Intel graphics are better supported in Windows. AMD's Linux support is generally not as well regarded. The Gentoo forums may provide better insights, but I would not over inflate your expectations. If you decide to check it out, do share your results.

The benefits of setting the correct -march compilation flag likely flow to all Linux distributions. Rather than testing on an unfamiliar distribution (for you), I suggest testing on whatever distribution you are most familiar with. In Debian, it is not hard to download a stock kernel and then set a preferred -march for compilation. Scrutinize your source reporting benefits from kernel tuning and try to identify other important flags. Your question is good. Try to start with the easiest testing procedure to get clear evidence whether you are on a good path.

Yeah, Radeon drivers are pretty lousy under Linux -- but we are talking about CPU's for the most part. As someone who has runs both Intel and AMD cpu's -- I can tell you that AMD processors are generally lame under Windows (but it seems that Microsoft is mostly to blame). The performance from Piledriver under Linux is pretty impressive for their retail price.

For example, my Intel i7 3770K @ 3.5 Ghz benchmarks at 14000 integer and 2600 floating point under the World Community Grid. My AMD FX 8320 @ 3.5 Ghz does 11000 integer and 2500 floating point (both machines running under Ubuntu 64 bit). For years I had read about how AMD chips can only compete with at best an i3.... But that is simply not true. It is far more dependant on the actual task, software and operating system. Under linux, an FX can keep up with a comparable i7 for many tasks (and ocassionally beat an Intel chip in some multi-threaded apps). When I ran this same benchmark for World Community Grid under Windows 7 (64 bit) on my AMD FX-8320, the score dropped to 5900 integer 1600 Floating Point. It just shows how poorly optimized Windows 7 is at using Bulldozer architecture (even after the Microsoft patch). The FX hardware is quite powerful, but a lot of software doesn't harness its full potential. Sad, but true.
 
Last edited:

jhu

Lifer
Oct 10, 1999
11,918
9
81
Yeah, Radeon drivers are pretty lousy under Linux -- but we are talking about CPU's for the most part. As someone who has runs both Intel and AMD cpu's -- I can tell you that AMD processors are generally lame under Windows (but it seems that Microsoft is mostly to blame). The performance from Piledriver under Linux is pretty impressive for their retail price.

For example, my Intel i7 3770K @ 3.5 Ghz benchmarks at 14000 integer and 2600 floating point under the World Community Grid. My AMD FX 8320 @ 3.5 Ghz does 11000 integer and 2500 floating point (both machines running under Ubuntu 64 bit). For years I had read about how AMD chips can only compete with at best an i3.... But that is simply not true. It is far more dependant on the actual task, software and operating system. Under linux, an FX can keep up with a comparable i7 for many tasks (and ocassionally beat an Intel chip in some multi-threaded apps). When I ran this same benchmark for World Community Grid under Windows 7 (64 bit) on my AMD FX-8320, the score dropped to 5900 integer 1600 Floating Point. It just shows how poorly optimized Windows 7 is at using Bulldozer architecture (even after the Microsoft patch). The FX hardware is quite powerful, but a lot of software doesn't harness its full potential. Sad, but true.

I think the issue has more to do with the compiler than Windows itself. MSVC is really pretty bad compared with Intel's compiler on Windows for anything computationally intensive. On the Linux side, gcc and icc are about comparable if Intel's MKL isn't being used. My testing reveals a Piledriver module = Sandy Bridge/Ivy Bridge core + HT when all threads are utilized and the source is available for recompiling.
 

Azuma Hazuki

Golden Member
Jun 18, 2012
1,532
866
131
My testing reveals a Piledriver module = Sandy Bridge/Ivy Bridge core + HT when all threads are utilized and the source is available for recompiling.

Thanks! THIS is the kind of information I was looking for :)

Though, doesn't this mean all the reviews saying Vishera's IPC is at below-Nehalem levels are actually wrong? SB was a rather big leap over Nehalem...
 

jhu

Lifer
Oct 10, 1999
11,918
9
81
Thanks! THIS is the kind of information I was looking for :)

Though, doesn't this mean all the reviews saying Vishera's IPC is at below-Nehalem levels are actually wrong? SB was a rather big leap over Nehalem...

Yes and no. Take a look at the link I provided, which is the most ideal setup possible: recompile of source to target architecture with the best compiler for the system. Under these conditions, a Piledriver module = Sandy Bridge/Ivy Bridge + HT at the same clock speed. But look at the single thread for Piledriver - it's still slower than Core 2.

Under normal conditions though, which is what most of the rest of us deal with (ie, precompiled and/or closed-source software), performance will be lower than peak theoretical (for both AMD and Intel processors) because, on the Windows side, a lot of software is compiled with MSVC, which isn't that fast. On the Linux side, they're compiled with gcc and generic flags - as I recall, this defaults to K8 tuning, which is slow for non-K8 processors (ie, almost anything new now).
 

MiddleOfTheRoad

Golden Member
Aug 6, 2014
1,123
5
0
Thanks! THIS is the kind of information I was looking for :)

Though, doesn't this mean all the reviews saying Vishera's IPC is at below-Nehalem levels are actually wrong? SB was a rather big leap over Nehalem...

That may be true under Windows benchmarks.

I know Vishera cores blow away Nehalem processors under Linux. My Vishera (FX 8320) and Ivy Bridge (i7 3770k) both clocked at 3.5 Ghz score within 100 points of each other (Whetstone) Floating Point performance under Ubuntu. Integer performance lagged a little on the AMD. though (Dhrystone) -- 11k versus 14k..... But is still much closer than most people would expect. But yes, Vishera is usually right there with Sandy/Ivy in multi-threaded stuff.
 

Azuma Hazuki

Golden Member
Jun 18, 2012
1,532
866
131
That may be true under Windows benchmarks.

I know Vishera cores blow away Nehalem processors under Linux. My Vishera (FX 8320) and Ivy Bridge (i7 3770k) both clocked at 3.5 Ghz score within 100 points of each other (Whetstone) Floating Point performance under Ubuntu. Integer performance lagged a little on the AMD. though (Dhrystone) -- 11k versus 14k..... But is still much closer than most people would expect. But yes, Vishera is usually right there with Sandy/Ivy in multi-threaded stuff.

Now, is compilation mostly integer or mostly FP? Because from the sound of it, this sounds like "properly optimized code runs at about Nehalem speeds for integer and about Gescher/Ivybridge speeds for FP." If that's the case, for compilation this is about equivalent to an i7 970 then?
 

jhu

Lifer
Oct 10, 1999
11,918
9
81
Now, is compilation mostly integer or mostly FP? Because from the sound of it, this sounds like "properly optimized code runs at about Nehalem speeds for integer and about Gescher/Ivybridge speeds for FP." If that's the case, for compilation this is about equivalent to an i7 970 then?

Compiling is going to be all integer. Can't think of a situation where compiling would touch the FPU.
 

TerryMathews

Lifer
Oct 9, 1999
11,464
2
0
Compiling is going to be all integer. Can't think of a situation where compiling would touch the FPU.
It wouldn't unless the code called for the compiler to solve a mathematical operation contained in a float into a constant.

For instance, having pi computed instead of calculating it. Still, effectively no effect.
 

DrMrLordX

Lifer
Apr 27, 2000
22,700
12,651
136
That's curious. I had thought that the shared FP units made Piledriver weaker in overall fp throughput than in integer.
 

TerryMathews

Lifer
Oct 9, 1999
11,464
2
0
Now, is compilation mostly integer or mostly FP? Because from the sound of it, this sounds like "properly optimized code runs at about Nehalem speeds for integer and about Gescher/Ivybridge speeds for FP." If that's the case, for compilation this is about equivalent to an i7 970 then?
I'm assuming your quote is talking about compiled code and not the act of compilation.
 

MiddleOfTheRoad

Golden Member
Aug 6, 2014
1,123
5
0
That's curious. I had thought that the shared FP units made Piledriver weaker in overall fp throughput than in integer.

It is very unusual -- which is why the BOINC benchmarks under Ubuntu surprised me. The previous reviews that I had read would have indicated the exact opposite -- that FX and Sandy would be close for integer performance, but farther apart in floating point.

I had always assumed that the Windows 7 benchmarks that I had used in the past were an accurate reflection of the AMD FX's slower/shared cores -- I only stumbled onto the better performance when I started tinkering with Linux. I'm curious how the FX performs under Windows 8 as well --
but I don't actually own a copy of that OS. I've heard Windows 8 performs well with AMD's APU's.
 

TeknoBug

Platinum Member
Oct 2, 2013
2,084
31
91
That may be true under Windows benchmarks.

I know Vishera cores blow away Nehalem processors under Linux. My Vishera (FX 8320) and Ivy Bridge (i7 3770k) both clocked at 3.5 Ghz score within 100 points of each other (Whetstone) Floating Point performance under Ubuntu. Integer performance lagged a little on the AMD. though (Dhrystone) -- 11k versus 14k..... But is still much closer than most people would expect. But yes, Vishera is usually right there with Sandy/Ivy in multi-threaded stuff.

Funny... I have a 3770K and used to have an 8320 (gave it away and threw in a 4350) and in real world use (not benchmarking) the 8320 felt sluggish compared to the 3770K and I'm also a Linux user. Clock for clock, Intel's going to be ahead of AMD in a lot of ways.
 

Azuma Hazuki

Golden Member
Jun 18, 2012
1,532
866
131
Funny... I have a 3770K and used to have an 8320 (gave it away and threw in a 4350) and in real world use (not benchmarking) the 8320 felt sluggish compared to the 3770K and I'm also a Linux user. Clock for clock, Intel's going to be ahead of AMD in a lot of ways.

Could it be the system RAM? I know the APU-series are very, very sensitive to memory bandwidth, whereas Ivy will happily run on DDR3-1333. Are the FX line as sensitive?
 

TerryMathews

Lifer
Oct 9, 1999
11,464
2
0
It is very unusual -- which is why the BOINC benchmarks under Ubuntu surprised me. The previous reviews that I had read would have indicated the exact opposite -- that FX and Sandy would be close for integer performance, but farther apart in floating point.

I had always assumed that the Windows 7 benchmarks that I had used in the past were an accurate reflection of the AMD FX's slower/shared cores -- I only stumbled onto the better performance when I started tinkering with Linux. I'm curious how the FX performs under Windows 8 as well --
but I don't actually own a copy of that OS. I've heard Windows 8 performs well with AMD's APU's.
My best guess is Visual Studio is failing to properly optimize for AMD in the same way ICC does. I know BOINC is compiled with VS. http://boinc.berkeley.edu/trac/wiki/CompileAppWin

I'm sure on Linux its compiled with GCC which is above marketing-driven optimization.

For further reading google "ICC cripple amd"
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
My best guess is Visual Studio is failing to properly optimize for AMD in the same way ICC does. I know BOINC is compiled with VS. http://boinc.berkeley.edu/trac/wiki/CompileAppWin

I'm sure on Linux its compiled with GCC which is above marketing-driven optimization.

For further reading google "ICC cripple amd"

And get results like this?

compiler.png


That really looks crippled.
 

Hans de Vries

Senior member
May 2, 2008
347
1,177
136
www.chip-architect.com
And get results like this?

compiler.png


That really looks crippled.

OMG. This really is a devious benchmarketing lie....

It compares partly multi-threaded code against single threaded code
So what a wonder, multiple AMD cores are faster than a single AMD core..

The Open64 compiler generates on average 50% to 60% faster executables
as Intel's ICC 12 on the same code: Yes that really looks crippled!

SPEC_FP_2006_rate.jpg


Keep forgetting this?
 
Last edited:

mavere

Member
Mar 2, 2005
190
4
81
Relative runtime results (lower is better) on a Bulldozer-bused Cray supercomputer at NERSC.

Fortran:
uHX9mhg.jpg


C++:
SNDlwWr.png


Honestly, Intel no longer has any reasons to give AMD a free PR campaign by explicitly crippling performance. An Intel manager doesn't even need to be particularly clever to realize that Intel would lose more from institutions/researchers abandoning x86-only ICC for multi-platform GCC than any high-performance product AMD could release in the medium-run.
 

TerryMathews

Lifer
Oct 9, 1999
11,464
2
0
Relative runtime results (lower is better) on a Bulldozer-bused Cray supercomputer at NERSC.

Fortran:
uHX9mhg.jpg


C++:
SNDlwWr.png


Honestly, Intel no longer has any reasons to give AMD a free PR campaign by explicitly crippling performance. An Intel manager doesn't even need to be particularly clever to realize that Intel would lose more from institutions/researchers abandoning x86-only ICC for multi-platform GCC than any high-performance product AMD could release in the medium-run.
Weird how GCC gives the best or nearly so performance on all of those tests. Some instances by a wide margin. Called it.