Question x86 and ARM architectures comparison thread.

511 · Wednesday at 6:29 AM

GCC 13 vs GCC 16 + Clang 16 oof

poke01 · Wednesday at 6:48 AM

511 said:
View attachment 136670
GCC 13 vs GCC 16 + Clang 16 oof

Btw the compliers aren’t comparable. GCC 16 didn’t even exist when this was tested.

It should just say Xcode 16.1 or just apple clang 16.1, don’t know why Michael adds the misleading stuff

511 · Wednesday at 6:53 AM

poke01 said:
Btw the compliers aren’t comparable. GCC 16 didn’t even exist when this was tested.

It should just say Xcode 16.1 or just apple clang 16.1, don’t know why Michael adds the misleading stuff

sure but clang 16 is 2024 GCC 13 is like 2022 I doubt changes for Zen5/ARL would be in GCC 13 he would have been better off using a LLVM Based compiler for doing testing of around the same timeframe would have been a more fair comparison and if he is using Apple propritery compiler than Intel/AMD might as well use their compiler toolchain.

poke01 · Wednesday at 7:32 AM

511 said:
sure but clang 16 is 2024 GCC 13 is like 2022 I doubt changes for Zen5/ARL would be in GCC 13 he would have been better off using a LLVM Based compiler for doing testing of around the same timeframe would have been a more fair comparison and if he is using Apple propritery compiler than Intel/AMD might as well use their compiler toolchain.

Here is a newer GCC 14.2, it’s faster than 13.2. Michael didn’t test FFMPEG compilation in this review but we can compare LLVM compilation.

AMD Ryzen 9 9950X3D Delivers Excellent Performance For Linux Developers, Creators & Technical Computing Review - Phoronix

www.phoronix.com

Also these don’t look like wall readings to me.

And here is the M4 Max in a MacBook.

M4Max Benchmarks [2503281-NE-M4MAX949371] - OpenBenchmarking.org

openbenchmarking.org

What I’m most impressed is the perf/w on the M4 Max which use around 100watts at the wall.

I’m going to test my own 9800X3D on LLVM and see the time difference.

511 · Wednesday at 7:43 AM

@poke01 can you run using AOCC as well ?

MS_AT · Wednesday at 7:44 AM

poke01 said:
It should just say Xcode 16.1 or just apple clang 16.1, don’t know why Michael adds the misleading stuff

Because he doesn't care about the fine details, but scale and automation, that is why when you scrutinize the benchmarks from phoronix you will see all kinds of inconsistencies

I guess on MacOS simply gcc aliases clang for convenience.

511 said:
I doubt changes for Zen5/ARL would be in GCC 13 he would have been better off using a LLVM Based compiler for doing testing of around the same timeframe would have been a more fair comparison and if he is using Apple propritery compiler than Intel/AMD might as well use their compiler toolchain.

Support for Zen5 in mainstream LLVM is still a joke. It's copy paste of Zen4 backend which itself only recently got fixed and was a copy paste of Zen3. AMD is dropping a ball there.

poke01 said:
Michael didn’t test FFMPEG compilation in this review but we can compare LLVM compilation.

poke01 said:
I’m going to test my own 9800X3D on LLVM and see the time difference.

Be sure to build the same target as by default each will compile for the same architecture it's running on, which will trigger different code/data paths in the compiler

Make sure to match the same options, and depending on the platform use the right compiler package

[As in my table, depending on the package source there was large diff between 18.1.8 clang versions].

poke01 · Wednesday at 7:46 AM

511 said:
@poke01 can you run using AOCC as well ?

Will do tomorrow. It’s almost midnight here. I neeed to sleep

Covfefe · Wednesday at 9:21 AM

Nothingness said:
How did they measure power for the x86 machines? As I previously wrote, I only trust power at the wall, after all this is what the machines I run consume.

Software readings for everything. Here's the review.

The M4 showing was all the more impressive when looking at the CPU power consumption exposed by powermetrics compared to the Intel/AMD RAPL/PowerCap results on Linux.

Nothingness · Wednesday at 10:08 AM

MS_AT said:
I guess on MacOS simply gcc aliases clang for convenience.

Correct. If one wants a real gcc, it can be installed via homebrew and be used this way:

$ gcc-15 --version
gcc-15 (Homebrew GCC 15.2.0) 15.2.0
$ gcc --version
Apple clang version 17.0.0 (clang-1700.6.3.2)

poke01 · 2026-01-16T03:48:41-0500

Tested on undervolted AMD 9800X3D. Power consumption in btop was 95w with a peak of 110w. @MS_AT @511

Latest clang 21.1.6

Test4 Benchmarks [2601158-NE-TEST4834135] - OpenBenchmarking.org

openbenchmarking.org

Latest GCC 15.2.1

Test2 Benchmarks [2601151-NE-TEST2270825] - OpenBenchmarking.org

openbenchmarking.org

oh and AMD's complier is pure dogwater. Its based on clang 17.0.6.
PTS crashed after it finished but so no web link but I managed a pic of the XML file.

It took 870 seconds, double the other compliers

511 · 2026-01-16T05:10:54-0500

@poke01 I Know of a funny stuff you can use Intel's compiler and pass generic flag 🤣.(like -O2 -x86_64_V3) or something depending on what you passed those compiler

MS_AT · 2026-01-16T05:34:20-0500

poke01 said:
oh and AMD's complier is pure dogwater. Its based on clang 17.0.6.

I would expect it to be the slowest, after all it's supposed to run extra optimization passes so the binary it produces is faster, I doubt anyone pays a lot of attention to how fast it itself is running.

poke01 said:
Latest clang 21.1.6

I guess this was distribution clang not official clang? Since the latest clang is 21.1.8. I am not sure how cachy is sourcing it, if they are building from scratch. Also do you know how can I read from openbenchmarking the exact command used to build? I wasn't able to find it.

poke01 said:
Latest GCC 15.2.1

That is surprisingly good result, I wonder if cachy is doing the sane thing and they have ditched ld underneath in favour of lld. Or maybe it's the other way around and the build defaults to system linker, which by default should be ld, so both are using the slow linker, hmm. I am too unfamiliar with Linux to be able to do more than guess

511 said:
I Know of a funny stuff you can use Intel's compiler and pass generic flag 🤣.(like -O2 -x86_64_V3) or something depending on what you passed those compiler

Well you can run icx telling it to compile for znver5 outright, it's llvm based after all

[it's another thing altogether that AMD still did not manage to merge proper scheduler data into upstream llvm]. https://godbolt.org/z/Ts6TxPef4

511 · 2026-01-16T05:36:21-0500

MS_AT said:
Well you can run icx telling it to compile for znver5 outright, it's llvm based after all [it's another thing altogether that AMD still did not manage to merge proper scheduler data into upstream llvm]. https://godbolt.org/z/Ts6TxPef4

I just don't know Intel still does Fancy AMD bottlenecking in their Compilers this might get rid of it

poke01 · 2026-01-16T05:39:23-0500

MS_AT said:
how can I read from openbenchmarking the exact command used to build?

phoronix-test-suite install pts/build-llvm to install

phoronix-test-suite run pts/build-llvm-1.6.0 to run

And select 1 when it asks you to choose for Ninja

MS_AT · 2026-01-16T05:42:45-0500

511 said:
I just don't know Intel still does Fancy AMD bottlenecking in their Compilers this might get rid of it

Their own compiler is deprecated, I mean icc. Their new compiler (icx) is tuned llvm with extra optimization passes for Intel hardware as far as I understand. So it does not cripple AMD chips the way the old one used to do. It's just not applying extra passes, but the generic tunings apply to AMD chips too. Mystical is using icx to build Y-cruncher for Zen, or at least used to last time I checked and he found it the best available at the time for the purpose.

poke01 said:
phoronix-test-suite install pts/build-llvm to install

phoronix-test-suite run pts/build-llvm-1.6.0 to run

Ah , I guess I was not precise enough, I mean the exact cmake command used to run the build itself, but I think these can be inferred from pts sources

Thanks anyway

511 · 2026-01-16T06:02:14-0500

MS_AT said:
Their own compiler is deprecated, I mean icc. Their new compiler (icx) is tuned llvm with extra optimization passes for Intel hardware as far as I understand. So it does not cripple AMD chips the way the old one used to do. It's just not applying extra passes, but the generic tunings apply to AMD chips too. Mystical is using icx to build Y-cruncher for Zen, or at least used to last time I checked and he found it the best available at the time for the purpose.

ICX is arguably the best compiler for x86_64 so I am not really surprising here but I just was not sure regarding AMD paths thanks for lmk.

Search

Question x86 and ARM architectures comparison thread.

511

Diamond Member

poke01

Diamond Member

511

Diamond Member

poke01

Diamond Member

AMD Ryzen 9 9950X3D Delivers Excellent Performance For Linux Developers, Creators & Technical Computing Review - Phoronix

M4Max Benchmarks [2503281-NE-M4MAX949371] - OpenBenchmarking.org

511

Diamond Member

MS_AT

Senior member

poke01

Diamond Member

Covfefe

Member

Nothingness

Diamond Member

poke01

Diamond Member

Test4 Benchmarks [2601158-NE-TEST4834135] - OpenBenchmarking.org

Test2 Benchmarks [2601151-NE-TEST2270825] - OpenBenchmarking.org

511

Diamond Member

MS_AT

Senior member

511

Diamond Member

poke01

Diamond Member

MS_AT

Senior member

511

Diamond Member

TRENDING THREADS