• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Mullins vs Tegra K1, Benchmark scores

Arkadrel

Diamond Member
I was reading the thread on S/A and someone commented that the Mullins 3Dmark scores looked really good, considering that it beat the Tegra K1 scores in it.

This got me thinking, ive heard people say that the K1 will be ~5watts (under gpgpu load / heavy gpu load), and that the Mullins has a 4.5w TDP (2.8W SDP).

Is mullins actually more energy effecient than the K1 at gpu loads?


the pic's:


Nvidia Tegra K1:

TH-lenovotegraK1-1.jpg




A10 Micro-6700T:

3dmark.png





Tegra K1 = 22,285 score
A10 Micro-6700T = 24,775 score
 
Is 3Dmark - Icestorm running really badly on Android? Is it nvidia drivers?

a K1 with 11watts should destroy any x86 chip @ 4,5watts shouldnt it?

It just looks weird.

Maybe its androids fault though, or the people that coded Icestorm for android.
 
Is 3Dmark - Icestorm running really badly on Android? Is it nvidia drivers?

a K1 with 11watts should destroy any x86 chip @ 4,5watts shouldnt it?

It just looks weird.

Maybe its androids fault though, or the people that coded Icestorm for android.

Arkadrel Nvidia is known to overpromise and underdeliver in mobile SOCs. Mullins is a very good chip. Its going to be difficult to beat Mullins at the same TDP.

Jaguar is a faster core than Silvermont which in turn is faster than A15. the 128 GCN cores clocked at 500 Mhz provide an impressive 128 GFlops. let see what clocks K1 GPU clocks at in a sub 5w envelope.
 
I was reading the thread on S/A and someone commented that the Mullins 3Dmark scores looked really good, considering that it beat the Tegra K1 scores in it.

This got me thinking, ive heard people say that the K1 will be ~5watts (under gpgpu load / heavy gpu load), and that the Mullins has a 4.5w TDP (2.8W SDP).

Is mullins actually more energy effecient than the K1 at gpu loads?


the pic's:


Nvidia Tegra K1:

TH-lenovotegraK1-1.jpg




A10 Micro-6700T:

3dmark.png





Tegra K1 = 22,285 score
A10 Micro-6700T = 24,775 score

The tegra was on early drivers.

Something seems off. That's pretty obvious given that the atom 3740 does as well as the S800 and T4.
 
The Tegra K1 GPU should in general have superior performance and superior perf. per watt compared to the Mullins high end variant GPU. Remember that Tegra K1 has 192 CUDA cores running at a much higher GPU clock operating frequency than Mullins. In a large tablet it will likely have close to 300 GFLOPS throughput. Obviously the Futuremark test is (bandwidth?) limited because TK1 is only 45% faster than T4. Anyway, to accurately compare GPU perf. per watt, one needs to measure app-specific power consumed specifically by the GPU in the SoC.
 
Last edited:
Phoenix.com got their Jetson K1 board. They are doing benchmarks right now.
So we will have numbers from Tegra K1 in a few hours or days.

However Tegra K1 archives 40GFLOPs/W while Mullins is only able of 14GFLOPs/W.
 
You realize that if thoses numbers where true Nvidia s K6000 2880 cuda cores would had triple the Flops of a W9100 2816 GCN cores..??..

Tegra K1 uses Kepler.M, K6000 the normal Kepler version. Kepler.M is optimized for the highest performance under 10W.
Furthermore K6000 uses GK110, which based on a chip architecture finished end of 2011 instead of mid 2013.

K6000 archives 23GFLOPS/W, K1 40GFLOPs/W.
 
I m quite skeptical of this number given that the bigger devices benefit from some scaling savings and will outperform smaller devices when all the plateform is taken into account, here that s exactly the contrary while being on a same 28nm process, we ll have to wait from independant reviews as Nvidia s numbers seems to me quite exagerated.
 
Phoenix.com got their Jetson K1 board. They are doing benchmarks right now.
So we will have numbers from Tegra K1 in a few hours or days.

However Tegra K1 archives 40GFLOPs/W while Mullins is only able of 14GFLOPs/W.

How on earth do you get to that number? :hmm: From the slide above, K1 has a theoretical max throughput of 326 GFlops at 11W (if we can even assume 11W is the TDP- it is simply listed as consumption when "working hard"). So the max cannot be higher than 30GFLOPs/W. But even on a high throughput scientific workload like SGEMM, it cannot achieve more than 290GFLOPs, so we're down to ~26 SGEMM GFLOPs/W. And in the kind of workloads that you would actually run on a tablet SoC, you can be pretty sure you won't get SGEMM levels of throughput.

Anyway, it's daft to even guess at this point, as we don't have independently measured power consumption figures from either the K1 or Mullins.
 
Next time read the text, too.
It's <11W and 26GFLOPS SGEMM/W for the whole board (we went over it in the other Tegra K1 thread).
It's ~8W for the SoC alone.
 
Next time read the text, too.
It's <11W and 26GFLOPS SGEMM/W for the whole board (we went over it in the other Tegra K1 thread).
It's ~8W for the SoC alone.

Um. No. Let's read the text, shall we?

nvidia said:
GPU + GPU: <11W (working hard)

It is pretty clear that 11W refers to the SoC, not the entire board.

And the other bit of text at the bottom of the slide seems pretty relevant too: ~26 SP GFLOPs/W. Claiming 40 is patently absurd.
 
Comparing Gflops between different architectures,designs etc doesnt translate in to real world performance, power and/or energy efficiency.

Also, Mullins was tested inside a Fun-Less Tablet, where K1 was tested ??
 
Tablet market will get interesting, lots of decent choices and ARM will also have to up it's game i think. Same OS comparison is need thou, AMD needs to work on Android and fast.
 
Um. No. Let's read the text, shall we?
It is pretty clear that 11W refers to the SoC, not the entire board.

The text from the twitter guy you got the slide from:
Steve Oberlin NVIDIA Jetson (28nm) power efficiency. Approx. 26GF/W SFP including GPU/CPU, 2GB memory, other I/O

NVIDIA&#12398;Steve Oberlin&#12364;Tegra K1(28nm)&#12505;&#12540;&#12473;&#12398;Jetson&#12508;&#12540;&#12489;&#12398;&#24615;&#33021;&#38651;&#21147;&#21177;&#29575;&#12398;&#30330;&#34920;&#12290;SGEMM 26GF/W&#12384;&#12364;&#12289;GPU/CPU&#12384;&#12369;&#12391;&#12394;&#12367;&#12289;2GB&#12513;&#12514;&#12522;(2-3W)&#12289;HDMI&#12420;&#12469;&#12454;&#12531;&#12489;&#31561;I/O&#31995;&#12434;&#21547;&#12416;

Google:
Steve Oberlin of NVIDIA's presentation of performance power efficiency of Jetson board of Tegra K1 (28nm)-based. But SGEMM 26GF / W, as well as GPU / CPU, including 2GB memory (2-3W), sound, etc. I / O system and HDMI

And the other bit of text at the bottom of the slide seems pretty relevant too: ~26 SP GFLOPs/W. Claiming 40 is patently absurd.

Only if you have no clue. :\
 
Seems like I was almost spot on with my early 2014 prediction. Again, if NVIDIA's claims hold true (60 FPS @ GFXBench 2.7) then K1 should have a confortable lead.

MullinsChart-7.jpg


Fun fact - Mullins (AMD's next x86 tablet chip) scores 570 @ 3DMark11 according to AMD, thats a slightly lower score than Kabini A4-5000 (~590 pts). If Tegra K1 really does 60 FPS @ GFXBench 2.7 then it will outclass AMD's solution big time in the graphics department, Mullins should score around 35-36 FPS based on AnandTech's Kabini results.
 
The text from the twitter guy you got the slide from:




Google:




Only if you have no clue. :\

So you're going off the word of an attendee of the talk, whose first language is not English, after it has been mangled by Google Translate, instead of what is actually written on the slide? Right then.
 
Looks interesting, but with all things nVidia you really need to wait for independent benchmarks, from reliable sources. Their track record isn't all that good in the SOC department.
 
So you're going off the word of an attendee of the talk, whose first language is not English, after it has been mangled by Google Translate, instead of what is actually written on the slide? Right then.

The guy attended the session and even wrote in englisch that the 11W is for the whole board. :awe:
 
The guy attended the session and even wrote in englisch that the 11W is for the whole board. :awe:

And said guy also claims that the 2GB of memory uses 2-3W when it's more like 1W. Which puts the figure at 10W for the SoC.
 
http://www.phoronix.com/scan.php?page=news_item&px=MTY3Nzg
linux bencmarks^

http://openbenchmarking.org/result/1401068-MAKS-MAKSONL39
a4-5000 for comparison^

http://www.phoronix.com/scan.php?page=article&item=amd_athlon_sempron&num=4
desktop kabinis^

kU6fjY5.png

this benchmark either isn't optimized for arm and tk1 or has a bug

ENI7wwL.png

better comparison.

if scimark is multithreaded then tk1 has the edge over the micro apus [see the branding is good], else the micro apu will turbo beyond even the 5350 and thus be faster.
 
Last edited:
Back
Top