Kaby-G Nuc Review

Bouowmx

Golden Member
Nov 13, 2016
1,138
550
146
I like the banner-style results.

A lot of potential for undervolting. CPU 1.2 V for 3.9-4.2 GHz, and who knows for auto volts at 4.3 GHz, and GPU.
 

Glo.

Diamond Member
Apr 25, 2015
5,707
4,551
136
That is amazing, that is like 80% the performance of a RX580.
While having 66% GCN core count of RX580, and 80% of core clock of it, and while having twice the amount of ROPs.

Which also proves that ROPs are the most important thing for gaming performance.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Which also proves that ROPs are the most important thing for gaming performance.

Maybe, maybe not.

RX 580 GFlops/GBytes per second = 24.1
RX Vega M GH GFlops/GBytes per second = 17.9

RX 580 has 35% more shader firepower per GB of memory bandwidth. That's in addition to having more ROPs. Those will skew the relative Flop/perf comparison in favor of the RX Vega M GH, but RX 580 is still faster.
 
Last edited:

Topweasel

Diamond Member
Oct 19, 2000
5,436
1,654
136
https://www.computerbase.de/2018-03/kaby-lake-g-nuc-test/

This is the top NUC model with the 100W 8809G. Performance is around the 1060 Max-Q. Too expensive I imagine to think it will sell ($1000 for just the kit). They were able to OC the CPU to 4.3 Ghz and GPU to 1300.

Yeah the idea behind it had a market. But I am not sure what it is now. This can't be what Apple really wanted? It's not going to be a good mobile CPU for their chasis, they can already package a SL-X setup in their AIO setup. No one wants to pay 1k for a NUC. This only available in a embedded setup.
 

Glo.

Diamond Member
Apr 25, 2015
5,707
4,551
136
Maybe, maybe not.

RX 580 GFlops/GBytes per second = 24.1
RX Vega M GH GFlops/GBytes per second = 17.9

RX 580 has 35% more shader firepower per GB of memory bandwidth. That's in addition to having more ROPs. Those will skew the relative Flop/perf comparison in favor of the RX Vega M GH, but RX 580 is still faster.
Erm... Its Vega M in NUC that has twice the amount of ROPs compared to RX 580...

Vega M has 64 ROPs, RX 580 - 32.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
Erm... Its Vega M in NUC that has twice the amount of ROPs compared to RX 580...

Vega M has 64 ROPs, RX 580 - 32.

Have you not read what I posted?

Vega M GH has more bandwidth per flop than RX 580. That's in addition to having 2x the ROPs. So if you do a simple Flop-based calculation, the Vega M GH will look better.
 

Glo.

Diamond Member
Apr 25, 2015
5,707
4,551
136
Have you not read what I posted?

Vega M GH has more bandwidth per flop than RX 580. That's in addition to having 2x the ROPs. So if you do a simple Flop-based calculation, the Vega M GH will look better.
You believe that Memory bandwidth is impacting so much more the performance of a GPU, than sheer horsepower of cores, and throughput of fillrate?

I don't think so ;).

If a GPU with 66% of cores, and 80% of GPU clock is still able to get 80% of a GPU that is directly compared - It means that the other GPU is bottlenecked somewhere. Is it Memory Bandwidth? No. Its the front end - plain and simple.
 
  • Like
Reactions: Drazick

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
You believe that Memory bandwidth is impacting so much more the performance of a GPU, than sheer horsepower of cores, and throughput of fillrate?

I don't believe ROPs are 100% responsible for the better relative performance. There are 3 significant things that effect better relative scaling. Better memory BW/Flops, more ROPs, and Vega architecture. None, of which can be easily isolated.

Polaris chips are thought to be memory bandwidth bound as well.

If a GPU with 66% of cores, and 80% of GPU clock is still able to get 80% of a GPU that is directly compared - It means that the other GPU is bottlenecked somewhere. Is it Memory Bandwidth? No. Its the front end - plain and simple.

80% of RX 580 puts it at RX 470 level: https://www.techpowerup.com/reviews/Gigabyte/AORUS_RX_580_XTR/29.html
 

Glo.

Diamond Member
Apr 25, 2015
5,707
4,551
136
I don't believe ROPs are 100% responsible for the better relative performance. There are 3 significant things that effect better relative scaling. Better memory BW/Flops, more ROPs, and Vega architecture. None, of which can be easily isolated.

Polaris chips are thought to be memory bandwidth bound as well.



80% of RX 580 puts it at RX 470 level: https://www.techpowerup.com/reviews/Gigabyte/AORUS_RX_580_XTR/29.html
Vega in Raven Ridge APU can be Memory bound. Polaris - cannot. It has 256 Bit memory bus with GDDR5 memory interface.

And RX 470 is around the same performance level, as GTX 1060 Max-Q.

I do like this GPU a lot. I wonder if AMD will add Vega Mobile(28 CU version?) to the desktop lineup. With proper clocks - it could be around 75W TDP(Vega M has 55W TDP with 1536 GCN cores@1.19 GHz) and around GTX 1060 3GB/6GB(between them). The thing is...

Nvidia GPUs that will replace GTX 1050 and GTX 1050 Ti should also be on the same level of performance. So nothing groundbreaking for AMD.
 
  • Like
Reactions: Drazick

Shivansps

Diamond Member
Sep 11, 2013
3,855
1,518
136
mmmmm how to call this thing? is not a IGP but is not a enterely a dGPU either, i think ill settle with calling it "dIGP".
 
  • Like
Reactions: lightmanek

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
It's not wrong to say its an iGPU either. It is all in one chip, and it has advantages that a discrete part can't have, like the power sharing feature it has.

Vega M has 55W TDP with 1536 GCN cores@1.19 GHz) and around GTX 1060 3GB/6GB(between them

That assumes the CPU part is fixed at 45W, and GPU at 55W and there's nothing else going on. They did state it has a proprietary power sharing framework that brings 10-15% benefit in efficiency. This may be the key in bringing it on par with Nvidia parts, which are more efficient.

The performance part isn't a problem. The pricing and whether the 100W version will be in more form factors than just a single NUC is. Out of the two laptops disclosed, the Dell isn't even running the part at 65W. They are apparently experimenting to see if 50-55W works.
 
  • Like
Reactions: ZGR

Olikan

Platinum Member
Sep 23, 2011
2,023
275
126
All GCN GPUs since Hawaii are severly ROP starved.
Some of the newer ASICs bandwidth starved as well (Polaris 10, Vega, etc).
Dunno, shouldn't Vega have increased performance per clock from Fiji, with DSBR and L2 cache?
 

Shivansps

Diamond Member
Sep 11, 2013
3,855
1,518
136
Run ran the ACO benchmark on my pc (R7 1700/RX 580), and i get 48fps avg on 1080p Ultra High, this means that nuc is getting 71% the performance of my pc at a much lower power, this is impressive.
 

wahdangun

Golden Member
Feb 3, 2011
1,007
148
106
Run ran the ACO benchmark on my pc (R7 1700/RX 580), and i get 48fps avg on 1080p Ultra High, this means that nuc is getting 71% the performance of my pc at a much lower power, this is impressive.

if the efficiency was this good, then prepare to gone in a minute to the miner.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,785
136
if the efficiency was this good, then prepare to gone in a minute to the miner.

NUC mining? LOL!

Seriously though at $999 it isn't that attractive. Also, unless the HBM2 timings can be adjusted with modified BIOS, it'll hash only little better than RX 560 cards. This is up to Intel whether they'll allow modded BIOS to work.

If they made a quad-socket motherboard and got a socketable Core i7 8809G chips out, maybe a different story.
 

HurleyBird

Platinum Member
Apr 22, 2003
2,684
1,268
136
Given that this is a single benchmark at 1080p, there's only so much that can be drawn from it. I wouldn't be surprised to see Kaby-G slightly behind a 580 at 1440p and ahead at 4K, given how twice the ROPs should help more at higher resolutions.
 

coercitiv

Diamond Member
Jan 24, 2014
6,199
11,895
136
Also, unless the HBM2 timings can be adjusted with modified BIOS, it'll hash only little better than RX 560 cards.
Where does this estimate come from? In terms of bandwidth and compute power it's quite close to RX 570, with lower memory latency. Depending on mining algo it may end up trading blows with it.