Vega refresh - Expected? How might it look?

Page 9 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DiogoDX

Senior member
Oct 11, 2012
746
277
136
Looks like a placeholder, Vega 20 is 4096 Bits wide and around 1 TB/s using slower HBM2. The code in the vega 20 bringup branch should be more trustworthy than TPU.
Also at same performance the 7nm should offer lower TDP than the stated TPU TDP of 250W unless something is really wrong.

Sidenote: Suzanne Plummer is now CVP of Radeon Technologies Group. From CPU to graphics, integrating some common stuff. Interesting.
If vega20 has 1/2 FP64 then 250W will be possible.
 

rainy

Senior member
Jul 17, 2013
505
424
136
Sidenote: Suzanne Plummer is now CVP of Radeon Technologies Group. From CPU to graphics, integrating some common stuff. Interesting.

This is indeed interesting.
Will RTG start implementing power management of Ryzen or similar in Radeons?
 

Stuka87

Diamond Member
Dec 10, 2010
6,240
2,559
136
This is indeed interesting.
Will RTG start implementing power management of Ryzen or similar in Radeons?

Radeons have power management, the reason they consume more power than nVidia is because AMD clocks them higher than they where designed for. When you surpass the 'sweet spot' power consumption skyrockets. Declock them even a small amount and power consumption drops like a rock. But to AMD marketing, that 5% performance is apparently worth 20% extra power.
 
  • Like
Reactions: KompuKare and ZGR

rainy

Senior member
Jul 17, 2013
505
424
136
Radeons have power management, the reason they consume more power than nVidia is because AMD clocks them higher than they where designed for.

I know that AMD GPU's have power management, however it seems less advanced than in Ryzen - I think, it's nothing wrong to make some improvements here.
 
Last edited:

Flash831

Member
Aug 10, 2015
60
3
71
You're over generalizing. They have failed to execute on major roadmap achievements and lost the head of the graphics business to their competitor.

Also.. "Where would AMD CPUs be without integrated graphics?" Well.. in terms of Ryzen.. they'd be exactly where they are now.

Roadmaps may change but theirs hasn't.. its just fallen silent with major milestones missed. If they don't spin off RTG with some sort of legacy licensing deal then I see it as limping along to basically be the iGPU component of the increasingly competitive Ryzen products. Console APUs helped them.. but even with 100% of the mainstream console market it barely kept them afloat while total revenues were down big time.

Once Ryzen/Epyc pick up in terms of market share the console APUs and semi-custom in general will start to pale in comparison in terms of overall revenue. Console APUs are already very low margin and this works against their goal of raising gross corporate margins.

Vega arrived super late, extremely inefficient for its large die space, and it won't see any successor to either next summer's 7nm Vega or a late 2019/early 2020 7nm Navi. The trend is downward facing in terms of any type of discrete graphics adapter market for AMD, unfortunately. I support the company and would like to see them reverse this trend, but a trend it is!
I agree on RTG not living up to the goals the last years. However, graphics remaing a huge opportunity for AMD to grow in once EPYC and additional Ryzen revenue comes in. They have major IP`s and graphics oppotunities are bigger than the once in CPU`s. Gaming GPU`s are only a part of the opportunities for RTG. Give it some time.
 
  • Like
Reactions: darkswordsman17

DisEnchantment

Golden Member
Mar 3, 2017
1,605
5,795
136
Vega 20 is in tests
AMD said:
7nm @RadeonInstinct product for machine learning is running in our labs. http://bit.ly/2vRDOl4

Bob Marston said:
From the AMD Teleconference Lisa Su was asked about the confidence AMD has with TSMC as a supplier of the 7nm Epyc products. Lisa responded that AMD would use both TSMC and Global Foundries for it 7nm products and TSMC would go first. Lisa also said AMD has confidence in TSMC.
https://twitter.com/AMDNews/status/989258151345229825
 

Ancalagon44

Diamond Member
Feb 17, 2010
3,274
202
106
Radeons have power management, the reason they consume more power than nVidia is because AMD clocks them higher than they where designed for. When you surpass the 'sweet spot' power consumption skyrockets. Declock them even a small amount and power consumption drops like a rock. But to AMD marketing, that 5% performance is apparently worth 20% extra power.

I believe Nvidia's cards use less power not because they are clocked less aggressively but because AMD is a lot more conservative with their voltages.

Nvidia's voltages are pretty low in general. In fact, I notice that Nvidia users who overclock, don't bother to undervolt.

AMD users on the other hand, must undervolt in order to overclock. Undervolting reduces power consumption and heat generation, and reducing the voltage then allows for higher overclocks.

Why don't AMD just have their voltages lower by default? I'm not sure, but I believe it is because some of their silicon cannot handle the low voltages. Rather use a higher voltage to guarantee stable performance at the selected clocks than have some cards unable to meet those performance targets. At least, that is what I think.
 

Stuka87

Diamond Member
Dec 10, 2010
6,240
2,559
136
I believe Nvidia's cards use less power not because they are clocked less aggressively but because AMD is a lot more conservative with their voltages.

Nvidia's voltages are pretty low in general. In fact, I notice that Nvidia users who overclock, don't bother to undervolt.

AMD users on the other hand, must undervolt in order to overclock. Undervolting reduces power consumption and heat generation, and reducing the voltage then allows for higher overclocks.

Why don't AMD just have their voltages lower by default? I'm not sure, but I believe it is because some of their silicon cannot handle the low voltages. Rather use a higher voltage to guarantee stable performance at the selected clocks than have some cards unable to meet those performance targets. At least, that is what I think.

This is also true. I have my RX480 undervolted by 100mV and it runs fine, and WAY cooler. My guess is they have their voltages higher so that they have fewer dies that have to be tossed out.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,605
5,795
136
694X : Kaby Lake G
69AX : Vega 12

Vega M has shown up in kernel side drivers.

+ /* VEGAM */
+ {0x1002, 0x694C, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGAM},
+ {0x1002, 0x694E, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGAM},

https://cgit.freedesktop.org/~agd5f...p&id=a76dedc836a602f70514f22ef6348aada2cd9db6

PRODUCT was not announced.

The chip was. Vega Mobile. Product is Vega 12, and all of SKUs it will spawn.

Vega M is Polaris
This is just one snippet. It is all over the place the code path is the same for Polaris and Vega M
Kernel side code
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index 6721b04..1edbe6b 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -569,9 +569,10 @@ static int gmc_v8_0_mc_init(struct amdgpu_device *adev)
/* set the gart size */
if (amdgpu_gart_size == -1) {
switch (adev->asic_type) {
- case CHIP_POLARIS11: /* all engines support GPUVM */
case CHIP_POLARIS10: /* all engines support GPUVM */
+ case CHIP_POLARIS11: /* all engines support GPUVM */
case CHIP_POLARIS12: /* all engines support GPUVM */
+ case CHIP_VEGAM: /* all engines support GPUVM */

This one from userland mesa
@@ -112,6 +112,7 @@ const char *ac_get_llvm_processor_name(enum radeon_family family)
return "polaris10";
case CHIP_POLARIS11:
case CHIP_POLARIS12:
+ case CHIP_VEGAM:
return "polaris11";

Vega 12
Plumbing for Vega 12 is already integrated in mesa 18.1 and Kernel 4.17. It is already live.
The question now is what really is Vega 12.
Vega 12 is much closer to Vega 10, but has some new changes in HW.
There were some HW fixes and also has new tiling updates.
 
Last edited:

ksec

Senior member
Mar 5, 2010
420
117
116
Last edited:

DisEnchantment

Golden Member
Mar 3, 2017
1,605
5,795
136

Vega 20 3D Mark scores
Core Clocks could be wrong (reported 1 GHz) . If not wrong it could be quite potent.
But that is indeed 4x 8 Hi stacks of Aquabolt HBM2, at 1250MHz/2.5Gbps , 32 GB/1.28 TB/s .
Interesting that it is running the HBM at 2.5 Gbps instead of the 2.4 advertised by Samsung. Lower Voltages on the Aquabolt and the 7nm on the core could possibly help tame that power draw.

3dmark links
two Vega 20 tests at 1GHz
Vega 20 vs RX Vega 64 Liquid

i2kcfWS.jpg

WTEO20i.png

jHSUJBa.png

Makes me wonder if AMD locked the Frequency to 1GHz and intentionally let out a teaser.
How high can the CLN7FF clock is a very interesting question... 2GHz+, that would be a really potent GPU.
 
Last edited:

DisEnchantment

Golden Member
Mar 3, 2017
1,605
5,795
136
Digging some code in the AMD's amdgpu source tree on the Vega 20 branch brought up a lot of interesting tidbits

Also in LLVM were merged some stuff for Vega 12/20.

This confirms Vega 12 is a desktop part.


+ ``gfx904`` ``amdgcn`` dGPU - xnack *TBA*
+ [off]
+ .. TODO
+ Add product
+ names.
+ ``gfx906`` ``amdgcn`` dGPU - xnack *TBA*
+ [off]
+ .. TODO
+ Add product
+ names.

Vega 20 is GFX906, Vega 12 is GFX904, RR is GFX 901, RX Vega 64 is GFX900.

Apparently the guys at AMD have been working on the driver long before they had the HW since H2 last year using an emulator.
From the code changes it seems they have got Vega 20 more than a month ago.
Some interesting things came up like hitting a memory allocation limit since Vega 20 has 32 GB HBM (it is a lot) and a bunch of tweaks related to mGPU.
Vega 20 has a bunch of new instructions tailored for Machine Learning especially half precision math instructions, vector instructions and fused instructions. This pretty much makes it less probable for Vega 20 to be launched for Gaming.
Vega 12 on the other hand has mesa(Linux Graphics Library for OpenGL/Vulkan) changes including some patches for HW changes which makes it more probable to be a gaming part.
Vega 20 uses a new Infinity Fabric version.


I am not planning to buy any Vega 20 because many things are not working yet like Tensor Flow, CNTK, cumbersome HIP tool etc..., but I am curious what they are up to.
Interesting times.
 
  • Like
Reactions: lightmanek

DisEnchantment

Golden Member
Mar 3, 2017
1,605
5,795
136
That's probably Kaby-G. Vega 12 I still think is Vega mobile.
I posted code from the Linux display driver and it indicates Vega 12 is not Vega M, and Vega M is Kaby G, the PCI ID is the same.

But AMD could confuse us with their naming again.
Kaby G GPU is Vega M(obile) in software but is actually Polaris.
But Vega 12 could be a Mobile Vega Part except that LLVM patches show it is a dGPU.
 
Last edited:

Glo.

Diamond Member
Apr 25, 2015
5,711
4,556
136
1.3 GHz, 1280 GCN5 cores, 4 GB HBM2, 192 GB/s, 60% faster than Radeon Pro 560X(very close in performance to GTX 1060 Max-Q).

35W TDP. This would be fanless GPU on desktop. 65-75W TDP target would result in over 1.5 GHz core clock, on Vega 12 chip.
 

JDG1980

Golden Member
Jul 18, 2013
1,663
570
136
This would be fanless GPU on desktop. 65-75W TDP target would result in over 1.5 GHz core clock, on Vega 12 chip.

For maximum performance in a 75W envelope, I think it might have trouble beating the Radeon Pro WX 5100. Sure, Vega can do higher clocks than Polaris, but I'm not sure that is enough to make up for the WX 5100's eight additional CUs.

The WX 5100 isn't fanless in default config, but you could almost certainly stick an Arctic Accelero S3 on it if you wanted to.
 

Glo.

Diamond Member
Apr 25, 2015
5,711
4,556
136
For maximum performance in a 75W envelope, I think it might have trouble beating the Radeon Pro WX 5100. Sure, Vega can do higher clocks than Polaris, but I'm not sure that is enough to make up for the WX 5100's eight additional CUs.

The WX 5100 isn't fanless in default config, but you could almost certainly stick an Arctic Accelero S3 on it if you wanted to.
75W TDP is over twice as high power envelope as it will be in MBPs with 35W TDP and 1300 MHz core clock ;).

With 75W TDP we are most likely looking at over 1600 MHz and 2.4 Gbps HBM2. And at that point it would definitely be faster than WX5100 ;).
 

JDG1980

Golden Member
Jul 18, 2013
1,663
570
136
75W TDP is over twice as high power envelope as it will be in MBPs with 35W TDP and 1300 MHz core clock ;).

With 75W TDP we are most likely looking at over 1600 MHz and 2.4 Gbps HBM2. And at that point it would definitely be faster than WX5100 ;).

1600 MHz, quite probably. 2.4 Gbps HBM2? I don't think the vendors make chips that fast, and even if they did, it's unclear if Vega's memory controller could handle it. The WX 8200 got its memory clock to 2.0 Gbps (which was the original design spec, but HBM chips at that speed were unavailable back in 2017); that's the fastest we've seen from Vega yet.
 
Mar 11, 2004
23,075
5,557
146
This is indeed interesting.
Will RTG start implementing power management of Ryzen or similar in Radeons?

I think that's more because embedded is part of RTG and is a huge part of AMD's business these days. But hopefully it will improve the GPUs as well, and I'm sure they can improve the power management with some of what they've developed for CPUs. But I think its mostly about getting CPUs and GPUs working together better (which will help overall efficiency). And AMD is simply trying to get their GPU stuff back on track after having let it languish and focusing resources on Zen. They've said that GPU is integral to what they're working towards and would help them sell CPUs. Gaming and HPC/enterprise are two markets that have been growing even as the overall PC market slowed, and both seem to have plenty of room for future growth.

Radeons have power management, the reason they consume more power than nVidia is because AMD clocks them higher than they where designed for. When you surpass the 'sweet spot' power consumption skyrockets. Declock them even a small amount and power consumption drops like a rock. But to AMD marketing, that 5% performance is apparently worth 20% extra power.

Its not even that, its that AMD doesn't bin or tune the voltage settings much, so you can often keep the exact same clocks (and sometimes even increase them) but drop the voltage and drop power use substantially. It also tends to help sustained performance as it helps with throttling issues. This has been hurting AMD basically since GCN. I really just cannot fathom why they did nothing to address it the whole time. Microsoft recognized it and made a big fuss about binning/tuning the power for each chip on the One X.

Digging some code in the AMD's amdgpu source tree on the Vega 20 branch brought up a lot of interesting tidbits

Also in LLVM were merged some stuff for Vega 12/20.

This confirms Vega 12 is a desktop part.





Vega 20 is GFX906, Vega 12 is GFX904, RR is GFX 901, RX Vega 64 is GFX900.

Apparently the guys at AMD have been working on the driver long before they had the HW since H2 last year using an emulator.
From the code changes it seems they have got Vega 20 more than a month ago.
Some interesting things came up like hitting a memory allocation limit since Vega 20 has 32 GB HBM (it is a lot) and a bunch of tweaks related to mGPU.
Vega 20 has a bunch of new instructions tailored for Machine Learning especially half precision math instructions, vector instructions and fused instructions. This pretty much makes it less probable for Vega 20 to be launched for Gaming.
Vega 12 on the other hand has mesa(Linux Graphics Library for OpenGL/Vulkan) changes including some patches for HW changes which makes it more probable to be a gaming part.
Vega 20 uses a new Infinity Fabric version.


I am not planning to buy any Vega 20 because many things are not working yet like Tensor Flow, CNTK, cumbersome HIP tool etc..., but I am curious what they are up to.
Interesting times.

AMD has outright said Vega 20 is not a gamer chip at all and they are not going to release it to that market. I don't know why people even keep speculating as AMD has been clear in quashing any such speculation. They're all in on Navi being their chip for gamers (which should be able to actually offer similar gaming performance to Vega 20 unless AMD gimps it stupidly somehow). They should easily be able to match the CU count of Vega 20, plus it'll have newer features that should help it be better per CU/per clock in gaming than Vega. And without all the other bits, it should be able to push clock speeds more. Some of the Vega 20 features will probably end up in Navi as well (reduced precision math, I think game developers have said they could make use of 8-bit, but probably don't need 4 bit yet unless they start pushing machine learning a lot).

Which, I feel like I'm having a brain slip, as I swear they touted FP16 Rapid Packed Math as a feature with Polaris (where they could do two FP16 or 1 FP32 ops per clock, whereas before you could do lower precision but it'd be at the same rate as the full), and that Vega extended that to 8. But maybe that was very early on and never came out for Polaris? Or maybe its because the PS4 Pro chip had that feature (and was mostly similar to Polaris). Which I assume Vega 20 gets that to 4bit precision (original Vega 8 bit was limited even apparently).
 
Last edited:

Glo.

Diamond Member
Apr 25, 2015
5,711
4,556
136
1600 MHz, quite probably. 2.4 Gbps HBM2? I don't think the vendors make chips that fast, and even if they did, it's unclear if Vega's memory controller could handle it. The WX 8200 got its memory clock to 2.0 Gbps (which was the original design spec, but HBM chips at that speed were unavailable back in 2017); that's the fastest we've seen from Vega yet.
Yes, they do make 2.4 Gbps HBM2 ;). Samsung, to be precise ;).
 

Qwertilot

Golden Member
Nov 28, 2013
1,604
257
126
Its not even that, its that AMD doesn't bin or tune the voltage settings much, so you can often keep the exact same clocks (and sometimes even increase them) but drop the voltage and drop power use substantially. It also tends to help sustained performance as it helps with throttling issues. This has been hurting AMD basically since GCN. I really just cannot fathom why they did nothing to address it the whole time. Microsoft recognized it and made a big fuss about binning/tuning the power for each chip on the One X.

Money. They didn't have any :( You can tell from how many glaringly obvious holes they've been leaving in their GPU line ups etc.
 

torlen11cc

Member
Jun 22, 2013
143
5
81
The Vega architecture isn't good, and therefore they should have used HBM2. If the move to 7NM would solve that so they can use GDDR5X/6 and reduce the price to a large extent, then at least they will be able to compete on the mainstream and the mid-high end. Hope that's what will happen.
 
Last edited: