• We should now be fully online following an overnight outage. Apologies for any inconvenience, we do not expect there to be any further issues.

Vega refresh - Expected? How might it look?

Page 5 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

tential

Diamond Member
May 13, 2008
7,348
642
121
Ahh good grief. Drama is fine for a certain degree but AMD stock price is 300% of what is was a year ago and there is good reasons for it. Their product portfolio and production capacity is in good shape.
That was part of a sector spike. If you're going to quote the markets, then pay attention to them.
Investors are bullish on GPUs. They aren't bullish on AMD specifically...
 

Glo.

Diamond Member
Apr 25, 2015
5,930
4,991
136
As a side note, my main development occurs on Nvidia cards. I bought more when the RX Vega 64 pricing fiasco hit. I also demoted my evaluation basis for Vega to one unit. So, they've got one try to deliver on their hype. Above was a deconstruction of the hype.. So, I am 100% aware of what I'm getting : A basic GPU whose driver/software stack hasn't bee figured out.
It appears you have your opinion, not educated one, about the hardware you use.

Nvidia's only advantage currently is the software. Nvidia's GPUs in compute have lower throughput than AMD ones, but they have better software. Do you know however what happens when Software matures, and starts using AMD GPUs properly?

AMD GPUs start to walk away from Nvidia's because they have much higher compute capabilities.

Let me show you something. Blender lately has added Split Kernel to the code for AMD OpenCL path, to mimic execution of CUDA. Simple change, which has to be payed by AMD, so Blender would start use it.

Blender corp. has compared the effects on similarly priced GPUs from both Vendors: GTX 1060 using CUDA, and RX 480 using OpenCL, with new implementation.
This is the effect:
Timings.png

Sometimes RX 480 is two times faster.

But that would not be strange if you watch something like this:

RX 480 being faster than GTX 1070 in Adobe Premiere Apps.

About HBCC. It is not DMA. It is hardware implementation of memory controller that is capable of handling HSA 2.1 specification(Currently HSA 2.0 compatible, because of not ready software). High Bandwidth Controller is indexing all of the data available in you system: RAM, HDD's, SSD's, Network Storage so it can dynamically gain access to the data needed, when it is needed. It has access to ALL data it indexed, so it increases virtual memory page size available for the GPU. It does not need the NVMe SSD sticked to the GPU, however, sticking the SSD to the GPU, increases the bandwidth available to large data sets, and lower access time to them. It was created for data centers in the first place. You are too attached to the idea of NVME SSD on the GPU, creating SSG. You forgot that Fiji also had SSG, Vega expands it a bit.

Its funny that you can comment on proprietary nature of hardware features, but when Nvidia delivers proprietary software "standard"/s everything is fine. I guess double standards, are what are apparent in the industry.


Unfortunately, you posted dozens of post on the matter of HBCC, however in all of them you missed the point, because you simply do not understand what it does, and what is its role for Vega architecture.
 

ub4ty

Senior member
Jun 21, 2017
749
898
96
It appears you have your opinion, not educated one, about the hardware you use.

Nvidia's only advantage currently is the software. Nvidia's GPUs in compute have lower throughput than AMD ones, but they have better software. Do you know however what happens when Software matures, and starts using AMD GPUs properly?

AMD GPUs start to walk away from Nvidia's because they have much higher compute capabilities.

Let me show you something. Blender lately has added Split Kernel to the code for AMD OpenCL path, to mimic execution of CUDA. Simple change, which has to be payed by AMD, so Blender would start use it.

Blender corp. has compared the effects on similarly priced GPUs from both Vendors: GTX 1060 using CUDA, and RX 480 using OpenCL, with new implementation.
This is the effect:
Timings.png

Sometimes RX 480 is two times faster.

But that would not be strange if you watch something like this:

RX 480 being faster than GTX 1070 in Adobe Premiere Apps.

About HBCC. It is not DMA. It is hardware implementation of memory controller that is capable of handling HSA 2.1 specification(Currently HSA 2.0 compatible, because of not ready software). High Bandwidth Controller is indexing all of the data available in you system: RAM, HDD's, SSD's, Network Storage so it can dynamically gain access to the data needed, when it is needed. It has access to ALL data it indexed, so it increases virtual memory page size available for the GPU. It does not need the NVMe SSD sticked to the GPU, however, sticking the SSD to the GPU, increases the bandwidth available to large data sets, and lower access time to them. It was created for data centers in the first place. You are too attached to the idea of NVME SSD on the GPU, creating SSG. You forgot that Fiji also had SSG, Vega expands it a bit.

Its funny that you can comment on proprietary nature of hardware features, but when Nvidia delivers proprietary software "standard"/s everything is fine. I guess double standards, are what are apparent in the industry.


Unfortunately, you posted dozens of post on the matter of HBCC, however in all of them you missed the point, because you simply do not understand what it does, and what is its role for Vega architecture.
Liked your post because its informative. That being said, I haven't missed a single point. With today's offering of Vega 56 having came and went in less than a minute and the stock showing as 7 on Amazon what Radeon has missed is an opportunity to capture someone in their ecosystem.

How did they miss this? By not being clear and concise about the features you just highlighted above in their advertisement of a flagship product when they are known to be an underdog in several areas. I get the 'fine wine' meme which is why one exists for Radeon cards. I also have several AMD CPU hardware in my test rigs evaluating them currently. Why? because they detailed their product through their own channels and delivered. I have hardware from Intel/AMD CPU and I have Nvidia cards. Even through all of my critiques, I was hoping to evaluate AMD's GPU line but they seem to have a fundamental issue : they don't clearly communicate what it is exactly that makes their cards better nor do they clearly communicate their fine wine software development track. You lose people doing this. There are individuals who've waited years for these cards and Radeon didn't even have the decency to do a showcase launch of RX Vega consumer. They promoted their pro line at Siggraph and let RX Vega and RX Vega FE literally fall out the back of a truck. You speak of features like FP16 then go in no detail about how it will function and whether it will be available for compute OpenCL. You have an ISA that details various FP16 but where is the prominent engineering discussion about this feature? What does this mean to someone who does compute flows? Is Radeon's business development/marketing group this inept?

HBCC, you've just provided information that I was unaware of and is positive for the product and understanding HBCC. Thus, where the heck is the Radeon group's presentation of this feature? Where is the video detailing this? How does it perform? What is the latency? What is the throughput? What are some use cases beyond gaming? What does it even do functionally in gaming to accelerate performance? Whereas, when Nvidia launches a product they detail the whole micro-architecture and what it implies for customers. They don't rely on a bunch of clowns on youtube to sell their product. They don't say : oh hey, world changing feature.... and then not go into technical detail about it.

They respect their customers and respect their understanding of their product such that they detail highly technical things such as these. Meanwhile, I have to look high and low and engage in passionate discussions across multiple forums to get answers about a product. Now you tell me what's wrong w/ this and how you end up losing sales and potential bigger relationships this way.. it's not like Radeon is an industry leader. Yet, they're not doing anything about their competition to make the sale.

I'm not a fanboy or have any bias. I have hardware from just about every manufacturer. I don't have Radeon's GPUs because they don't know how to communicate what their cards features are or their software development roadmap. In general, finewine with no roadmap is a huge problem for active developers. I have no clue what features your card will support or how it will perform. You are not the industry leader yet you expect me to buy on faith? No man.. that's not how it works and its a shame that you have business/marketing people who think that way consistently. The engineering department seems to be pushing the envelope and is doing amazing things within the budget they have. I can respect that. I can even put my money behind that beyond my best interest. But Vega launch has been a *@#% show. From pricing to power utilization to performance to a lack of detailing as to what features the card has, will they gimped, and what exactly those features mean for gaming/compute.

People have timelines.. Gamers as well as developers. You monkey around like this and you lose people for some time (years) or for good. I know exactly what stunts Nvidia pulls but they're clear about them. Radeon pulls the same kind of stunts. Let me give you an example :
https://github.com/RadeonOpenCompute/ROCnRDMA

Is this available on your consumer cards? Nope .. you've gimped it.
Nvidia did the same thing :
https://developer.nvidia.com/gpudirect

An industry feature that's been around for as long as I can remember yet, you gimp them in software/drivers and force people to buy your pro cards. Now what am I to believe about HBCC when you say it adheres to HSA 2.1 knowing dam well that Radeon gimps consumer cards and its not the full featureset of the pro-line... It has no NvMe yet you market it as a solution for running out of GPU memory. Logic goes : well, are you going to evict stuff out of HBM2? Then you tell me that it conforms to HSA 2.1 .. whatever that means... and has access to my whole memory space (which it does not nor do I want it to nor is there any logical reason for it to).. But yes, say I have a pinned grouping of memory for my GPU in RAM (Apparently it functions like this and has a minimum of 11GB .. no clue why its so high). So, you have pinned memory and have access rights to r/w from it. What's the performance? No details given.. Good job.

Has Radeon said what will or want be available on HBCC for RX vega consumer? What about RDMA? We finally gonna get that on RX Vega or Vega FE? Nope. What about FP16? is it only available to GPU flows or will it be available for compute in ROCm? What about any of the other flagship features that are discussed on marketing slides?

This is what GPU customers ask when buying cards these days... Yet Radeon relies on sending expensive review packages to ecelebs that run canned benchmarks to showcase and sell their product. This is how you lose out on customers that matter and you've lost me. I don't care about finewine because I have my own product to procure and it has a timeline. I need to know what the card is capable of and when so I can plan accordingly. That's not provided so I bought more Nvidia cards and will be focusing development on CUDA. I don't care if you get your act together 1-2 years from now w.r.t to OpenCL and outperform Nvidia. You don't now and you haven't detailed when or how you will.

I don't buy things based on faith. I may buy it because I appreciate your efforts as a company. I may buy it because I understand what its like to produce quality products on a tight budget. I will take personal risk to support a company when I feel they are being honest and straight forward with me... When they've detailed their roadmap. I've gotten none of that after wasting precious time trying to find answers. So, I'm done.

Development is hard enough when the hardware features that were promised work and the development software is top notch.... The state of Radeon group is too messy to take the risk especially when you've somehow managed to price your product higher than the top ranking competition and have 10 cards on hand for launch (yet told people you're building up tons of supply). You can't keep disappointing people consistently. I'm thoroughly disappointed on multiple levels and I haven't even been on the bandwagon for long. So, again, I'm done with Radeon. Maybe I revisit this conversation a year or two down the road when they've managed to get their act together. Maybe I wont because my development will have been tied significantly to Nvidia's proprietary but functional code stack. My work goes forward with team Red CPUs and team Green GPUs. It was fun wasting a week or two getting excited about Vegas' potential.
 
Last edited:
  • Like
Reactions: xpea and tential

Glo.

Diamond Member
Apr 25, 2015
5,930
4,991
136
HBCC, you've just provided information that I was unaware of and is positive for the product and understanding HBCC. Thus, where the heck is the Radeon group's presentation of this feature? Where is the video detailing this? Whereas, when Nvidia launches a product they detail the whole micro-architecture and what it implies for customers. They don't rely on a bunch of clowns on youtube to sell their product. They respect their customers and respect their understanding of their product such that they detail highly technical things such as these. Meanwhile, I have to look high and low and engage in passionate discussions across multiple forums to get answers about a product. Now you tell me what's wrong w/ this and how you end up losing sales and potential bigger relationships this way.
If I was able to educate myself on the matter of GCN, and its features, how come you were not able to do the same thing?
 

ub4ty

Senior member
Jun 21, 2017
749
898
96
If I was able to educate myself on the matter of GCN, and its features, how come you were not able to do the same thing?
I am quite educated on it which is why I know what questions to ask beyond fluff. If you're educated on it above what I am, please let me know what the latency figures are for HBCC, why it needs a minimum of 11GB of allocation, what's the page size? Is it variable? Can it pin less memory than 11GB? Why not? and how is it any different than Nvidia's pinned memory DMA access to system memory? Also, let me know what the bandwidth is ...

Nvidia CUDA :
  1. Host to Device Bandwidth, 1 Device(s)
  2. PINNED Memory Transfers
  3. Transfer Size (Bytes) Bandwidth(MB/s)
  4. 33554432 11884.2
So, while there was an aspect of his post that was informative, it has nothing to do w/ how educated I am on the matter at hand. I am quite educated and Radeon's packing of information about their product is lack luster. Snip the important aspect of my comments from this post.

Nvidia has this implemented and it works and there are figures for the feature.
Radeon claims it adheres to HSA 2.1 and does magic but there is nothing backing up that statement. If there is, they've done a piss pour job of detailing it.

Meanwhile, at a company that knows how to provide such details :
https://devblogs.nvidia.com/parallelforall/how-optimize-data-transfers-cuda-cc/

When you get a chance, since you're so educated on the matter and its so readily available, try answering the questions I posed in my previous post and this one. I'll wait..
 
Last edited:

Glo.

Diamond Member
Apr 25, 2015
5,930
4,991
136
I am quite educated on it which is why I know what questions to ask beyond fluff. If you're educated on it above what I am, please let me know what the latency figures are for HBCC, why it needs a minimum of 11GB of allocation, why it can't pin less memory than that, and how is it any different than Nvidia's pinned memory DMA access to system memory. Also, let me know what the bandwidth is ...

Nvidia CUDA :
  1. Host to Device Bandwidth, 1 Device(s)
  2. PINNED Memory Transfers
  3. Transfer Size (Bytes) Bandwidth(MB/s)
  4. 33554432 11884.2
So, while there was an aspect of his post that was informative, it has nothing to do w/ how educated I am on the matter at hand. I am quite educated and Radeon's packing of information about their product is lack luster. Snip the important aspect of my comments from this post.
You understand that with HBCC, you do not have to manage memory, and boarders, because HBCC's point is to manage it for you?
 

ub4ty

Senior member
Jun 21, 2017
749
898
96
You understand that with HBCC, you do not have to manage memory, and boarders, because HBCC's point is to manage it for you?
Talk to me in engineering language and provide links to detailed spec docs, benchmarks, limits, and throughput information.
 
  • Like
Reactions: xpea

Magic Hate Ball

Senior member
Feb 2, 2017
290
250
96
I am quite educated on it which is why I know what questions to ask beyond fluff. If you're educated on it above what I am, please let me know what the latency figures are for HBCC, why it needs a minimum of 11GB of allocation, what's the page size? Is it variable? Can it pin less memory than 11GB? Why not? and how is it any different than Nvidia's pinned memory DMA access to system memory? Also, let me know what the bandwidth is ...

Nvidia CUDA :
  1. Host to Device Bandwidth, 1 Device(s)
  2. PINNED Memory Transfers
  3. Transfer Size (Bytes) Bandwidth(MB/s)
  4. 33554432 11884.2
So, while there was an aspect of his post that was informative, it has nothing to do w/ how educated I am on the matter at hand. I am quite educated and Radeon's packing of information about their product is lack luster. Snip the important aspect of my comments from this post.

Nvidia has this implemented and it works and there are figures for the feature.
Radeon claims it adheres to HSA 2.1 and does magic but there is nothing backing up that statement. If there is, they've done a piss pour job of detailing it.

Meanwhile, at a company that knows how to provide such details :
https://devblogs.nvidia.com/parallelforall/how-optimize-data-transfers-cuda-cc/

When you get a chance, since you're so educated on the matter and its so readily available, try answering the questions I posed in my previous post and this one. I'll wait..

2ngbnki.jpg

What an incredibly snide way to have a discussion.
 

Glo.

Diamond Member
Apr 25, 2015
5,930
4,991
136
Talk to me in engineering language and provide links to detailed spec docs, benchmarks, limits, and throughput information.

The memory Controller is dealing with data, and memory menagment for you. I have no idea why would you want to know this specifics, when nothing should bother you here.

The documentation is not released, the same way as for Primitive Shaders. Currently in current state of drivers its Drivers job to manage Primitive Shaders for you.
 
  • Like
Reactions: DarthKyrie

ub4ty

Senior member
Jun 21, 2017
749
898
96
What an incredibly snide way to have a discussion.
The uninformed need to learn when they've reached their intellectual limit that they need to stop pushing incorrect and uninformed viewpoints. What's amazing is an individual being uneducated and uninformed yet debating ad naseum with someone who is... Meanwhile, when they reach the debts of their limited understanding they push the conversation into snide exchanges in order to mask that fact as if an intelligent person want know what they're doing. The world is in an ugly place because of people like this. There used to be a time, when you weren't educated on something, that you just shut your mouth and let informed people have the floor and learn. Yet....

https://forums.anandtech.com/thread...ow-might-it-look.2516665/page-5#post-39054445

Over and over and over again....
I'm off this train. I've had enough.
 

zlatan

Senior member
Mar 15, 2011
580
291
136
please let me know what the latency figures are for HBCC
The HBC can be accessed in 60-80 ns.

why it needs a minimum of 11GB of allocation
This is just a minimum limit of the driver for the HBCC segment. It can work with 8.001 GB for example, but this is not useful for the typical workloads.

what's the page size? Is it variable?
Yes variable. 4 and 64 kB is the size for the current implementations.

and how is it any different than Nvidia's pinned memory DMA access to system memory?
HBCC is a more general memory paging system. It can work with any API/platform not just CUDA, supports fine-grained data movements, can address byte- and block-based memories, and it can access the CPU's page tables directly.

Also, let me know what the bandwidth is ...
In HBCC mode the HBC is simply a cache, so the bandwith is basically what you can get with HBC alone.
 

zlatan

Senior member
Mar 15, 2011
580
291
136
The memory Controller is dealing with data, and memory menagment for you. I have no idea why would you want to know this specifics, when nothing should bother you here.

Well this is true, but very simplified. In exclusive caching the HBCC segment will be the "traditional VRAM" with flat memory mapping, it don't need direct support. The inclusive caching uses hierarchical memory mapping, which will need some direct support in the application.
 

Glo.

Diamond Member
Apr 25, 2015
5,930
4,991
136
Well this is true, but very simplified. In exclusive caching the HBCC segment will be the "traditional VRAM" with flat memory mapping, it don't need direct support. The inclusive caching uses hierarchical memory mapping, which will need some direct support in the application.
Can you tell more about Primitive Shaders? What it does exactly, and what difference in performance we can see from it?
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,596
136
What about the tile based rasterization. To what degree is it functional?
How similar is it to nv solution?
 

DA CPU WIZARD

Member
Aug 26, 2013
117
7
81
That was part of a sector spike. If you're going to quote the markets, then pay attention to them.
Investors are bullish on GPUs. They aren't bullish on AMD specifically...

You can not ignore the other half of the story. Their new CPU offerings are, what I would frame as, more competitive than anyone expected. Besides, doesn't AMD's stock price specifically suggest that investors are in fact bullish on AMD specifically?
 
Last edited:

Krteq

Golden Member
May 22, 2015
1,009
729
136
What about the tile based rasterization. To what degree is it functional?
How similar is it to nv solution?
Well, we don't know. It's been enabled for AMD's internal testing only, but DSBR is still disabled in drivers released to public.
 

DisEnchantment

Golden Member
Mar 3, 2017
1,777
6,791
136
Videocardz have been spot on with the Vega leaks so far.

AMD-VEGA-10-specifications.jpg


AMD-VEGA-20-specifications.jpg


AMD-VEGA-10-VEGA20-VEGA-11-NAVI-roadmap.jpg


AMDGPU.jpg


What we can speculate, if AMD execute according to plan in these slides.
2017
- Vega 11 as Polaris 10/11 replacement will be due first. From the Linux sources, GFX9/Vega is not tied to HBM, Vega 10 is. (Vega 10 is an implementation of the Vega Arch). So Vega 11 might not come with HBM if it is a low cost part. Jeffrey Cheng, AMD Senior Fellow, also confirm the Vega Arch being independent of the memory type.
- Vega 10 x2 is also due in H2. There are rumors of ASUS already working on such a part. Incidentally ASUS is also the first AIB to have non reference designs for Rx Vega 64/56. Vega 10 x2 much higher BW than Rx Vega 64. According to some knowledgeable posters here, Samsung HBM2 on Rx Vega 64 is not having clock issues but most likely downclocked to save some power. Not clear if 2x the HBM2 stacks only, or 2x of Vega 10 CU + HBM2 stacks. But the slide reads 64CUs.
- GFX9 Design bugs get squashed in the meanwhile?
- AMD keep mentioning that Vega 10 is the first implementation of the Vega architecture and also the first Infinity Fabric GPU.
2018
- Vega 20 happens to have the xGMI IF which like in EPYC is use in Multi Socket communication across PCIe. With PCIe Gen4 maybe a possibility for this to be used to enable the dual GPUs to behave as a single GPU to the SW doing away with Crossfire which incidentally AMD is also keen to drop support in the future. Much higher DP FP with ECC. Looks tailored for Compute.
-
2019
- Navi scalable arch would probably be an evolution of Vega 20 seems to me. Next gen memory + Scalability, according to some AMD slides. Low cost HBM3?

The Linux sources here, gives a wealth of information.

Not present in the slides is also Raven Ridge with Vega/GFX9.1 Graphics. The driver for GFX9 better improve otherwise AMD has a lot to lose, maybe not as much in Compute.

EDIT:
With Navi tapeout already planned for H2 2017, The likely candidates for Navi memory would have been set. HBM3 or GDDR6?

One of the interesting thoughts I have in mind is that, how the challenges being solved for Infinity Fabric on CPUs (Client and Server CPUs) and GPUs is being fed back to design to make it more scalable than it is at IF v1.0 for Ryzen 7. How would the Infinity Fabric evolve? To me the Infinity Fabric is the most exciting part of Ryzen and Vega.

Disclaimer: I have a lot of AMD shares, so I have something at stake :oops:.
 
Last edited:

DisEnchantment

Golden Member
Mar 3, 2017
1,777
6,791
136
In Windows,
I find three Vega part numbers in the driver for the Crimson Relive 17.8.2
687f.1, 687f.2, 687f.3
In Linux
Code:
    {0x1002, 0x6860, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10|AMD_EXP_HW_SUPPORT},
    {0x1002, 0x6861, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10|AMD_EXP_HW_SUPPORT},
    {0x1002, 0x6862, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10|AMD_EXP_HW_SUPPORT},
    {0x1002, 0x6863, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10|AMD_EXP_HW_SUPPORT},
    {0x1002, 0x6864, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10|AMD_EXP_HW_SUPPORT},
    {0x1002, 0x6867, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10|AMD_EXP_HW_SUPPORT},
    {0x1002, 0x6868, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10|AMD_EXP_HW_SUPPORT},
    {0x1002, 0x686c, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10|AMD_EXP_HW_SUPPORT},
    {0x1002, 0x687f, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_VEGA10|AMD_EXP_HW_SUPPORT},

However I cannot correlate the parts against the Marketing name. Anyone can help?
 

Ancalagon44

Diamond Member
Feb 17, 2010
3,274
202
106
Those slides aren't quite correct because a full Vega 10 uses a lot more than 225W of power. In fact, only Vega 56 can fit into that envelope under normal conditions as far as I know. Vega 64 will only use 225W if you use the secondary BIOS in power saver mode (I think).

What does ROCm mean anyway?
 

DisEnchantment

Golden Member
Mar 3, 2017
1,777
6,791
136
Those slides aren't quite correct because a full Vega 10 uses a lot more than 225W of power. In fact, only Vega 56 can fit into that envelope under normal conditions as far as I know. Vega 64 will only use 225W if you use the secondary BIOS in power saver mode (I think).

What does ROCm mean anyway?

ROCm is Radeon Open Compute.
AMD recently has made lots of open source stuff in this area. For established players this doesn't mean much.
If you are a new comer in the AI and Machine learning it means everything.

About the power I think the slide is from 2016 and generic targets. So I would not look so much into down to the last watt.

Also if you are into SW development you will like the GPU profiler from AMD.
Machine Learning/AI/Compute is converging a lot, and traditional tech companies (like ours) are scrambling to join the action because there is so much of Disruptive tech in play and we are just feeling the heat trying to keep abreast.
 
May 11, 2008
22,565
1,472
126
Very good detailing and eye you have. The two things that my comments center on is :




Also HBCC being a cut down hand me down from the pro-line for handling the on board SSD that RX vega doesn't have :
pro-ssg.jpg


So, you see where my critical comments come from. For RX Vega, until it is further detailed and performance numbers provided, I consider it TBD marketing hype. IOMMU and DMA asych pre-fetch transfers are already hard enough and Nvidia kicks Radeon's behind here. So, for Radeon to come out of nowhere and fix their IOMMU/DMA performance as well as add on incredibly new features that are heavily dependent on drivers (Where they lack).. Yeah, I'm going to be quite skeptical and its warranted

I have been thinking more about it.
We can safely assume the HBCC is a memory controller, DMA engine and IOMMU.
When looking up IOMMU the wiki page mentions the IOMMU as GART, and GART allows for a graphic card to do that expanding of the limited gpu memory to system memory.
And AMD continued to work on that idea.
https://en.wikipedia.org/wiki/Graphics_address_remapping_table

But i think it does not stop there.
The whole HSA idea of AMD is to turn the GPU in some sort of specialized CPU.
And a cpu is not complete without an MMU. A memory managing unit.
https://en.wikipedia.org/wiki/Memory_management_unit
And a MMU can divide the virtual address space in pages with the help of a TLB.
And that sounds very familiar with this slide that i noticed in the post of another forum user :

slides-09.jpg



The HBCC is in my opinion a mixture of an IOMMU, an MMU, DMA engine and memory controller.
And this allows AMD to do nifty things. :)
I see your point that the HBCC is most useful for professional workloads especially with cards with a NVMe upon it.
And i think it will be useful for running virtual machines that share the gpu processing power with proper security measures.
For gaming at the moment not perhaps but using the system memory as to evict pages can be useful once the game developers are familiar and use these capabilities.
AMD does have a history of thinking up great ways to increase the performance for the end user.
I give them the benefit of the doubt, even when it takes a few driver updates and a few months extra before they can release what they presented in slides.
 
  • Like
Reactions: DarthKyrie
May 11, 2008
22,565
1,472
126
Videocardz have been spot on with the Vega leaks so far.

AMD-VEGA-10-specifications.jpg


AMD-VEGA-20-specifications.jpg


AMD-VEGA-10-VEGA20-VEGA-11-NAVI-roadmap.jpg


AMDGPU.jpg


What we can speculate, if AMD execute according to plan in these slides.
2017
- Vega 11 as Polaris 10/11 replacement will be due first. From the Linux sources, GFX9/Vega is not tied to HBM, Vega 10 is. (Vega 10 is an implementation of the Vega Arch). So Vega 11 might not come with HBM if it is a low cost part. Jeffrey Cheng, AMD Senior Fellow, also confirm the Vega Arch being independent of the memory type.
- Vega 10 x2 is also due in H2. There are rumors of ASUS already working on such a part. Incidentally ASUS is also the first AIB to have non reference designs for Rx Vega 64/56. Vega 10 x2 much higher BW than Rx Vega 64. According to some knowledgeable posters here, Samsung HBM2 on Rx Vega 64 is not having clock issues but most likely downclocked to save some power. Not clear if 2x the HBM2 stacks only, or 2x of Vega 10 CU + HBM2 stacks. But the slide reads 64CUs.
- GFX9 Design bugs get squashed in the meanwhile?
- AMD keep mentioning that Vega 10 is the first implementation of the Vega architecture and also the first Infinity Fabric GPU.
2018
- Vega 20 happens to have the xGMI IF which like in EPYC is use in Multi Socket communication across PCIe. With PCIe Gen4 maybe a possibility for this to be used to enable the dual GPUs to behave as a single GPU to the SW doing away with Crossfire which incidentally AMD is also keen to drop support in the future. Much higher DP FP with ECC. Looks tailored for Compute.
-
2019
- Navi scalable arch would probably be an evolution of Vega 20 seems to me. Next gen memory + Scalability, according to some AMD slides. Low cost HBM3?

The Linux sources here, gives a wealth of information.

Not present in the slides is also Raven Ridge with Vega/GFX9.1 Graphics. The driver for GFX9 better improve otherwise AMD has a lot to lose, maybe not as much in Compute.

EDIT:
With Navi tapeout already planned for H2 2017, The likely candidates for Navi memory would have been set. HBM3 or GDDR6?

One of the interesting thoughts I have in mind is that, how the challenges being solved for Infinity Fabric on CPUs (Client and Server CPUs) and GPUs is being fed back to design to make it more scalable than it is at IF v1.0 for Ryzen 7. How would the Infinity Fabric evolve? To me the Infinity Fabric is the most exciting part of Ryzen and Vega.

Disclaimer: I have a lot of AMD shares, so I have something at stake :oops:.

I was thinking of what could happen with future VEGA and ZEN CPU+SOC versions. The current ryzen CPUs are really SOC, a system on a chip.
SInce PCIe and IF and possibly GMI can share the same pins and the selection is just configuration of the pins and serdes through configuration registers in the SOC, we might see a future zen system that has by default a 16x PCIe link when using older AMD GPUs or nvidia GPUs and perhaps it is possible to turn the 16x PCIe link into a GMI or IF link when vega 20 is detected by the soc software in the agesa. But that might have to wait until PCIe v5 is in effect because of the higher clock frekwenties and new required PCIe connectors for high clock frequencies.
https://www.extremetech.com/computi...ns-launch-pcie-5-0-2019-4x-bandwidth-pcie-3-0
 
Last edited:

DaQuteness

Senior member
Mar 6, 2008
200
34
86
Probably the Redux driver will allow the NGG path. This is the next big Crimson release. Sometime november or december...

I'll be completely honest and say I haven't carefully gone through every single post in this thread, but there is a good indication that the next iteration of PRO Crimson driver will be around the 13th September.

https://pro.radeon.com/en-us/announcing-radeon-pro-software-crimson-relive-edition-vega-based-radeon-professional-graphics/
Radeon™ Pro Software Crimson ReLive Edition for “Vega”-based Radeon™ Professional Graphics is expected to be available to download with the availability of the Radeon™ Pro WX 9100 and Radeon™ Pro SSG graphics cards.
http://www.amd.com/en-us/press-releases/Pages/radeon-pro-wx-9100-2017jul30.aspx
The Radeon Pro WX 9100 and Radeon Pro SSG graphics cards' planned availability is September 13. The expected MSRP of the Radeon Pro WX 9100 is $2,199 and the expected MSRP of the Radeon Pro SSG is $6,999.

Here's to at least two weeks of restless anticipation... $7k for the SSG... wow :|

Edit: Just realised that's still within RMA period :)) I wouldn't return my FE though, there have been so many screw-ups with this launch it's an almost historical event, imbuing Vega with almost collectible value =)) The card itself though is solid. Even with crap drivers it's still swinging punches at 1080Ti in terms of compute power, and even higher end Quadros.
 
Last edited: