• We should now be fully online following an overnight outage. Apologies for any inconvenience, we do not expect there to be any further issues.

Vega refresh - Expected? How might it look?

Page 4 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
May 11, 2008
22,565
1,472
126
It's an old picture. There are newer ones w/ updated components and respective latencies.
Picked the 1st one I found while googling as it conveys the point of wildly differing access latencies for respective components and scales therein. People often throw around components while ignoring the access latencies.

Also, to put this marketing gimmick to rest : All modern video cards have HBCCs as its nothing more than a DMA controller paging memory and communication back and forth through system memory. The only question is how its implemented, how well it performs, and what code/driver support it needs to function. Radeon hasn't provided any of these details thus its nothing more than fanciful marketing until then.

Hilariously, Nvidia outperforms Radeon cards by 10fold factors when it comes to this area of the pipeline. So, don't expect some earth shattering change when the details come as well as real world non-canned performance.

I'm really getting tired of the Vega b.s and I was 100% on board waiting for its arrival.
The more I look into the technical details about the card the more I understand how little of note these marketed features are. I don't judge AMD CPU division in the same light. They seem to have actually gotten themselves together and fixed such glaring issues in their hardware pipeline.

Interesting that you mention DMA, because that was what i was thinking about also when reading about hbcc in the previous posts and i was thinking about the IOMMU as well. The DMA controllers indeed take care of retrieving data in parallel with current execution. And it is normally the game engine or 3d cad software that must schedule the DMA transfers correctly such that the DMA transfers actually prefetch data before it is needed by the execution units in the GPU. Otherwise the execution units would be stalled while waiting for data and that nullifies the use of DMA for these kind of cases. That is DMA under control by software, now to the gpu iommu.
The iommu takes care of the virtual addressing of the device to physical addressing of addressable system memory( All the memory locations the cpu can address, not just the physically present and available memory) .
I think the the iommu can also perform some cache tasks but i am not sure about this.
And that is what got me wondering.

I wonder if HBCC works well in a HSA environment.
The whole zero copy idea of HSA was that only pointers are passed and no data is actually unnecessarily copied, saving bandwidth and preventing high latencies.
On a cpu/gpu combination like an APU with a single memory space that works fantastic.
On a cpu + gpu system with a gpu connected over a high speed serial port (alike lots of serial links in parallel = PCIe or IF), only pointers are passed over the high speed serial port. And the
gpu can access main system memory over the serial port (like PCIe). But as you mentioned, that is indeed the slow down factor. And the only way to prevent that factor is to do intelligent prefetching.
Normally the game engine or 3D cad program takes care of this prefetching. ( I mean with prefetching here that the data is modified ahead of actual use, hiding the latency.)

190px-MMU_and_IOMMU.svg.png


720px-HSA-enabled_virtual_memory_with_distinct_graphics_card.svg.png



If i understand correctly, AMD boasts that they can do that prefetching automatically through the driver and even from slower devices such as SSD storage.
I wonder how well that works. The software does have to tell the hardware what to retrieve in advance.
It would be really difficult to track all behavior of a program to know what is needed.
That would make a gpu driver very complex. :neutral:
And it seems that AMD already has a dire need to increase the software department for the gpu.

Also , the HBCC usage should extend the physical addressing from system memory only that a normal iommu does to data on storage as well (HBCC is designed with 512TB of addressing range).
So, virtual addressing to system memory and other storage devices. At least , that is what i interpreted from it.
When looking at the diagram from this AMD whitepaper, the HBCC is replacing the memory controller. Note the lack of an IOMMU. So, i can assume the HBCC is also doing the IOMMU tasks.

vega.jpg~original





When reading here,
http://www.guru3d.com/articles-pages/amd-radeon-rx-vega-64-8gb-review,31.html

Starting with Vega architecture you have the ability to use a bit of your system memory and assign it to the graphics card. In the drivers (global settings) you will see a HBCC (High Bandwidth Cache Controller) entry, you can assign a part of your system memory that the graphics card can then use as extra cache.
Now as my editor Ian reminded me on, this idea is pretty similar to technologies used for years like Turbocache (Nv) or Hypermemory (ATi) like ten years ago. HBCC it is totally different and yet offer the same.

That sure is interesting.
And that is what you mentioned. :)
Nvidia turbocache and ATI hypermemory.
But AMD claims they can do it better now by making use of mass storage devices as well.

edit:
Forgot to mention that the DMA engine is missing too from the picture.
XDMA is for crossfire, i assume that the HBCC is doing the DMA and IOMMU tasks aside from being a memory controller.
 
Last edited:
  • Like
Reactions: ub4ty

krumme

Diamond Member
Oct 9, 2009
5,956
1,596
136
Dude you've talked a lot about how everything already does the "HBCC" thing, but you are missing most of the points.

I think you need to accept/realise the huge CPU/driver overhead advantage with dGPU "directly" accessing virtual memory over a conventional PCIe link (or any link for that matter). HBCC with hardware virtual memory and direct access improves over the conventional methods without even talking about the HSA/hUMA/IF hardware capabilities. The alternative, and what you seem to think is comparable, is a relative mess of CPU run drivers and memory conversions with a massive latency disadvantage.
Set overhead and hardware aside.
I an real world with real worlds developers is it even economic feasible?
I mean you need something that have a minimum of transparency for the people working with it or its a gigantic mess of complexity.
A point from Zlatan is often to witch degree its transparent and accessible to the developer.
 

maddie

Diamond Member
Jul 18, 2010
5,158
5,545
136
That's the thing.

A few of the negative HBCC crowd states:
1) HBCC is just a renamed DMA that has been in use forever by both parties.
2) You have to do extraordinary coding for it to work well.

I give up. We'll see soon enough.
 

eek2121

Diamond Member
Aug 2, 2005
3,420
5,066
136
Once they get the money, they need to cut completely new dies for vega w/o all the extra power sucking pro features that will either be gimped or are running background disabled. There's a reason why Nvidia has so many die cuts for the same micro-architecture and why they use far less power than vega RX per performance. I can respect Radeon for what they've done on their budget and understand its quite amazing. I also understand they likely didn't have the budget to cut completely separate dies for consumers. In a way they signaled to this by reducing how much they were pushing RX Vega and how it rolled out. I'm just surprised with it out now how much they continue to highlight features that I know are either disabled or going to be gimped. It's here that they are setting up false expectations that will cost them in the long run because they're setting people up for disappointment. Now, could I be wrong? yeah .. but don't you think they'd confirm it and not leave it up in the air if they intend to keep and make such features functional? They've have to have brain dead marketing/etc to not... So, there it is.

I was hoping for magic but am slowly realizing its not gonna happen... they have a pro line that cost 4-40x as much to sell after-all.

Vega you see today is literally a cut die made for pro compute workloads w/a GPU pipeline shoved in. Shrinking gate size doesn't fix this. Spinning a custom consumer die w/ out all the pro level features does.

It has nothing to do with 'pro features'. All the extra die space was needed to scale clocks up on 14LPP. AMD themselves stated that much.
 

Krteq

Golden Member
May 22, 2015
1,009
729
136
A few of the negative HBCC crowd states:
1) HBCC is just a renamed DMA that has been in use forever by both parties.
2) You have to do extraordinary coding for it to work well.
Both of these claims are terribly wrong
 

ub4ty

Senior member
Jun 21, 2017
749
898
96
Interesting that you mention DMA, because that was what i was thinking about also when reading about hbcc in the previous posts and i was thinking about the IOMMU as well. The DMA controllers indeed take care of retrieving data in parallel with current execution. And it is normally the game engine or 3d cad software that must schedule the DMA transfers correctly such that the DMA transfers actually prefetch data before it is needed by the execution units in the GPU. Otherwise the execution units would be stalled while waiting for data and that nullifies the use of DMA for these kind of cases. That is DMA under control by software, now to the gpu iommu.
The iommu takes care of the virtual addressing of the device to physical addressing of addressable system memory( All the memory locations the cpu can address, not just the physically present and available memory) .
I think the the iommu can also perform some cache tasks but i am not sure about this.
And that is what got me wondering.

I wonder if HBCC works well in a HSA environment.
The whole zero copy idea of HSA was that only pointers are passed and no data is actually unnecessarily copied, saving bandwidth and preventing high latencies.
On a cpu/gpu combination like an APU with a single memory space that works fantastic.
On a cpu + gpu system with a gpu connected over a high speed serial port (alike lots of serial links in parallel = PCIe or IF), only pointers are passed over the high speed serial port. And the
gpu can access main system memory over the serial port (like PCIe). But as you mentioned, that is indeed the slow down factor. And the only way to prevent that factor is to do intelligent prefetching.
Normally the game engine or 3D cad program takes care of this prefetching. ( I mean with prefetching here that the data is modified ahead of actual use, hiding the latency.)

190px-MMU_and_IOMMU.svg.png


720px-HSA-enabled_virtual_memory_with_distinct_graphics_card.svg.png



If i understand correctly, AMD boasts that they can do that prefetching automatically through the driver and even from slower devices such as SSD storage.
I wonder how well that works. The software does have to tell the hardware what to retrieve in advance.
It would be really difficult to track all behavior of a program to know what is needed.
That would make a gpu driver very complex. :neutral:
And it seems that AMD already has a dire need to increase the software department for the gpu.

Also , the HBCC usage should extend the physical addressing from system memory only that a normal iommu does to data on storage as well (HBCC is designed with 512TB of addressing range).
So, virtual addressing to system memory and other storage devices. At least , that is what i interpreted from it.
When looking at the diagram from this AMD whitepaper, the HBCC is replacing the memory controller. Note the lack of an IOMMU. So, i can assume the HBCC is also doing the IOMMU tasks.

vega.jpg~original





When reading here,
http://www.guru3d.com/articles-pages/amd-radeon-rx-vega-64-8gb-review,31.html



That sure is interesting.
And that is what you mentioned. :)
Nvidia turbocache and ATI hypermemory.
But AMD claims they can do it better now by making use of mass storage devices as well.

edit:
Forgot to mention that the DMA engine is missing too from the picture.
XDMA is for crossfire, i assume that the HBCC is doing the DMA and IOMMU tasks aside from being a memory controller.
Very good detailing and eye you have. The two things that my comments center on is :
i assume that the HBCC is doing the DMA and IOMMU tasks aside from being a memory controller.

That would make a gpu driver very complex.
And it seems that AMD already has a dire need to increase the software department for the gpu.

Also HBCC being a cut down hand me down from the pro-line for handling the on board SSD that RX vega doesn't have :
pro-ssg.jpg


So, you see where my critical comments come from. For RX Vega, until it is further detailed and performance numbers provided, I consider it TBD marketing hype. IOMMU and DMA asych pre-fetch transfers are already hard enough and Nvidia kicks Radeon's behind here. So, for Radeon to come out of nowhere and fix their IOMMU/DMA performance as well as add on incredibly new features that are heavily dependent on drivers (Where they lack).. Yeah, I'm going to be quite skeptical and its warranted
 

Krteq

Golden Member
May 22, 2015
1,009
729
136
Neither are incorrect. Your lack of follow up of technical detailing to the contrary show the weight of your off hand comment.
AMD clearly states that HBCC is driven internally by driver and there is nothing exposed by driver for any kind of "extraordinary coding" from app/game developers.

Regarding HBCC being a simple DMA engine with some "secret sauce". Yes, It's using DMA (like any other memory interface :rolleyes:), but there is a new paging system and "secret sauce" doing a magic.

Please re-read Vega Whitepaper for more info and stop spreading misinformation.
 

Krteq

Golden Member
May 22, 2015
1,009
729
136
Yes, we know how it works, but there are some users thinking that all of this is some kind of "marketing hype" :rolleyes:
 
  • Like
Reactions: DarthKyrie

ub4ty

Senior member
Jun 21, 2017
749
898
96
AMD clearly states that HBCC is driven internally by driver and there is nothing exposed by driver for any kind of "extraordinary coding" from app/game developers.
So, either it's garbage or its not and requires developer support. You can't handle a diverse range of data flows using automated paging w/o developer support. You can make a limited algorithm that accurately pages some flows and only performs well for the ones you've codified or instantiated in hardware. Clearly you're knowledge of caching is limited so please stop repeating what AMD said as if its gold or defies the laws of nature.

Regarding HBCC being a simple DMA engine with some "secret sauce". Yes, It's using DMA (like any other memory interface :rolleyes:), but there is a new paging system and "secret sauce" doing a magic.

Please re-read Vega Whitepaper for more info and stop spreading misinformation.
DMA + IOMMU + some caching logic w.r.t to on-demand paging/pre-fetch hardware/software wasn't invented yesterday. Stop shilling or behaving like marketing speak means anything to engineers who've designed such hardware. Until they prove it performs leaps and bounds better or detail what exactly is behind their rebranding of industry standard subsystems, HBCC is nothing more than what everyone calls it.
 

PeterScott

Platinum Member
Jul 7, 2017
2,605
1,540
136
Yes, we know how it works, but there are some users thinking that all of this is some kind of "marketing hype" :rolleyes:

Not sure what you are rolling your eyes about. HBCC is marketing hype. It's doing what everyone already does, AMD might have tweaked their algorithms slightly, but having faster memory on your card that you move data across PCIe from main memory is what GPUs have done almost as long as their have been GPUs.

AMD has given a new marketing name, but it does nothing significant for any normal performance case.
 

ub4ty

Senior member
Jun 21, 2017
749
898
96
slides-08.jpg

slides-09.jpg

The amount of nonsense spread over Vega on forums lately is absurd. I hope this ends any discussion about HBCC.

One thing a whole swath of the population needs to grasp is that you don't know a single thing about an advanced topic just because you watched a youtube video or are able to link to marketing slides. Nothing in the above pictures is revolutionary. All modern video cards do this.
https://en.wikipedia.org/wiki/Paging

It's been around since 1960. Only in modern 2017 does someone argue ad naseum about something they know absolutely nothing about nor went to school simply because they have access to marketing slides.

> Memory pages
> Dynamic Paging

OMFG, Radeon is changing everything....
You're not impressing anyone who knows how this stuff works. So, cut it out. If you want to speak definitively on something, please spend some time googling terms in a marketing slide so you don't come off as a brainlet.

So, for one last time : HBCC is nothing worthy of note until proven otherwise.
2ba776f41dc9af86e37858c6b8260293-650-80.jpg

Next topic ! Everyone's a closet Computer Engineering major all of a sudden. The amount of noise coming from the uninformed masses is insufferable sometimes.

Lastly, because people making wild comments about magical HBCC have not even the most basic understanding of it, they manage to forget that the paged in memory has to go somewhere. If your onboard VRAM is fully utilized causing you to need to page in memory, where are you going to put it? Evict things out of VRAM. What if what you just evicted is needed? What if this happens over and over again? https://www.quora.com/What-is-cache-thrashing

SSG PRO cards put the paged in memory onto NVME which is huge and then the HBCC manages it from there. As this only works well for file based contiguous pages, this is mostly beneficial for rendering/video editing. This is going to perform like arse in a gaming environment or no better than what already exists because what already exists dynamically loads content into the video for use.

You got duped by marketing and a hand-me-down SSG PRO feature except your RX consumer vega has no NVME so this is literally retarded.
 
Last edited:
  • Like
Reactions: MangoX

Glo.

Diamond Member
Apr 25, 2015
5,930
4,991
136
http://www.techarp.com/articles/amd-vega-memory-architecture/

  • AMD Vega was specifically architected to handle big datasets, with a heterogenous memory architecture, a wide and flat address space, and a High Bandwidth Cache Controller (see 1:34).
  • Large amounts of DRAM can be used to handle big datasets, but this is not the best solution because DRAM is costly and consumes lots of power (see 2:54).
  • AMD chose to design a heterogenous memory architecture to support various memory technologies like HBM2 and even non-volatile memory (e.g. Radeon Solid State Graphics) (see 4:40 and 8:13).
  • At any given moment, the amount of data processed by the GPU is limited, so it doesn’t make sense to store a large dataset in DRAM. It would be better to cache the data required by the GPU on very fast memory (e.g. HBM2), and intelligently move them according to the GPU’s requirements (see 5:40).
  • The AMD Vega’s heterogenous memory architecture allows for easy integration of future memory technologies like storage-class memory (flash memory that can be accessed in bytes, instead of blocks) (see 8:13).
  • The AMD Vega has a 64-bit flat address space for its shaders (see 12:08, 12:36 and 18:21), but like NVIDIA, AMD is (very likely) limiting the addressable memory to 49-bits, giving it 512 TB of addressable memory.
  • AMD Vega has full access to the CPU’s 48-bit address space, with additional bits beyond that used to handle its own internal memory, storage and registers (see 12:16). This ties back to the High Bandwidth Cache Controller and heterogenous memory architecture, which allows the use of different memory and storage types.
  • Game developers currently try to manage data and memory usage, often extremely conservatively to support graphics cards with limited amounts of graphics memory (see 16:29).
  • With the introduction of AMD Vega, AMD wants game developers to leave data and memory management to the GPU. Its High Bandwidth Cache Controller and heterogenous memory system will automatically handle it for them (see 17:19).
  • The memory architectural advantages of AMD Vega will initially have little impact on gaming performance (due to the current conservative approach of game developers). This will change when developers hand over data and memory management to the GPU. (see 24:42).
  • The improved memory architecture in AMD Vega will mainly benefit AI applications (e.g. deep machine learning) with their large datasets (see 24:52).
 

ub4ty

Senior member
Jun 21, 2017
749
898
96
http://www.techarp.com/articles/amd-vega-memory-architecture/

Hetergeneous memory architecture = marketing speak.
All modern computing systems and GPUs have a heterogenous memory architecture in that they have multiple levels of memory modules interconnected by various levels of caching/paging. Radeon slapped NVME solid state storage on their PRO vega and added another layer of memory. NVME access times are dog **** compared to HBM/DDR. This works excellent for video editing and graphics rendering whereby you have large asset files. All this new memory element does is cut down on PCI-E transfers by allowing a more significant block of assets to be stored on the GPU.

RX Vegas don't have NVME storage on the GPU. Thus, in the case of RX Vega (consumer cards) HBCC literally does nothing beyond DMA/IOMMU dynamic paging just like any other GPU including their currently line of pre-vega GPUs.

  • Large amounts of DRAM can be used to handle big datasets, but this is not the best solution because DRAM is costly and consumes lots of power (see 2:54).
Thus why you dynamically page in data across PCIE into GPU memory which is what every modern video card does. HBCC on RX Vega can't do anything beyond this because it has no place to put the paged in memory besides into the traditional GPU pipeline memory. On PRO SSG, it can put it into NVME storage and what it can put there are typically large and contiguous asset files. Video games don't have the same access patterns thus why this is overhyped FUD.

Radeon slapped NVME on their PRO line GPUs for large asset storage. They then added extra logic to a traditional DMA/IOMMU to handle another hierarchy of memory. Marketing then used a non standard name to wow brainlets and you get the second coming : HBCC. RX Vega has no NVME and thus you get extra transistors in a memory subsystem that just burn energy doing nothing. Welcome to one of many RX Vega power leeches.

  • At any given moment, the amount of data processed by the GPU is limited, so it doesn’t make sense to store a large dataset in DRAM. It would be better to cache the data required by the GPU on very fast memory (e.g. HBM2), and intelligently move them according to the GPU’s requirements (see 5:40).
Make up your mind Radeon. L1->L2->L3->HBM2->NVME->SYS memory
NVME is a new memory hierarchy element. All nvidia cards have :
L1->L2->L3->DDR5/DDR5X->SYS Memory


  • The AMD Vega’s heterogenous memory architecture allows for easy integration of future memory technologies like storage-class memory (flash memory that can be accessed in bytes, instead of blocks) (see 8:13).
Literal filler

  • The AMD Vega has a 64-bit flat address space for its shaders (see 12:08, 12:36 and 18:21), but like NVIDIA, AMD is (very likely) limiting the addressable memory to 49-bits, giving it 512 TB of addressable memory.
Computer architecture 101.

  • AMD Vega has full access to the CPU’s 48-bit address space, with additional bits beyond that used to handle its own internal memory, storage and registers (see 12:16). This ties back to the High Bandwidth Cache Controller and heterogenous memory architecture, which allows the use of different memory and storage types.
This is how all computing hardware works :
64-bit+Logical+Address+Space.jpg


  • Game developers currently try to manage data and memory usage, often extremely conservatively to support graphics cards with limited amounts of graphics memory (see 16:29).
Game developers have different modes w/ different memory allocations and different access patterns for various GPUs. This wont change.

Proprietary nightmare that will likely perform like arse for some games. Not gonna happen.

  • The memory architectural advantages of AMD Vega will initially have little impact on gaming performance (due to the current conservative approach of game developers). This will change when developers hand over data and memory management to the GPU. (see 24:42).
  • The improved memory architecture in AMD Vega will mainly benefit AI applications (e.g. deep machine learning) with their large datasets (see 24:52).

LOL, given how much b.s is packed into their detailing of these features, how little they've disclosed to technically component individuals who do this for a living. Given how most of these features are disabled or whose support software/drivers are in beta stages, this is not going to happen. Nvidia dominates GPU compute because their software is mature enough to handle it, because they speak to their target markets w/o fluff and get to hard #'s and industry standard capabilities. Vega is half way between here and there and isn't going to benefit anyone until the marketing and business group learns how to let engineers speak about their product and speak to the technically component individuals who would likely buy it. Spreading these amateur hour slides around isn't going to get someone who knows GPU architectures in and out to buy your cards especially when they know what deficiencies your micro architecture have and how much of a catch up game you're playing.

Learn how to be straight with people and speak to your target market like you respect them and maybe you might make some sales.

I am making an eval decision by the end of the night.
I will be buying one unit of Vega understanding that driver/software support are arse and understanding that all of these hyped up features will likely under deliver, be gimped, or outright disabled. If this occurs, I am never touching another Radeon card again. They want to hype, so lets see where the hype takes things... If they disappoint me, I'm done w/ them.

As a side note, my main development occurs on Nvidia cards. I bought more when the RX Vega 64 pricing fiasco hit. I also demoted my evaluation basis for Vega to one unit. So, they've got one try to deliver on their hype. Above was a deconstruction of the hype.. So, I am 100% aware of what I'm getting : A basic GPU whose driver/software stack hasn't bee figured out.
 
Last edited:

Krteq

Golden Member
May 22, 2015
1,009
729
136
One thing a whole swath ...
You are kidding me right? Calm down, take a long breath... and return back to studies.

Yes. paging system is used by every memory systems since ages ago, same as DMA. What you are missing is a link between implementation in previous/current GPUs and a new system introduced in Vega.

As I told you before, go back to Vega whitepaper and find a "New memory hierarchy and the High-Bandwidth Cache Controller" section.

Regarding that GTC slide. Yes, GP100 (only) is also capable of 49-bit addressing, but with a far worse paging granularity. Also, it's only exposed via last version of CUDA, you can't use it outside CUDA environment.


//OMG, you are mixing apples and bananas in that last post *facepalm*
 
  • Like
Reactions: DarthKyrie

IllogicalGlory

Senior member
Mar 8, 2013
934
346
136
Interestingly, while 4GB of HBM was definitely a downside to Fiji, maybe the talk about HBM acting like cache and not being subject to traditional limitations may have been somewhere close to the mark.

Regardless, HBCC has never been marketed as providing a huge increase in performance, (I remember a slide claiming up to 7% improvement in games, most likely on special case, can't find it now) but rather being useful in situations when there is too little available VRAM. The two demos they gave showed its benefits when the chip was limited to 2GB of VRAM, where it had a dramatic effect. I don't think that 2GB NVIDIA cards or older AMD cards are capable of handling Deus Ex and Tomb Raider at 4K. I don't see how it can be concluded that there's nothing special about the solution when it provides results we definitely haven't seen before. Whether this is a problem that needed solving for consumer/gaming uses is quite a different question, and the answer is probably "no". Might've been useful on Fiji, though.
 

ub4ty

Senior member
Jun 21, 2017
749
898
96
You are kidding me right? Calm down, take a long breath... and return back to studies.
Lets see what backs up your unsubstantiated comment shall we...

Yes. paging system is used by every memory systems since ages ago, same as DMA.
K, so I am correct .. everyone's on the same basic page and one needs to clearly define what exactly they're doing different, why it changes everything, and how it performs better.... That hasn't occurred. So, it's just DMA. Extending addressing means donkey snot when you have nowhere to put it efficiently. Am I wrong? PRO SSGs have NVMes.. RX vega doesn't. So what's the plan here?

What you are missing is a link between implementation in previous/current GPUs and a new system introduced in Vega.
As I told you before, go back to Vega whitepaper and find a "New memory hierarchy and the High-Bandwidth Cache Controller" section.
I'm not missing it. I detailed it. PRO SSG has NVME. That's the new element in the memory hierarchy. HBCC = new handler for NVME. Explain what in the world HBCC is w/o NVME. I'll wait. I've read the white paper. I know what I'm looking for. Now, explain to me what's new when you take NVME out of the picture? which RX vega doesn't have.

Regarding that GTC slide. Yes, GP100 (only) is also capable of 49-bit addressing
Ah', the magic wears off when you begin getting into technical details. So, Nvidia has had this tech since GP100? Consumer SM has 32-bit addressing and thus far less power utilization. 32-bit = 4GB. Hmmm, mfw when your GPU has 8GB, that seems reasonable.... 512TB addressable and only 8GB GPU. Seems out of scale no? How much RAM do people usually have? How much is even supported on modern CPUs? 64GB/128GB? 49bit addressing ...
> mfw your compute doesn't even support the amount of RAM.
But ah', this makes perfect sense for server market that has that kind of addressable RAM. So, apples to apples were talking about comparing PRO SSG to Nvidia's high end server cards not RX vega w/ no NVME and no mobo that supports 48-bits of worth of addressable RAM. Dat hand me down tech from pro cards that is a fish out of water.

but with a far worse paging granularity.
Compared to what? a non existent detailed HBCC? Is there something you're not telling me about what you know friendo...? This can't wait until September so come out w/ it.

Also, it's only exposed via last version of CUDA, you can't use it outside CUDA environment.

Is radeon going to expose FP16 and HBCC to compute? Or are they going to pull a feature disparity game? Tell radeon's marketing/business group to take the handcuff off its engineers and products and get an answer out. You don't have much time to clarify. Also, creative fingers will be dialing in whatever is gimped. So, lets not make this more difficult than it has to be.

//OMG, you are mixing apples and bananas in that last post *facepalm*
Yeah yeah yeah...
 
Last edited:

evilr00t

Member
Nov 5, 2013
29
8
81
Regarding that GTC slide. Yes, GP100 (only) is also capable of 49-bit addressing, but with a far worse paging granularity.
This seems to be a Pascal (CC6.0+) feature, and doesn't seem restricted to GP100.
Citation needed on paging granularity, and why this matters.

Here's a dump of a modified deviceQuery on my GP107 card:
Code:
Device 2: "GeForce GTX 1050 Ti"
  CUDA Driver Version / Runtime Version          8.0 / 8.0
  CUDA Capability Major/Minor version number:    6.1
  Device supports Unified Addressing (UVA):      Yes
  Device has concurrent managed memory access:   Enabled
Concurrent Managed Memory access - the pageable memory feature you're looking for.
"The device attribute concurrentManagedAccess tells whether the GPU supports hardware page migration and the concurrent access functionality it enables. A value of 1 indicates support. At this time it is only supported on Pascal and newer GPUs running on 64-bit Linux."

Also, it's only exposed via last version of CUDA, you can't use it outside CUDA environment.
afaik HBCC's implementation allows Windows to oversubscribe memory ie. pretend to have more vram on board. This might be a good idea if there were a NVMe SSD (ie. something faster than PCIe-linked host memory) linked to the HBCC controller, but there isn't on this product...

As I see it, HBCC-oversubscribed memory, without a NVMe backing store, is a hardware solution to something that has already been solved in software--in a superior fashion--through developer management of separate device and host memory pools.

Now if we start seeing Vega cards with NVMe slots, HBCC becomes much more interesting.
 
Last edited:
  • Like
Reactions: ub4ty

ub4ty

Senior member
Jun 21, 2017
749
898
96
afaik HBCC's implementation allows Windows to oversubscribe memory ie. pretend to have more vram on board. This might be a good idea if there were a NVMe SSD (ie. something faster than PCIe-linked host memory) linked to the HBCC controller, but there isn't on this product...

As I see it, HBCC-oversubscribed memory, without a NVMe backing store, is a hardware solution to something that has already been solved in software--in a superior fashion--through developer management of separate device and host memory pools.

Now if we start seeing Vega cards with NVMe slots, HBCC becomes much more interesting.

Caveat being if AMD opens up about this and allows developer access to see what the community can come up with. But they're being tight lipped and mum and they already have cards moving off the shelves.

This is going to be some very serious fine wine...
kek
 

Sonikku

Lifer
Jun 23, 2005
15,901
4,927
136
Once everything has been mined some years down the road I fear we could be looking at a defacto Nvidia monopoly with the way things have been going with AMD.
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,596
136
Once everything has been mined some years down the road I fear we could be looking at a defacto Nvidia monopoly with the way things have been going with AMD.
Ahh good grief. Drama is fine for a certain degree but AMD stock price is 300% of what is was a year ago and there is good reasons for it. Their product portfolio and production capacity is in good shape.
 

krumme

Diamond Member
Oct 9, 2009
5,956
1,596
136
It will be the consoles and apu putting Vega tech to market. No matter the paging granularity whatever i fail to see the benefit of hbcc here?
 
Last edited:

Cookie Monster

Diamond Member
May 7, 2005
5,161
32
86
Ahh good grief. Drama is fine for a certain degree but AMD stock price is 300% of what is was a year ago and there is good reasons for it. Their product portfolio and production capacity is in good shape.

And thats probably due to Ryzen and more importantly EPYC platform that will shake Intel's stranglehold on the lucrative server market.