Question Speculation: RDNA2 + CDNA Architectures thread

uzzi38 · Apr 28, 2020

All die sizes are within 5mm^2. The poster here has been right on some things in the past afaik, and to his credit was the first to saying 505mm^2 for Navi21, which other people have backed up. Even still though, take the following with a pich of salt.

Navi21 - 505mm^2

Navi22 - 340mm^2

Navi23 - 240mm^2

Source is the following post: https://www.ptt.cc/bbs/PC_Shopping/M.1588075782.A.C1E.html

jamescox · Sep 12, 2020

Olikan said:
I mean, the pieces of the puzzle kinda makes sense...
1- N21 have 16 slices of L2, this means 256 or 512 memory width... 512 seems very unlike, also we have that photo leak with256-bit

2- a small density increase, and this huge cache can fit with 80Cu... also, Ray Tracing loves huge caches

3- N21 support xGMI, we were all puzzled why...

If they are calling it Infinity cache, then that may imply that it is a separate cache chip of some kind rather than on die. On die would obviously provide the highest bandwidth. Embedded DRAM seems kind of unlikely. It doesn’t seem like it would be stacked although AMD X3D slides show memory stacked on top of GPUs. I would think such stacked memory would be HBM coming much later. I am currently thinking it may be a separate chip of some kind connected in the same package using infinity fabric.

If it is an SRAM cache chip, then that could be low enough latency to act as cpu cache chip also. Perhaps they could reuse such a thing in Epyc processors. They could replace some of the cpu die with cache chips perhaps. It doesn’t seem like it would be that high of bandwidth for a gpu though. Perhaps they could use more or wider links on the gpu.

xilli_fiberbit said:
VEGA 20 has too xGMI feature so with a specific connector for crossfire, on NAVI 1x it was not needed for crossfire

The name “infinity cache“, if that is actually what they are calling it, would seem to imply that it is a separate infinity fabric connected cache chip. Some of AMD’s infinity architecture slides show up to 8 connected GPUs. The slides that I have seen though show only 6 links coming off each gpu. That is CDNA GPUs rather than RDNA GPUs, but the die area taken for infinity fabric connections is quite small. The IFOP style connections (very short distance on package) are probably even smaller than the IFIS connections (off package across system board of gpu board). It really isn’t that high of bandwidth compared to graphics memory though, so it will be interesting to see what they have done. Giant caches would probably be much more useful for ray tracing than rasterization, so perhaps it is an infinity fabric cache specifically for ray tracing. They may be able to make good use of it for rasterization also since AMD GPUs have a cpu like virtual memory management system. They have been using HBM as cache for a while.

The stuff with ray ray tracing is kind of bogus though. Once ray tracing is actually commonly used, the first and current generation GPUs will be terrible at it. It seems like they are basically ray tracing at low res and using AI to patch it up. Ray tracing is essentially the opposite memory access patterns from rasterization. It is much more latency sensitive than bandwidth sensitive. Bandwidth can be scaled with some cost, but latency reduction is much more difficult. Perhaps AMD will have significantly better ray tracing than Nvidia. The image quality on AMD May be better in that case. Rasterization can only do so much be for equality improvements are negligible.

Both Intel and Nvidia have basically been doing the same thing for years. Nvidia just makes massive GPUs. They spent a lot on DX11 optimizations and software scheduling that kind of goes away with DX12. Intel was just pushing 4 cores forever since it maintained ridiculous prices for Xeons with more cores for those that needed them. With the die sizes at the time, we could have had 8 core mainstream at 20 nm. AMD did something entirely different with Zen to compete with Intel and they did more than compete. I wouldn’t even consider buying an Intel cpu right now. They weren’t going to compete with Nvidia by just building a massive gpu to fight Nvidia’s massive GPU. They had to actually innovate, which seems to not be happening as much at Intel or Nvidia. I expect an oustide the box solution from AMD. It may work out or it may not. We will just have to wait and see.

edit: oops, accidentally left an earlier reply...

Olikan · Sep 12, 2020

Just a reminder... BIG PAGE... I posted this some time ago, and i thought it was color compression

set BIG_PAGE on GFX 10.3
"It reduces traffic between CB, DB and TCP blocks if buffers respect a certain alignment."

radv: set BIG_PAGE to improve performance on GFX10.3

Zstream · Sep 12, 2020

So does anyone have info on their two, or three monitor drivers being resolved on this spin? I had to sell my 480 due to monitor sleep bug, and would like to upgrade my 2060 by end of year.

blckgrffn · Sep 12, 2020

JujuFish said:
I've thought about it, but moving up from 1440p 144hz to something better is a pretty hefty investment at the same time as a GPU upgrade.

HDR is a killer feature that you deserve.

SpaceBeer · Sep 13, 2020

Zstream said:
So does anyone have info on their two, or three monitor drivers being resolved on this spin? I had to sell my 480 due to monitor sleep bug, and would like to upgrade my 2060 by end of year.

I don't know what bug you are talking about, but I had 3 monitor setup for some time, with R7 260X and RX 570, and no problems

A/// · Sep 13, 2020

No one said the cache rumor implied it'll be on the die itself. @jamescox has the right idea. A pure cache chip connected with IF is dead cheap. How it'll be used... dunno.

Zstream · Sep 13, 2020

SpaceBeer said:
I don't know what bug you are talking about, but I had 3 monitor setup for some time, with R7 260X and RX 570, and no problems

It was where the main monitor would not turn on after hibernate or sleep. You had to either disconnect the display cord or power off the machine.

Trust me, I’m not trying to disparage AMD. However this bug was the only issue and I’ve dealt with it for years. I was so bothered by it one day I went to micro center and bought a way overpriced 2060 and haven’t had the problem since.

eek2121 · Sep 13, 2020

A/// said:
Actually I would have predicted Zen. You're either really young or need to brush up on your AMD history. Historically, AMD had one-upped Intel at every corner and whooped their butt for years. They brought the 64bit extension to 32bit, and killed Intel's IA-64 goals dead in the water. At one point, AMD was in most datacenters. Intel's major save from the Pentium 4 era was their Core 2 Duo. I won't even go into the hot mess that Pentium D was, emphasis on HOT MESS. AMD has always been an innovater. That is a fact.

However, to expect a 50% increase in performance over RDNA... that's dreaming. I'm sure it'll be good, but the environment that NVidia have developed over time isn't there. This isn't really an AMD fault either. If you're old enough to have owned ATI cards, you'd know they were terrible in software, too.

I knew about Zen long before it came out. If you paid attention to the tech press, you would have, too. Maybe you did, I don't know. I was impressed with Zen. I was impressive with Zen+. I was and still am impressed by Zen2. I know Zen 3 is going to be very good. However, my big question is not whether TSMC can provide in the future. They have to or they lose their major customers. I'm wondering if AMD can continue delivering excellent performance gains per generation.

I don't have a lot of confidence in Intel's rumored 11th gen desktop, Rocket Lake. I have almost zero confidence in Alderlake. I'm sure they'll deliver it, eventually, but as for it being a beast? Heck no.

AMD has publicly stated they are targeting a 50% improvement in efficiency. The console parts show they’ve hit that goal. Why is that so hard to believe? You should see what they were able to do with Vega in Renior.

Midwayman · Sep 13, 2020

eek2121 said:
AMD has publicly stated they are targeting a 50% improvement in efficiency. The console parts show they’ve hit that goal. Why is that so hard to believe? You should see what they were able to do with Vega in Renior.

Efficiency doesn't always translate into performance. I remember them talking up performance per watt with Polaris and then it couldn't hit the clocks needed. The consoles are looking really good though and its on a pretty small die. Hopefully it will scale.

Timmah! · Sep 13, 2020

A/// said:
No one said the cache rumor implied it'll be on the die itself. @jamescox has the right idea. A pure cache chip connected with IF is dead cheap. How it'll be used... dunno.

I honestly doubt you can have cache on separate die. That is IMO similar nonsense as that rumored ray-traversal coprocessor.

Glo. · Sep 13, 2020

Midwayman said:
Efficiency doesn't always translate into performance. I remember them talking up performance per watt with Polaris and then it couldn't hit the clocks needed. The consoles are looking really good though and its on a pretty small die. Hopefully it will scale.

This RTG team, the AMD one also talked about RDNA1 performance/watt improvements over previous generations.

And they pretty much said the truth.

uzzi38 · Sep 13, 2020

Midwayman said:
I remember them talking up performance per watt with Polaris and then it couldn't hit the clocks needed.

Nvidia did a great job following in AMD's footsteps by doing the same thing with Ampere just now. Muh 1.9x perf/W improvement.

Anyway AMD's methodology for perf/W is at equal power consumption, so fret not.

A/// · Sep 14, 2020

Timmah! said:
I honestly doubt you can have cache on separate die. That is IMO similar nonsense as that rumored ray-traversal coprocessor.

I'd like to take credit for that theory, but an Anandtech Editor said it's plausible. It explains the anemicness of the leak from last week but also doesn't make sense because such a large cache would have to be fast and be able to constantly flush and update due to how dynamic games are compared to say a static or fixed motion render.

The Traversal coprocessor was BS due to where it was suggested it would be. Additionally, the same person suggested the VRMs and the memory chips would be on a second PCB connected to the main PCB. This is plain wrong on so many levels yet people fell for it. The Game Cache Andrei mentions as a chiplet is also suspect. I don't think it's been done on a B2C product before, but I could be very wrong here. I'm more interested in how AMD are making up for an otherwise anemic card per fairly accurate leaks (accuracy based on prior rumors). If anyone can get Andrei to hop in this thread and clarify, that would be neat. I'd love to learn something new because I didn't think this was possible. I don't have a reddit and refuse to make one.

I should note that the thread was based on a video titled "RDNA 2 Is Monstrous | Insane Cache System & Performance Info - EXCLUSIVE" which is by "R3d G@m!ng T3ch, I've used alternative symbols to prevent SEO improvement. I don't find his videos viable in the least, and I'm skeptical of his so-called sources which may very well be his son's teddy bear collection for all we know. For the moment the idea of using a large cache only chip is right up AMD's alley. I don't believe any of these Youtube or blog charlatans. We have nothing to go on at the moment other than an anemic engineering board that was leaked last week. It's fun to discuss BS.

Andrei Frumusanu said:
It infers to a 128MB cache chiplet connected to the GPU. This chiplet would probably also see use in Zen3 CPUs.

Andrei Frumusanu said:
There's no reason to believe they're going with organic substrates for this or Milan this year, or be limited to just 32b buses. Making use of those TSMC packaging and a simple RDL enables way more than what we've seen on Zen2.
128MB cache added to an actual active chip is expensive as hell, as a standalone chiplet with just cache and nothing else, which could be on cheaper node, it would be cheap as hell.

Olikan · Sep 14, 2020

A/// said:
I don't find his videos viable in the least, and I'm skeptical of his so-called sources which may very well be his son's teddy bear collection for all we know.

I do, and It's probably AMD marketing the leakers, that's the game it's played today... a non-official, hype machine.

After he hype, near product reveal, he will correct he info "oh, it was total sram on die... infinity cache is a improved HBCC" (that had 20MB)

A/// · Sep 14, 2020

Olikan said:
I do, and It's probably AMD marketing the leakers, that's the game it's played today... a non-official, hype machine.

After he hype, near product reveal, he will correct he info "oh, it was total sram on die... infinity cache is a improved HBCC" (that had 20MB)

You're defending him and then agreeing with me. Which is it?

jamescox · Sep 14, 2020

A/// said:
I'd like to take credit for that theory, but an Anandtech Editor said it's plausible. It explains the anemicness of the leak from last week but also doesn't make sense because such a large cache would have to be fast and be able to constantly flush and update due to how dynamic games are compared to say a static or fixed motion render.

The Traversal coprocessor was BS due to where it was suggested it would be. Additionally, the same person suggested the VRMs and the memory chips would be on a second PCB connected to the main PCB. This is plain wrong on so many levels yet people fell for it. The Game Cache Andrei mentions as a chiplet is also suspect. I don't think it's been done on a B2C product before, but I could be very wrong here. I'm more interested in how AMD are making up for an otherwise anemic card per fairly accurate leaks (accuracy based on prior rumors). If anyone can get Andrei to hop in this thread and clarify, that would be neat. I'd love to learn something new because I didn't think this was possible. I don't have a reddit and refuse to make one.

I should note that the thread was based on a video titled "RDNA 2 Is Monstrous | Insane Cache System & Performance Info - EXCLUSIVE" which is by "R3d G@m!ng T3ch, I've used alternative symbols to prevent SEO improvement. I don't find his videos viable in the least, and I'm skeptical of his so-called sources which may very well be his son's teddy bear collection for all we know. For the moment the idea of using a large cache only chip is right up AMD's alley. I don't believe any of these Youtube or blog charlatans. We have nothing to go on at the moment other than an anemic engineering board that was leaked last week. It's fun to discuss BS.

I don’t see why they wouldn’t be able to make a cache as a separate die. It wold not be super high bandwidth with a single IF link. They could achieve relatively high bandwidth with IF but it would take a lot of links unless they have pci-e 5.0 speeds. Such a cache chip operating over one IF link could provide much lower latency than DRAM, which would probably be very useful for ray tracing. I don’t know how it would be used for just rasterization though. If their top chip is using a 256-bit GDDR6 bus, then they are probably going to need something to make up for that. I don’t know if they could make the cache chip at Global Foundries; perhaps new 12LP+ would be sufficient. That would probably be cheaper than 7 nm and avoid 7 nm wafer production constraints. Using a separate cache chip isn’t a new idea. IBM Power 5 from 2004 was an MCM with 4 cpu die and 4 cache chips.

A/// · Sep 14, 2020

jamescox said:
I don’t see why they wouldn’t be able to make a cache as a separate die. It wold not be super high bandwidth with a single IF link. They could achieve relatively high bandwidth with IF but it would take a lot of links unless they have pci-e 5.0 speeds. Such a cache chip operating over one IF link could provide much lower latency than DRAM, which would probably be very useful for ray tracing. I don’t know how it would be used for just rasterization though. If their top chip is using a 256-bit GDDR6 bus, then they are probably going to need something to make up for that. I don’t know if they could make the cache chip at Global Foundries; perhaps new 12LP+ would be sufficient. That would probably be cheaper than 7 nm and avoid 7 nm wafer production constraints. Using a separate cache chip isn’t a new idea. IBM Power 5 from 2004 was an MCM with 4 cpu die and 4 cache chips.

The quirky part of all this discussion is that it revolves around an image and a tweet about the bus. And, AFAIK, the latter is due to the former. And to be frank, we don't know how old that image even is. Do we?

All I can tell is the mobo looks like an Asus X570 Prime Pro. What doesn't add up is:

1. If AIB partners have yet to get any information on these GPUs, then how does an AIB partner have an engineering sample already?
2. AIB cards are due in 2021, not near launch. Again, why would an AIB partner have a card at this point, days after sites reporting that partners have been left in the dark?
3. If this isn't from an AIB partner, then where is it from? AMD themselves? Why would they leak something?

Olikan · Sep 14, 2020

A/// said:
You're defending him and then agreeing with me. Which is it?

He nailed Radeon7 and consoles specs... so, he do have legit sources, just this time, imo, there is a "jebait".... and maybe even a planned one

Gideon · Sep 14, 2020

This channel is surprisingly informative compared to most other youtubers (with less speculation and more architecture facts):

His RDNA2 and Ampere architecture videos are really good as well.

Helis4life · Sep 14, 2020

A/// said:
The quirky part of all this discussion is that it revolves around an image and a tweet about the bus. And, AFAIK, the latter is due to the former. And to be frank, we don't know how old that image even is. Do we?

All I can tell is the mobo looks like an Asus X570 Prime Pro. What doesn't add up is:

1. If AIB partners have yet to get any information on these GPUs, then how does an AIB partner have an engineering sample already?
2. AIB cards are due in 2021, not near launch. Again, why would an AIB partner have a card at this point, days after sites reporting that partners have been left in the dark?
3. If this isn't from an AIB partner, then where is it from? AMD themselves? Why would they leak something?

The image would be from a amd worker, they said they were working on a vrs bug. An aib wouldnt have a board like that i wouldnt have thought.

Also whats to say that the board itself isnt just from a pcb template, and that it actually has the 8 memory dies connected to it. The could be unpopulated right? Its not without precedent, 2080ti boards did it with one chip, 3080 boards do it with 2. could it not potentially be a chip with 16gb samsung hbm2?

leoneazzurro · Sep 14, 2020

Helis4life said:
The image would be from a amd worker, they said they were working on a vrs bug. An aib wouldnt have a board like that i wouldnt have thought.

Also whats to say that the board itself isnt just from a pcb template, and that it actually has the 8 memory dies connected to it. The could be unpopulated right? Its not without precedent, 2080ti boards did it with one chip, 3080 boards do it with 2. could it not potentially be a chip with 16gb samsung hbm2?

It could be everything, even a full Navi21 with a castrated bus because of testing purposes, as in RDNA the bus is decoupled from the RBEs, as demonstrated with the 5600XT.

A/// · Sep 14, 2020

Helis4life said:
he image would be from a amd worker

Source?

Helis4life · Sep 14, 2020

A/// said:
Source?

Somewhere the picture came with a description in Chinese i guess that stated it was big Navi, the worker was working on a vrs bug, they couldn't run benches as the were using Linux/android. I'll see if I can find it

Helis4life · Sep 14, 2020

Konan said:
Well the 505mm2 and the only source of die sizes came from here (he works for ASE Group)
Twitter KatCorgi questions the size too

https://twitter.com/x/status/1303986125225095169

And regarding the original source of the that Engineering Sample and comments:
View attachment 29611

Specifically calls it Big Navi (say he fixing the VRS bug of big Navi / He can't run 3Dmark because working on Android/ Linux system)
Wouldn't have called it if Big Navi if it was N22...
So to answer your question, no not really

This has the original image with the description

Stuka87 · Sep 14, 2020

Helis4life said:
The image would be from a amd worker, they said they were working on a vrs bug. An aib wouldnt have a board like that i wouldnt have thought.

Also whats to say that the board itself isnt just from a pcb template, and that it actually has the 8 memory dies connected to it. The could be unpopulated right? Its not without precedent, 2080ti boards did it with one chip, 3080 boards do it with 2. could it not potentially be a chip with 16gb samsung hbm2?

IF AMD is planning on a later October unveiling, meaning a November release, AIB's already have their hands on boards, and production is starting to ramp up. They may lack drivers or firmware, but hardware is in hand.

Question Speculation: RDNA2 + CDNA Architectures thread

Platinum Member

Senior member

Platinum Member

Diamond Member

Diamond Member

Senior member

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Golden Member

Diamond Member

Platinum Member

Diamond Member

Platinum Member

Diamond Member

Senior member

Diamond Member

Platinum Member

Platinum Member

Member

Golden Member

Diamond Member

Member

Member

Diamond Member