Question AMD Rembrandt/Zen 3+ APU Speculation and Discussion

izaic3 · Apr 2, 2021

Alright, so we've had some leaks so far. I don't know if any of it's been confirmed yet, as it's pretty early, but here is what I've surmised so far (massive grain of salt of course):

If if turns out to have RDNA 2 and 12 CU, I could see iGPU performance potentially almost doubling over Cezanne.

If I've made any mistakes or gotten anything wrong, please let me know. I'd also love to hear more knowledgeable people weigh in on their expectations.

majord · Oct 17, 2021

LightningZ71 said:
What's incredible about it? With a cache hit rate of 50%, the effective bandwidth of that arrangement is over 500GB/sec. With 500GB/sec of bandwidth, you had better have impressive performance! And, at less than 1/5th of that bandwidth, you're not getting too far too fast.

50% hit rate is only at 1080P . I'm more talking about say 4K where the hit rate is as low as 25% , and whilst performance takes a noticeable hit, it's still pulling good numbers. bearing in mind that 25% is just a marketing theoretical number. So yes, whilst from a product pt of view, whilst people are disappointed in the small amount of Icache , and thus relatively "poor" 4K performance , from a technical pt of view, I still find it impressive. 32MB is not a lot of cache after all.

AMD Radeon RX 6600 Review - Great for 1080p Gaming

AMD's Radeon RX 6600 is an excellent choice for 1080p Full HD gaming. As our review shows, the red team's latest release has enough power to achieve 60 FPS at maximum details, yet doesn't require much power to do so. It's actually the most energy-efficient graphics card we ever tested, which...

www.techpowerup.com

The more relevant point I was trying to make, is it's not like they've just thrown bandwidth preservation in the bin, and just slapped cache on to compensate - it's working in harmony, and for arguments sake, I think it can be assumed it's at least marginally better RDNA1 and in any case, certainly far better than VEGA, which is out baseline here.

Probably also worth noting, as a side note, and different angle of view - The GTX1650 has 128GB/s from it's GDDR5 - Vs 192GB/s from the GDDR6 version. The later only 12% faster @ 1080p.. Out of curiosity, where would you place a GTX1650 with theoretical 100Gb/s bandwidth on this chart? smack between between the 1050ti and the 1650 at worst? If so, why is it so hard to believe an RDNA2 IGP with the same B/W , and similar if not higher shader horsepower at the least match the 1050ti , if not beat it? Do you have any reason to believe RDNA2 is inferior to Turing in this regard?

jpiniero · Oct 17, 2021

majord said:
I maintain around 1050ti levels for a desktop 12CU part, 1050 levels for mobile.. nominal. Depends on clocks and RAM speeds. The other unknown is ROPS. Will they move to 32 or just remain at 16 . In fact I'd go so far as to say this would be THE primary factor , not RAM bandwidth.

Neglected to mention that I watched a fairly recent youtube video comparing Cezanne w/3600 to 1050 Ti. Some games it was only 20-30% behind, others it was closer to 50%.

Desktop Rembrandt should easily be able to be faster than a 1050 Ti.

majord · Oct 17, 2021

jpiniero said:
Neglected to mention that I watched a fairly recent youtube video comparing Cezanne w/3600 to 1050 Ti. Some games it was only 20-30% behind, others it was closer to 50%.

Desktop Rembrandt should easily be able to be faster than a 1050 Ti.

Agree , but just using it as a goal post at least. Ultimately, APU's can be tricky to predict performance. Sometimes the Execution doesn't quite match expectations on paper due to the complexities of balancing CPU-GPU TDP's, so none is bottlenecking the other, and correctly managing VRAM bandwidth , This was a real problem in times gone by - Excavator for example was just absolutely terrible. Cezanne and Renoir on the other hand - they perform really well for their Spec's.. I think AMD nailed it there , and if that's a trend, then yeah, it should.

Initially these are mobile parts though , so I guess all this is not really relevant, and will be completely limited by TDP. Certainly not bandwidth.. Will be interesting to see. I'll be keen to see what can be had in the ~25W or so space for my next Notebook.

jpiniero · Oct 17, 2021

majord said:
Initially these are mobile parts though , so I guess all this is not really relevant, and will be completely limited by TDP. Certainly not bandwidth.. Will be interesting to see. I'll be keen to see what can be had in the ~25W or so space for my next Notebook.

I was going to say power limits might become an issue.

Panino Manino said:
That's what you understood? I didn't meant to say that it would be as fast as a midrange discrete GPU!

In a non mining world, the 6600 is probably $279 or maybe even $249 MSRP and street even less than that. Navi 24, if it gets a DIY release, would be faster than Rembrandt and might be more cost effective given fast DDR5 prices.

LightningZ71 · Oct 17, 2021

Of course all of this is highly situational! Some games use a lot more VRAM bandwidth than others, and others will pull a lot more main memory bandwidth, reducing what's available for the iGPU. What we're looking at here is averages.

I don't know where people are getting the idea that Rembrandt will ever get 100+GB/sec bandwidth from ram. That's the current maximum jedec spec for DDR5, and there exist no currently mass production DIMMS that support that transfer rate, nor do we have any confidence that AMD's first DDR5 ram controller can even hit those speeds.

What we're looking at in the next year is mobile applications that are going to struggle to get into the 80 GB/sec range with simply atrocious latencies, which is something we've seen that hurts iGPU performance. We're possibly going to see desktop parts that might see 90 GB/sec in overclocked setups, but still suffering from latency issues. Both of these will still have to share bandwidth with their CPU cores, and both will have real power limits and thermals to manage.

Approaching and maintaining 1050/560 performance should be doable for many games, but, it's going to be very situational and the exception, not the rule. It will certainly, combined with AlderLake's improvements, completely eliminate any realistic market for any of the existing line of Nvidia MX dGPUs in mobile, as the MX450 won't be able to improve on anything. Low power configured 1650s will also not justify their cost. I also have a feeling that 4GB dGPUs will all feel the pressure as these iGPUs won't have those sorts of limits in modern games and won't suffer from having to swap across the PCIe bus to change textures.

But, I'm not expecting world changing performance. If they can give decent FPS in 1080p with mid to high detail settings, that's going to be enough for most. If FSR/XeSS can make native 900p rendering look great at high and ultra detail with playable frame rates, all the better.

jpiniero · Oct 17, 2021

LightningZ71 said:
I don't know where people are getting the idea that Rembrandt will ever get 100+GB/sec bandwidth from ram. That's the current maximum jedec spec for DDR5, and there exist no currently mass production DIMMS that support that transfer rate, nor do we have any confidence that AMD's first DDR5 ram controller can even hit those speeds.

AT's front page literally has an article about 6400 CL36 ram being available in November. No data about pricing of course but desktop Rembrandt isn't shipping for another 6 months either.

6400 btw is peak 51.2 GB/sec per (dual 32-bit) channel.

LightningZ71 · Oct 17, 2021

True, they are coming soon. We have no idea about initial volume numbers on those parts, but DIMMs are coming. No word on SO-DIMMs though. Can't wait to see the prices on those parts as well...

Joe NYC · Oct 17, 2021

LightningZ71 said:
The "bad" 6600 still has at least some Infinity Cache to help hide it's VRAM throughput deficit. Rembrandt has none and dramatically lower VRAM bandwidth. It's not going to be amazing.
(Edited to fix "throughput")

AMD had some nice charts showing Infinity Cache hit rate. Size vs. Resolution.

Here is an image with some commentary and additional numbers by Twitter user Lacusa:

If you extend it far enough on the Infinity Cache size and keep the resolution at 1080, you can get to the point of VRAM being redundant.

An interesting question would then be: cost comparison of the Infinity Cache vs. cost of separate VRAM. It seems to me that with 3D stacking, Cache would be much cheaper, both in terms of silicon components and assembly... Say 128-256 MB of SRAM.

So it would be possible to have near dGPU level performance in iGPU at cheaper price.

With Rembrandt, the good news is that it is so close release (Q1) advancing the AM5 platform.

The bad news is that Rembrand being out so early, it likely missed out on 3D V-Cache, and Infinity Cache, so it will likely just fall short of being a revolutionary product for Notebooks.

It seems that Rembrandt will be a nice enough entry, but with Alder Lake having a much more competitive GPU, the difference between Rembrandt and Alder lake in graphics will not be big enough to change the notebook market share status quo...

The Infinity Cache as V-Cache would have been a game changer...

jpiniero · Oct 17, 2021

Joe NYC said:
It seems that Rembrandt will be a nice enough entry, but with Alder Lake having a much more competitive GPU, the difference between Rembrandt and Alder lake in graphics will not be big enough to change the notebook market share status quo...

Alder Lake, both desktop and laptop should be minimally faster at best than the equivalent Tiger/Rocket model. P/M does include an updated display engine but that's it. Rembrandt is going to be way faster.

Joe NYC · Oct 17, 2021

jpiniero said:
Alder Lake, both desktop and laptop should be minimally faster at best than the equivalent Tiger/Rocket model. P/M does include an updated display engine but that's it. Rembrandt is going to be way faster.

You are right. I got confused comparing wrong things (Alder Lake Desktop vs. Mobile) instead of Alder Lake Mobile vs. Tiger Lake Mobile.

Yeah, so Rembrandt should be quite a bit faster than Alder Lake mobile....

Shivansps · Oct 18, 2021

LightningZ71 said:
You're looking at about 80GB/sec in bandwidth with worse latency numbers than the Vega 8 APUs and no infinity cache, and I'm seeing people expecting better performance than dGPUs with over 100GB/sec of bandwidth.

That is overly optimistic.

Dont look at bandwidth numbers alone, the fact that the RX550 with 112GB/s can get beaten by the 3400G, 5600G and 5700G that only have shared DDR4 memory should tell you that. No to mention that the 64 bit version of the RX550 is trash, it is the AMD 1030 DDR4, so the RX550 it is actually using that bandwidth.

Logic tells me that RDNA2 in place of Vega 8, today, with DDR4 should be faster than what we have now due to better memory efficiency, that alone may be enoght to get to RX560 levels WITH DDR4.

AtenRa · Oct 18, 2021

Has this been posted before ??

leoneazzurro · Oct 18, 2021

AtenRa said:
Has this been posted before ??

Yes, it was posted.

AtenRa · Oct 18, 2021

leoneazzurro said:
Yes, it was posted.

ok thx

TESKATLIPOKA · Oct 18, 2021

LightningZ71 said:
......
Approaching and maintaining 1050/560 performance should be doable for many games, but, it's going to be very situational and the exception, not the rule.
...

I don't know why you still continue saying It will only sometimes perform at the level of GTX 1050.
I already showed how a Rembrandt with only 2GHz IGP + 4800MHz DDR5 should(could) perform based on Ryzen 5 5700g IGP's average performance in 17 games. I got GTX 1050Ti level of performance on average.
Please provide us some data, which can back up what you say.

Joe NYC · Oct 18, 2021

Apple showed AMD today what Rembrandt should have been.

leoneazzurro · Oct 18, 2021

Apple wants to spend a lot of money for TSMC 5n, Rembrandt will not be on that process. Not every customer wants to pay the M1Pro/MAX premium.

moinmoin · Oct 18, 2021

Joe NYC said:
Apple showed AMD today what Rembrandt should have been.

Apple showed the market how a premium APU in a slim laptop could look and perform like. If AMD had a comparable APU before Apple manufacturers would likely still have included a dedicated GPU anyway to make it "more premium".

Joe NYC · Oct 18, 2021

leoneazzurro said:
Apple wants to spend a lot of money for TSMC 5n, Rembrandt will not be on that process. Not every customer wants to pay the M1Pro/MAX premium.

Process node is one aspect that AMD cannot change on the short notice.

But what AMD could have done is added multiple channels of high bandwidth, low(er) power, low(er) latency, low(er) cost memory in the MCM to feed the GPU and CPU as unified memory.

For that, AMD has all the IP. Even the cheap(ish) Steam Deck may have it

Apple already showed this was the path for laptops, and both AMD and Intel are ignoring it.

leoneazzurro · Oct 18, 2021

Laptops costing 2500$ base in a closed ecosystem. AMD can do such a APU, but what is the market for that when the competitor can put a CPU+discrete graphic card for a lot less? Because huge chips on top process node are not cheap.

moinmoin · Oct 18, 2021

Joe NYC said:
Apple already showed this was the path for laptops, and both AMD and Intel are ignoring it.

Are they? Intel is still fundamentally stuck in the "few cores for mobile" mindset it had for most of the last decade (that's what Apple is moving away from after all!), only now really beginning to break free of it. AMD doesn't have a premium APU line to begin with, focusing on fostering OEM relations with its more mainstream and low budget APUs first. Apple's premium laptops will help create the market that both of them can now decide to try to serve as well or not, in a couple of years.

Joe NYC · Oct 18, 2021

moinmoin said:
Apple showed the market how a premium APU in a slim laptop could look and perform like. If AMD had a comparable APU before Apple manufacturers would likely still have included a dedicated GPU anyway to make it "more premium".

I don't entirely agree with this.

APU with multiple channels of memory in MCM is just going to outcompete on cost and power any dGPU + discrete memory with overlapping performance level.

At some point, you just can't compete offering same performance at higher cost and higher power consumption.

The only reason this continues to exist is Intel + AMD + OEMs operating like a cartel, preventing this solution from getting to the market.

But Apple is not controlled by this same cartel.

moinmoin · Oct 18, 2021

Joe NYC said:
APU with multiple channels of memory in MCM is just going to outcompete on cost and power any dGPU + discrete memory with overlapping performance level.

This doesn't exist. Also this by far wouldn't have the efficiency of AMD's current APUs nor of Apple's monolith dies.

NTMBK · Oct 18, 2021

Joe NYC said:
I don't entirely agree with this.

APU with multiple channels of memory in MCM is just going to outcompete on cost and power any dGPU + discrete memory with overlapping performance level.

At some point, you just can't compete offering same performance at higher cost and higher power consumption.

The only reason this continues to exist is Intel + AMD + OEMs operating like a cartel, preventing this solution from getting to the market.

But Apple is not controlled by this same cartel.

AMD aren't the cartel, they're the ones who are hamstrung by dumb OEM decisions. Who keeps putting single channel memory alongside APUs? Who keeps putting in dGPUs that are worse than the integrated graphics? It's not AMD.

Apple control the whole stack, so they can push this through. They also only target the high margin end of the market, so they can afford to spend billions of transistors on an enormous system level cache, and a memory bus wider than a 3080. They have a huge die on a cutting edge process, using advanced packaging techniques. This thing is going to be PRICEY. You can't churn out $500 HP craptops with those sort of design choices.

eek2121 · Oct 18, 2021

moinmoin said:
Apple showed the market how a premium APU in a slim laptop could look and perform like. If AMD had a comparable APU before Apple manufacturers would likely still have included a dedicated GPU anyway to make it "more premium".

I have Joe on ignore, because he likes to troll, but I peaked at his comment because I figured it was ridiculous. The thing is, he is comparing a year old architecture on an inferior process to a brand new CPU on N5/N5P.

Based on 3 different rumors, AMD will likely have 35-45W Zen 4 “H” parts launch at some point, and they will likely be faster and more efficient than the M1. Rembrandt is a stopgap for OEMs.

Question AMD Rembrandt/Zen 3+ APU Speculation and Discussion

Member

Senior member

Lifer

Senior member

Lifer

Platinum Member

Lifer

Platinum Member

Diamond Member

Lifer

Diamond Member

Diamond Member

Lifer

Golden Member

Lifer

Platinum Member

Diamond Member

Golden Member

Diamond Member

Diamond Member

Golden Member

Diamond Member

Diamond Member

Diamond Member

Lifer

Diamond Member