• Guest, The rules for the P & N subforum have been updated to prohibit "ad hominem" or personal attacks against other posters. See the full details in the post "Politics and News Rules & Guidelines."

Question Speculation: RDNA2 + CDNA Architectures thread

Page 220 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Glo.

Diamond Member
Apr 25, 2015
4,824
3,443
136
Yeah it's Rembrandt, 12 CU with 2 SA and no L3/IC.
Would Infinity Cache that is available to the CPU be visible in the GPU driver?

THAT is the key thing to know the answer.

I don't believe that Rembrandt does not have IC and has ne L3 cache. It makes zero sense for it.
 

moinmoin

Platinum Member
Jun 1, 2017
2,676
3,485
136
Would Infinity Cache that is available to the CPU be visible in the GPU driver?
Does the GPU driver need to manage the IC? Honestly assumed all these times that it's managed by the hardware transparently to the driver and OS, never thought of looking at the Linux driver whether that's actually the case.
 

Gideon

Golden Member
Nov 27, 2007
1,429
2,875
136
Does the GPU driver need to manage the IC? Honestly assumed all these times that it's managed by the hardware transparently to the driver and OS, never thought of looking at the Linux driver whether that's actually the case.
At least the very first IC leaks were from Linux drivers where the size was listed.
 

Bigos

Member
Jun 2, 2019
38
61
61
I believe the only thing related to IC in the Linux AMD graphics drivers (other than listing its size, for API query purposes) is MALL (IIRC Memory Access Last Level), which is a method for the display controller to source framebuffer solely from the IC to improve power use.

The fact the L3 is not listed for Rembrandt in the Linux compute drivers suggests it doesn't exist. It would make no sense to hide it from the compute applications (which could size their working set based on the L3 size otherwise).

Also, "Infinity Cache" is a marketing term. The Zen CCX L3 cache != RDNA L3 cache. They are different on a couple fronts:
  • The Zen L3 seems to be a victim cache, i.e. it holds cache lines that have been evicted from a core's L2 cache. It means that in order to fill L3 cache line it must have been present in at least one core's L2 cache.
    On the other hand, RDNA L3 cache seems to be a transparent memory controller cache (given how it is sized based on the memory controller width), i.e. it caches requests to memory.
  • As mentioned before, the cache line size differs. It is 64 bytes on CPU and 128 bytes on GPU. It is still possible to respond to a GPU cache read using two CPU cache reads, but it makes the whole mechanism more complicated (how to handle partial hits?) and not something you want to have on a fast path (that's probably one of the reasons the fastest memory for GPU on an APU is so called "uncached memory", which is not cached by CPU at all).
However, if part of the "Zen 3+" is a redesign of the L3 cache, we might see it being used more often by the GPU. The GPU might be connected to it in a similar way the CPUs are, for example. Given no mention of it in amdkfd makes it however fairly unlikely.
 

beginner99

Diamond Member
Jun 2, 2009
4,845
1,232
136
I terms of the 6800m it seems Asus gimped on the RAM in the G15 making their flagship laptops as reviewed by anantech CPU limited in some games.

In shadow of the tomb raider the FPS goes from 119 (120 in anadtech bench) to 135 and above some 3080 offerings. Even worse the performance also seemes gimped when playing on the laptop screen (just 100fps). So external screen + new ram = 1/3 better performance. :confused_old:
 
  • Like
Reactions: Rigg and lightmanek

moinmoin

Platinum Member
Jun 1, 2017
2,676
3,485
136
I terms of the 6800m it seems Asus gimped on the RAM in the G15 making their flagship laptops as reviewed by anantech CPU limited in some games.

In shadow of the tomb raider the FPS goes from 119 (120 in anadtech bench) to 135 and above some 3080 offerings. Even worse the performance also seemes gimped when playing on the laptop screen (just 100fps). So external screen + new ram = 1/3 better performance. :confused_old:
Looks like chips scarcity hit again, in this case better RAM simply isn't available in sufficient quantity. Not really a good look for the new "AMD Advantage" certification that it allows that though.
 

blckgrffn

Diamond Member
May 1, 2003
8,103
1,357
126
www.teamjuchems.com
Some RX 6700M reviews are out, but nothing for RX 6600M, that's infuriating.
Wasn't that part of the launch announcement? That the high and low end cards were shipping now and the mid range shipping soon?

And with all the leeway they are giving the chassis folks, especially on the 6600M I expect performance to vary a lot from chassis to chassis.

Yay, laptop GPUs. /s

I mean, I want them, but their "configurability" is makes it so hard to know what you are getting.
 

blckgrffn

Diamond Member
May 1, 2003
8,103
1,357
126
www.teamjuchems.com

Second to last paragraph. 6800M and 6600M are shipping now, 6700M is coming someday.

Ah I see what you mean.

I forgot that it was 6800M/6700M/6600M - I had it shifted one number. My bad. Thought there as a 6500M to go with 6700M - the 6700M should be the last one to get benches, even though with that SKU they will be all over the map based on TDP and memory speeds and so on.
 

GodisanAtheist

Diamond Member
Nov 16, 2006
3,284
1,831
136
In this environment, I hope some of the reviewers get out of their comfort zone and bench laptops against some desktop hardware.

It might not be an apples to apples comparison, but a $ to $ comparison in this market would still ultimately be useful.

If I needed a new computer right now, it would be nice to know how much performance I'd be getting out of, say, a $1000 gaming laptop vs the GPU landscape.
 
  • Like
Reactions: scineram

Glo.

Diamond Member
Apr 25, 2015
4,824
3,443
136
AMD just released Radeon Pro W6600.

10.4 TFLOPs, from 1792 ALUs.

That equals to 2.9 GHz core clock.

From 100W GPU.

Let that sink in.
 
  • Wow
Reactions: lightmanek

TESKATLIPOKA

Senior member
May 1, 2020
441
425
96
AMD just released Radeon Pro W6600.

10.4 TFLOPs, from 1792 ALUs.

That equals to 2.9 GHz core clock.

From 100W GPU.

Let that sink in.
I think there must be a mistake, even If It's on the official AMD pages and presentations.
Link Link
As you said It would mean 2.9GHz clockspeed(Turbo) for this 28CU GPU with 100W TBP.
For comparison, 40CU RX 6700XT with 2.58GHz(turbo) has 230W TBP, which already doesn't make sense to need 2.3x more power.

Then we have 28CU RX 6600M with 8.77 TFLOPs and that's 2.45GHz(turbo) or 450MHz lower and not TBP but just GPU power is up to 100W, which is the same. This also doesn't make any sense.

P.S. I already found an error in memory interface for Radeon Pro W6600, which should be 128bit but they mention 4096bit.
 
Last edited:

Glo.

Diamond Member
Apr 25, 2015
4,824
3,443
136
I think there must be a mistake, even If It's on the official AMD pages and presentations.
Link Link
As you said It would mean 2.9GHz clockspeed(Turbo) for this 28CU GPU with 100W TBP.
For comparison, 40CU RX 6700XT with 2.58GHz(turbo) has 230W TBP, which already doesn't make sense to need 2.3x more power.

Then we have 28CU RX 6600M with 8.77 TFLOPs and that's 2.45GHz(turbo) or 450MHz lower and not TBP but just GPU power is up to 100W, which is the same. This also doesn't make any sense.

P.S. I already found an error in memory interface for Radeon Pro W6600, which should be 128bit but they mention 4096bit.
The clock for those GPUs is nowhere to be found on the sp[ec sheets.

Do you believe that AMD would be THAT incompetent to post incorrect maximum, Peak TFLOPs numbers for ALL of important workloads in professional lineup?

Those clocks are unbelievable. But that would require incredible lack of competence from person who runs the professional group at AMD, because he has to say "OK" to everything his teams are doing, and that includes - marketing efforts.
 

TESKATLIPOKA

Senior member
May 1, 2020
441
425
96
If you believe they can't be that incompetent then ok, but can you give me a reasonable answer why this GPU has much higher turbo(~18%) or TFLOPs than the mobile version even though they have the same configuration, yet It consumes less power based on official specs(GPU power 100W vs TBP 100W).

For example:
Radeon PRO W6800 has 17.83 TFLOPs which means 2.3GHz turbo with 250W TBP.
RX 6800 has 16.17 TFLOPs which means 2.1GHz turbo with 250W TBP.
~10% higher Turbo with same TBP, but It's true Vram is doubled. A new better revision maybe? Interesting.
 
Last edited:

Glo.

Diamond Member
Apr 25, 2015
4,824
3,443
136
If you believe they can't be that incompetent then ok, but can you give me a reasonable answer why this GPU has much higher turbo or TFLOPs than the mobile version even though they have the same configuration, yet It consumes less power based on official specs(GPU power 100W vs TBP 100W).

For example:
AMD Radeon PRO W6800 has 17.83 TFLOPs which means 2.3GHz turbo with 250W TBP. This is comparable to desktop RX 6800.
It may mean, that mobile 6600 actually runs at way lower thermal envelope, and that "up to 100W" rating allows the GPU to clock way, way higher than advertised.

Im not saying this is the case. But it could be one possible explenation, that AMD sandbags the mobile 6600M way below its potential.
 

TESKATLIPOKA

Senior member
May 1, 2020
441
425
96
I personally don't believe in higher clockspeed or lower power consumption and that AMD is sandbagging us. The reason is N22 and Its specs in comparison to N23.

BTW no 32CU version out yet. Is really everything sent for Tesla?
 

TESKATLIPOKA

Senior member
May 1, 2020
441
425
96
Do you think even in Tesla is only the 28CU version?
Then where are the full 32CU versions? Unless N23 has physically only 28CU.
That would be hilarious.

If this boost turns out true, then even a 16CU N24 could perform better than RX 5500XT and If It could fit into 75W limit, then a great GPU for HTPC.
 
Last edited:
  • Like
Reactions: Tlh97

Glo.

Diamond Member
Apr 25, 2015
4,824
3,443
136
Do you think even in Tesla is only the 28CU version?
Then where are the full 32CU versions? Unless N23 has physically only 28CU.
That would be hilarious.

If this boost turns out true, then even a 16CU N24 could perform better than RX 5500XT and If It could fit into 75W limit, then a great GPU for HTPC.
Potentially 28 CUs because AMD has to use every single die possible, hence the CU limit of 28 CUs for each SKU possible.

Also this might mean, that 6600XT is actually 28 CUs.

Yeah, even 1024 ALU Navi 24, clocked at 3 GHz(for the giggles) would be 6 TFLOPs of compute power.
 

TESKATLIPOKA

Senior member
May 1, 2020
441
425
96
Potentially 28 CUs because AMD has to use every single die possible, hence the CU limit of 28 CUs for each SKU possible.

Also this might mean, that 6600XT is actually 28 CUs.
The weird thing is that we have 6800M what is a fully unlocked N22 chip and a partially disabled 6700M, but a smaller N23 is limited to only 28CU.

Yeah, even 1024 ALU Navi 24, clocked at 3 GHz(for the giggles) would be 6 TFLOPs of compute power.
At 3GHz It should be capable to go against 3050Ti. :cool:
 
Last edited:
  • Like
Reactions: Tlh97

Glo.

Diamond Member
Apr 25, 2015
4,824
3,443
136
The weird thing is that we have 6800M what is a fully unlocked N22 chip and a partially disabled 6700M, but a smaller N23 is limited to only 28CU.


At 3GHz It should be capable to go against 3050Ti. :cool:
Against 3050 Ti is Navi 14 capable of going with full 24 CUs and 16 Gbps GDDR6, let alone a GPU with more TFLOPs(1024 ALUs at 3 GHz is for gaming better than 1536 ALUs at 1.9 GHz).
 

ASK THE COMMUNITY

TRENDING THREADS