DisEnchantment
Golden Member
- Mar 3, 2017
- 1,419
- 4,786
- 136
Can you please move this Apple discussion somewhere, thanks.
You need to look at that DTR "laptop" market with unlocked CPUs. They are not about efficiency per se but desktop like performance (including desktop like power usage really) in a laptop form factor, never mind wattage. Power usage at idle won't be of concern in these products.IMHO that might be overly confident as TGL-8C is not that bad in comparison to Cézanne. While on the other hand Vermeer is really bad, efficiency wise wrt ST/light load.
Zen4 might have dual SDP/CCD links, with current SerDes that is going to be a problem.I doubt it needs to reconcile that. Even with the overhead a Raphael-H will already be more efficient than the direct competition. And the direct competition so far has been DTR chips of the likes of 10980HK, 11980HK and likely 12980HK and 13980HK unless Intel changes the model names.
Considering the IOD barely changed at all between Zen 2 and 3 I do expect big changes for Zen 4. When AMD talked about going organic and MCM instead interposer they showed how cramped the floor plan for all the necessary links already is, so around doubling that for DDR5, PCIe5 and SerDes links does seem impossible. We may actually see a more significant change to the current IOD to CCD hierarchy.Zen4 might have dual SDP/CCD links, with current SerDes that is going to be a problem.
Imagine this on 128 Core.
That would be 4x the interconnect energy consumption vs the 64 core part. Or 3x for the 96 core part.
Add to that additional 4 channels of DDR5.
There has to be something changed here.
Oh, Cheese forgot to include it I think, but there were a couple of tables on peak I/O die power draw iirc.Zen4 might have dual SDP/CCD links, with current SerDes that is going to be a problem.
Imagine this on 128 Core.![]()
Details on the Gigabyte Leak
Recently, a ransomware group leaked data from Gigabyte in an attempt to extort payment. That’s been well covered by other outlets (please everyone, secure your networks), so here we’re …chipsandcheese.com
That would be 4x the interconnect energy consumption vs the 64 core part. Or 3x for the 96 core part.
Add to that additional 4 channels of DDR5.
There has to be something changed here.
No because i can make my Zen3 in CB23 ST hold ~3.0ghz while consuming 0.5 watts ( on the core). I lose 33% performance and at 12x reduced power. that's setting TDP to 9 watts, i have like 300 background processes including sql , web frame works etc so in this situation i have more non Cb23 load on my device then not. If i really cared to game your metric i would do a clean install turn off AV , indexing etc.@eek2121
Should I maybe point that I have never ever mentioned M1Pro/Max specifically. I am talking about the Firestorm core of Apple Silicon in general. Because therefore I have hard numbers (AKA facts). Apple is 500% more efficient while 5nm brings smaller 150% - end of story, not?
Could you try running it with Process Lasso holding the workload to a single core in particular? I would but I'm away from my PC for a bit.No because i can make my Zen3 in CB23 ST hold ~3.0ghz while consuming 0.5 watts ( on the core). I lose 33% performance and at 12x reduced power. that's setting TDP to 9 watts, i have like 300 background processes including sql , web frame works etc so in this situation i have more non Cb23 load on my device then not. If i really cared to game your metric i would do a clean install turn off AV , indexing etc.
i am already doing that but just with core affinity, otherwise it moves around to fast to see anythingCould you try running it with Process Lasso holding the workload to a single core in particular? I would but I'm away from my PC for a bit.
I noticed when I last tested this that CPPC2 and the whole switching between preferred cores thing actually did lead to much lower ST power recorded on average, so you would need to lock it to a single thread to get an accurate measurement.
(FYI when I did this on my 3800X about a year ago I got 3.6GHz sustained at 3.2W iirc).
Thankfully, the supported AVX-512 instruction will be using a 64B data paths between cache and execution ports. It wasn't clear to me that that was going to be the case (would have neutered performance otherwise). So, AMD is serious about high throughput AVX-512 compute.Zen4 might have dual SDP/CCD links, with current SerDes that is going to be a problem.
Imagine this on 128 Core.![]()
Details on the Gigabyte Leak
Recently, a ransomware group leaked data from Gigabyte in an attempt to extort payment. That’s been well covered by other outlets (please everyone, secure your networks), so here we’re …chipsandcheese.com
That would be 4x the interconnect energy consumption vs the 64 core part. Or 3x for the 96 core part.
Add to that additional 4 channels of DDR5.
There has to be something changed here.
This isn't an Apple thread. That number "500% more efficient" is wrong. The CPU has a larger power budget than Cezanne. A 5950x core beats the fire storm core in ST performance in GB5, for example, while consuming under 20W for most tasks (including that GB5 bench). Also, Apple's chips are optimized for power, while AMD's chips are optimized for performance and expandability. You can't buy a Mac with 256 gb of RAM. You can't buy a Mac with 12TB of storage. You can't upgrade an M1 based Mac, period. Apple has a slight efficiency lead, most or all is down to a more efficient process (which Zen 4 will launch on), and not having to support DDR4/PCIE4. In addition, you are comparing an old architecture from AMD with a cutting edge architecture from Apple. If you want to continue this discussion about the M1, please take it to the appropriate thread. I am not going to respond further in this thread about it. We aren't interested in talking about Apple in every thread. This thread is about Zen 4.@eek2121
Should I maybe point that I have never ever mentioned M1Pro/Max specifically. I am talking about the Firestorm core of Apple Silicon in general. Because therefore I have hard numbers (AKA facts). Apple is 500% more efficient while 5nm brings smaller 150% - end of story, not?
@moinmoin
IMHO that might be overly confident as TGL-8C is not that bad in comparison to Cézanne. While on the other hand Vermeer is really bad, efficiency wise wrt ST/light load.
The IO die is rumored to be on N6. Also, IF speeds are rumored to be dynamic. AMD was rumored to be playing around with dynamic DDR5 speeds at some point. The current IO die design is also almost 4 years old. Even with Cezanne, AMD has made it clear there is much room for improvement.Zen4 might have dual SDP/CCD links, with current SerDes that is going to be a problem.
Imagine this on 128 Core.![]()
Details on the Gigabyte Leak
Recently, a ransomware group leaked data from Gigabyte in an attempt to extort payment. That’s been well covered by other outlets (please everyone, secure your networks), so here we’re …chipsandcheese.com
That would be 4x the interconnect energy consumption vs the 64 core part. Or 3x for the 96 core part.
Add to that additional 4 channels of DDR5.
There has to be something changed here.
See comments from @uzzi38 . It looks like they've done something. More throughput, similar power on Genoa vs. Milan. Maybe those changes will also appear in desktop products along with DTR products like Raphael-H.The SERDES is what also keeps me thinking. So maybe Zen4 could finally be the time they at least on mobile could use something like Info-LSI (TSMCs Version of EMIB, but better). This would improve Interconnect power efficiency by around 1000% and would make chiplets in the mobile space viable. But the leaks suggested that they seem to stay on the organic package Interconnect at least on the Desktop. So for me it is rather improbable that it was technically feasible with one and the same die. But I keep my fingers crossed.
We're talking about competition that can already hit 60W on battery. Most DTRs downclock on battery anyway, so being only competitive when receiving wall power wouldn't be entirely bad.If you mean that AMD could stick a large GPU into a SoC and feed that with high bandwidth memory, then clearly yes. If you mean them to reach the same power efficiency, then definitely no. The margin is much too big for them to catch up within one generation.
Forget the PR for a moment and consider what market changes AMD must react to. In a "worst case" scenario, their main x86 competition will flame out on their 10nm process (10ESF) and not release anything compelling in significant volume due to wafer supply problems, or will at least be out of the game until said competition moves to their 20A process. And by "worst case" I mean "possible, though perhaps not probable; we'll see". In that scenario, AMD would need to shoulder the burden of the entire x86 world looking to them for chips on cutting-edge processes. Who will be competing with them from outside the x86 world?Also why bring Apple PR in this thread, there are lots of such threads.
I will concede that, today, AMD doesn't really have DTR products that would compete directly against the most expensive (and heavy!) Mac Pro units announced recently. There are some boutique laptop manufacturers that are in the same segment using what is essentially a mashup of desktop and laptop hardware, but AMD doesn't address that segment. In the future, they may wish to change that. And if they do, then whabam, they are in competition. You must concede that a 16c Raphael-H would be a significant departure from their current mobile offerings. We are getting into DTR territory with hardware like that.But with Zen 4 AMD won't have a direct answer to any M1 variant, and also doesn't need one unlike Intel which wants to win back Apple as a customer.
In the article you quoted there is also the following passage: " ‘Narrow mode’ may be a way to save even more power by disabling one of the IF links when high bandwidth isn’t required." So I guess this is how they try to tackle the consumption problem - being more dynamic depending on bandwidth demand.Zen4 might have dual SDP/CCD links, with current SerDes that is going to be a problem.
Imagine this on 128 Core.![]()
Details on the Gigabyte Leak
Recently, a ransomware group leaked data from Gigabyte in an attempt to extort payment. That’s been well covered by other outlets (please everyone, secure your networks), so here we’re …chipsandcheese.com
That would be 4x the interconnect energy consumption vs the 64 core part. Or 3x for the 96 core part.
Add to that additional 4 channels of DDR5.
There has to be something changed here.
It's a brand new I/O die on 6nm. But the real bottleneck, and albatross are the SerDes links.Considering the IOD barely changed at all between Zen 2 and 3 I do expect big changes for Zen 4. When AMD talked about going organic and MCM instead interposer they showed how cramped the floor plan for all the necessary links already is, so around doubling that for DDR5, PCIe5 and SerDes links does seem impossible. We may actually see a more significant change to the current IOD to CCD hierarchy.
He meant that as a concept.Yes, except the consoles eat much more power- so, actually, no. It is an interesting idea though - certainly console SoCs would be much closer to being pro grade workstation, if put into a laptop.
500% 🤣🤣🤣🤣Badly. As stated in the Apple M1 Thread their efficiency margin is 500%. None of the things you stated will change that in a significant way.
But that's just wizardry. Hard facts please!! 🤣No because i can make my Zen3 in CB23 ST hold ~3.0ghz while consuming 0.5 watts ( on the core). I lose 33% performance and at 12x reduced power. that's setting TDP to 9 watts, i have like 300 background processes including sql , web frame works etc so in this situation i have more non Cb23 load on my device then not. If i really cared to game your metric i would do a clean install turn off AV , indexing etc.
You know what else would "solve" the "SerDes problem"? Going monolith...It seems to me that simple EMIB would solve 90% of the problems SerDes causes...
It is not a straight answer, there are pros and cons of increasing cacheAlso I think the cache was something planned, after all there is a lot of emphasis by all CPU makers on large caches for various reasons, not only for gaming.
AMD went with the CCD strategy in the beginning to scale out # of cores where applicable, to use commodity dice between desktop; workstation; and server, and to improve yields. If AMD can get good yields on16c parts and below on N5 then they could just go monolithic on their entire desktop and laptop lineup. The I/O die would be exclusive to EPYC and Threadripper of that generation. Not saying that's what they'll do, since it would violate the pattern from Zen -> Zen3 and force them to produce multiple monolithic dice for desktop/laptop (more masks = more time, more money). They would need an 8c monolithic and a 16c monolithic at least, and then fuse off cores for 12c and 6c parts.You know what else would "solve" the "SerDes problem"? Going monolith...