- Mar 3, 2017
- 1,777
- 6,791
- 136
That's with 142W PL (which it actually hits now unlike the 7800X3D) vs 88W. If you're looking at MT it's a terrible extrapolation to the 9950X3D. In 1T it's still generally behind and the same when power limits go out the window. It won't perform better in games than the 9800X3D without applying the same tweaks to prevent core migration that the plain 9950X uses. And if you're doing that anyway, why not have the more flexible approach with a lower latency, higher frequency CCD for the multitude of applications and some games that prefer it? At a lower cost too.Narrowing the clock speed deficit from 500 MHz (7800x3d vs 7700x) to 300 MHz (9800x3d vs 9700x) was enough to make 9800x3d approx. equal in to 9700 in non-gaming benchmarks
Dual CCD V-cache even 9900X3D with SMT off has a better chance of beating the 9800X3D due to more cache available to all threads. Turning SMT off on 9800X3D may tank performance for some games relying on more than 8 threads.I think the reason people want the 2 V-Cache configuration is because it would be a non-brainer for gaming, would likely beat 9800x3d in pretty much all cases (if the clock speed was just a little higher).
Maybe once they are done with AI...It would be nice if they came up with a more robust software solution than the MS Game Bar "collab".
Rather game developers need to stop pretending that hybrid CPUs do not exist. Every OS side solution will have some short-comings. APO requires that Intel maintains it, so it will always lag behind. Likewise for game-bar. If the games are CPU-aware they will be able to handle the scheduling correctly, as I think Windows already is exposing all relevant APIs.AMD needs to do something like Intel APO where they create specific profiles for games to pin their cache hungry threads to the V-cache CCD at all times and ensure that no such thread is allowed to be scheduled on the non-V-cache CCD.
I think AMD's game bar solution and Intel's APO are tacit admissions from both companies that they can't do anything about game developers' coding habits.Rather game developers need to stop pretending that hybrid CPUs do not exist.
But not everyone has CPUs with heterogeneous core configurations. Plus do mobile developers need to worry about heterogeneous core configurations? Those have been omnipresent in the mobile space for years now. It seems like mobile ARM platforms handle such configurations better than do Windows desktops.Rather game developers need to stop pretending that hybrid CPUs do not exist.
It's great for work too! Some workloads LOVE cache, this can increase perf way more than 5% extra frequency (which costs lots of extra power).I think the reason people want the 2 V-Cache configuration is because it would be a non-brainer for gaming, would likely beat 9800x3d in pretty much all cases (if the clock speed was just a little higher).
They can do, it's called money, I guess this is what nVidia is using by offering software engineering supportI think AMD's game bar solution and Intel's APO are tacit admissions from both companies that they can't do anything about game developers' coding habits.
I would imagine they are using the more powerful cluster to run the games, and ignore the A5xx cores all together. Unless A5xx is all there is, but I am not familiar enough with android to offer anything but a guess.Plus do mobile developers need to worry about heterogeneous core configurations?
From what I understand they rely on software/firmware for scheduling but I'm not really sure about it either.I would imagine they are using the more powerful cluster to run the games, and ignore the A5xx cores all together. Unless A5xx is all there is, but I am not familiar enough with android to offer anything but a guess.
1) There are many HPC workloads indeed which don't scale well or even negatively with SMT. But disabling SMT in the BIOS is not the correct answer to this. Instead, leave SMT on, determine the optimum program thread count of your workload on your hardware and configure the application program respectively. And, perhaps, either use Linux or give the Windows kernel some scheduling hints if it needs them."For HPC workloads AMD recommends
1) disabling SMT
2) engaging the "high performance" power profile,
3) running in performance determinism mode
4) running the respective CPU at its maximum configurable TDP (cTDP) value
5) running with four NUMA nodes per socket (4 NPS)"
#1-4 are obvious even without any guides (had them for long time on older EPYCs), will give #5 a go once I get my hands on new stuff, been suggested on here too.
Yes, it generally is to some extent, depending on the outcome of silicon lottery.https://www.phoronix.com/review/amd-epyc-9005-determinism/7
Those results suggest power determinism is faster.
Others have already answered. But I respond in other words: In a variety of games, in several HPC/ workstation use cases, and in several database use cases, it doesn't matter whether the CPU is clocking at 5.4 or at 5.7 GHz while it is waiting for memory requests to be fulfilled. Instead, it matters that more of such requests hit cache.It would be an inferior gamer part than even the 9800X3D. And inferior 1T to 9950X. And an inferior MT to 9950X. But cost twice as much. So what's it for? And who is it for?
And that's just the core clock speed difference, not the application performance difference,If 2nd non-3D chiplet got 5% faster clocks (5.7 vs 5.4) then overall that's like 2% diff for whole chip,
indeed!totally nothing for risks of bad thread management,
You are considering 1T workloads (or even few-T workloads) which scale very well with core clock speed and are well served by 1 MB L2$ + 32 MB L3$. Many workloads are like this. But this is a kind of workload which is not critical to the prospective buyer of a 16c/32t CPU with 16×1 MB L2$ and 2×96 MB L3$ and an according price tag.It reduces the "uplift" in 1T from 1.14x (allegedly, more like 1.1x) to 1.07x.
Almost sounds as if you were talking about AMD EPYC 4484PX/4584PX.It would be denounced, rightfully, as a pointless money grab part. AMD shouldn't hurt their reputation in the only place they have a positive reception.
AMD served even tinier niches before. (I am not saying they want to, or even will, serve this one.)Tiny niche.
This increased heterogeneity would be an improvement to matching heterogeneous workloads indeed. (As long as they are scheduled correctly.)What would actually be an improvement while using the same amount of silicon would be taking both of those 3D caches and stacking them on a single die. But TSMC/AMD aren't doing that.
Except whenever it is the worse configuration. — I am looking at high performance CPUs mostly from the technical computing angle, and I have two rhetorical questions to ask: Who cares about high CPU performance in areas in which high CPU performance is not critical? And in areas in which it is critical, who touches heterogeneous CPUs even with a ten foot pole?Mixed is genuinely the better […] configuration.
Yep, 99.5% people asking for it — I verified your head count and arrived at the same figure ;-) — are erring about their objective need for high performance CPUs.99.5% people asking for it are wrong to even want that configuration.
Yes, they're wrong and AMD will show them by not supplying it. I'm sorry this keeps upsetting people who are wrong about what they want.Yep, 99.5% people asking for it — I verified your head count and arrived at the same figure ;-) — are erring about their objective need for high performance CPUs.
The future isnt 2 vcached CCDs, its 2 CCDs over a single unifying vcache.Yes, they're wrong and AMD will show them by not supplying it. I'm sorry this keeps upsetting people who are wrong about what they want.
Consumer workloads are better off with one low latency CCD and one big cache CCD. Workstation might not be. But AMD has a mantra for you people: "go buy EPYC". Repeat until you stop imagining markets.
AMD didn't increase core counts. AMD won't increase the number of cache CCDs until well after 9800X3D supply improves. You may not like it but it's a better configuration for the majority of users, including people who want a 9800X3D, and AMD too.
There is one answer to this, and as a former scalper myself (sorry, PS3 launch), I can confidently declare--- NEVER pay above MSRP for tech. F***k 'em, dont do it. A little self control will put a world of hurt on scalpers. They can keep their stock until AMD restocks, and restocks again until I get MSRP or better. This is the way.Trying to get a 9800X3D reminds me of when i was trying to get a 3090 during COVID.
They are being scalped til no tomorrow, with prices as high almost 800 dollars.
I even went to microcenter to get one at the store, and right as i got there, they were all sold out.
The guy said there was a line waiting for tickets even, as i think they are buying them and flipping them on ebay.
Really annoying...
At that point it is one CCX over two CCD, which seems unlikely. You cannot "extend" 2 private L3 caches onto a "unified" vcache and keeping it all as "L3".The future isnt 2 vcached CCDs, its 2 CCDs over a single unifying vcache.![]()
There is one answer to this, and as a former scalper myself (sorry, PS3 launch), I can confidently declare--- NEVER pay above MSRP for tech. F***k 'em, dont do it. A little self control will put a world of hurt on scalpers. They can keep their stock until AMD restocks, and restocks again until I get MSRP or better. This is the way.
Lord Jensen thanks those without impulse control. It's what allows for >$2,000 flagship GPU prices.Unfortunatly, many lack self control in this "instant gratification" world.
Yep. Fortunately for AMD, Intel products and behaviour is worse.If there isn't good day 1 sales on a tech item, it can torpedo the entire product. Look at how many people lost their minds over the Zen5 launch. Over time, the 9950X et al will probably rack up pretty good sales, despite the 9800X3D obviously cannibalizing a lot of those early sales via Osborne effect. It still doesn't matter. People will declare Zen5 to have been a dud, and if AMD didn't have X3D processors lined up and ready to go a few months later, it might have shifted some influencial thinking wrt whether it's worth it for AMD to continue to service the enthusiast diy PC crowd.
[This has been discussed to death by now, but: A) What has been semi-valid for Zen 4 is no longer going to be valid for Zen 5. B) They are "better off" only in a fantasy world in which operating systems have an omniscient task scheduler. In the real world, heterogeneous CPUs are in some economic respects preferable to homogeneous CPUs, but from the technical perspective they are nothing but kludges.]Consumer workloads are better off with one low latency CCD and one big cache CCD.
The complete current wording of the mantra is "go buy EPYC, but not EPYC 4000". Earlier versions of the mantra also mentioned a "Threadripper" but this was a long time ago.Workstation might not be. But AMD has a mantra for you people: "go buy EPYC".
Understandable since they can't charge too much for EPYC 4000 since it's just rebadged "consumer" stuff, Intel solved this problem in the past by removing support for ECC, but since Zen 4/5 support it (if you get BIOS and mb) it's hard for AMD to do that."go buy EPYC, but not EPYC 4000"
Even the plain old 9950X is dependent on preferred CCD scheduling for peak performance. We all lament that but it is the real world for months now.[This has been discussed to death by now, but: A) What has been semi-valid for Zen 4 is no longer going to be valid for Zen 5. B) They are "better off" only in a fantasy world in which operating systems have an omniscient task scheduler. In the real world, heterogeneous CPUs are in some economic respects preferable to homogeneous CPUs, but from the technical perspective they are nothing but kludges.]
The complete current wording of the mantra is "go buy EPYC, but not EPYC 4000". Earlier versions of the mantra also mentioned a "Threadripper" but this was a long time ago.