- Mar 3, 2017
- 1,774
- 6,757
- 136
At a cost of a bit of latency ?
All other cache sizes seems to be right, so it s unlikely that GB made a miss just for the L1D, and the ST score, if at about 1.4GHz, is aligned with the late april GB submissions, this would translate as about 23% better ST score than a 7950X.Wouldn't be the first time GB reported the wrong cache size. On the other hand, it could be correct and a Zen 5 update to their embedded platform.
23% is badAll other cache sizes seems to be right, so it s unlikely that GB made a miss just for the L1D, and the ST score, if at about 1.4GHz, is aligned with the late april GB submissions, this would translate as about 23% better ST score than a 7950X.
Just on this, I noticed this thread this morning;Memory capacity is king. Strix Halo + 64GB RAM will beat a 4090 in any AI task if you scale it up sufficiently. And for local LLMs where capacity is by far the biggest bottleneck, it won't even be close.
Strix Halo directly competes for the semi-pro market that is buying 4090s instead of RTX 6000s. Significantly slower for tasks that fit inside 24GB, but can be used for tasks a 4090 can't touch.
Only for train passengers.23% is bad
23% is bad
Great points. I would also say the huge bandwidth helps as well. That’s before Apple is using LP5X.Just on this, I noticed this thread this morning;
https://www.reddit.com/r/LocalLLaMA/comments/1crwkia/why_people_buying_macs_instead_of_cuda_machines
From my understanding, Strix Point will also feature a souped up NPU, so Windows folks might get some inferencing love if any of the partners will field Strix Point laptops with 128GB of ram.
Nope, brand new uarch with major changes for only 23%, that's definitely not good.Only for train passengers.
You're trying to model perf off fuzzy data.Nope, brand new uarch with major changes for only 23%, that's definitely not good.
lol.. Maybe if I win the lottery.
You've asked for bigger and meaner and APUs (so far) don't get bigger and meaner than that!lol.. Maybe if I win the lottery.
For the April GB scores when compared against a 14900K w/ DDR5-4800 JEDEC it ended up being 20-22%. If that were the score I think that'd actually be quite good since a desktop chip with larger L3$ would score even better than this. If you extrapolate the small IPC bump from a desktop chip having fatter L3$ and the slight fmax increase over Zen 4 you would see ~30% 1T perf increase.There s forcibly some big variability, in a comparision with a linux driven 7800X3D, wich is sure to be at 5Ghz, that could be 27-29% or so, at this point the GB submission give no clue about the test plateform, given that it display 6.74GB RAM it could be single channel, or not.
If the previously discussed patent 10698472 is any indication it would work on an opcode level, only implementing opcodes that can be churned through without any real computation (like JMP or anything keeping I/O going etc.) otherwise throwing an illegal opcode exception to move the computation onto a fat core.Will be highly interesting to see how this works out and what strategy for switching the threads over they might employ.
Will they switch on a thread level, on a core level (thread pairs), what thresholds, etc. ...
Going by above patent it could well be so castrated it couldn't be used as a full core of its own. It would be little more than a fast shortcut for the simplest of all opcodes.The real question is just how castrated Z5LP really is.
I don't know, how would that work for a LP island?If the previously discussed patent 10698472 is any indication it would work on an opcode level, only implementing opcodes that can be churned through without any real computation (like JMP or anything keeping I/O going etc.) otherwise throwing an illegal opcode exception to move the computation onto a fat core.
Going by above patent it could well be so castrated it couldn't be used as a full core of its own. It would be little more than a fast shortcut for the simplest of all opcodes.
Did I uh, miss something? ROCm under Windows? Google search results seem inconclusive.It gets day1 ROCm too!
Yeah, the LP cores do need to perform proper computations to actually prevent the HP cores from powering up.I don't know, how would that work for a LP island?
Sounds like something that would fire up the fat cores for literally any workload.
This just replaces a number with a word, same meaning. What actually should've happened is, each gen should shift the window of numbers. Example: Zen 1 would have Ryzen 1,2,3 tiers, Zen 2 will have Ryzen 2,3,4,5 tiers. Zen 3 would have Ryzen 3,4,5,6 tiers. Zen 4 would have Ryzen 5,6,7 tiers. This way, you avoid the problem you are talking about. Someone will say "I have a Ryzen 4", and you instantly know it's a chip made inbetween 2019 and 2022.Weird hypothesis, but frankly Intel and AMD always used the numbers very poorly.
Out of 9 numbers, 4 are used: 3/5/7/9. It's a generally poor system because you don't want to have many tiers(muddies the lineup), and you don't want to have lower numbers than 5. Anyone identifies i3/R3 as "bad", to the point that some noobs actually think a 2023 i3 is worse than a 2014 i7.
Frankly I think clothes did it pretty well: S/M/L/XL?
Make 4 tiers:
- Light
- Medium
- Heavy
- Ultra
But it's kind of problematic too since it's a two sided coin, you get Bigger Number Better just like you get Lower Number Worse.
Yes, that's the point.This just replaces a number with a word, same meaning.
You just create a huge new slew of problems.What actually should've happened is, each gen should shift the window of numbers. Example: Zen 1 would have Ryzen 1,2,3 tiers, Zen 2 will have Ryzen 2,3,4,5 tiers. Zen 3 would have Ryzen 3,4,5,6 tiers. Zen 4 would have Ryzen 5,6,7 tiers. This way, you avoid the problem you are talking about.
So just the difference between a 3950x and a 7600.Someone will say "I have a Ryzen 4", and you instantly know it's a chip made inbetween 2019 and 2022.
Is it a lot better when Ryzen 7 has 200% ST and MT better performance than Ryzen 7? (1700X vs 7700X), I don't think that. With intel it's even biggerJust a smidge of 31% better MT perf on one, vs 30% better ST perf on the other.