adroc_thurston
Diamond Member
- Jul 2, 2023
- 8,457
- 11,187
- 106
DIY margins are higher.Not much of a distinction relative to enterprise margins tho.
It's a tricky market to win, though.
DIY margins are higher.Not much of a distinction relative to enterprise margins tho.
Even if you don't NEED this type of performance via mostly AMX instructions, it'll impact workloads where part of the instructions are potentially performed by AMX on CPU, or iGPU / NPU. So it may still affect the perf to 10%, 30%, or whatever.Well, possibly both.
For anyone that works professionally and NEEDS this type of performance, they will absolutely have a discrete GPU.
For the average Joe (or Jill), the iGPU in Zen6 will likely run circles around a CPU + AMX solution.
I guess I just don't see the market here for AMX.
those gemm blobs are one or two per cluster.Regarding the tiny peasant iGPU in Zen6 running circles around 24C/48T Zen6 with AMX (or 52C/52T NVL-S), what do you base that on? E.g. any benchmark, or just speculation?
So AMX on CPU will outrace iGPU after all, on CPUs with peasant iGPUs such as 9950X successor and friends?those gemm blobs are one or two per cluster.
Their point isn't to outrace iGPs, but to pump a higher geekbench score.
Maybe?So AMX on CPU will outrace iGPU after all
Those things never run of their iGPUs.on CPUs with peasant iGPUs such as 9950X successor and friends?
DIY DT ≠ All DTThose things never run of their iGPUs.
It's the only DT that matters. Rest are no margin povertyholes.DIY DT ≠ All DT
Yeah those tend to accelerate GEMM via iGPs or dedicated GEMM blobs.Also, you have mobile/laptop segment, where lots are without dGPU.
Again, margins does not matter in this case, since we're not talking about sales but technical feasibility.It's the only DT that matters. Rest are no margin povertyholes.
Maybe better to use NPU instead of iGPU as AMX CPU alternative? Unless NPU is occupied with AI stuff.Yeah those tend to accelerate GEMM via iGPs or dedicated GEMM blobs.
ACE/SME are for pumping out Geekbench scores.
Sales are only worth anything at good margin. Otherwise, you're Intel.Again, margins does not matter in this case, since we're not talking about sales but technical feasibility.
GEMM blobs suck at modern ML.Maybe better to use NPU instead of iGPU as AMX CPU alternative? Unless NPU is occupied with AI stuff.
lmaoSame on DT which will get NPU too.
Again, we're talking about about technical feasibility. Regardless of whether AMD, Intel, or someone else, and what their sales are.Sales are only worth anything at good margin. Otherwise, you're Intel.
Translate that into links showing actual benchmarks of NPU being worse than iGPU at handling AMX CPU like tasks, given same silicon area to play with.GEMM blobs suck at modern ML.
You gotta wait for M5 Pro/Max for that.Translate that into links showing actual benchmarks of NPU being worse than iGPU at handling AMX CPU like tasks, given same silicon area to play with.
https://fastflowlm.com/ I will leave that to you to see how iGPU in Strix Point and Halo compare. NPU though gives lower power draw. After all there is a reason AMD software folks prefer to use hybrid approach at best on Strix Point. (Read up on their Lemonade server).Translate that into links showing actual benchmarks of NPU being worse than iGPU at handling AMX CPU like tasks, given same silicon area to play with.
So Apple leads the way and shows that NPU is better than iGPU (and CPU) for the tasks it’s designed for?You gotta wait for M5 Pro/Max for that.
Apple is the only OEM with both NPU and GPU as a first class s/w citizen.
The opposite, they added matmul piles to their GPU IP because ANE sucks.So Apple leads the way and shows that NPU is better than iGPU (and CPU) for the tasks it’s designed for?
Assuming you can expect developers to actually target your GPU with their software.
The opposite, they added matmul piles to their GPU IP because ANE sucks.
That was the OG idea but they've pumped matmul rates there ever since.ANE is designed for low power background processing, like recognizing when you say "hey Siri"
That was the OG idea but they've pumped matmul rates there ever since.
This was their primary ML offering with first-class s/w support but no one liked it.
For a good reason, too.
the core matmul rates went up.They haven't increased the size of the ANE, it has always been 16 core
Unlikely since it's old RDNA2 so no lower precision there and it's tiny to begin with, but that CPU ain't got AMX anywayDepends on what iGPU we're talking about. The one in e.g. 9950X and friends is tiny. Will that still be sufficient?
the core matmul rates went up.
Bigger configs had dual-ANE iirc.
Again, this was their primary ML offering with first-class SW and **no one** liked it. So GPU it is.
All those dumb VLIW blobs are goddamn useless for doing actual real ML.If you think Apple made such a blunder here what about Qualcomm, Intel and AMD, all of whom have a similar separate NPU for similar roles, have GPU AI capability, and are also adding SME / ACE. Sure seems like they all agree on this, so I'll trust the four of them rather than you "bro".
DIY margins are higher.
It's a tricky market to win, though.
They own DIY market via pure accident. They shipped server scraps and just won.I think it is kind of a lesson AMD is learning. When you "own" the market, like DIY desktop, suddenly a lot of money in revenue and profits starts pouring in.
Mobile is heavily commoditized with margin pressure exerted from multiple directions.I think Lisa and Jean like it and will want to replicate it in other segments (such as mobile, dGPU).
This was true 10 years ago.AMD is more used to trying to squeeze a drop of water from a dry rock, while other companies walk away with fat profits.
Possibly a bad assumption, but an assumption just the same. GPU's and NPU's (pitiful or not) are architected to handle matrix loads. They generally handle them better for the same amount of die space IMO.Even if you don't NEED this type of performance via mostly AMX instructions, it'll impact workloads where part of the instructions are potentially performed by AMX on CPU, or iGPU / NPU. So it may still affect the perf to 10%, 30%, or whatever.
Regarding the tiny peasant iGPU in Zen6 running circles around 24C/48T Zen6 (or 52C/52T NVL-S) with AMX, what do you base that on? E.g. any benchmark, or just speculation?
What a day it is when I agree with something logical adroc stated!They own DIY market via pure accident. They shipped server scraps and just won.
It happened but that was never, ever the intent.
