• We’re currently investigating an issue related to the forum theme and styling that is impacting page layout and visual formatting. The problem has been identified, and we are actively working on a resolution. There is no impact to user data or functionality, this is strictly a front-end display issue. We’ll post an update once the fix has been deployed. Thanks for your patience while we get this sorted.

Discussion AMD SoC Halo series GPU discussion

Page 5 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.
Uh no it doesn't.

Not to rehash an old point but yeah it is.
1682865419223.png
1682865458938.png

12CU RDNA3 is basically c. 12-13% faster than 12CU RDNA2 in an APU, holding memory speeds and bandwidth constant. That is despite having a node shrink from 6nm to 4nm and as a result ability to run higher clocks. Not seeing much there that RDNA2 with a node shrink can't replicate.
 
I think some people are overestimating how much Strix Halo costs. It isn't that expensive. Plugging die size for the GPUIO die into a calculator gives 148 good dies. At $17000 per wafer, that's $114.86 per die.

View attachment 114424

The CCDs are around $20 each. Even after the advanced packaging, the SKU costs less than $200 to manufacture.

That's not taking into account the margins that AMD would want though. Plugging the same parameters for Strix Point gets us a bit less than $80/die. And from what we've seen on the market AMD isn't selling that for cheap. Strix Halo is a much lower volume chip that will necessitate higher margins than Strix Point for the math to work out...
 
Last edited:
That's not taking into account the margins that AMD would want though. Plugging the same parameters for Strix Point gets us a bit less than $80/die. And from what we've seen on the market AMD isn't selling that for cheap. Strix Halo is a much lower volume chip that will necessitate higher margins than Strix Point for the math to work out...
That $80 doesn't take into account the dGPU though. Strix Halo should perform roughly like a HX 370 + RTX 4060 mobile (stronger in CPU, but weaker in GPU). By my estimates, that combo is at most $30 less than an HX 395 to manufacture.

I agree margins are a big unknown. And there would be other costs that go into selling the chip.
 
That $80 doesn't take into account the dGPU though. Strix Halo should perform roughly like a HX 370 + RTX 4060 mobile (stronger in CPU, but weaker in GPU). By my estimates, that combo is at most $30 less than an HX 395 to manufacture.

I agree margins are a big unknown. And there would be other costs that go into selling the chip.
You can game on battery (sub 30watts) on strix halo. I think that is difficult with discrete gpu
 
That $80 doesn't take into account the dGPU though. Strix Halo should perform roughly like a HX 370 + RTX 4060 mobile (stronger in CPU, but weaker in GPU). By my estimates, that combo is at most $30 less than an HX 395 to manufacture.

I agree margins are a big unknown. And there would be other costs that go into selling the chip.
Did you count separate video memory and extra cost to assemble that as well, including cooling for the video chip?
 
You can game on battery (sub 30watts) on strix halo. I think that is difficult with discrete gpu
I can't see being that much better than Strix Point by itself at those power levels though. Just running stuff through the IOD + CCDs is going to take consequentially more power than a single die, which will matter at these power levels, not to mention how well (or not) 16c Zen 5 + 40CU RDNA3. 5 can scale down effectively.
 
I can't see being that much better than Strix Point by itself at those power levels though. Just running stuff through the IOD + CCDs is going to take consequentially more power than a single die, which will matter at these power levels, not to mention how well (or not) 16c Zen 5 + 40CU RDNA3. 5 can scale down effectively.
Wide & low is a recipe for GPU efficiency. C.f. Apple.
Still dubious at ~30W because the multiple chips. Though really how much data is being sent between them if it can share memory? It depends possibly on how quickly the interconnect can power down. But the crossover point with Strix Point is probably not too bad.
 
Wide & low is a recipe for GPU efficiency. C.f. Apple.
Still dubious at ~30W because the multiple chips. Though really how much data is being sent between them if it can share memory? It depends possibly on how quickly the interconnect can power down. But the crossover point with Strix Point is probably not too bad.
I'd imagine the main memory (RAM?) to be orders of magnitude higher latency and lower bandwidth than pinging stuff through the interconnect,and not advisable unless the data is not super affected by latency.

Given the interconnect links both CCDs and the GPU + IOD, that'd be pretty inadvisable to power down during a gaming session..
 
Given the interconnect links both CCDs and the GPU + IOD, that'd be pretty inadvisable to power down during a gaming session..
Oh, right. The IOD and GPU are the same die still.
I'd imagine the main memory (RAM?) to be orders of magnitude higher latency and lower bandwidth than pinging stuff through the interconnect,and not advisable unless the data is not super affected by latency.
Where are you putting the assets? You don't have to stream anything. Pass a pointer, done. And I guess the MC is on the GPU die because it will cause the majority of memory bandwidth. And the CPU, which will have to go through the interconnect, would be less bandwidth and draw lists are small so this presents ample opportunity for doing nothing on most of the links.
 
Last edited:
I think some people are overestimating how much Strix Halo costs. It isn't that expensive. Plugging die size for the GPUIO die into a calculator gives 148 good dies. At $17000 per wafer, that's $114.86 per die.
It isn't the raw cost that's the problem, but volume. New motherboard required, new packaging required for the CPU on a much lower volume.

Sure dGPUs require a separate PCB and all that but that's on a mass produced already existing design that needs little modifications to accommodate newer generations.

I again doubt these halo iGPUs will gain real traction.
Wide & low is a recipe for GPU efficiency. C.f. Apple.
It depends a lot on the V/F curve, meaning it can vary between different uarch, silicon, and power levels.

Going from 0.6V to 0.7V is a no brainer, because you are going from say 300MHz to over 1GHz. So in this case, you can't win with a "wide and slow strategy", because you need to make up for over 3x difference in clocks.

At the very end, 1.1V might be 2.4GHz while 1.2V is 2.5GHz. Then you absolutely benefit taking 1.1V instead and making it 5-10% wider.

The scaling goes from superlinear, to linear, to sublinear. Wide & Slow only works for sublinear scaling.
 
I was wandering if this is possible in Windows environment. This would be ideal, if possible.

Intel and AMD already have a kind of this unified memory with the zero copy buffers for their iGPUs. I suppose AMD will do some driver magic to let both the CPU and GPU access any part of the unified memory space and the memory controller will ensure coherence.

UMA apparently was introduced back in DX11: https://learn.microsoft.com/en-us/windows/win32/direct3d11/unified-memory-architecture
 
1000018451.jpg
1000018453.jpg
Asus has some strix halo performance data on their page.

GPU time spy score seems to be ~4060 / low power 4070. Not bad. Comparable to 32 CU 7700S too, but at lower power (ofc 6 mm vs 4 nm). GPU efficiency seems similar to 4060/4070 too, perhaps even slightly higher.

CPU R23 nT scores are also solid. Comparable at the top end to 16C dragon range from last gen. 60 W or so seems to be the sweet spot. Efficiency should be comparable to M4 Pro/Max/Strix point for R23 nT specifically.
 
Back
Top