Discussion AMD SoC Halo series GPU discussion

Page 5 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

insertcarehere

Senior member
Jan 17, 2013
712
701
136
Uh no it doesn't.

Not to rehash an old point but yeah it is.
1682865419223.png
1682865458938.png

12CU RDNA3 is basically c. 12-13% faster than 12CU RDNA2 in an APU, holding memory speeds and bandwidth constant. That is despite having a node shrink from 6nm to 4nm and as a result ability to run higher clocks. Not seeing much there that RDNA2 with a node shrink can't replicate.
 
  • Like
Reactions: lightmanek

gdansk

Diamond Member
Feb 8, 2011
4,092
6,772
136
I think some people are overestimating how much Strix Halo costs. It isn't that expensive.
I reserve the right to be pleasantly surprised but the only interesting devices so far (HP) look like they'll be very expensive.
 
  • Like
Reactions: GTracing

insertcarehere

Senior member
Jan 17, 2013
712
701
136
I think some people are overestimating how much Strix Halo costs. It isn't that expensive. Plugging die size for the GPUIO die into a calculator gives 148 good dies. At $17000 per wafer, that's $114.86 per die.

View attachment 114424

The CCDs are around $20 each. Even after the advanced packaging, the SKU costs less than $200 to manufacture.

That's not taking into account the margins that AMD would want though. Plugging the same parameters for Strix Point gets us a bit less than $80/die. And from what we've seen on the market AMD isn't selling that for cheap. Strix Halo is a much lower volume chip that will necessitate higher margins than Strix Point for the math to work out...
 
Last edited:

GTracing

Senior member
Aug 6, 2021
478
1,112
106
That's not taking into account the margins that AMD would want though. Plugging the same parameters for Strix Point gets us a bit less than $80/die. And from what we've seen on the market AMD isn't selling that for cheap. Strix Halo is a much lower volume chip that will necessitate higher margins than Strix Point for the math to work out...
That $80 doesn't take into account the dGPU though. Strix Halo should perform roughly like a HX 370 + RTX 4060 mobile (stronger in CPU, but weaker in GPU). By my estimates, that combo is at most $30 less than an HX 395 to manufacture.

I agree margins are a big unknown. And there would be other costs that go into selling the chip.
 

marees

Golden Member
Apr 28, 2024
1,042
1,401
96
That $80 doesn't take into account the dGPU though. Strix Halo should perform roughly like a HX 370 + RTX 4060 mobile (stronger in CPU, but weaker in GPU). By my estimates, that combo is at most $30 less than an HX 395 to manufacture.

I agree margins are a big unknown. And there would be other costs that go into selling the chip.
You can game on battery (sub 30watts) on strix halo. I think that is difficult with discrete gpu
 

Joe NYC

Diamond Member
Jun 26, 2021
3,030
4,426
106
That $80 doesn't take into account the dGPU though. Strix Halo should perform roughly like a HX 370 + RTX 4060 mobile (stronger in CPU, but weaker in GPU). By my estimates, that combo is at most $30 less than an HX 395 to manufacture.

I agree margins are a big unknown. And there would be other costs that go into selling the chip.
Did you count separate video memory and extra cost to assemble that as well, including cooling for the video chip?
 
  • Like
Reactions: lightmanek

GTracing

Senior member
Aug 6, 2021
478
1,112
106
Did you count separate video memory and extra cost to assemble that as well, including cooling for the video chip?
No, that's just a rough estimate of the cost to manufacture the dies and package them on a substrate. It doesn't include shipping, RAM, motherboard, cooling, etc.
 
  • Like
Reactions: Tlh97 and Joe NYC

insertcarehere

Senior member
Jan 17, 2013
712
701
136
You can game on battery (sub 30watts) on strix halo. I think that is difficult with discrete gpu
I can't see being that much better than Strix Point by itself at those power levels though. Just running stuff through the IOD + CCDs is going to take consequentially more power than a single die, which will matter at these power levels, not to mention how well (or not) 16c Zen 5 + 40CU RDNA3. 5 can scale down effectively.
 

gdansk

Diamond Member
Feb 8, 2011
4,092
6,772
136
I can't see being that much better than Strix Point by itself at those power levels though. Just running stuff through the IOD + CCDs is going to take consequentially more power than a single die, which will matter at these power levels, not to mention how well (or not) 16c Zen 5 + 40CU RDNA3. 5 can scale down effectively.
Wide & low is a recipe for GPU efficiency. C.f. Apple.
Still dubious at ~30W because the multiple chips. Though really how much data is being sent between them if it can share memory? It depends possibly on how quickly the interconnect can power down. But the crossover point with Strix Point is probably not too bad.
 
  • Like
Reactions: Tlh97

insertcarehere

Senior member
Jan 17, 2013
712
701
136
Wide & low is a recipe for GPU efficiency. C.f. Apple.
Still dubious at ~30W because the multiple chips. Though really how much data is being sent between them if it can share memory? It depends possibly on how quickly the interconnect can power down. But the crossover point with Strix Point is probably not too bad.
I'd imagine the main memory (RAM?) to be orders of magnitude higher latency and lower bandwidth than pinging stuff through the interconnect,and not advisable unless the data is not super affected by latency.

Given the interconnect links both CCDs and the GPU + IOD, that'd be pretty inadvisable to power down during a gaming session..
 

gdansk

Diamond Member
Feb 8, 2011
4,092
6,772
136
Given the interconnect links both CCDs and the GPU + IOD, that'd be pretty inadvisable to power down during a gaming session..
Oh, right. The IOD and GPU are the same die still.
I'd imagine the main memory (RAM?) to be orders of magnitude higher latency and lower bandwidth than pinging stuff through the interconnect,and not advisable unless the data is not super affected by latency.
Where are you putting the assets? You don't have to stream anything. Pass a pointer, done. And I guess the MC is on the GPU die because it will cause the majority of memory bandwidth. And the CPU, which will have to go through the interconnect, would be less bandwidth and draw lists are small so this presents ample opportunity for doing nothing on most of the links.
 
Last edited:

DavidC1

Golden Member
Dec 29, 2023
1,452
2,361
96
I think some people are overestimating how much Strix Halo costs. It isn't that expensive. Plugging die size for the GPUIO die into a calculator gives 148 good dies. At $17000 per wafer, that's $114.86 per die.
It isn't the raw cost that's the problem, but volume. New motherboard required, new packaging required for the CPU on a much lower volume.

Sure dGPUs require a separate PCB and all that but that's on a mass produced already existing design that needs little modifications to accommodate newer generations.

I again doubt these halo iGPUs will gain real traction.
Wide & low is a recipe for GPU efficiency. C.f. Apple.
It depends a lot on the V/F curve, meaning it can vary between different uarch, silicon, and power levels.

Going from 0.6V to 0.7V is a no brainer, because you are going from say 300MHz to over 1GHz. So in this case, you can't win with a "wide and slow strategy", because you need to make up for over 3x difference in clocks.

At the very end, 1.1V might be 2.4GHz while 1.2V is 2.5GHz. Then you absolutely benefit taking 1.1V instead and making it 5-10% wider.

The scaling goes from superlinear, to linear, to sublinear. Wide & Slow only works for sublinear scaling.
 
Jul 27, 2020
24,362
16,957
146
I was wandering if this is possible in Windows environment. This would be ideal, if possible.

Intel and AMD already have a kind of this unified memory with the zero copy buffers for their iGPUs. I suppose AMD will do some driver magic to let both the CPU and GPU access any part of the unified memory space and the memory controller will ensure coherence.

UMA apparently was introduced back in DX11: https://learn.microsoft.com/en-us/windows/win32/direct3d11/unified-memory-architecture
 
  • Like
Reactions: Tlh97 and Joe NYC

techjunkie123

Member
May 1, 2024
145
313
96
1000018451.jpg
1000018453.jpg
Asus has some strix halo performance data on their page.

GPU time spy score seems to be ~4060 / low power 4070. Not bad. Comparable to 32 CU 7700S too, but at lower power (ofc 6 mm vs 4 nm). GPU efficiency seems similar to 4060/4070 too, perhaps even slightly higher.

CPU R23 nT scores are also solid. Comparable at the top end to 16C dragon range from last gen. 60 W or so seems to be the sweet spot. Efficiency should be comparable to M4 Pro/Max/Strix point for R23 nT specifically.