AMD Carrizo APU Details Leaked

Page 8 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Homeles

Platinum Member
Dec 9, 2011
2,580
0
0
The problem is that if frequency is increased further, the TDP goes up dramatically on current the latest nodes. So there's really not much headroom for increasing the CPU frequency and paying with some extra TDP.

See what happens to the power consumption when the frequency is increased beyond 3.5 GHz or (3.9 GHz with Turbo):

ClockspeedversusPowerConsumptionfor2600kan3770k.png


For reference, the image is created by IDC.
Yeah, and 14nm would look something like this:
lTvXYHf.png


Even a modest 10% higher frequency boost would put an average Broadwell back to where good Sandy Bridge chips clocked to (~5GHz).
 

Fjodor2001

Diamond Member
Feb 6, 2010
4,693
764
126
Yeah, and 14nm would look something like this:
lTvXYHf.png


Even a modest 10% higher frequency boost would put an average Broadwell back to where good Sandy Bridge chips clocked to (~5GHz).

Wow, you've already got the Broadwell power consumption vs CPU Frequency graph ahead of everyone else. I'm impressed!

I'm even more impressed that your red Broadwell graph shows 0 W power consumption at ~1.7 GHz and below. Amazing power consumption efficiency! :D
 
Last edited:

Homeles

Platinum Member
Dec 9, 2011
2,580
0
0
Wow, you've already got the Broadwell power consumption vs CPU Frequency graph ahead of everyone else. I'm impressed!

I'm even more impressed that your red Broadwell graph shows 0 W power consumption at ~1.7 GHz and below. Amazing power consumption efficiency! :D
The lengths you go to avoid being proven wrong, and avoid learning anything... you never cease to amaze me.
 

Homeles

Platinum Member
Dec 9, 2011
2,580
0
0
I request thread title change.
The topic of conversation -- transistor frequency and power scaling on state of the art process nodes -- applies to Kaveri as well. If you read the damn thread perhaps, instead of taking conversations out of context, this forum would be so much better off.

This is what, the 5th time -- at least -- that you've done this?
 

PPB

Golden Member
Jul 5, 2013
1,118
168
106
The lenghts you go to justify yourself derailing a thread will never cease to amaze me.

Following your falacy, you may as well say that you are actually are on topic because you are discussing the capabilities of a cpu, and since carrizo also happen to have a cpu in it...

Save us from your not-so-awesome paint skills and start discussing the topic at hand, please?
 

Erenhardt

Diamond Member
Dec 1, 2012
3,251
105
101
The topic of conversation -- transistor frequency and power scaling on state of the art process nodes -- applies to Kaveri as well. If you read the damn thread perhaps, instead of taking conversations out of context, this forum would be so much better off.

This is what, the 5th time -- at least -- that you've done this?

I'm doing this, because I get infractions for it. I may be naive to think the same rules apply to others.
 

Homeles

Platinum Member
Dec 9, 2011
2,580
0
0
I'm doing this, because I get infractions for it. I may be naive to think the same rules apply to others.
It's sad that I'm having to do this:
Original comment in question:
Regarding the assumed use of 28 nm instead on 20 nm for Carrizo, does it really matter that much on a desktop CPU?

Sure, the TDP will be a bit higher, but that is not so important on the desktop while were staying within reasonable limits. Also, it should be cheaper buying chips with X billion transistors on 28 nm compared to 20 nm, right? If so, it seems an obvious choice to make, until the price for 20 nm has come down (I know TSMC/GF 20 nm for big cores is not available yet, but likely it will be around the time Carrizo is released).
My reply:
AMD is severely lacking in performance. 20nm provides more performance. New nodes are just as big of a deal as architecture updates, if not more so. 20nm is a bit of an outlier in this regard because of the somewhat higher cost, but the performance benefits are still there.

However, 20nm at GloFo is not a high performance process. The only flavor of 20nm they offer is 20nm Low Power Mobility.
His rebuttal:
Is that really true nowadays? Do you mean it brings more performance due to node shrinks allowing for higher clock frequency? Because the latest node shrinks have not brought much of that.
Next:
I'm talking about power and price. That's all AMD's biggest customers care about.
Etc:
But previously you said node shrinks improved performance. So now you're shifting to it lowering power consumption and price instead, because you could not provide any evidence of what you claimed before?

Also, does it really lower price? That assumes it is cheaper for AMD to buy a chip with X billion transistors on 20 nm than on 28 nm from GF/TSMC. I don't think that is the case at the time Carrizo is expected to be released.


Power consumption is not that important on a desktop CPU/APU. On mobile it's another story though.
More:
You've been here since 2010, and you still don't know what new nodes do?

So what can a new process node do? TSMC claims a up to a 30% improvement in performance, or 25% lower power with their 20nm node versus their 28nm node. When was the last time you saw AMD boost their IPC 30%? GloFo claims a 42% increase in performance or a 61% power reduction with 20nm LPM vs 28nm SLP (a testament to the benefits of going gate-last -- TSMC already saw those gains).

High performance CPUs won't fully realize those gains, but mobile does (or gets much closer to it). Mobile is where the money is right now, and that is the main reason why these new nodes are so critical to AMD's success.

But you were talking about desktop, of course. And yes, it's still a big deal there. Remember that 28nm was a crappy node for AMD because AMD dropped SOI. The traditional improvements vs. 32nm bulk where there, but they were hidden by the regression from SOI.

20nm would have been a great node for AMD's desktop processors, because GloFo is moving from the lower-performing gate-first to the higher performing gate-last method. GloFo's not offering a high performance variant, but the same performance boost would apply if they theoretically ported their big cores to TSMC.
You're rambling, and you don't seem to know what node shrinks do nowadays. 45->22 nm has brought only minor frequency increase for the Intel desktop CPUs. There you have it. The performance improvement has come from uArch changes instead. I.e. node shrinks have not brought the performance improvements.
.
How many times have I been over this, now? 22nm FinFETs had regressed performance at high voltages. At desktop operating voltage, there was next to zero performance benefit, so stock clocks sat still. At higher voltage, there was worse performance, hence the reduced maximum overclocks even when delidded.

This is FinFET-specific. It does not apply to AMD. It is also a one-time hit -- you only go from bulk to planar once. In other words, Intel's 14nm won't have this issue.

Now, let's look at Lynnfield (1156) to Sandy Bridge (1155). A Nehalem i7-870 had a base clock of 2.93 GHz, and a max turbo of 3.6 GHz. A Sandy Bridge i7-2600K had a base clock of 3.4GHz, and a max turbo of 3.8GHz. That's an improvement of 16.0%, and 5.5% respectively.

If we're getting IPC gains of around 10% each generation today, a 10% boost in frequency is nothing to laugh at. As I said before, new nodes are just as important as architecture revisions.
.
No, there's just a penalty associated with moving to multigate devices. Once you've taken the hit, you can go back to the gains from traditional scaling.

http://download.intel.com/pressroom/pdf/kkuhn/Kuhn_22nm_Device.pdf

Slide 13.

These designs are power limited. The clock frequencies these devices reach are a function of power -- at X watts, Y frequency can be achieved.
The problem is that if frequency is increased further, the TDP goes up dramatically on current the latest nodes. So there's really not much headroom for increasing the CPU frequency and paying with some extra TDP.

See what happens to the power consumption when the frequency is increased beyond 3.5 GHz or (3.9 GHz with Turbo):

ClockspeedversusPowerConsumptionfor2600kan3770k.png


For reference, the image is created by IDC.
Once again, conversations evolve. Once again, you're throwing a fit over that fact. Are people not allowed to have conversations on this forum?

I'd really appreciate it, if for just once there could be a conversation in an AMD-related, with a brief mentioning of Intel, that you didn't try to get renamed/closed etc.
 
Last edited:

NostaSeronx

Diamond Member
Sep 18, 2011
3,815
1,294
136
Since we are this far in this thread;

Carrizo/Toronto is;
- 60h-6Fh models of the 15h microarchitecture.
- Supports AVX2, BMI2, MOVBE, and RDRAND.
- GPU uses the Volcanic Islands ISA, unknown if the CU count increased or decreased.
- Memory Controller supports DDR3 and DDR4, with certain models supporting ECC and Registered DIMMs.
- FP4 and SP2 will possibly have an integrated southbridge, while it is disabled in FM2 plus.
- UMI and GPP will be using the PCIe 3.0 spec.
- Carrizo's server counterpart "Toronto" supports data poisioning with ECC.
- Carrizo/Toronto will be using GF28A instead of GF28SHP.
 

inf64

Diamond Member
Mar 11, 2011
3,884
4,692
136
Increasing CU count(SPs) is pointless with current memory BW capabilities. Carizzo would end up with more wasted die space for iGPU if paired with dual ch. DDR3...
 

Erenhardt

Diamond Member
Dec 1, 2012
3,251
105
101
Increasing CU count(SPs) is pointless with current memory BW capabilities. Carizzo would end up with more wasted die space for iGPU if paired with dual ch. DDR3...

What about better memory controller? I think there is a room for improvement.
Lets say they increase memory controller efficiency and improve memory bandwidth by 30%, does gpu performance (fps) increase by 30 aswell if there is enough CUs?
 

inf64

Diamond Member
Mar 11, 2011
3,884
4,692
136
What about better memory controller? I think there is a room for improvement.
Lets say they increase memory controller efficiency and improve memory bandwidth by 30%, does gpu performance (fps) increase by 30 aswell if there is enough CUs?
I doubt they can do that. If they could why haven't they increased the IMC efficiency by decent amount already with SR?
There is only two ways to do it in near future: go wider with the memory channels(won't happen with DDR3 and FM2+) or go with the eDRAM(won't happen as it would increase the cost drastically).

They can wait till 2016 and do 20nm parts with stacked eDRAM while having DDR4 support but they would go up against next+ gen. of intel "APU" products since intel won't be standing still.
 

AtenRa

Lifer
Feb 2, 2009
14,003
3,362
136
Gaming performance is not the primary objective for the APUs from AMD, compute is. Higher iGPU Shader count will bring higher computational performance with OpenCL and that is what AMD is after.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
146
106
They can wait till 2016 and do 20nm parts with stacked eDRAM while having DDR4 support but they would go up against next+ gen. of intel "APU" products since intel won't be standing still.

I would doubt you see any CPU with stacked DRAM in 2016. More like 2018+. The cost will simply be prohibitive for IGPs and lower dGPUs. Stacked memory is also to replace DIMM memory.
 

NostaSeronx

Diamond Member
Sep 18, 2011
3,815
1,294
136
Why not have an "ambidextrous" cache that operates like the L3 cache for all units.
 
Last edited:

PPB

Golden Member
Jul 5, 2013
1,118
168
106
What people dont get is that stacked ram at its current capabilities and density wont do any greater good than just help at lower res/quality performance. Iris pro perf tanks badly after you saturate that L4 cache. The longer but better road will always be to improve system RAM bandwidth. DDR4 is the best route for that.

Only niche stacked ram would serve in the HSA scheme of things would be as a ambidiextrous, small sized L3 cache with actual perf gains for the cpu part of the APU. My bet Intel didnt gain cpu perf from their L4 in iris pro because they already had good performing L3. If pulled right, amd can save the traditional L3 die size because the stacked ram could fit that role
 

norseamd

Lifer
Dec 13, 2013
13,990
180
106
Since we are this far in this thread;

Carrizo/Toronto is;
- 60h-6Fh models of the 15h microarchitecture.
- Supports AVX2, BMI2, MOVBE, and RDRAND.
- GPU uses the Volcanic Islands ISA, unknown if the CU count increased or decreased.
- Memory Controller supports DDR3 and DDR4, with certain models supporting ECC and Registered DIMMs.
- FP4 and SP2 will possibly have an integrated southbridge, while it is disabled in FM2 plus.
- UMI and GPP will be using the PCIe 3.0 spec.
- Carrizo's server counterpart "Toronto" supports data poisioning with ECC.
- Carrizo/Toronto will be using GF28A instead of GF28SHP.

will carrizo have a higher clock rate than kaveri does have
 

norseamd

Lifer
Dec 13, 2013
13,990
180
106
you know this question was actually on my mind

what about if you stacked a l4 cache on top of a processor and also used ddr4 normally
 

OatisCampbell

Senior member
Jun 26, 2013
302
83
101
Gaming performance is not the primary objective for the APUs from AMD, compute is. Higher iGPU Shader count will bring higher computational performance with OpenCL and that is what AMD is after.

As a general consumer who runs office apps and games, can this benefit me?:confused: