Will Bobcat be the home run AMD is looking for?

IntelUser2000 · Nov 18, 2010

CTho9305 said:
I think ISA still matters. Of course someone selling x86 chips will claim it doesn't... but if you actually look at the complexity involved in making x86 fast,

It matters, but not much as before. Really. The single biggest differences between ARM chips and the x86 chips are that the former has everything either on die or package which makes power management substantially easier.

It's like saying 2x cores cost 2x to manufacture. Not anymore because cores themselves take much less in proportion to the overall die size nowadays. Caches, I/Os, and the routers take at least the other half.

ARM has been exclusively focusing on integrated smartphone chips while x86 vendors have been focusing on making high performance as possible. The overlap hasn't happened yet. Let's see when they are both aimed at similar segments. Like similar performance levels.

Cogman · Nov 18, 2010

IntelUser2000 said:
It matters, but not much as before. Really. The single biggest differences between ARM chips and the x86 chips are that the former has everything either on die or package which makes power management substantially easier.

It's like saying 2x cores cost 2x to manufacture. Not anymore because cores themselves take much less in proportion to the overall die size nowadays. Caches, I/Os, and the routers take at least the other half.

ARM has been exclusively focusing on integrated smartphone chips while x86 vendors have been focusing on making high performance as possible. The overlap hasn't happened yet. Let's see when they are both aimed at similar segments. Like similar performance levels.

Honestly, though, wasn't that the goal of atom? Atom was originally intended to be a smartphone CPU. I would put that squarely in the position of being aimed at similar segments.

However, it is pretty interesting that OEMs decided that "Heck, we don't want this in our phones, but how about our laptops?" and proceeded to make laptops instead of phones with the atom. Not exactly what intel intended, but instead it created a new market they never thought of.

Idontcare · Nov 18, 2010

Cogman said:
Honestly, though, wasn't that the goal of atom? Atom was originally intended to be a smartphone CPU. I would put that squarely in the position of being aimed at similar segments.

However, it is pretty interesting that OEMs decided that "Heck, we don't want this in our phones, but how about our laptops?" and proceeded to make laptops instead of phones with the atom. Not exactly what intel intended, but instead it created a new market they never thought of.

Atom was/is too expensive and consumed/consumes too much power to ever be a contender in the mobile phone market to date.

It needs to consume <1W (preferably <500mW) and cost <$10 (preferably $3-$5).

OEM's put atom to work where Intel guided them to put it, Intel talked netbooks and tablets from day one IIRC.

Cogman · Nov 18, 2010

Idontcare said:
Atom was/is too expensive and consumed/consumes too much power to ever be a contender in the mobile phone market to date.

It needs to consume <1W (preferably <500mW) and cost <$10 (preferably $3-$5).

OEM's put atom to work where Intel guided them to put it, Intel talked netbooks and tablets from day one IIRC.

True. I guess I still remember this article http://www.anandtech.com/show/2493/2 Where anand heavily indicated that the atom would be perfect for smaller devices like GPSes and handheld gaming devices (Though, I agree with you, the pricing is way too high for those sorts of applications)

Voo · Nov 18, 2010

IntelUser2000 said:
It matters, but not much as before. Really. The single biggest differences between ARM chips and the x86 chips are that the former has everything either on die or package which makes power management substantially easier.

While I agree with the overall sentiment, x86 decoding still needs a lot of area compared to an ARM architecture, which also results in a higher energy consumption (because you can't really power down the decoding parts - although Intel has some neat ideas in SB to keep it low). That's usually put at around 5% of the overall energy consumption.

Vesku · Nov 18, 2010

I think Tech Report did a better job in regards to documenting Zacate performance.

http://techreport.com/articles.x/19981/1

Very similar to the Pentium SU4100 but with the beefier graphics component. Hyperthreading does help the Atom N550 outperform Zacate in demanding multithreaded encoding situations, something which is low on my priority list. I can see why AMD has a 4 core bobcat on it's late 2011 product forecast as we may start to see more dramatic 3+ thread benefits in more everyday use software by then.

OCGuy · Nov 18, 2010

I'm going with no...

grimpr · Nov 18, 2010

Xpage said:
I wonder why AMD doesn't use SOI for these processors, a 10% increase in cost, is worth it if they can get an equal amount of CPU performance, so they can catch up to CULV processors intel has in CPU performance, since they can pass the cost onto the consumer. I'd pay an extra $5-10 for 300 more mhz CPU speed

There you have it.

http://fudzilla.com/notebooks/item/20888-amd-apple-deal-is-28nm-notebooks

Eug · Nov 18, 2010

OK. So maybe I'll buy a 28 Zacate replacement in 2012 then.

I'm thinking such a part could do 1080p decode with CPU-only, and with the GPU the CPU usage could be consistently under 20%. ie. Very responsive netbook for all basic tasks.

Khato · Nov 18, 2010

grimpr said:
There you have it.

http://fudzilla.com/notebooks/item/20888-amd-apple-deal-is-28nm-notebooks

Interesting indeed. Though given how far out the products in question are, it's not very meaningful, especially when it comes to Apple. They may very well intend to use Fusion right now, but they might change their mind in a few months, and then they might change their mind again by the time they actually start system design at which point the decision might actually be final.

Really all this means is that Intel has some actual competition. Enough at least for Apple to put some more pressure on 'em in order to get what they want.

grimpr · Nov 18, 2010

Khato said:
Interesting indeed. Though given how far out the products in question are, it's not very meaningful, especially when it comes to Apple. They may very well intend to use Fusion right now, but they might change their mind in a few months, and then they might change their mind again by the time they actually start system design at which point the decision might actually be final.

Really all this means is that Intel has some actual competition. Enough at least for Apple to put some more pressure on 'em in order to get what they want.

Carefull what you wish for. :biggrin:

CTho9305 · Nov 19, 2010

Voo said:
IntelUser2000 said:

It matters, but not much as before. Really. The single biggest differences between ARM chips and the x86 chips are that the former has everything either on die or package which makes power management substantially easier.

It's like saying 2x cores cost 2x to manufacture. Not anymore because cores themselves take much less in proportion to the overall die size nowadays. Caches, I/Os, and the routers take at least the other half.

ARM has been exclusively focusing on integrated smartphone chips while x86 vendors have been focusing on making high performance as possible. The overlap hasn't happened yet. Let's see when they are both aimed at similar segments. Like similar performance levels.

Click to expand...

While I agree with the overall sentiment, x86 decoding still needs a lot of area compared to an ARM architecture, which also results in a higher energy consumption (because you can't really power down the decoding parts - although Intel has some neat ideas in SB to keep it low). That's usually put at around 5% of the overall energy consumption.

I guess my point is that there's impact beyond the decoder. For example, your load/store unit and fetch unit have to handle self-modifying code. Even without self modifying code, you can jump into the middle of an instruction (and e.g. skip a prefix), meaning you have to execute the same bytes with a different interpretation. Your integer datapath has to check if you perform a shift by 0 bits and set flags differently in that case versus the same instruction shifting by a nonzero amount. Your scheduler needs to handle special instructions that write results to two registers (e.g. MUL writing EDX:EAX for 32b*32b->64b). Your floating point unit needs to handle an 80-bit format nobody wants. Your address generation units have to pull in extra operands on one of the most important critical timing loops in a processor (load to use latency) or add additional complexity if you want to separate the segbase==0 case. A little area here, another gate on your critical path there, a little more energy per operation there... it adds up to a less optimal design. Sure, you could handle most of the ugliness with microcode (i.e. only add overhead to the decode), but if you want to make x86 go fast you end up dealing with it all over.

As processors turn into commodities where every option is basically "good enough" to accomplish most tasks and a customer can spend $1 on an ARM core or $1.05 on an x86 core (or $1 on the battery for the ARM-based system and $1.05 for the battery on the x86 system), wouldn't they save the 5%? x86 vendors took over from the old RISC vendors when the enormous volume (and good-enough margins) allowed them to out-invest the old RISC vendors, and the "small" designs weren't good enough for YouTube/Call of Duty. As the smaller, cheaper designs reach the "good enough" point, it seems like a lot of the x86 market could disappear. Game developers are already porting PC games from PowerPC on all 3 major consoles, and app developers are getting familiar with ARM for phones and tables, so I see the porting/"legacy" argument getting weaker every year.

edit: Oh, and I see overlap between the A15, and possibly the high end of A9 and the low end of x86 (e.g. Atom) within the next year or two.

Idontcare · Nov 19, 2010

Excellent post CTho9305!

maddie · Nov 19, 2010

CTho9305 said:
I guess my point is that there's impact beyond the decoder. For example, your load/store unit and fetch unit have to handle self-modifying code. Even without self modifying code, you can jump into the middle of an instruction (and e.g. skip a prefix), meaning you have to execute the same bytes with a different interpretation. Your integer datapath has to check if you perform a shift by 0 bits and set flags differently in that case versus the same instruction shifting by a nonzero amount. Your scheduler needs to handle special instructions that write results to two registers (e.g. MUL writing EDX:EAX for 32b*32b->64b). Your floating point unit needs to handle an 80-bit format nobody wants. Your address generation units have to pull in extra operands on one of the most important critical timing loops in a processor (load to use latency) or add additional complexity if you want to separate the segbase==0 case. A little area here, another gate on your critical path there, a little more energy per operation there... it adds up to a less optimal design. Sure, you could handle most of the ugliness with microcode (i.e. only add overhead to the decode), but if you want to make x86 go fast you end up dealing with it all over.

As processors turn into commodities where every option is basically "good enough" to accomplish most tasks and a customer can spend $1 on an ARM core or $1.05 on an x86 core (or $1 on the battery for the ARM-based system and $1.05 for the battery on the x86 system), wouldn't they save the 5%? x86 vendors took over from the old RISC vendors when the enormous volume (and good-enough margins) allowed them to out-invest the old RISC vendors, and the "small" designs weren't good enough for YouTube/Call of Duty. As the smaller, cheaper designs reach the "good enough" point, it seems like a lot of the x86 market could disappear. Game developers are already porting PC games from PowerPC on all 3 major consoles, and app developers are getting familiar with ARM for phones and tables, so I see the porting/"legacy" argument getting weaker every year.

edit: Oh, and I see overlap between the A15, and possibly the high end of A9 and the low end of x86 (e.g. Atom) within the next year or two.

This is probably going to sound very dumb, but might it be possible to improve x86 use by reducing the most unnecessary instructions over time?

Basically simplifying it and gradually making it more RISC like, or is everything too intermixed.

Voo · Nov 19, 2010

maddie said:
This is probably going to sound very dumb, but might it be possible to improve x86 use by reducing the most unnecessary instructions over time?

Basically simplifying it and gradually making it more RISC like, or is everything too intermixed.

And what do you propose to do when running old software that uses those "unnecessary" instructions? If that was possible Intel would've done it decades ago, but backwards compabitility is way too important in the market to do anything there, but they use µops for exactly that reason - i.e. split complicated instructions into several smaller instructions.

On the other hand I'm pretty confident that we won't get x86 in smartphones, because the main advantage of x86 over other ISAs (millions of existing sw products) just doesn't exist in that form (actually the contrary, although with java/.net becoming more important that may become obsolete in the future).

@CTho9305: Good post - the x86 architecture is complicated enough that you can find strange things everywhere you look (although op code prefixes and variable length opcodes up to 15 bytes [theoretically at least] still win in my book)

But I think you agree that you can handle most of it while decoding (and making sure that modern compilers use only instructions that are heavily optimized and get around the compability overhead) and I think Intel handles most of it that way as well. And one thing's sure I wouldn't want to be the person having to verify their decoders.

I think the problem in the desktop/laptop space is, that what do you prefer: A 5% more expensive CPU that can run all your programs you already have or a marginally cheaper CPU for which there are only a handful of programs available? I - and I'm pretty sure the majority of people - would take the first choice. On the other hand who'd write a program for a ARM cpu that doesn't target a smartphone?
With the advance of bytecode interpreters that problem should become smaller and smaller, but atm we're not there I think.
The reason why people can port games from consoles to PCs is that there's already a framework that handles most of the complexity - for other applications these don't exist.

cbn · Nov 19, 2010

Voo said:
While I agree with the overall sentiment, x86 decoding still needs a lot of area compared to an ARM architecture, which also results in a higher energy consumption (because you can't really power down the decoding parts - although Intel has some neat ideas in SB to keep it low). That's usually put at around 5% of the overall energy consumption.

Does anyone have information on how x86 decoding scales for smaller cores like atom and bobcat?

Would it be safe to assume less x86 decoder area is needed for a smaller cpu core?

Mr. Pedantic · Nov 19, 2010

No, I don't think so. GPU performance is good, very good. But for me, the CPU is far too weak. Atom is hardly a performance whiz, and claiming that your chip beats Atom isn't really very impressive at all, even if you manage it with cores that are half the size of Atom's. I would have very much liked to see bigger CPU cores with much better performance; in terms of die size, price, power, etc I'm thinking it would still have been competitive with Intel's current ULV offerings, but it would have been much better competitor on the performance front.

veri745 · Nov 19, 2010

Mr. Pedantic said:
No, I don't think so. GPU performance is good, very good. But for me, the CPU is far too weak. Atom is hardly a performance whiz, and claiming that your chip beats Atom isn't really very impressive at all, even if you manage it with cores that are half the size of Atom's. I would have very much liked to see bigger CPU cores with much better performance; in terms of die size, price, power, etc I'm thinking it would still have been competitive with Intel's current ULV offerings, but it would have been much better competitor on the performance front.

That's kind of the opposite of what bobcat was designed for. Low performance, low power, and SMALL! (read low cost).

Bigger cores imply higher costs, which means they would probably have to be sold in the same market (pricing-wise) as what CULV is hitting. The pretty much removes them from the $300-$500 laptop space.

cbn · Nov 19, 2010

veri745 said:
That's kind of the opposite of what bobcat was designed for. Low performance, low power, and SMALL! (read low cost).

Bigger cores imply higher costs, which means they would probably have to be sold in the same market (pricing-wise) as what CULV is hitting. The pretty much removes them from the $300-$500 laptop space.

I agree. These bobcat cores sound like they are sized just right. If they were any bigger they would no longer be able to compete with dual core netbook atom.

http://ark.intel.com/Product.aspx?id=50154&processor=N550&spec-codes=

9 watts of Ontario vs 8.5 watts of N550 atom? How much mileage did AMD get out of the extra .5 watt?

Idontcare · Nov 19, 2010

Computer Bottleneck said:
Does anyone have information on how x86 decoding scales for smaller cores like atom and bobcat?

Would it be safe to assume less x86 decoder area is needed for a smaller cpu core?

You have the closest thing to the world's most perfect resource for answering this question right here in this thread, and I can't emphasize that enough.

Take whatever scraps of credibility I have in this forum, call that X, and elevate that to a few powers, say X^6, and that is how serious I am in making this statement.

(let's see if he answers your post, he hasn't yet)

cbn · Nov 19, 2010

Opinions on AMD's Hudson Controller Hub/Chipset vs Intel's NM10 Express chipset?

Fusion Controller Hub description from Part 1 of the Anandtech Brazos Preview

Intel Ark listing of NM10 Express chipset
Intel website NM10 Express overview

Anyone want to speculate on power consumption differences? (NM10 Express is listed at 2.1 watts). How about differences in features?

soccerballtux · Nov 19, 2010

Idontcare said:
You have the closest thing to the world's most perfect resource for answering this question right here in this thread, and I can't emphasize that enough.

Take whatever scraps of credibility I have in this forum, call that X, and elevate that to a few powers, say X^6, and that is how serious I am in making this statement.

(let's see if he answers your post, he hasn't yet)

a scrap sounds like <1 credibilities, so that's exponential decay, not exponential growth

cbn · Nov 19, 2010

Idontcare said:
You have the closest thing to the world's most perfect resource for answering this question right here in this thread, and I can't emphasize that enough.

Take whatever scraps of credibility I have in this forum, call that X, and elevate that to a few powers, say X^6, and that is how serious I am in making this statement.

(let's see if he answers your post, he hasn't yet)

It would be great to get an answer on that.

I keep wondering why atom has such a decreased performance per watt compared to other laptop and desktop processors? IO power budget scaling differences vs. x86 decoder scaling differences vs. other differences?

extra · Nov 19, 2010

Amd is going to sell a *ton* of these.

cbn · Nov 19, 2010

http://techreport.com/articles.x/19937

More information on Hudson's power consumption.

What does Hudson look like? I don't have a sexy chip shot with a quarter for reference, but AMD's spec sheet paints a pretty good picture. The Hudson FCH is built on a 65-nm fab process and has a 23 x 23-mm, 605-ball BGA package—slightly larger than the APU it accompanies. Power consumption ranges from 2.7W to 4.7W for "typical configurations." Inside Hudson lurk the four PCIe Gen1 lanes required for the UMI interface, an extra four PCIe Gen2 lanes, six 6Gbps Serial ATA connections, 14 USB 2.0 connections, and built-in fan control logic.

So if we compare N550 atom vs Ontario we would have the following power consumption totals:

N550 atom (8.5 watts) + 2.1 watts for NM10 Express chipset (built on 130mm process) = 10.6 watts

Ontario (9 watts) + 2.7 watts for the lower Hudson chipset (built on 65nm process)= 11.7 watts

Total TDP difference= 1.1 watts

Will Bobcat be the home run AMD is looking for?

Will Bobcat be the Home run AMD is looking for?

Yes

No

Elite Member

Lifer

Elite Member

Lifer

Golden Member

Diamond Member

Lifer

Golden Member

Lifer

Golden Member

Golden Member

Elite Member

Elite Member

Diamond Member

Golden Member

Lifer

Diamond Member

Golden Member

Lifer

Elite Member

Lifer

Lifer

Lifer

Golden Member

Lifer