AMD Announcement: ARMv8 Opterons In 2014

OBLAMA2009 · Oct 29, 2012

why do people care about this? is there going to be a desktop/consumer chip too?

mrmt · Oct 29, 2012

pelov said:
If they wanted to compete, I don't think they'd have much of a choice here other than opting to go with ultra low power.

Compete? Against who? Via? Because they won't be able to do with other ARM players. Samsung is developing custom ARM cores, do does Qualcomm, Nvidia and Calxeda. How do you expect AMD to compete with vanilla ARM cores?

pelov · Oct 29, 2012

mrmt said:
Compete? Against who? Via? Because they won't be able to do with other ARM players. Samsung is developing custom ARM cores, do does Qualcomm, Nvidia and Calxeda. How do you expect AMD to compete with vanilla ARM cores?

I wasn't aware Samsung makes server parts. Or Qualcomm for that matter. And I'm also pretty sure nVidia only sticks its nose into server with GPU...

What was your point again?

Ferzerp · Oct 29, 2012

OBLAMA2009 said:
why do people care about this? is there going to be a desktop/consumer chip too?

Very, very few people do, even within the market they are trying to target.

cbn · Oct 29, 2012

pelov said:
http://techreport.com/news/23768/tsmc-16-nm-finfet-process-coming-next-year

The chasm that exists in x86 in nodes isn't as evident in the low power segment.

Intel is changing that with atom. In fact, I believe future atom SOCs will be the first to roll off the new nodes. (This is the reverse of the current situation.)

Here is the EE Times article linked in that Techreport article:

http://www.eetimes.com/electronics-news/4398727/TSMC-taps-ARM-V8-in-road-to-16-nm-FinFET

TSMC aims to have chip design kits for its 16-nm process available in January with the first foundation IP blocks such as standard cells and SRAM blocks ready a month later. It will start limited so-called risk production of the 16-nm process in November 2013. Production chip tape outs will follow about four or five quarters later.

How soon after the tape-out would volume 16nm FinFET production products be available at retailers?

P.S. Here is an earlier article (July 31st, 2012)

http://semimd.com/blog/2012/07/31/firms-rethink-fabless-foundry-model/

Originally, TSMC planned to introduce finFETs at 14nm by late 2014. Now, the company has no plans to brand its finFETs at 14nm, but rather it will introduce the technology at 16nm. TSMCs finFET risk production is slated for the end of 2013 or early 2014, with production scheduled for the second half of 2015, Chang said.

pelov · Oct 29, 2012

Ferzerp said:
Very, very few people do, even within the market they are trying to target.

Yea, it's quite limited. The potential customers would be those who are interested in Calxeda and/or AMD's own SeaMicro atom-based SKUs.

The big tech corporations seem like they would be interested, but I'm guessing AMD would have to compete with Amazon's in-house stuff? The one point they can leverage would be SeaMicro, as it's a very important interconnect fabric for dense clusters that potential customers would be looking for.

mrmt · Oct 29, 2012

pelov said:
I wasn't aware Samsung makes server parts. Or Qualcomm for that matter. And I'm also pretty sure nVidia only sticks its nose into server with GPU...

What was your point again?

Samsung has been hiring some former AMD server guys, Qualcomm CEO explicitly listed servers as one of the main routes the company is looking for its future, and Nvidia and its project Denver is all about servers.

Ops, sorry, of course you knew those facts. Maybe if I draw a picture you'll understand better.

Let's do this without the attitude.
-ViRGE

pelov · Oct 29, 2012

mrmt said:
Samsung has been hiring some former AMD server guys, Qualcomm CEO explicitly listed servers as one of the main routes the company is looking for its future, and Nvidia and its project Denver is all about servers.

Ops, sorry, of course you knew those facts. Maybe if I draw a picture you'll understand better.

With zero interconnect.

Idontcare · Oct 29, 2012

Computer Bottleneck said:
How soon after the tape-out would volume production products be available at retailers?

Minimum 6 months, 9-12 months is typical.

mrmt · Oct 29, 2012

pelov said:
With zero interconnect.

Yes, but it is far easier to develop a custom interconnect than to develop a custom ARM core. Even AMD could afford to buy an interconnect maker, and AMD financial position is subpar.

AMD Seamicro will make a difference on highly complex or specific designs. For your everyday servers the chip itself will matter a lot more than the interconnect.

So they are switching to cheap, vanilla ARM servers, most likely inferior to its competitors, which means that whatever gains comes from manufacturing the boxes themselves. I doubt that this business will be able to muster enough margins to sustain the company in the long run.

ShintaiDK · Oct 29, 2012

pelov said:
http://techreport.com/news/23768/tsmc-16-nm-finfet-process-coming-next-year

The chasm that exists in x86 in nodes isn't as evident in the low power segment. From tapeout-to-products rolling off the fab, TSMC took less than a year with 28nm. Assuming the same time frame post tapeout, we'd be looking at 2014 16nm FinFET which is when Intel would be on 14nm Broadwell.

The above is the most likely reason AMD is opting with ARM. TSMC, GloFo and the Fishkill alliance won't be rushing any sort of high power node to viability when they can make much more money in the mobile market. As a result, we'd see AMD stuck on 28nm while Intel would be skipping happily along to 14nm. If they wanted to compete, I don't think they'd have much of a choice here other than opting to go with ultra low power.

TSMCs first 28nm tapeouts was in 2010. Products first started to appear in 2012.

Funny how people expected 28nm GPUs to appear in Q1 2011 as well.

And as shown, dont expect TSMC 16nm products before the middle or late 2015.

Spooling back time would most likely also show some overly optimistic expectations for GloFo.

ShintaiDK · Oct 29, 2012

happysmiles said:
GPU

Not at all. nVidia is also years ahead already in the segment for that matter.

ShintaiDK · Oct 29, 2012

Atleast it fits Rorys mantra of outsourcing, this time the entire core development.

mrmt · Oct 29, 2012

ShintaiDK said:
Atleast it fits Rorys mantra of outsourcing, this time the entire core development.

See why most people were asking if AMD was still an engineering company?

BTW, I think we can have a clear idea of where Kabini and Temash are in the power ladder. Above ARM and Atom, performance per watt should be subpar too. And AMD should not have the resources to bridge this gap.

JumpingJack · Oct 29, 2012

pelov said:
I wasn't aware Samsung makes server parts. Or Qualcomm for that matter. And I'm also pretty sure nVidia only sticks its nose into server with GPU...

What was your point again?

Samsung is working on ARM based server parts for release in 2014, at 32 bits ARM is inadequate for a large server application, but with instruction set V8 it becomes viable.

http://www.xbitlabs.com/news/cpu/di...ash_ARM_Based_Server_Chip_in_2014_Report.html

http://www.eetimes.com/electronics-news/4391764/Samsung-plans-ARM-based-CPU-for-servers--says-report

Qualcomm is also eyeballing servers

http://www.forbes.com/sites/ericsav...-to-target-pcs-servers-ceo-jacobs-says-video/

Even Nvidia is looking to get into the ARM server game

http://www.xbitlabs.com/news/cpu/di..._ARM_Based_Boulder_Microprocessor_Report.html

AMD is counting on their newly acquired interconnect tech to propel them into the lead for the nascent market.

AMDrulZ · Oct 29, 2012

I for one am glad that AMD is going to build ARM chips it opens more doors for them as a company x86 is good but its not the jack of all trades people wish it could be at least not yet anyway. Lets face a simple fact ARM is in Billions of devices being made every year it isnt going anywhere. AMD has some very good tech they can use with there ARM chips esp radeon graphics if apple really is going to make some low power ARM systems AMD becomes a logical partner. And i honestly belive AMD will integrate radeon graphics into their ARM chips but not liscense the tech to other ARM builders despite what some belive and my say AMD managment isnt that stupid. Well that is my two cents...lol

CTho9305 · Oct 30, 2012

Idontcare said:
Isn't that what CUDA/OpenCL/GPGPU addresses? If you want gigathreads then you don't need a sea of ARM cores to get that.

For now, I don't think anybody is running LAMP on GPUs. That may change, but my understanding is that for an individual thread, a GPU is very slow - to the point where, if you have any latency restrictions at all, a GPU thread is not likely to meet your performance requirements. I've heard that for a number of server workloads, the current generation of low-power processor cores (from all major vendors) just don't have quite enough performance to get a foot in the door. Websites care very much about how long a page takes to load, and if you can't e.g. serve Anandtech forums in under 100ms (random made up number), it doesn't matter how cheap/low-power/etc your chip is.

ShintaiDK said:
It seems AMD learned absolutely nothing since Hector. They still pursue the server segment, a segment they are already completely destroyed in with something around 4% marketshare today for x86 systems. And they solely pursue it for the status of it.

No, they solely pursue it for the margins associated with it. Being at the top of the x86 server market brings enormous profits, and those margins are effectively a siren song that is very hard to ignore (particularly for someone who got to sample them for a few years).

joshhedge said:
Can they still implement their own IP should it benefit the processor? Or are they just directly using the stock CPU design from ARM?

ARM offers many options there - you can build their microarchitecture (e.g. A8, A9, A15), or pay more for the ability to modify the microarchitecture or even design your own from scratch (Qualcomm Scorpion and Krait, Apple A6, Marvell XScale, etc). With any of those choices, you can choose to hand-design the core or use automation. Seeing how Apple used hand design for the core in the A6 and A6X SOC, and AMD recently picked up Jim Keller from Apple, I wouldn't be surprised if they put a lot of work into doing design by hand. People with a history doing hand design tend to believe it provides a lot of value (which might lead them to expect that their bog-standard ARM core will outperform everybody else's).

Exophase · Oct 30, 2012

mrmt said:
Compete? Against who? Via? Because they won't be able to do with other ARM players. Samsung is developing custom ARM cores, do does Qualcomm, Nvidia and Calxeda. How do you expect AMD to compete with vanilla ARM cores?

Samsung hasn't announced the development of custom ARM cores. Calxeda announced that they'll release a 64-bit ARM solution in 2014 but there's no word that they'll have developed the core themselves (much like AMD implies I'm sure ARM will announce availability of their own 64-bit core). I doubt this will be the case because so far they've used standard Cortex-A9 cores and will be using Cortex-A15 in 2013 products.

If you want the other current players in the custom ARM game there's Apple, Marvell, and AMCC.

MisterMac · Oct 30, 2012

Question:

What advantage will a 64bit v8 core have over a Haswell EP or broadwell EP - if they loose on perf per watt?

Why would anyone choose a stacked dense enviroment with this ?
(Other than really cheap mass webhosters).

This is what i fail to see - for these time\latency workloads that are a cluster-balanceable - it's there.

But that's like the enthusiast market in the server world - the 1%.

Will v8 cores clock at 3 ghz? 4ghz? or am i missing something?

ViRGE · Oct 30, 2012

MisterMac said:
Question:

What advantage will a 64bit v8 core have over a Haswell EP or broadwell EP - if they loose on perf per watt?

Why would anyone choose a stacked dense enviroment with this ?
(Other than really cheap mass webhosters).

This is what i fail to see - for these time\latency workloads that are a cluster-balanceable - it's there.

But that's like the enthusiast market in the server world - the 1%.

Will v8 cores clock at 3 ghz? 4ghz? or am i missing something?

Right now only the ARMv8 ISA has been released, so it's not really possible to answer those questions until we know more about the first ARMv8 cores. Which from what I hear is coming up this week when ARM introduces Atlas.

The expectation is that ARM will continue with another generation of relatively small and low power designs like what they did for the Cortex (ARMv7) and ARM11 (ARMv6) before it, though architecture partners like Qualcomm and NVIDIA can go in any direction they want.

Olikan · Oct 30, 2012

nothing about kabini/kavery? Sea islands?

boooo...

Cerb · Oct 30, 2012

mrmt said:
Yes, but it is far easier to develop a custom interconnect than to develop a custom ARM core. Even AMD could afford to buy an interconnect maker, and AMD financial position is subpar.

AMD Seamicro will make a difference on highly complex or specific designs. For your everyday servers the chip itself will matter a lot more than the interconnect.

So they are switching to cheap, vanilla ARM servers, most likely inferior to its competitors, which means that whatever gains comes from manufacturing the boxes themselves. I doubt that this business will be able to muster enough margins to sustain the company in the long run.

I cannot but agree. MIPS beat them all to the punch by years in some ways, but not by making anything COTS. Even if there ends up being enough markets (no one niche will be enough for more than any one company, I don't think), AMD would only have a significant advantage v. Calxeda and Baserock, who only support rather slow interconnects. For fighting thermal density with parallel workloads, I'd be willing to bet that the forthcoming Atoms will smash the upcoming ARM SoCs, and that AMD would have been better to make a RAS-heavy Jaguar, and try to stuff it into every niche they could find (and, the same chip w/ those features not turned on could be an OK mobile and AIO CPU).

JHH is at least working towards known-profitable areas, like HPC, but I'm not sure how well AMD will be able to hold out with other ARM vendors fighting over small margins, unless they have other trick up their sleeve (so far they haven't, though).

JumpingJack said:
Samsung is working on ARM based server parts for release in 2014, at 32 bits ARM is inadequate for a large server application, but with instruction set V8 it becomes viable.

32-bit isn't viable for anything more fancy than a Sheevaplug. Yes, it works, but 64-bit makes the world, as the CPU, OS, RAM-heavy services and file-heavy services, see it, orders of magnitude simpler, once you get into using GBs of RAM, and 100s of MBs of data sets. Wider registers are a nice bonus for small copies and DB work, too. It's the difference between admins getting pissed off and disabling the OOM killer, then recompiling the kernel, then getting pissed off all over again when they still run out of memory with enough available...to occasionally having to tweak a VM tunable.

CTho9305 said:
For now, I don't think anybody is running LAMP on GPUs. That may change, but my understanding is that for an individual thread, a GPU is very slow - to the point where, if you have any latency restrictions at all, a GPU thread is not likely to meet your performance requirements. I've heard that for a number of server workloads, the current generation of low-power processor cores (from all major vendors) just don't have quite enough performance to get a foot in the door. Websites care very much about how long a page takes to load, and if you can't e.g. serve Anandtech forums in under 100ms (random made up number), it doesn't matter how cheap/low-power/etc your chip is.

Pretty much. I could see OLAP going GPU, but (Apache/Nginx)+(MySQL/PostgreSQL)+(PHP/Python/Ruby), or anything like that, will be much more benefited from big caches and fast memory access than more processors. Not merely that, but they benefit greatly from caching in more abstract ways than the memory cache: sharing chunks of files through the OS, sharing their own data structures in memory (Python explicitly will need to use a 3rd-party object DB, PHP can use APC with a few config changes), and implicitly sharing most of their VM contents (depending on server config, anyway--upstream defaults are not good for web apps). In addition, MySQL and PostgreSQL both scale very well to many cores, provided there are also many transactions.

So, even in cases where more threads, cores, etc., could be more useful than fast response times per thread, you'd really want an expandable NUMA system, where each card adds to total CPUs and RAM (and with a kernel aware of the layout), rather than each card being a little ARM blade server. Also, the program counter can't reasonably be shared between threads, even though they're sharing the same binary, so a GPGPU's vector lanes would be unusable: scaling would be limited to the number of independent program counters supported by the processor (they call vector data lanes, "threads," so it's important to make the distinction).

I can see three possibly useful cases:

1. Shared hosting (standard and VPS), where everybody gets slow 'servers' anyway, and the hosting service would get benefits from reduced thermal density, and not needing to share nearly as much cache and RAM between sets of sites.

2. Massive low-bandwidth parallel processing, where time to completion does not matter as much as being able to have more jobs in flight. With a system like Hadoop, or Erlang, the scaling could be excellent. The case here is looking at them as sets of networked DRAM banks that happen to have some CPUs attached.

3. Memcache servers. I'm not sure how they'd actually perform here, relative to using fewer servers per U, but once 64-bit is here, each node could have 8+GB RAM, so you might be able to pack much more total cache per U than standard servers.

MisterMac said:
Will v8 cores clock at 3 ghz? 4ghz? or am i missing something?

All you're missing is that it's new, and there's obviously some demand. But, with no useful parts out, nobody knows exactly how it will shape up. So, anybody who can wants to be ready, over the next few years, in case it ends up not only being a few users totaling 1%. If they are good enough for enough users, and/or people find unexpected ways to exploit them, and it explodes, you don't want to be the guy that said it would be a waste of time to pursue, do you?

mrmt · Oct 30, 2012

Exophase said:
Samsung hasn't announced the development of custom ARM cores. Calxeda announced that they'll release a 64-bit ARM solution in 2014 but there's no word that they'll have developed the core themselves (much like AMD implies I'm sure ARM will announce availability of their own 64-bit core).

You are correct, both players didn't announce custom cores, and by reading my posts now I see that I incorrectly used cores and SoCs as if they were the same thing, which aren't. Both companies in your answer are developing custom SoCs, and specially in Calxeda cases they have some special sauce that distinguishes them from the competition.

My point still stands, as they have an edge over AMD full vanilla solution.

What makes me think that this was the second dumbest move in AMD history is that their ARM line is going to compete with their Kabini line, and Kabini is theoretically a bigger margin product than a vanilla ARM server. Unless Kabini is going to flop at low power envelopes and/or will have further problems scaling down, it doesn't make sense to just sell a vanilla ARM server to someone when you have a full custom solution in-house that can fit your client needs.

MisterMac · Oct 30, 2012

Cerb said:
All you're missing is that it's new, and there's obviously some demand. But, with no useful parts out, nobody knows exactly how it will shape up. So, anybody who can wants to be ready, over the next few years, in case it ends up not only being a few users totaling 1%. If they are good enough for enough users, and/or people find unexpected ways to exploit them, and it explodes, you don't want to be the guy that said it would be a waste of time to pursue, do you?

Well no - but it's the same argument with PHI MC.

Switch base ISA platform for (atm) initial cost benefit - but less performance?

That seems like a very weak betting for a CEO to take - unless there's something we don't know.

All we know is - some big players are betting on it.
And i'm inclined to believe it's 100% because they know there's potentially more profit per. SKU\Wafer than in mobile segment.

Now that we're slowly seeing that shape up it seems abundantly clear that:

A. Decisions to move towards this was made a while ago by the big players - before Intel decided to start flexing the LP muscle.
B. AMD is just jumping in paranoid fear of being left out.
C. There will be FAR too many players in this "Server ARM Dense Many Small Core" segment.

And this is without looking at the whole ISA\Platform change of what is now 99% of the market.
Which will be costy for those market movers that could potentially put ARM on the server workload map.

I see it more likely that atleast half the "potential" players in this space dropping out after finding out the initial ROI and volume needed to get good margins can't support this many players.

ShintaiDK · Oct 30, 2012

MisterMac said:
Question:

What advantage will a 64bit v8 core have over a Haswell EP or broadwell EP - if they loose on perf per watt?

Why would anyone choose a stacked dense enviroment with this ?
(Other than really cheap mass webhosters).

This is what i fail to see - for these time\latency workloads that are a cluster-balanceable - it's there.

But that's like the enthusiast market in the server world - the 1%.

Will v8 cores clock at 3 ghz? 4ghz? or am i missing something?

The problem that keeps returning is that, like with Atom. Slow gets expensive when you try to scale up the cores and extra platform costs needed. And ARM is horrible slow. Atom failed to do what AMD wants to do with ARM for the exact same reasons. If Atom/Bobcat cant do it today, ARM certainly wont.

ARM is already delayed with the 64bit cores and we see smartphones etc waiting for the extentions so they can add more memory. Its the case of always being behind the curve.

So the answer is, absolutely nothing besides that tiny niche segment. And that niche segment is so small and unprofitable that Intel dont even wanna make some (custom)Atoms for it.

AMD Announcement: ARMv8 Opterons In 2014

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Lifer

Diamond Member

Diamond Member

Diamond Member

Elite Member

Diamond Member

Lifer

Lifer

Lifer

Diamond Member

Member

Member

Elite Member

Diamond Member

Senior member

Elite Member, Moderator Emeritus

Platinum Member

Elite Member

Diamond Member

Senior member

Lifer