• We should now be fully online following an overnight outage. Apologies for any inconvenience, we do not expect there to be any further issues.

[] IBM unveils Power8 and OpenPower pincer attack on Intel’s x86 server monopoly

Page 4 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

24601

Golden Member
Jun 10, 2007
1,683
40
86
IBM should really put an x86-64 decode engine in their chip designs to sweeten the deal and create a competitor to Intel for compatibility oriented markets.

The lack of x86-64 support basically limits them to the markets where performance/watt matter almost exclusively.

Maybe they could fleece 2 core POWER8 processors to consumers to get GLOFO a way to unload their wafers when they fail in true GloFo fashion.

More powerful per core performance would be extremely desirable in many Windows Server x64 applications like Game Server hosting, actual Gaming, any horribly threaded program that needs high per core performance in general.
 

DrMrLordX

Lifer
Apr 27, 2000
22,947
13,033
136
I completely mis-wrote that comment. I should have my BS in Physics revoked :$

Eh it's alright. We've all been there in some fashion or other.


I think that Broadwell will make a larger impact. The advantage Haswell brings over IVY won't be as big as Broadwell will bring over Haswell - mainly a 30% decrease in TDP at a given clock rate, which will lead to more significant savings in power bills. And, BW still has AVX2 and all the architectural improvements that HW has.

Broadwell should be a big deal for Intel, but it's gonna be awhile unless Intel is in a big hurry to push past Haswell-EP/EX. And who knows, maybe they are?

Anyway, Broadwell will probably square off against die-shrunk POWER8, provided Samsung/Glofo do as well with their 14nm process as the press snippets indicate.

Some comparisons of Ivy Bridge and POWER8:

SpecInt_rate_2006
2.7Ghz Intel Xeon E5-2697 v2 - 24 cores - 934
3.52Ghz POWER8 - 24 Cores - 1750

Specfp_rate_2006
2.7Ghz Intel Xeon E5-2697 v2 - 24 cores - 649
3.52Ghz POWER8 - 24 Cores - 1370

Okay, so where are the comparisons of POWER8 vs E7-8895 v2? The E5-2697 v2 launched in 2013 on an older stepping.

i.e. Broadwell 18 core will still not match Today's 12 core POWER8

No surprises there. I don't think anyone really expects Xeon to beat POWER8 on a per-core basis. I think they expect it to beat POWER8 on a per watt basis.

POWER8 is not standing still in the meantime either with higher clocked POWER8 chips to be released as well as future POWER8+ shrink.

I would expect to see POWER8 on Samsung's 14nm process eventually. Depending on how much you buy the hype coming from licensees, that could be pretty soon.

Well stated @thunng8! What @DrMrLordX chooses to overlook is you don't compare a single x86 server to a single Power server. Most businesses with 1000 employee's and higher will have a an IT staff, maybe an operations staff and 1 or two data centers of varying sizes and sophistication. I'll accept we can run a Power server (it can be P7 or P8) at 90% utilization - let's just say 100% of TDP. The server could be a 1, 2, 4, 8 or 32 socket server depending on the environment. Because of the efficiency of the Power Hypervisor it is doing the work that a 12, 16, 24, 32, ... core x86 or SPARC server can do and do it with 1/4th or 1/8th or 1/20th (I am just picking ratio's I have used in the past against these platforms) the compute resources. Now, run Oracle on that x86 server at 25% util on a 24 core server (which is 6 effective cores) it will require 12 licenses @ $47,500/license. That is roughly $600k + 22% maint per year starting with year 1. On P8, let's say only 3 cores are required (it's my story so I can tell it how I want :) ) that would be 3 Oracle licenses at $150K + 22% maint per year. Most x86 shops will deploy more x86 servers for each Oracle workload. For Power, they will just stack them on the same server. If we add a 2nd workload the x86 is another $600k totaling $1.2M + 22% maint whereas the Power is $300k + 22% maint. See how this scales? I don't want you to think I'm calling your baby ugly for no reason. The reality is, x86 vendors are positioning their servers to run these enterprise workloads where Power, SPARC, PA-RISC, Itanium, Alpha, MIPS and others have been for decades. Performance per core is key. When you have to buy server after server after server like x86 then perf/watt makes perfect sense. Just don't try to apply what is important to you to Power as it isn't relevant.

Uh, socket license issues have been there for years. That hasn't stopped a lot of server farms from picking up x86 machines in the past (which have required more sockets for the same level of computational power vs. other platforms), which is why Intel now controls about 95% of the server market. By your logic, nobody in their right mind would adopt microservers or blades for anything, and yet, that segment of the server market is on fire. I mean, who wants to pay socket licenses on a bunch of Atom or ARM-based servers when you have to pick up maybe 4-8 times as many sockets? Apparently someone does.

IBM should really put an x86-64 decode engine in their chip designs to sweeten the deal and create a competitor to Intel for compatibility oriented markets.

The lack of x86-64 support basically limits them to the markets where performance/watt matter almost exclusively.

It didn't work for Itanium. Who wants to buy a hot 190W chip and then run it in some kind of compatibility mode that will reduce performance?

Maybe they could fleece 2 core POWER8 processors to consumers to get GLOFO a way to unload their wafers when they fail in true GloFo fashion.

I hope IBM's official position on Globalfoundries is more positive than that, since they need someone else to make their chips for them now. Or do you expect TSMC to do it? Depending on whether or not you believe TechEye, GloFo is the one buying out IBM's fabs.

There's no telling what consumers will buy, but I have a feeling that dual-core POWER8 chips would not be a smash hit, especially if they spent 99% of their time decoding someone else's instruction set at a performance hit. With all the changes going on in the desktop and mobile space, it is more likely that people will be willing to jump on a different OS platform. Google already managed to get an enormous number of mobile users to switch to Linux (well, sort of). If that's possible, then nearly anything is possible.

So what would a dual-core POWER8 chip @ 3.52 ghz look like, anyway? If it scaled down perfectly, 32W TDP and 16 threads? That could make an okay console chip, though I think you'd be better off cutting the chip down to a single core and filling out the rest of the power budget with a CAPI-based device (GPU or what have you).

More powerful per core performance would be extremely desirable in many Windows Server x64 applications like Game Server hosting, actual Gaming, any horribly threaded program that needs high per core performance in general.

The real question here is: how attached are game server hosting firms to the Windows server platform? Typically, end-users are the ones most attached to their operating systems and legacy software.
 
Last edited:

Headfoot

Diamond Member
Feb 28, 2008
4,444
641
126
Race to Idle is highly irrelevant in the server space. Idle servers have gone the way of the dodo in the virtualized data center. And any data center worth talking about virtualizes. Maybe when microservers pick up and we can move back to physical cores, then we can talk about race to idle again
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Now, run Oracle on that x86 server at 25% util on a 24 core server (which is 6 effective cores) it will require 12 licenses @ $47,500/license. That is roughly $600k + 22% maint per year starting with year 1. On P8, let's say only 3 cores are required (it's my story so I can tell it how I want :) ) that would be 3 Oracle licenses at $150K + 22% maint per year. Most x86 shops will deploy more x86 servers for each Oracle workload. For Power, they will just stack them on the same server. If we add a 2nd workload the x86 is another $600k totaling $1.2M + 22% maint whereas the Power is $300k + 22% maint.

Your wall of text is pretty unreadable. Also, your Oracle license examples are way off, Oracle charges twice as much for a Power license as they do for x86.

And you forgot to include the price of AIX.
 

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Your wall of text is pretty unreadable. Also, your Oracle license examples are way off, Oracle charges twice as much for a Power license as they do for x86.

And you forgot to include the price of AIX.

Hi Phynaz, I know you are an excellent resource for this kind of information based on your RL work experience, can you tell me if Oracle prices their licenses preferentially towards their Sparc-based hardware products versus the other architectures?

I would expect them to, but have zero insight into the pricing layers for Oracle software licenses. Hoping you can shed some light there.
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Hey IDC,
Oracle pricing for current generation Sparc CPUs is equivalent to their x86 pricing. This is lower than it used to be, in the past it was priced the same as the other RISC architecture chips.

Below I've linked the price factor table. You take your number of CPU cores and multiply it by the listed number, and then round up if needed to get to a whole number. You then multiply that result by your negotiated license price to get your total cost. There's all kinds of rules around virtual machines, but there's no reason to go that deep in this thread.

http://www.oracle.com/us/corporate/contracts/processor-core-factor-table-070634.pdf

Edit: Oh, and this is for production servers only. Development servers use a completely different licensing model.
 
Last edited:

jhu

Lifer
Oct 10, 1999
11,918
9
81
Hey IDC,
Oracle pricing for current generation Sparc CPUs is equivalent to their x86 pricing. This is lower than it used to be, in the past it was priced the same as the other RISC architecture chips.

Below I've linked the price factor table. You take your number of CPUs and multiply it by the listed number, and then round up if needed to get to a whole number. You then multiply that result by your negotiated license price to get your total cost. There's all kinds of rules around virtual machines, but there's no reason to go that deep in this thread.

http://www.oracle.com/us/corporate/contracts/processor-core-factor-table-070634.pdf

Edit: Oh, and this is for production servers only. Development servers use a completely different licensing model.

If going with POWER, why use Oracle and not DB2?
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Many people do use DB2, we do somewhat on Power, and we run it exclusively on Z.

It comes down to application support, resource availability and other factors. For example, many smaller applications just don't support DB2. Another example is there are fewer skilled DB2 DBA's available to fulfill staffing needs compared to Oracle DBA availability.
 

thunng8

Member
Jan 8, 2013
167
72
101
Okay, so where are the comparisons of POWER8 vs E7-8895 v2? The E5-2697 v2 launched in 2013 on an older stepping.

Wouldn't be much different per core compared to E5-2697v2. BTW, the E5-2697v2 was only shipping late in 2013 .. so availability only ~6 months before POWER8. The E7v2 was only available 2 months ago.


No surprises there. I don't think anyone really expects Xeon to beat POWER8 on a per-core basis. I think they expect it to beat POWER8 on a per watt basis.

You would be surprised ..there are a lot of people who actually think Xeons are the top performing chip - just look at the start of this thread .. not one mention that POWER8 is far performance superior.

In reality even today's top bin Xeon is still slightly behind per core against POWER7 shipping in early 2010.

SpecInt_rate_2006
2.7Ghz Intel Xeon E5-2697 v2 - 24 cores - 934
3.86Ghz POWER7 - 16 Cores - 652

Specfp_rate_2006
2.7Ghz Intel Xeon E5-2697 v2 - 24 cores - 649
3.86Ghz POWER7 - 16 Cores - 586

http://www.spec.org/cpu2006/results/res2010q1/cpu2006-20100208-09586.html
http://www.spec.org/cpu2006/results/res2010q1/cpu2006-20100208-09582.html
 

Ill_take_Power

Junior Member
Apr 29, 2014
7
0
0
Your wall of text is pretty unreadable. Also, your Oracle license examples are way off, Oracle charges twice as much for a Power license as they do for x86.

And you forgot to include the price of AIX.

My apologies for the readability. Still getting used to this forum. I'll work on my style :)

My Oracle licensing is spot on. If you would read the link you referenced in a subsequent post you would see that Oracle prices x86 with a licensing factor of 0.5.

I used a 24 core x86 example. This means 24 x .5 = 12. I said it would require 12 Oracle licenses. What did I misstate? I rounded the dollars but I thought I was amongst friends here and you would cut me some slack.

For the Power, I said I only required 3 cores. I can put those 3 cores in a Dedicated LPAR or in a Shared Processor Pool with 3 cores. 3 cores times a Oracle licensing factor of 1.0 = 3 Oracle licenses.

It is true that many x86 users do not understand there are alternatives with financial options. It is also true that many x86 sellers intentionally misstate the licensing requirements by saying that x86 has a licensing factor of .5 and Power has 1.0 thus Power is twice as much and thus by extension will cost more. Ah, but with the rest of the story that I have told x86 users can see. There are lots of reasons for the 95% and I'm even saying I accept that number. However, I will say that whatever that number is means it is a opportunity for IBM to go after with Power8 whether running Little Endian Linux workloads, Oracle workloads, Open Source workloads like EnterpriseDB or Enterprise Commercial products like IBM's DB2 with BLU acceleration, WebSphere, Cognos, BigInsights, MQ, Portal and much more.
 

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
You're new here so I'll be nice and let you know that we expect facts around here. Pulling some made up stuff that 3 power8 cores is the equivalent of 24 Intel cores is going to require some proof on your part.

I've spent years moving off power onto x86 because it's so much cheaper to run. As in no contest. And believe me I'm one of the biggest power fans around.

Let's take your example of the shared processor pool. If you do that you are soft partitioned and then must license the entire server, at twice the price per core as x86.

So if your going to continue down this path, be prepared to back up your statements. I've done the TCO, and I've spent millions to move off power.

And btw, if you're paying $47K a core for Oracle I'm gonna quit my job and come be your rep. Man he's gotta be making a pile from you.
 
Last edited:

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Last edited:

DrMrLordX

Lifer
Apr 27, 2000
22,947
13,033
136
Urk. The 2697 isn't Intel's "top-binned Xeon".

I went poking around for SPEC numbers on 8895 machines, but sadly, none were available. However, I found SPECInt_Rate2006 (sadly, no FP) for a 4P 8891 machine. The 4P 8891 gets a 1750.

Perf/watt for the 8891 machine (considering only CPU draw, which isn't an entirely accurate picture of total system power draw, but whatever): ~2.822 (1750/ (4 * 155))
Perf/watt for the 3.52 ghz POWER8: ~4.605 (1750 / (2 * 190))

Interestingly enough, I think you actually get better performance/watt out of the 2P 2697 machine above . . . the 2697 machine gets 3.592 (934 / 260). Still below POWER8.

Now see? That wasn't so hard. POWER8 wins on perf/watt, at least according to some SPECInt_rate2006 benchmarks. Again, this doesn't take total system power into account, and I would like to see some figures on how much power the rest of the POWER8 platform consumes (ditto for the 8891 machine). I think 4P 8895 would have put up a better fight, but I have no data to corroborate that.

More benchmark data would paint a more-complete picture; sadly, I do not expect Anand to bust out an Ivy Bridge-EX (or Haswell-EX) vs. POWER8 article anytime soon. Maybe he will surprise me.
 
Last edited:

witeken

Diamond Member
Dec 25, 2013
3,899
193
106
Even if the Power CPUs consume more power, they would finish the task faster than Intel CPU & race to Idle faster.

It should even out in the end.

Race to idle really just means performance per watt. A bigger core might be 2x faster, but might consume 3x more energy. Even if it is 2x faster done, it will have consumed 1.5x more energy because it consumes 3x more energy/second while only being 2x faster.

The statistic that really matters is not performance/watt, but performance/joule.

That's the same. I'm actually not sure; it's quite confusing.

performance/watt = performance (FLOPS) / (energy (joules) / time (1 second)) = amount of performance a computer has per joule within a fixed amount of time
performance/joule = performance (GFLOPS) / energy (joules) = the amount of performance a computer has per joule it consumed during the task
 
Last edited:

NTMBK

Lifer
Nov 14, 2011
10,456
5,843
136
That's the same. I'm actually not sure; it's quite confusing.

performance/watt = performance (FLOPS) / (energy (joules) / time (1 second)) = amount of performance a computer has per joule within a fixed amount of time
performance/joule = performance (GFLOPS) / energy (joules) = the amount of performance a computer has per joule it consumed during the task

Yeah, we already covered how I screwed that one up :p

Performance/W == (Computation/s) / (Joules/s) == Computation / Joule
 

Fox5

Diamond Member
Jan 31, 2005
5,957
7
81
Race to idle really just means performance per watt. A bigger core might be 2x faster, but might consume 3x more energy. Even if it is 2x faster done, it will have consumed 1.5x more energy because it consumes 3x more energy/second while only being 2x faster.



That's the same. I'm actually not sure; it's quite confusing.

performance/watt = performance (FLOPS) / (energy (joules) / time (1 second)) = amount of performance a computer has per joule within a fixed amount of time
performance/joule = performance (GFLOPS) / energy (joules) = the amount of performance a computer has per joule it consumed during the task

Is race to idle really a "thing" on systems like this? If you're paying this much for a computer, I'd imagine you intend to have it running 100% all the time.
 

Headfoot

Diamond Member
Feb 28, 2008
4,444
641
126
Is race to idle really a "thing" on systems like this? If you're paying this much for a computer, I'd imagine you intend to have it running 100% all the time.

Like I've already said, it isn't. Race to idle is NOT relevant in the server environment. Everywhere, with a few exceptions for HPC etc, virutalizes servers these days if they are not running at a constant load (of some variety) specifically so that they are running at constant load. It's more efficient in literally every metric except maybe on initial cost in a few software licenses. Everywhere virtualizes, its not a new thing, it hit its stride years ago and now its a pretty well defined use case.

The only places that can rationally not virtualize in today's datacenters are very small servers (think small single company server) where not even an entire server's load worth is used, perhaps some varieties of cold storage systems (write-to-tape stuff), and HPC. And HPC is usually under constant load anyways
 
Last edited:

Phynaz

Lifer
Mar 13, 2006
10,140
819
126
Is race to idle really a "thing" on systems like this? If you're paying this much for a computer, I'd imagine you intend to have it running 100% all the time.

Not really, with the way VM's work all the race does is free up the CPUs to get allocated to another VM.
 

TuxDave

Lifer
Oct 8, 2002
10,571
3
71
Like I've already said, it isn't. Race to idle is NOT relevant in the server environment.

From my experience, it's a race "to get people to shut up about adding more servers" more than anything else. :p

The faster servers get, the more work we'll find to put on them and the less careful we get about program runtime. So yeah, servers will always have work to do.
 

thunng8

Member
Jan 8, 2013
167
72
101
What performance does it get from the ability to run so many threads on 12 cores? Hyperthreading only sees ~20% gains.

According to IBM Performance estimates it is about 100% using their internal rPerf benchmark (it is believed to be a composite of SPECInt, Specfp, SAP, Java and Database perf).

24 core 3.52Ghz POWER8
st: 209.1
smt2: 303.2
smt4: 394.2
smt8: 421.8


Source:
http://www.ibm.com/common/ssi/fcgi-...lfid=POO03017USEN&attachment=POO03017USEN.PDF

Given that POWER8 is about 100% faster than Intel Xeon (Ivy Bridge) per core when running SMT8, that means that POWER8 running single thread is about the same performance as Xeon running hyperthreading.

If Hyperthreading gains about 20% .. then POWER8 is about 20% faster at single thread.
 
Last edited:

Headfoot

Diamond Member
Feb 28, 2008
4,444
641
126
Its important to note that hyperthreading scaling varies greatly based on the width of the architecture, so not all hyperthreading implementations are created equal. For example, Ivy Bridge i3s HT threads scale significantly worse than Haswell i3 HT threads due to the fact that Haswell is wider (among other things). I've got no idea how Power7, much less Power8 hyperthreading scales. Presumably well because of how heavily it is utilized, but I can't say for sure. I doubt IBM would hyperthread the cores so much if they didn't get decent scaling out of it
 

thunng8

Member
Jan 8, 2013
167
72
101
Its important to note that hyperthreading scaling varies greatly based on the width of the architecture, so not all hyperthreading implementations are created equal. For example, Ivy Bridge i3s HT threads scale significantly worse than Haswell i3 HT threads due to the fact that Haswell is wider (among other things).

Do you mean i7? i3 does not do hyperthreading. Do you have any hard data?

I've got no idea how Power7, much less Power8 hyperthreading scales. Presumably well because of how heavily it is utilized, but I can't say for sure. I doubt IBM would hyperthread the cores so much if they didn't get decent scaling out of it

As per my previous post, IBM quotes 100% scaling from single thread to 8 threads.