How many Sandy-Bridges to run IBM's Watson?

Page 2 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

Arkadrel

Diamond Member
Oct 19, 2010
3,681
2
0
From that picture?

Theres only a few things it can be.. either a beckton cluster, or a Magoney on overdrive. :biggrin:

But im saying 256 SB cores should do it. 300 if you want to keep them at stock.

Because thats 256 cores but 512 Threads with HT, keeping the faster QPI interlink on the current system with quad channel DDR3 which is suposed to be out for LGA2011 vs the Power 7.

:biggrin:

So assumption LGA2011 will be up to 10 cores / 20 meg cache cpu's.. umm.. thats 26 2011 cpu's one would need, with a total of OMFG 460 megs in cache. (were almost 1/2 at a gigabyte in cache)

4 cpu's / planer board ~ 7 racks. or 280 LGA2011 cores /w 560 working threads + 560megs of cache. :p

And you wouldnt need 3 phase power to run 7 x 2011 systems. :p

Or am i majorly underestimating the power of a Power7?



Are you makeing a joke? or being serious?
 

mv2devnull

Golden Member
Apr 13, 2010
1,526
160
106
All that power and heat means that even more power is needed to cool the beast, and so on. What size is the factor describing such overhead needs (on computing center) these days? Around 1.8?

Or power. I don't think you can get three phase power in your home from the utility company.
Depends. Our housing company has at least one outlet, and farms and similar small enterprise certainly have too. Might be different regulations.
 

Ben90

Platinum Member
Jun 14, 2009
2,866
3
0
Arkadrel said:
aigomorla said:
From that picture?

Theres only a few things it can be.. either a beckton cluster, or a Magoney on overdrive.

But im saying 256 SB cores should do it. 300 if you want to keep them at stock.

Because thats 256 cores but 512 Threads with HT, keeping the faster QPI interlink on the current system with quad channel DDR3 which is suposed to be out for LGA2011 vs the Power 7.



So assumption LGA2011 will be up to 10 cores / 20 meg cache cpu's.. umm.. thats 26 2011 cpu's one would need, with a total of OMFG 460 megs in cache. (were almost 1/2 at a gigabyte in cache)

4 cpu's / planer board ~ 7 racks. or 280 LGA2011 cores /w 560 working threads + 560megs of cache.

And you wouldnt need 3 phase power to run 7 x 2011 systems.

Or am i majorly underestimating the power of a Power7?
Are you makeing a joke? or being serious?
Well considering a Power7 "RAAAAAAAPPPES" anything Intel makes, its only off by like an order of magnitude.

Its pretty much 2 Gulftown cores = 1 Power7 core clock/clock.
 
Last edited:

Ben90

Platinum Member
Jun 14, 2009
2,866
3
0
Watson (90 750's) should be able to execute linpack at around 78.5 Tflops.

Assuming 100% perfect scaling it would take 937 2500k's judging from these results.

Please note that this comparison is EXTREMELY stacked in Intel's favor. Use a program that benefits from SMP and the Power7 will leave Sandy Bridge in the dust.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
There will be a IBM supercomputer that will use 14,000 8 core Sandy Bridge based server chips to achieve 3 PetaFlops. Comparing a desktop chip to a server one is... not so relevant.
 

aigomorla

CPU, Cases&Cooling Mod PC Gaming Mod Elite Member
Super Moderator
Sep 28, 2005
21,067
3,574
126
Watson (90 750's) should be able to execute linpack at around 78.5 Tflops.

Assuming 100% perfect scaling it would take 937 2500k's judging from these results.

Please note that this comparison is EXTREMELY stacked in Intel's favor. Use a program that benefits from SMP and the Power7 will leave Sandy Bridge in the dust.

So wait the bottom sandy says 80...
avxh.png


This is sandy... 1155 sandy, and not the 2011 sandy.

I still thinknig 260 cores on a 2011 sandy or roughly 26 Sandy Bridge LGA2011 cpu's.
In any regards, i think you could pull it off without having to sell your soul to the power company to install a three phase power system in your house.


As i said, if im majorly underestimating please point me to how.
I dont know the power 7 arch very well, but i at least know how to turn on the car, and tune it to some degree on an intel.
 
Last edited:

Rubycon

Madame President
Aug 10, 2005
17,768
485
126
Guys you can have all the flops in the world but if you don't have a super fast way to move the data amongst your cluster your computer will be a flop. :D
 

Tsavo

Platinum Member
Sep 29, 2009
2,645
37
91
Guys you can have all the flops in the world but if you don't have a super fast way to move the data amongst your cluster your computer will be a flop. :D

Floppies are a great way to move flops amongst computers. That's why they call them floppies! Duh!
 

aigomorla

CPU, Cases&Cooling Mod PC Gaming Mod Elite Member
Super Moderator
Sep 28, 2005
21,067
3,574
126
Floppies are a great way to move flops amongst computers. That's why they call them floppies! Duh!
floppy_disk_usb.jpg


My floppies are special, they got USB!

:O
 
Last edited:

IGemini

Platinum Member
Nov 5, 2010
2,472
2
81
So wait the bottom sandy says 80...
...
As i said, if im majorly underestimating please point me to how.
I dont know the power 7 arch very well, but i at least know how to turn on the car, and tune it to some degree on an intel.

The error is in your math. Linpack is giving you peak FLOP performance across all cores, not just one. Set Linpack to work on one thread and you'll see the "per core" FLOP performance.

You're saying it like this:
82 GFLOPS * 4 threads * 260 cores = ~85 TFLOPS

In reality, your number is this:
82 GFLOPS * 260 cores = ~21 TFLOPS

It would be closer to say you would need around 1000 Sandy Bridge 1155 2500K CPUs to equal the same performance level...not factoring in the extra threading ability of POWER7.

In either case, like Ruby said in so many ways, this discussion is purely academic.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
It would be closer to say you would need around 1000 Sandy Bridge 1155 2500K CPUs to equal the same performance level...not factoring in the extra threading ability of POWER7.

In either case, like Ruby said in so many ways, this discussion is purely academic.

Aigo: Well there's no 10 core Sandy Bridge chips so there you go.

IGemini: Linpack doesn't benefit at all from threading. Again, Socket 2011 Sandy Bridge-based Xeon chips with 8 cores would do it.
 

mikedev10

Member
Dec 21, 2004
109
0
71
So wait the bottom sandy says 80...
avxh.png


This is sandy... 1155 sandy, and not the 2011 sandy.

I still thinknig 260 cores on a 2011 sandy or roughly 26 Sandy Bridge LGA2011 cpu's.
In any regards, i think you could pull it off without having to sell your soul to the power company to install a three phase power system in your house.


As i said, if im majorly underestimating please point me to how.
I dont know the power 7 arch very well, but i at least know how to turn on the car, and tune it to some degree on an intel.

what program is running to calc your gflops? i'm using sisoft sandra, my chip is running faster but mah flops be lower. are they not running the same algorithm?

http://forums.anandtech.com/showthread.php?t=2144884
 

IGemini

Platinum Member
Nov 5, 2010
2,472
2
81
IGemini: Linpack doesn't benefit at all from threading.

No?

Two non-concurrent LinX runs on my machine:

LinX1v4.jpg


Logical threading may or may not matter in Power7 architecture, but I won't claim to know one way or another.
 
Last edited:

mikedev10

Member
Dec 21, 2004
109
0
71
so i downloaded linx and now i'm quite confused. what should i be getting? why the discrepancy between these scores?

gflops.png
 

Ben90

Platinum Member
Jun 14, 2009
2,866
3
0
No?

Two non-concurrent LinX runs on my machine:
Logical threading may or may not matter in Power7 architecture, but I won't claim to know one way or another.
Well, yes they will run at the same time, but notice the extreme performance degradation of the second one. What happened is it finished the first one while kinda almost working on the second. When the first finished then the second was able to go full throttle.

Either way, I think he meant SMP which is a different discussion. Linpack is so heavy on the core that SMP will actually reduce performance.

I still thinknig 260 cores on a 2011 sandy or roughly 26 Sandy Bridge LGA2011 cpu's.
In any regards, i think you could pull it off without having to sell your soul to the power company to install a three phase power system in your house.
2011 must be pretty powerful if its offering a 36x improvement over 1155.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
Logical threading may or may not matter in Power7 architecture, but I won't claim to know one way or another

Yea I meant for logical threads. The code that Linpack uses hits the FPU too much for there to be extra headroom.

http://www.power.org/events/Power7/Performance_Guide_for_HPC_Applications_on_Power_755-Rel_1.0.1.pdf

Page 65

aigomorla said:
you sure about that...

:hmm:

Yep, both R and B2 sockets have 8 cores. Sandy Bridge EX isn't a true EX part. It's a low-cost 4P aimed at Blades and HPC with far less RAS and scability features than Nehalem/Westmere EX.
 
Last edited: