How many Sandy-Bridges to run IBM's Watson?

Page 3 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
No?

Two non-concurrent LinX runs on my machine:

LinX1v4.jpg


Logical threading may or may not matter in Power7 architecture, but I won't claim to know one way or another.

Hmm, I didn't look at this pic. Well you don't need to run multiple instances in Linx. Linpack benchmarks like Linx are fully multi-threaded.

Here's what your score should be.

3.6GHz x 4 cores x 4 DP FLOPs/cycle = 57.6GFlops

That's why logical threads won't help you. It uses the FP unit to the max.
 

IGemini

Platinum Member
Nov 5, 2010
2,472
2
81
I ran separate instances solely to not require separate screenshots (see: non-concurrent).
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
My mistake... again. Oh well you can read the PDF file I linked about Power 7. SMT doesn't increase Linpack performance.
 

Edrick

Golden Member
Feb 18, 2010
1,939
230
106
So LinX takes advantage of AVX?

I have not ran LinX since I updated to SP1 last week. Now I have to try it when I get home.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
I ran Linx on the 2600K after installing SP1 from Windows Update. Still getting 40-ish GFlops.

Edit: NVM, I got it working. 93GFlops.
 
Last edited:

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
Edrick: Try turning off Hyperthreading. I can't get to AVX figures when Hyperthreading is on no matter what I do. Changing it to 4 threads in the program or via Task Manager won't work, you need to disable it.

Linpack is a VERY FPU intensive benchmark. I figure it might be power limited. 93GFlops is the highest I got, though with Hyperthreading off I got 67-85GFlops depending on my luck. You guys have your chips overclocked. Coincidentally you guys can't activate AVX. Hmm... :p

Just for fun, if you want to make your computer lag so much you can't even move your mouse, go to settings and activate "Real Time". It'll be almost totally unresponsive for a minute or so.

Oh wait what, this was a Power 7 thread?!? Sorry, back to Power 7. 32nm Power 7+ chips should be out this year.
 
Last edited:

mikedev10

Member
Dec 21, 2004
109
0
71
why are you turning off hyperthreading? am i correct to assume that's only to get top calcs on this benchmark, whilst you normally leave HT on?

i'm rather confused as to whether i'm getting the most out of my cpu at the moment !
 

Accord99

Platinum Member
Jul 2, 2001
2,259
172
106
why are you turning off hyperthreading? am i correct to assume that's only to get top calcs on this benchmark, whilst you normally leave HT on?

i'm rather confused as to whether i'm getting the most out of my cpu at the moment !
Yeah LinX is peculiar in this regard with Hyperthreading causing significantly reduced and inconsistent scores. Most CPU intensive applications do benefit from HT and even from those that don't, the end score is roughly the same.

I found that I was able to get the best performance with a 2600K by setting LinX to use only 4 threads and after it started, go into Task Manager and set the affinities to use only core #0, #2, #4 and #6. It went from 50 something GFLOPs to 114 GFLOPs at 4.7 GHz.
 

PreferLinux

Senior member
Dec 29, 2010
420
0
0
I get 47/48 GFLOPS with LinX in Windows (Vista, so no AVX), but I get 80 GFLOPS in Linux (kernel version 2.6.31 or 2.6.32, not sure which, so with AVX). At stock.

Interesting observation I made with Linux: When run within KDE, I only got 77/78. When I went to runlevel 3 (no graphics), it went to 80, and was more consistent, too.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
why are you turning off hyperthreading? am i correct to assume that's only to get top calcs on this benchmark, whilst you normally leave HT on?

i'm rather confused as to whether i'm getting the most out of my cpu at the moment !

Didn't you read the whole post? Linpack pegs FPU to full so using Hyperthreading gives you nothing. It means you even have a chance of getting lower score since the threads will still split the workloads up and cause contention.
 

zsdersw

Lifer
Oct 29, 2003
10,505
2
0
IIRC, activating the buzzer is Watson's least advantageous feature used during the Jeopardy challenge.

http://ibmresearchnews.blogspot.com/2010/12/how-watson-sees-hears-and-speaks-to.html

"When host Alex Trebek finishes stating a clue, a human operator (who works for Jeopardy!) turns on a “Buzzer Enable” light on stage to indicate that contestants can “buzz in” and answer. At exactly the moment the “Buzzer Enable” light is activated, Watson’s system receives a signal that the buzzer is open."

The best human contestants don’t wait for, but instead anticipate when Trebek will finish reading a clue. They time their “buzz” for the instant when the last word leaves Trebek’s mouth and the “Buzzer Enable” light turns on. Watson cannot anticipate. He can only react to the enable signal. While Watson reacts at an impressive speed, humans can and do buzz in faster than his best possible reaction time.
 
Last edited:

Idontcare

Elite Member
Oct 10, 1999
21,110
64
91
Didn't you read the whole post? Linpack pegs FPU to full so using Hyperthreading gives you nothing. It means you even have a chance of getting lower score since the threads will still split the workloads up and cause contention.

Just to show some data to support these correctly stated assertions:

Corei79204GHzwithHT.png


(this is from the linpack thread in this forum where this topic has been beaten to death, repeatedly, ad nauseam, twice over and then thrice again...)

And data without the HT enabled:
LinxScalingNehalemDenebKentsfield.png
 

Edrick

Golden Member
Feb 18, 2010
1,939
230
106
This explains why I see 2500's scoring higher than my 2600 in LinX.

So this opens the question if a 2600 will perform worse than a 2500 when only 4 threads are used in other applications. There is not many times I need more than 4, so perhaps turning HT off would actually increase speed (and reduce heat).

Sorry, off topic. Back to Power 7.

So based on what we see here, it looks like using 2500s would give a better fight agaisnt Power7 when it comes to FP throughput.

Also, where did you see the information on Power7+?
 

MrTransistorm

Senior member
May 25, 2003
311
0
0
So this opens the question if a 2600 will perform worse than a 2500 when only 4 threads are used in other applications. There is not many times I need more than 4, so perhaps turning HT off would actually increase speed (and reduce heat).
But how many programs are only going to use the FPU? It seems to me that having HT on is very rarely a detriment to performance in real-world applications. I would assume that HT does help in most well-threaded programs.
 

IntelUser2000

Elite Member
Oct 14, 2003
8,686
3,787
136
Just to show some data to support these correctly stated assertions:

Thanks for the great data.

I know you have another link for Amdahl's Law. It was for Euler3D. See, while Euler3D behaves similarly for multiple cores, unlike Linpack it benefits significantly from Hyperthreading: http://techreport.com/articles.x/15818/13

So even if its scaling pretty well with cores for Euler3D, it's nowhere near its theoretical capability therefore a gain is shown for Hyperthreading. Unlike Linpack, where you can simply calculate your maximum DP Flops/cycle and results will come out to be 80-90% of that calculated value.

Core i7 2600 Sandy Bridge: 3.4GHz x 4 cores x 8 DP Flops(AVX) = 108.8GFlops

You shouldn't be basing it on Linpack, but the applications you use. Quite rare you'll find a multi-threaded program that'll hit the execution resources so much so SMT is a guaranteed loss.
 
Last edited:

oRdchaos

Member
Nov 4, 2000
63
0
0
IIRC, activating the buzzer is Watson's least advantageous feature used during the Jeopardy challenge.

I felt like this was a bit of BS from the IBM'ers. If you buzz in too early there's a timeout window where you're no longer able to buzz in. Watson will never run into this issue. Any humans fighting to squeeze in ahead of Watson will hit it over and over again.

It looked like if the question took longer for Trebek to read, Watson had more than enough time to come up with his best answer, and was ready to buzz in immediately. At that point, the only lag in Watson's answering process was how long it took the electrical signal from the light to reach his circuits, and how long it took his 'thumb' to depress the buzzer. I would imagine that left a very small window of time for humans to buzz in.

It was clear that for a brief time Ken was trying to buzz in for every question regardless of his knowledge of the answer, and was having difficulty beating Watson on any of them. Ken's had more practice with the buzzer than any contestant in history, so that seemed a little off to me.

Watson certainly seemed to be at a disadvantage on the shorter questions, like the actor/director one, but that seemed to be processing latency rather than buzzer.