CPU beware: nVidia Tesla Linpack Numbers Analyzed

happy medium · May 19, 2010

twice the power usage but 8x the performance?

I take it that Nvidia is gonna make some serious money from this?

http://www.brightsideofnews.com/news/2010/5/19/cpu-beware-nvidia-tesla-linpack-numbers-analyzed.aspx

JAG87 · May 19, 2010

Pretty crazy stuff... this is where nvidia cashes in with Fermi.

Genx87 · May 19, 2010

This is just the beginning. As Nvidia adds 100s of GFlops to their chips Intel and AMD will be adding 10s.

cbn · May 19, 2010

BSN article said:
The new opportunities given to the scientific community will usher us into a new age of responsible power management and radical performance increases. nVidia isn't so wrong when they say a million dollar budget will now get you 50 TFLOPS of computing power, rather than 10 if you would rely on CPUs alone. The company now has the Linpack crown and we can't stop but asking ourselves will that title make nVidia to bite even more or is the company overly satisfied with themselves.

Interesting points. It will be interesting to see what the next 5-10 years holds.

cbn · May 19, 2010

Genx87 said:
This is just the beginning. As Nvidia adds 100s of GFlops to their chips Intel and AMD will be adding 10s.

BSN Tesla article said:
With the figure of 328.05 GFLOPS, Tesla C2050 is taking the crown of world's most powerful single piece of silicon. While there are CPUs that can achieve higher numbers, they're consisted out of multiple dies. Until C2050 came along, world's most powerful piece of silicon belonged to IBM and their PowerXCell 8i, more known as "The Cell." Equipped with full eight SPE units, the PowerXCell 8i at 3.2GHz yields out 100 GFLOPS, as measured by the team of prof. Jack Dongarra, creator of Linpack.

CPU's won't be able to regain the title even with the POWER7 quad-core die, which can yield a theoretical maximum of 132.48 GFLOPS. The major difference between the CPU and a GPU is that CPUs generally tend to have very low difference between theoretical efficiency and the real world [in number crunching, don't get us started on memory bandwidth efficiency], while GPUs don't have as much cache to correctly predict everything coming their way - and pack lower efficiency, which still enables them to dominate the rankings.

Can someone explain the part bolded to me?

Are CUDA programs specially designed to work around the differences in cache?

sandorski · May 19, 2010

Genx87 said:
This is just the beginning. As Nvidia adds 100s of GFlops to their chips Intel and AMD will be adding 10s.

Given that neither Intel nor AMD have been the Leader in this measure in the past, I suspect this means little to them.

Idontcare · May 19, 2010

The Future is Fusion!

Genx87 · May 19, 2010

sandorski said:
Given that neither Intel nor AMD have been the Leader in this measure in the past, I suspect this means little to them.

How much are tesla systems going to run? I suspect they will be competitively priced against x86 boxes. The previous champions in this costed quite a bit more than x86 offerings.

How many x86 boxes will need to be run, maintained, and powered to equal one Tesla box?

sandorski · May 19, 2010

Genx87 said:
How much are tesla systems going to run? I suspect they will be competitively priced against x86 boxes. The previous champions in this costed quite a bit more than x86 offerings.

How many x86 boxes will need to be run, maintained, and powered to equal one Tesla box?

How many Tesla Boxes will be built? AMD/Intel don't care about such a small Market. They compete and dominate in the PC Market and will continue to do so.

Genx87 · May 19, 2010

sandorski said:
How many Tesla Boxes will be built? AMD/Intel don't care about such a small Market. They compete and dominate in the PC Market and will continue to do so.

x86 competes in the HPC market. It basically destroyed the RISC chips in the process. It is quite lucrative. If it wasnt, Intel wouldnt had built Itanic.

happy medium · May 19, 2010

Genx87 said:
How much are tesla systems going to run? I suspect they will be competitively priced against x86 boxes. The previous champions in this costed quite a bit more than x86 offerings.

How many x86 boxes will need to be run, maintained, and powered to equal one Tesla box?

A cpu box with dual zeons and 48gb of memory cost 7,000$
A cpu/gpu box with dual Telsa and dual zeons and 48gb's of memory cost 11,000$
So the cards cost $1,500 a piece?

Thats about 35% more cost for 800% more performance.

ronnn · May 19, 2010

Idontcare said:
The Future is Fusion!

All this really excites me as I linpack all the time. Pr guys are funny....

sandorski · May 19, 2010

Genx87 said:
x86 competes in the HPC market. It basically destroyed the RISC chips in the process. It is quite lucrative. If it wasnt, Intel wouldnt had built Itanic.

Ah ok. Just read up on it a bit, seems I was in error. How Nvidia does remains to be seen. Heat/Power Consumption seem to be major factors in the HPC Market.

brybir · May 19, 2010

It gets more performance if the application is designed to benefit from Tesla type cards. This has been talked about a lot and is widely expected....large parallel workloads are the playground of GPU style processors.

Its not like the non-parallel workload world is going away overnight.

This market will get very interesting when MS gets DirectCompute more mainstream and both Intel and AMD start offering decent GPU performance on die with their processors. Should be interesting.

Scali · May 20, 2010

Computer Bottleneck said:
Can someone explain the part bolded to me?

Are CUDA programs specially designed to work around the differences in cache?

It basically means that performance drops off more quickly when you need memory access that isn't very predictable for the cache (or when you need to perform other types of operations that GPUs aren't as efficient in, compared to CPUs, such as a lot of branching).
Eg, if you have card with 1 TFLOPS theoretical power, with certain applications you should be happy to get 200 GFLOPS of actual throughput. That would be 20% efficiency, but still faster than any CPU.
With CPUs it's easier to get closer to the theoretical maximum, as their caches are much more granular and have smarter prediction algorithms (at the cost of speed), and the instructionsets are more flexible and powerful.

Lonyo · May 20, 2010

happy medium said:
A cpu box with dual zeons and 48gb of memory cost 7,000$
A cpu/gpu box with dual Telsa and dual zeons and 48gb's of memory cost 11,000$
So the cards cost $1,500 a piece?

Thats about 35% more cost for 800% more performance.

$2000 each.

Also raw performance is meaningless.
As are both other graphs.

A decent and more representative setup would either have an equivalently costed CPU only setup vs the CPU+GPU setup, or an equivalent power consumption setup, or one of each, or something which is both.
Giving raw performance numbers for two unequal setups is silly.
Giving performance per x numbers when you have thrown in a high fixed cost and then a couple of addons to said fixed cost gives a horribly misleading performance per $ and per watt figure. How much extra power and cost would getting a 4 CPU instead of 2 core server end up giving you? Probably not twice the cost and twice the power, surprisingly.

But then it's not NVs fault. As someone has already said, PR people are hilarious, and this is pretty hilarious PR.

Keysplayr · May 20, 2010

sandorski said:
How many Tesla Boxes will be built? AMD/Intel don't care about such a small Market. They compete and dominate in the PC Market and will continue to do so.

Small market huh? Ok. Lets just take one little item out of hundreds of thousands of applications. Mapping the Human Genome. I chose this because I used to work in the IT dept. of one of the biggest laboratories in the world. For mapping the human genome alone (there are literally hundreds of other types of research going on), Entire NOC's (network operations Centers) were dedicated with rows and rows of 40U racks filled top to bottom with hundreds of 1U dual CPU rigs. 25 40U racks in one NOC alone for this one application.
Each populated 40U rack costing upwards of $125,000, and this was years ago.

This is just one lab, running in one dept of the lab, running one app in one dept of that lab.
Multiply that by how many labs around the world collaborating the same app with their own NOC's and server resources?

I highly doubt...... scratch that. I know you do not realize the sheer magnitude of the HPC market. Cancer research alone has thousands and thousands of departments focusing on different types of cancers and causes of cancer. It's limitless.

In that same dept that was Mapping the Human Genome, don't forget each scientist with his/her own office workstations.

It's MEGA HUGE, even in this teeny tiny segment of the overall HPC usage in the world today.

I know you think this is PR dude. Call it what you want, but it's the truth. I just gave you 1st hand facts on the littlest example from one dept of a lab.

Multiply that by millions for all other applications in the scientific community alone. And this is Academia alone. We haven't even gotten into Corporate usage yet.

It's really mind blowing, the sheer size of it. It's no wonder it elludes people when they can't fathom the size or use a little example like this to put things into perspective.

Then there are the other numerous departments.
Alzheimers research
Cancer research
Autism research

Too many to list, and each with their own numbers crunching NOC's.

Still feel the same way now?

Scali · May 20, 2010

Thing is, nVidia compensates for the market size in their price anyway.
That's the main reason why a Quadro is considerably more expensive than a GeForce... or a Tesla considerably more expensive than a Quadro even.
Intel/AMD do the same with their server CPUs. Most of it is rehashed technology from the desktop/notebook market, but the margins are a lot higher.

There have been many companies that pretty much survived on HPC systems alone... such as SGI, Cray, Sun to a certain extent etc.
They may not have sold all that many units compared to consumer-oriented companies... but that didn't stop them from being highly successful and becoming large, profitable and very influential companies.

Tsavo · May 20, 2010

Keysplayr said:
Small market huh? Ok. Lets just take one little item out of hundreds of thousands of applications. Mapping the Human Genome. I chose this because I used to work in the IT dept. of one of the biggest laboratories in the world. For mapping the human genome alone (there are literally hundreds of other types of research going on), Entire NOC's (network operations Centers) were dedicated with rows and rows of 40U racks filled top to bottom with hundreds of 1U dual CPU rigs. 25 40U racks in one NOC alone for this one application.
Each populated 40U rack costing upwards of $125,000, and this was years ago.

This is just one lab, running in one dept of the lab, running one app in one dept of that lab.
Multiply that by how many labs around the world collaborating the same app with their own NOC's and server resources?

I highly doubt...... scratch that. I know you do not realize the sheer magnitude of the HPC market. Cancer research alone has thousands and thousands of departments focusing on different types of cancers and causes of cancer. It's limitless.

In that same dept that was Mapping the Human Genome, don't forget each scientist with his/her own office workstations.

It's MEGA HUGE, even in this teeny tiny segment of the overall HPC usage in the world today.

I know you think this is PR dude. Call it what you want, but it's the truth. I just gave you 1st hand facts on the littlest example from one dept of a lab.

Multiply that by millions for all other applications in the scientific community alone. And this is Academia alone. We haven't even gotten into Corporate usage yet.

It's really mind blowing, the sheer size of it. It's no wonder it elludes people when they can't fathom the size or use a little example like this to put things into perspective.

Then there are the other numerous departments.
Alzheimers research
Cancer research
Autism research

Too many to list, and each with their own numbers crunching NOC's.

Still feel the same way now?

That's pretty cool. Put Tesla's in all the boxes and you'd be able to play Crysis at 1920x1080 at 60 fps with all the eye candy cranked to max.

Keysplayr · May 20, 2010

Tsavo said:
That's pretty cool. Put Tesla's in all the boxes and you'd be able to play Crysis at 1920x1080 at 60 fps with all the eye candy cranked to max.

Or perhaps try to find a cause/cure for a debilitating disease.

Fox5 · May 20, 2010

Keysplayr said:
Or perhaps try to find a cause/cure for a debilitating disease.

http://www.youtube.com/watch?v=hevLZzodcmI

Fox5 · May 20, 2010

I really hope nvidia finally utilizes its 3dfx ip and brings these commercials back.

http://www.youtube.com/watch?v=E311nNuhy34

dug777 · May 20, 2010

Keysplayr said:
Or perhaps try to find a cause/cure for a debilitating disease.

I think he may have been cracking a little joke, Mr Serious Pants

Keysplayr · May 20, 2010

Well I'm sorry, but I'm just a little tired of thread derails lately. Intentional or not. I don't lack a sense of humor, just the patience this morning.

Genx87 · May 20, 2010

sandorski said:
Ah ok. Just read up on it a bit, seems I was in error. How Nvidia does remains to be seen. Heat/Power Consumption seem to be major factors in the HPC Market.

It is, and surprisingly the problem on the power consumption side runs into how much a power plant can supply over the wire. I have heard about data centers that are tapped out from a power perspective. The only way to get more performance is to find a faster solution at the same power envelope or build their own on site power plant. In this case if a DC had a problem like this, the Nvidia solution would provide 4 times the performance for the same power envelope.

CPU beware: nVidia Tesla Linpack Numbers Analyzed

Lifer

Diamond Member

Lifer

Lifer

Lifer

No Lifer

Elite Member

Lifer

No Lifer

Lifer

Lifer

Diamond Member

No Lifer

Senior member

Banned

Lifer

Elite Member

Banned

Platinum Member

Elite Member

Diamond Member

Diamond Member

Lifer

Elite Member

Lifer