Nvidia Teslas on an office server?

equazcion

Member
Feb 13, 2006
56
0
61
I was wondering if it's worth it to invest in nVidia Tesla cards for a Windows 2008 server. I'm reading that they're mainly intended for scientific calculations and simulations, but would they not help with ordinary office LAN services as well, like databases, XMPP, WSUS, file serving, and other usual Windows services?
 

hans007

Lifer
Feb 1, 2000
20,212
18
81
uh, tesla cards have to have software written for them to take advantage of cuda / open cl etc.

none of those normal server tasks are written to those standards and it probably likely would not benefit from anyone writing them to cuda/open cl as they tend to not be tasks that are not highly parallizeable.
 

Soulkeeper

Diamond Member
Nov 23, 2001
6,725
151
106
Unless you have a full-time programmer working for your office that can write the opencl software you need, then it's pointless. Some of the big companies do ...
Or if you have a particular piece of software that you rely heavily on and already know can take advantage of opencl/cuda ...
 

Voo

Golden Member
Feb 27, 2009
1,684
0
76
Unless you have a full-time programmer working for your office that can write the opencl software you need, then it's pointless. Some of the big companies do ...
Well it's not as if a fileserver would ever profit from CUDA. Same goes for WSUS or messaging protocols. It's not as if the CPU was the bottleneck for those things (and I have no idea how you'd even write an efficient CUDA XMPP implementation for example - I don't like saying it's impossible but I don't see how)

Databases usually also fall into that category, although the term is nonspecific enough so if we include data warehouses - well possible although the bottleneck should still be the memory..
 

Soulkeeper

Diamond Member
Nov 23, 2001
6,725
151
106
Well it's not as if a fileserver would ever profit from CUDA. Same goes for WSUS or messaging protocols. It's not as if the CPU was the bottleneck for those things (and I have no idea how you'd even write an efficient CUDA XMPP implementation for example - I don't like saying it's impossible but I don't see how)

Databases usually also fall into that category, although the term is nonspecific enough so if we include data warehouses - well possible although the bottleneck should still be the memory..

I'm not suggesting any of that so I won't argue with you there.
Although there certainly are "office applications" which can benefit from hardware speed/efficiency increases including opencl/cuda (of which there are very few).

I think you quoted me because you misunderstood my position.
I am not suggesting the op purchase tesla for any of the needs he listed.

I should be more clear :)
 
Last edited:

Voo

Golden Member
Feb 27, 2009
1,684
0
76
I think you quoted me because you misunderstood my position.
I am not suggesting the op purchase tesla for any of the needs he listed.

I should be more clear :)
Sorry, as I understood your post you meant that if someone had the programmers for the task they could profit from using tesla. Seems like I misunderstood you, but no harm done I hope.

Too bad doesn't seem like this topic will give us much controversy or food for talk ;-)
 

ydnas7

Member
Jun 13, 2010
160
0
0
i think these could be of benefit to databases. but as database costs are essentially a high software $$$. the benefits would be very questionable.
 

Tuna-Fish

Golden Member
Mar 4, 2011
1,513
2,101
136
i think these could be of benefit to databases. but as database costs are essentially a high software $$$. the benefits would be very questionable.

How would it help databases? Databases are all about memory access, and while GPU's have a lot of bandwidth, the memory latency is something truly appalling.
 

Dribble

Platinum Member
Aug 9, 2005
2,076
611
136
How would it help databases? Databases are all about memory access, and while GPU's have a lot of bandwidth, the memory latency is something truly appalling.

lol - you just answered the question then tried to deny it for some reason. Having memory several times as fast matters a lot more then a little latency. Not that I am convinced that databases are "all about memory access" anyway - I am sure there's a whole pile of hardware bottlenecks - fast processing for all the server side work, v fast network access, etc.

Anyway I am sure a gpu could speed up servers but that requires software that runs on the gpu which is pretty unlikely for a while. Perhaps one day oracle will bring out a gpu accelerated version, but lots complexity as these things have to be 100% reliable and secure.
 

Voo

Golden Member
Feb 27, 2009
1,684
0
76
lol - you just answered the question then tried to deny it for some reason. Having memory several times as fast matters a lot more then a little latency. Not that I am convinced that databases are "all about memory access" anyway - I am sure there's a whole pile of hardware bottlenecks - fast processing for all the server side work, v fast network access, etc.
If you're talking about a normal db (and not datawarehousing or something where you do some computational expensive things with the data) then I don't see what you're getting at. The current memory of a GPU together with how their caches work limit the usefulness of a GPU for memory limited things quite a lot.

Loading data from RAM/writing back is additional work that doesn't have to be done when doing the work on the CPU (and even if the data is in GDDR, no GPU can talk to the IP stack [atm] so you still end up transferring data) and since there's not much work to be done the GPU can't really excel. Sure a fulltable scan can be sped up, but an indexed search? Not really. And not many performance relevant statements have to use fulltable scans.
 
Last edited:

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
The kind of work where a high-latency compute engine (anything not directly connected to the CPU) could excel will also be the kind of work where more COTS servers can be easily added. The hardware and power bills will be more than made up for by not having to contract out new software for some proprietary hardware. The fancy GPU box will need to be able to basically run its own DB software, in tandem with the COTS server. Needing to work on large result sets would also be a prerequisite, currently, so that it can churn for awhile on each query, to make latency less of an issue. There is promise, but we aren't nearly there, yet.

Until low-latency access to parallel compute systems is common (direct access to the CPU's memory, with latency comparable to the CPU's own, will be a must), with a wide software support base, either as IGP-with-RAS, or something plugged into a spare CPU socket, we likely won't see much going on. There will be more consumer/prosumer programs, where hardware cost and power efficiency matters, and more embedded programs, where spare resources just don't exist, for the near future.

Anyway I am sure a gpu could speed up servers but that requires software that runs on the gpu which is pretty unlikely for a while. Perhaps one day oracle will bring out a gpu accelerated version, but lots complexity as these things have to be 100% reliable and secure.
But, why? They offer proprietary software, and keep customers on an upgrade treadmill. The T-series gives them the ability to make that proprietary hardware, too. It is not just unlikely for awhile, it is unlikely, period, unless oracle buys a company that currently produces GPUs, to make a new processor for Oracle software to run on. Ellison makes Jobs look humble and selfless.
 

Dribble

Platinum Member
Aug 9, 2005
2,076
611
136
Oracle will react pretty quickly if a competitor comes out that uses the gpu and runs at 10 times the speed.

Also obviously it depends on the database but there's more to most servers then the actual oracle query. For example databases I work with would with do a query then return a load of data (i.e. a bunch of files). For the server the most efficient way is to return that data as it is but that is very heavy on network traffic. Often the most out of date bit of most companies IT infrastructures is their networking.

It would be much more efficient to zip all those files up into one big compressed file and send that, but that's a very computationally expensive operation. Often you can go even further and extract the sections of those files really required and just send them zipped up - even more computationally heavy.

That is the sort of thing a gpu would be able to do quite efficiently. I am sure there's a load of other examples that would also work well on a gpu - given new hardware you can always make use of it, just takes a company with a bit of initiative to work out how.
 
Last edited:

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
Oracle will react pretty quickly if a competitor comes out that uses the gpu and runs at 10 times the speed.
If that were plausible, maybe. As it stands, that kind of speedup is unreasonable, and will remain so, until memory operations are fast, or until a realiable OS runs on Teslas, DB software gets made for it, and nVidia starts supporting decent amounts of memory. The T-series is still pretty much untouchable, for the next several years.

Also obviously it depends on the database but there's more to most servers then the actual oracle query. For example databases I work with would with do a query then return a load of data (i.e. a bunch of files). For the server the most efficient way is to return that data as it is but that is very heavy on network traffic. Often the most out of date bit of most companies IT infrastructures is their networking.

It would be much more efficient to zip all those files up into one big compressed file and send that, but that's a very computationally expensive operation. Often you can go even further and extract the sections of those files really required and just send them zipped up - even more computationally heavy.

That is the sort of thing a gpu would be able to do quite efficiently. I am sure there's a load of other examples that would also work well on a gpu - given new hardware you can always make use of it, just takes a company with a bit of initiative to work out how.
The GPU would not be able to do that more efficiently. A GPU is slow at doing what your CPU generally does. Fermi can do some pointer chasing, but not well. Compression is not something a GPU will be able to do well. Decompression, with the right compression algorithm, maybe. If the network is the bottleneck, a programmable GPU won't fix that.

The [GP]GPU excels when you can apply the exact same instruction to a wealth of input data. For filtering, grouping, and sorting of large data sets, there should be room for it in the server room, in the future. Right now, moving data around is just too slow, and development of software for a proprietary device poses significant risk to bean counters.
 
Last edited:

equazcion

Member
Feb 13, 2006
56
0
61
Thanks for the replies, interesting discussion.

http://www.tomshardware.com/news/Windows-Leopard-Cuda-Stream,7648.html

"Consider the GPU as a 'helper' now, offering up its higher-end processing areas to compute a portion of a task carried out by the CPU..."

...Makes it sound like it helps out with nearly anything the CPU happens to be doing.

"Nvidia's CUDA is compatible with many computational interfaces including PhysX, Java, Python, OpenCL, and DirectX Compute..."

...Python and Java might be of help on a server, assuming this means CUDA helps run those no matter how they're written.

Some of this article is just quoted claims from the manufacturer, but still worth pointing out.
 

Tuna-Fish

Golden Member
Mar 4, 2011
1,513
2,101
136
lol - you just answered the question then tried to deny it for some reason. Having memory several times as fast matters a lot more then a little latency.

Only for anything without complex joins. The second you have to chase down data by foreign keys, the gpu starts being so slow it's not even funny. For single-threaded loads, modern day GPU's are about as fast as the Pentium 1. That's the speed that complex query would run.
 

Cerb

Elite Member
Aug 26, 2000
17,484
33
86
Thanks for the replies, interesting discussion.

http://www.tomshardware.com/news/Windows-Leopard-Cuda-Stream,7648.html

"Consider the GPU as a 'helper' now, offering up its higher-end processing areas to compute a portion of a task carried out by the CPU..."

...Makes it sound like it helps out with nearly anything the CPU happens to be doing.
Compute is the key word. Most of the space and power of your CPU is dedicated to doing one thing, through several different mechanisms: accurately predict what addresses in memory you will need over the next few hundred cycles, and try to make sure that the right one is on the CPU. The rest is useful, but no amount of ALUs, FPUs, dedicated vector engines, or any of that, will make up for a branch predictor that isn't up to snuff, or prefetchers simultaneously fetching with several different algorithms.

The GP part of a GPGPU has most of its space and power dedicated to crunching numbers. To make it worthwhile, you need to have enough to crunch, without significant branching, and with highly predictable data loading over the next few hundred cycles. Basically, the kind of code that does not get such great benefits from all the speculation and cache on your CPU.
 

Voo

Golden Member
Feb 27, 2009
1,684
0
76
"Nvidia's CUDA is compatible with many computational interfaces including PhysX, Java, Python, OpenCL, and DirectX Compute..."

...Python and Java might be of help on a server, assuming this means CUDA helps run those no matter how they're written.
No that's not what it means (or it's what it meant - haven't read the article, but it's factually wrong). In reality there are some libraries for Java, Python - pretty much the same as for C/C++. And believe me the last time I checked the Python CUDA lib was.. not pretty. That means writing Kernels in the usual C++ variant as strings. numpy could use CUDA for some stuff which was as nice as usual, but writing cuda programs yourself in python? Horrible. Or at least not nicer than writing them in C++ to start with. And writing CUDA kernels efficiently is a topic for itself - the compiler is just plain incapable (when was the last time you had to unroll a loop yourself :p).

But I think we're getting a bit OT..
 

Dribble

Platinum Member
Aug 9, 2005
2,076
611
136
Cerb + Tuna_Fish thank you for the replies, you obviously know more about this then me :)