Run-of-the-mill x86 server CPUs can't handle HPC? Then tell me what is actually used for HPC, other than supercomputers?
Poorly phrased - meant run of the mill x86 servers. But it is true that most of the grunt for HPC comes from GPU accelerators.
Run-of-the-mill x86 server CPUs can't handle HPC? Then tell me what is actually used for HPC, other than supercomputers?
Wrong.Poorly phrased - meant run of the mill x86 servers. But it is true that most of the grunt for HPC comes from GPU accelerators.
GPUs are used as accelerators, which means their function is to speed-up specific kernels which constitute only a part of the overall computational task.I like Gustafson's law better. The big problem for GPUs and DP is latency and bandwidth, hence the reason top GPUs are going to HBM memory (and faster connects to the CPU). Every component superior in a particular problem domain. They don't hang all those GPUs off of HPC machines for nothing.
GPUs are used as accelerators, which means their function is to speed-up specific kernels which constitute only a part of the overall computational task.
They hang off clusters with hundreds of nodes which contain thousands of x86 cores, which do the bulk of the heavy-lifting.
GPUs are used as accelerators, which means their function is to speed-up specific kernels which constitute only a part of the overall computational task.
They hang off clusters with hundreds of nodes which contain thousands of x86 cores, which do the bulk of the heavy-lifting.
It depends on the fraction of overall code occupied by the linear solver - the higher the fraction, the better speedup GPUs can achieve - see page 18 of this PDF. Obviously this is going to be domain-specific, like you say, but it is important to note that even when we expect GPUs to be ahead in a certain scenario, a multi-core CPU is not far behind:This depends on the domain. The one workload I've seen up close where lots of GPUs were used, fluid dynamics for weather forecasting, the GPUs did practically all the work and the CPUs were there just to shuffle data around.
Looks like AMD's argument that 2P systems will, in many cases, become redundant is actually a compelling one.phoronix:Initial Benchmarks Of The AMD EPYC 7601 On Ubuntu Linux
AMD EPYC 7601 (32C64T)vs 2x intel Xeon gold 6138(40C80T)
http://www.phoronix.com/scan.php?page=article&item=epyc-7601-linux&num=4
Looks like AMD's argument that 2P systems will, in many cases, become redundant is actually a compelling one.
phoronix:Initial Benchmarks Of The AMD EPYC 7601 On Ubuntu Linux
AMD EPYC 7601 (32C64T)vs 2x intel Xeon gold 6138(40C80T)
http://www.phoronix.com/scan.php?page=article&item=epyc-7601-linux&num=4
How much of that is Intels fault though. Since all the sockets where point to point using QPI and didn't require any major changes to the CPU, the huge upswing in cost and it's near inability to use 2U made it an interesting setup. It cost more and you didn't get an increase in density. I don't know why someone wouldn't look at EPYC and say well we can get 32c or 64c in 2u, why not get 64c. That one system could negate my company's complete virtualization setup.Makes perfect sense really. How many 4 socket systems are there being spec'ed up these days?