Architectural Direction of GPUs

Page 5 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

BFG10K

Lifer
Aug 14, 2000
22,709
2,958
126
As for the ROPs, their speed depends on the memory clock, the higher the memory frequency, the higher the ROP performance.
On the GF100, the ROPs run at the core clock, just like they did with earlier architectures.
 

Genx87

Lifer
Apr 8, 2002
41,095
513
126
Tianhe-1 is a hybrid, using both gpus and cpus, it's powered by 5120 r700 class gpus.

Oak ridge and nvidia had/have plans for a supercomputer using fermi that will take the top spot in that list, havent heard anything about it since september last year though, anyone have any news about that project?

The only thing I can find is the original release from Sept 30th. I suspect since Fermi was just released the building of this supercomputer has just started. It will be interesting to see how it works out.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
We've been talking about NV vs ATI for a while, but if we are discussing architecture why not talk about what is keeping Fermi from competing with Intel. I haven't tried progamming on Cuda or directcompute yet, but I'm wondering what functions exactly are missing from these cards currently that keep it from replacing a CPU?

Branch heavy or scalar code isn't going to perform well on Fermi or any other type of processor that is designed in a comparable manner relative to a general purpose CPU. This is a design tradeoff, and not one that you can engineer around. Another issue, and a more common one, is that typical code can't be threaded nearly as heavily as you would need to make Fermi's architecture remotely competitive. Currently the top consumer CPU from Intel supports 8 threads, code that only runs on one of them gives you ~25% of the performance of code that can run on all eight(some wiggle room in there, but obviously HT isn't directly equal to another physical core). On Fermi, something that can only run on a single core can at best offer ~2%, and that would be if it weren't branch heavy and wasn't scalar(where the performance may be ~0.1% of ideal performance). A GPGPU will not replace a CPU or perhaps I should say if you made a GPU that was decent at general code it wouldn't be that good at what GPUs currently excel at.

750mhz was before Nvidia started getting chips back.

750MHZ was from an ATi slide. I never saw anything from nV indicating that was their target for clockrate.

The first thing is why suddenly go from a 3:1 ALU:TEX ratio to 8:1 ratio.

I see this as a gamble on nVidia's part. If they win this gamble, their parts will be significantly faster then ATi's and could create a situation where ATi's top parts are directly comparable to nV's much lower offerings. This will rely on developer support, and we really won't know how it plays out for them until after the refresh of Fermi more then likely.

On the subject of gpgpu I can't help but feel that nvidia's dominance of the HPC environment is as much marketing as fact.

If the total HPC market was simply the list of the fastest supercomputers in the world nVidia wouldn't be spending anything on making that list. One of the biggest HPC applications in terms of volume as a general example is insurance companies analyzing statistics to figure out ideal rates(attract the best customers for them, discourage the others). These types of applications are done by a single board in a normal desktop PC and is the type of market nVidia would really like to grab on to. Video editing is obviously another, as is the typical lab workstation. In terms of the big gun supercomputers, pre Fermi nV wasn't truly a viable option as they didn't support ECC nor did they have the level of exception handling required for true DP support. The machines from ATi and nV on that list currently are very limited in their capabilities, that can work in isolated situations, but not for mass market penetration. The new Oak Ridge machine using Fermi based products is supposed to be significantly faster then any of the machines currently on that list and will be a far more viable option for more typical supercomputer uses(ECC/DP support make a big differnce).

The way I see it ATI took the "safe" route with a smaller and less complex architecture, and this strategy payed off very well for them.

We will have to wait and see how well it pays off for them for quite some time. To date ATi has sold ~6Million DX11 GPUs. By way of comparison, nV has pushed ~150Million DX10 GPUs. I'm not saying they are directly comparable numbers at all mind you, just pointing out that we have seen probably about 3% of all DX11 GPUs sold as of right now. I think that they did a good job of executing their strategy, at this point the market will decide how well that is going to work out for them on a bottom line basis.