[H]ardOcp : Just Cause 2 performance and image quality

Cookie Monster · May 4, 2010

Why did this thread turn into a massive flame war?

I think people fail to realise that Just Cause 2 is a DX10/10.1 game. What does this mean? no tesellation since its a DX11 feature. But because nVIDIA has CUDA, the devs were able to implement something similiar via CUDA on DX10/10.1 nVIDIA hardware. (note: This game is also part of the TWIMPTBP program)

Also becuse this was done with CUDA, it shows the possiblity of the same effects being done via OpenCL/direct compute (or tessellation).

I quite like the water effects and hope to see similiar/better graphical improvements are added to future games.

zerocool84 · May 4, 2010

Same thing as Physx. Alienate half your user base and it won't take off. Can't get any more simple than that.

BenSkywalker · May 4, 2010

Why did this thread turn into a massive flame war?

I don't see people flaming each other here.

Alienate half your user base and it won't take off.

You bring up a good point, much better to alienate 95% of your user base by requiring DX11, right?

ugaboga232 · May 4, 2010

Show me where it requires incredible level of tessellation to do what Cuda does? That's just your educated guess.

T2k · May 5, 2010

Keysplayr said:
In which case Nvidia would perform better the higher the tess level gets.

Anything else?

Well, that mincy little thing that Fermi does not even have dedicated tessellator, it's taken away from shaders - in other words the higher the load goes, the lower the tessellation performance falls.

So it's actually the other way around, in more and more advanced, modern games NV will have less and less tessellation power versus ATI's dedicated performance.

T2k · May 5, 2010

BenSkywalker said:
You bring up a good point, much better to alienate 95% of your user base by requiring DX11, right?

Right because Windows 7 in its first six months isn't already well above 10% market share and didn't sell well over 100M copies already...

...guess what, PhysX/CUDA already well behind and the more DX11 cards are sold, the more they become irrelevant.

It's a pretty simple and Microsoft, while probably not giving a crap about CUDA vs OpenCL, plays its DX11 cards very well and NV has no choice but playing along while hoping they can woo enough people away from x86 to their GPGPU stuff to survive the future.
Of course, if GPGPU market heats up then MS will immediately deploy few thousand engineers and update DirectCompute or even upgrade it...

Sylvanas · May 5, 2010

T2k said:
Well, that mincy little thing that Fermi does not even have dedicated tessellator, it's taken away from shaders - in other words the higher the load goes, the lower the tessellation performance falls.

So it's actually the other way around, in a modern game NV will have less and less tessellation power versus ATI's dedicated performance.

The shaders (CUDA Cores) and the Tesselator (Polymorph engine) are two very separate things. The Polymorph engine does not and cannot handle FP and Int calculations like a CUDA core does.

CUDA Core

Polymorph Engine

The Tesselator is still a fixed function unit in the geometry pipeline but there are 16 of them and they communicate through an L2 cache to coordinate out of order execution.

The higher the Tesselation load does not compromise CUDA core operations.

AdamK47 · May 5, 2010

Which in game demo/demos did they run for benchmarking?

BenSkywalker · May 5, 2010

Right because Windows 7 in its first six months isn't already well above 10% market share and didn't sell well over 100M copies already.

Have you posted on every forum that installing Windows 7 will turn any graphics card into a DX11 part? Actually, you should probably tell MS this, you clearly know more then ATi, nVidia and MS combined. Your genius must simply be beyond compare.

GaiaHunter · May 5, 2010

Sylvanas said:
The shaders (CUDA Cores) and the Tesselator (Polymorph engine) are two very separate things. The Polymorph engine does not and cannot handle FP and Int calculations like a CUDA core does.

CUDA Core

Polymorph Engine

The Tesselator is still a fixed function unit in the geometry pipeline but there are 16 of them and they communicate through an L2 cache to coordinate out of order execution.

The higher the Tesselation load does not compromise CUDA core operations.

But something has to apply textures and whatnot to all those extra triangles.

One thing I don't understand - I thought that tesselation was supposed to increase performance by adding detail on top of very simple textures opposed to the very complex textures games use atm, instead we have very complex textures getting even more detail and taking a huge performance hit.

Sylvanas · May 5, 2010

GaiaHunter said:
But something has to apply textures and whatnot to all those extra triangles.

One thing I don't understand - I thought that tesselation was supposed to increase performance by adding detail on top of very simple textures opposed to the very complex textures games use atm, instead we have very complex textures getting even more detail and taking a huge performance hit.

Yes, that's what Texture units are for- 4 per SM on Fermi running at half the shader clock. As I understand it (correct me if I'm wrong), when a surface of geometry is mapped it does not matter how many triangles are within in the boundary point- they all get the same texture. (Modelworks has a good post that may touch on this here )

Here's a question: Does an object that has been scaled up now have different depth buffer information as the non tesselated object? I'd think it'd have to seeing as the raster engine is at the end of the pipeline before it gets to your screen and therefore may change the way shadows are cast on the object now that it consists of more geometry.

In answer to your question though, I too wonder about this. Kind of like how hardware 'accelerated' physics has not 'accelerated' anything at all

.

EDIT1: I just remembered reading this Beyond3D article where it says:

Moving onwards from the domain shader, we find that, on average, for 15% of the render time the pipeline is stalled by rasterisation (setup included here), meaning that the domain shader can output processed vertices and the primitive assembler can assemble them faster than they can be setup and rasterised. This is a consequence of having numerous small triangles in the scene (just look at something like the dragon's leg or the roof), and is one of the cases where upping setup rate beyond 1 triangle/clock could have helped (we're pretty sure the rasteriser itself isn't the one causing the stalls, given pixel/triangle ratios).

In this case the rasteriser is a bottleneck in not being able to setup all that additional geometry as fast as it's being output from the Geometry pipeline. This maybe is why Nvidia has gone for 4 parallel raster engines as the increased throughput from 16 Polymorph blocks would bottleneck a single raster engine from previous architectures.

happy medium · May 5, 2010

T2k said:
Well, that mincy little thing that Fermi does not even have dedicated tessellator, it's taken away from shaders - in other words the higher the load goes, the lower the tessellation performance falls.

So it's actually the other way around, in more and more advanced, modern games NV will have less and less tessellation power versus ATI's dedicated performance.

I thought Nvidia used its polymorph engine for dedicated tessellation?
I believe it was rumored to use it's shader power before it's launch.

http://www.anandtech.com/show/2977/...tx-470-6-months-late-was-it-worth-the-wait-/5

I think its the other way around, the more advanced games use more tessellation,the more the 5870 fails. It's known that the gtx 480 has 4 to 5 times the tessellation power of the 58xx series.

Keysplayr · May 5, 2010

T2k said:
Well, that mincy little thing that Fermi does not even have dedicated tessellator, it's taken away from shaders - in other words the higher the load goes, the lower the tessellation performance falls.

So it's actually the other way around, in more and more advanced, modern games NV will have less and less tessellation power versus ATI's dedicated performance.

As others have stated above, you are incorrect. Fortunately, they've explained it and this has been a topic of discussion many times. Tesselation is handled in the polymorph engines which are distinctly separate from the shaders. Now, with tesselation being done, there are more details to be calculated by texture units and the shaders. So the end result would result in higher load on the rest of the GPU. More detail, more power required to render it. This applies to both NV and AMD GPUs with the exception of the AMD GPUs having a single dedicated tesselation unit. AMD knows they need to beef up their tesselation performance. Rumor has it that SI will have double, and NI would have quadrupled tesselation performance over 5xxx series which is a good thing.

Search

[H]ardOcp : Just Cause 2 performance and image quality

Cookie Monster

Diamond Member

zerocool84

Lifer

BenSkywalker

Diamond Member

ugaboga232

Member

T2k

Golden Member

T2k

Golden Member

Sylvanas

Diamond Member

AdamK47

Lifer

BenSkywalker

Diamond Member

GaiaHunter

Diamond Member

Sylvanas

Diamond Member

happy medium

Lifer

Keysplayr

Elite Member

TRENDING THREADS