AMD Radeon HD 6970 already benchmarked? Enough to beat GTX480 in Tesselation?

blastingcap · Oct 29, 2010

Leadbox said:
That 2GB GDDR5, would it still have to go via a 256-bit bus or have they doubled it?

Not sure which rumors to believe anymore but IIRC, the slides I posted in the Barts megathread a while back seemed to indicate 256-bit + 1GB GDDR5 at >6000MHz (effective), whereas this one says 2GB GDDR5. I guess it could be a 512-bit part, but that would add costs and so I am skeptical. Maybe there will be two versions of Cayman (1GB and 2GB).

WhoBeDaPlaya · Oct 29, 2010

Genx87 said:
...months late with the X1800XT and the X1900XT was too little too late.

The X1900XT was great at the time. I jumped from a X800GTO to the X1900XT and then to the G80 awhile after.

busydude · Oct 29, 2010

blastingcap said:
Triangles are not all that matter. You need to process things like lighting, texture, etc. so things like the shaders can be the bottleneck, no matter how fast you can process polygons. Historically there was a surge in the demands placed on shaders, hence why people talk about how many SPs Cayman XT might have, etc. But NV seems hellbent on pushing more triangles onto the screen, because it has a geometry and tessellation advantage. I approve of this and hope AMD follows suit.

You have to bear with me here.. hardware is not my area of expertise, I just have some time to spare and learn something new I do not mean to pester you.

I know more SP's = greater performance.. but it does not mean they always scale perfectly.

Now, are there there two kinds of triangles being processed? One related to tessellation and one for Geometry processing... ? Just ignore if this is a stupid question.

Regarding 8x compared to GT200:

While programmable shading has allowed PC games to mimic film in per-pixel effects, geometric realism has lagged behind. The most advanced PC games today use one to two million polygons per frame. By contrast, a typical frame in a computer generated film uses hundreds of millions of polygons. This disparity can be partly traced to hardware-while the number of pixel shaders has grown from one to many hundreds, the triangle setup engine has remained a singular unit, greatly affecting the relative pixel versus geometry processing capabilities of today's GPUs. For example, the GeForce GTX 285 video card has more than 150× the shading horsepower of the GeForce FX, but less than 3× the geometry processing rate. The outcome is such that pixels are shaded meticulously, but geometric detail is comparatively modest.

So Nvidia just went back and forth in geometry processing?

blastingcap · Oct 29, 2010

busydude said:
You have to bear with me here.. hardware is not my area of expertise, I just have some time to spare and learn something new I do not mean to pester you.

I know more SP's = greater performance.. but it does not mean they always scale perfectly.

Now, are there there two kinds of triangles being processed? One related to tessellation and one for Geometry processing... ? Just ignore if this is a stupid question.

Regarding 8x compared to GT200:

So Nvidia just went back and forth in geometry processing?

I'm not a hardware guru either, but AFAIK, DX11 tessellation is different from triangle processing.

You need some baseline level of polygon (triangle) processing (this is ALL that is used in games that don't use tessellation). This is the old way of doing things, and apparently a GTX480 is eight times better at it than a GTX280:

"NV30 (GeForce FX 5800) and GT200 (GeForce GTX 280), the geometry performance of NVIDIA’s hardware only increases roughly 3x in performance. Meanwhile the shader performance of their cards increased by over 150x. Compared just to GT200, GF100 has 8x the geometry performance of GT200, and NVIDIA tells us this is something they have measured in their labs. This is where NVIDIA hopes to have the advantage over AMD, assuming game developers do scale up geometry and tessellation use as much as NVIDIA is counting on." - from: http://www.anandtech.com/show/2977/...tx-470-6-months-late-was-it-worth-the-wait-/3

Provided that the software supports it, a hardware tessellator allows you to cut up each polygon, set key points (think acupuncture), and then tell the GPU to "stretch" the cut-up polygon into a desired shape. The advantage of this is that the GPU's dedicated tessellator does the work of cutting up and stretching out the model, saving precious memory bandwidth compared to processing each and every triangle the old way.

I think this would be a great time for Scali to step in and give a brief history lesson of geometry bottlenecking vs. shader bottlenecking.

Scali · Oct 29, 2010

Zstream said:
You come off as it is an easy task to take what you have been doing for years into a specification that is needed for nvidia and or ATI. Have you worked in the industry where a simple code change turns out to ruin the entire 3d engine? Do you think it is a simple as a find and replace in wordpad?

Yes I have, have you?
Not sure what point you're trying to drive at.

Zstream said:
People have steps and a specific process when it comes to developing games. You have to give the code generated up top and let the developers code with it. I make one change and documentation, training etc. has to be worked out. It is a rather big undertaking.

You generally don't just jump from one algorithm to the next, but I don't see what that has to do with anything. This goes for all code, not just tessellation.
Point is that there are plenty of popular and very useful tesellation algorithms that you can efficiently implement with DX11.

Zstream said:
Did I ever say it wasn't very useful? PLEASE POINT ME TO WHERE I SAID IT.

Then what the heck was your point man?!
You were bringing up that the tessellator was allegedly not compatible with various algorithms, while at the same time neglecting to mention that it was compatible with various popular algorithms as well.

Zstream said:
So you are on a kick about not using textures in the first place? I could give you the best tessellation algorithm and video card to process it and regardless, a plain texture looks BAD with the 3d geometry.

Again, what is your point?
I countered your argument, and you just repeat it.

Zstream said:
I understand the undertaking in that approach and I am not debating that. I will however debate the better image quality.

If you want to debate it, use arguments. I don't see you presenting any.

Zstream said:
This is where we have our problem. You are saying it is easier for a company to place specific hardware on a card to provide Tessellation. Ok, fine... That is your approach to the problem.

My approach to the problem is that you can not have variations in Tessellation units. I think Nvidia's 480 is fine, when you move down to the 460 or below it starts to bother me. What I am expected to do is create a texture for the 3d mesh. I refuse to make three or four different textures to fit a 3d model. You scale down, meaning you make one high quality texture and scale it down.

It doesn't really work that way. Tessellation is done on-the-fly as I have already stated before, keep up.
So you design your geometry (which includes textures) so that the tessellation factor controls your level of detail. For lower-end hardware, you just limit your tessellation factor at a lower maximum.

Zstream said:
If I had three or four different levels of tessellation, I have to make a texture to fit those levels. I am surely not going to test the process on Nvidia and on ATI's hardware.

Only if you mess up your texture coordinates.

Zstream said:
A 3d mesh DOES look different with each variation level of tessellation. This causes me to create different textures for each level.

Only if you mess up your texture coordinates.
Aside from that, I don't see your argument. This is related to LOD in general, not tessellation. Conventional LOD algorithms have the same problem. To my knowledge, most games generally do not use different texture sets for different geometry levels. They use a single set of mipmaps for each texture which works for the entire range. Firstly, the differences would be very minor, and probably won't be noticed by most people... Secondly, the cost of doubling up the textures for each level of detail would be far too expensive.

Zstream said:
Not all GPU's have dedicated tessellation hardware that is sufficient enough to run a fully fledged tessellation styled game. I am referring to the mass audience. I would rather put a low level default tessellation mesh for the general public that can run fine on a dual core CPU. I will not put a low level tessellation mesh for the general public and have it run on fixed GPU hardware.

Well, I believe in a fully scalable approach.

Zstream said:
CPU's are not used enough in today's industry, why make it worse?

Because what you're proposing is utterly unrealistic.

Zstream said:
Once again you are referring to a large amount of tessellated objects.

Well, that's what a game with tessellation is. You tessellate every object.
Just like we have LOD on pretty much every object today.

Zstream said:
That is your opinion and yours to keep.

It's not an opinion, it is a fact. Software rendering and software T&L are dead... Software tessellation is dead by default.

Scali · Oct 29, 2010

maddie said:
Isn't that what AMD argued for?

Against Nvidia's advice of using extreme, unrealistic tessellation levels when testing cards.

No, AMD alleged that nVidia used extreme, unrealistic tessellation levels when testing cards.
AMD was incorrect.

Scali · Oct 29, 2010

jones377 said:
Each SM contains 32 cores (CUDA cores as they call it) and a 'polymorph engine'. The geometry (including tesselation) is handled by the polymorph engine, not the CUDA cores! Fermi contains 16 SMs (2 are disabled in GTX480). So no, using tesselation does NOT mean less performance for the shaders. Stop listening to Charlie D, he's the one spreading this false information.

Tessellation DOES use the unified shaders (aka Cuda cores on nVidia, Stream processors on AMD).
The pipeline is laid out like this:
Hull shader -> fixed function tessellator -> Domain shader

So you have a shader to set up the tessellation, then you have fixed hardware to do the actual subdivision, and then you have a shader to post-process the subdivided geometry, in a nutshell.

The hull and domain shader are standard SM5.0 shaders, and like all other shaders (vertex, geometry, pixel, compute), they are executed by the unified shaders.
This goes for both AMD and nVidia.

The difference is that AMD's tessellator bottlenecks the rest of the pipeline, so the vertex/pixelshaders will stall, and as such you won't see a drop when using heavier shaders... the units were sitting idle anyway, because the triangles couldn't be fed quickly enough.
Under normal conditions, you will always be limited by the shaders, because the GPU is designed in a way so that the fixed units will not bottleneck the GPU, and give the shaders maximum throughput.

Scali · Oct 29, 2010

blastingcap said:
Scali ought to be pleased that geometry is only 2x the previous generation and not up to 8x like Fermi was.

'Pleased' is not the right word. I would like both companies to develop the best hardware they possibly can. So I'm a bit disappointed that they've not quite caught up in this area yet. If they did, they'd have an absolute killer card. Better than nVidia's offering in every way (unless you happen to be of the Cuda/PhysX persuasion).

Having said that, it is good to know that common sense still seems to hold true (as I explained earlier, it's not that AMD can't build a tessellator like nVidia has done... but it seems that AMD has not bothered to try... and if they would make one, it would take them longer than the 6800/6900 series).

Scali · Oct 29, 2010

Skurge said:
Wait, Fermi was 8X compared to what? GT200?

And why was it so much, it doesn't seem to have made a huge difference.

Yea, compared to GT200, I believe ('last generation').
It was so much because of the PolyMorph engine.
Where they had a single triangle setup unit earlier, they now had up to 15 of these units.
So the maximum throughput was much larger, up to 8x. But this is mainly for tessellation obviously, since you have other bottlenecks if you render static geometry (which is why all previous hardware got away with the much lower triangle throughput).

Scali · Oct 29, 2010

busydude said:
LOL, then why did they name it Geometry Processing? They could have simply called it Triangle Processing.

Technically they're correct though.
You do not necessarily process triangles.
You can also process lines or quads.

Scali · Oct 29, 2010

blastingcap said:
I think this would be a great time for Scali to step in and give a brief history lesson of geometry bottlenecking vs. shader bottlenecking.

Well yea... funny thing is that we were never that geometry bottlenecked

In the old days, the CPU did the transform and lighting, and the 3D accelerator was just a rasterizer.
Since the CPU wasn't that fast at processing the triangles, and they had to be sent over the PCI-bus, this was generally your bottleneck.
So 3D cards were mainly used to render with better quality (texture filtering, mult-texturing, higher resolutions etc), but the polycount was still very low, just like full software rendering.

nVidia (yes, them again, can't help it) revolutionized this by putting the full T&L on the 3D accelerator. This is why nVidia started to call it a GPU.
The CPU could now upload the source geometry to the GPU at initialization, and for each frame, it only had to update some variables such as animation matrices and lights. The GPU would do the complete transform, lighting and rasterizing on its own, with no CPU intervention.
This meant that the CPU, system memory and PCI bus were no longer the bottleneck.

It now became a bit of a balancing act between hardware and software. Would games push detail via geometry, or via more advanced shading?
Developers generally chose the advanced shading option, using bumpmapping for adding detail to relatively lowpoly objects.
This meant that shader power was more important than triangle throughput, so shader power scaled up faster in the hardware.

It seems that the time has now come for triangle throughput (and thus geometry detail) to catch up with shader power.
The easiest way to think of tessellation is to think of it as the geometry equivalent of texture compression, I suppose.

blastingcap · Oct 29, 2010

Scali said:
Well yea... funny thing is that we were never that geometry bottlenecked
In the old days, the CPU did the transform and lighting, and the 3D accelerator was just a rasterizer.
Since the CPU wasn't that fast at processing the triangles, and they had to be sent over the PCI-bus, this was generally your bottleneck.
So 3D cards were mainly used to render with better quality (texture filtering, mult-texturing, higher resolutions etc), but the polycount was still very low, just like full software rendering.

nVidia (yes, them again, can't help it) revolutionized this by putting the full T&L on the 3D accelerator. This is why nVidia started to call it a GPU.
The CPU could now upload the source geometry to the GPU at initialization, and for each frame, it only had to update some variables such as animation matrices and lights. The GPU would do the complete transform, lighting and rasterizing on its own, with no CPU intervention.
This meant that the CPU, system memory and PCI bus were no longer the bottleneck.

It now became a bit of a balancing act between hardware and software. Would games push detail via geometry, or via more advanced shading?
Developers generally chose the advanced shading option, using bumpmapping for adding detail to relatively lowpoly objects.
This meant that shader power was more important than triangle throughput, so shader power scaled up faster in the hardware.

It seems that the time has now come for triangle throughput (and thus geometry detail) to catch up with shader power.
The easiest way to think of tessellation is to think of it as the geometry equivalent of texture compression, I suppose.

Thanks for the history lesson, I got a Voodoo 1 and soon was on non-gaming laptops for several years, so I missed out on a lot of developments from ~2000 to ~2006.

P.S. Seven posts in a row?! Haha.

Arkadrel · Oct 29, 2010

Still those demos that show factor scalable tessellation on a tea pot.. once you go past tessellation factor 5 or so, I cant notice any image improvements.

Lmao we ve come along way since final fantasy 7 times... where people where made up of square boxes more or less

Does anyone have a demo that shows the differnces on a more complex model? where you can see tessellation factor 1x -> 13x, and maybe from 13x->32?

check out how simple it is at start to how complex it becomes with higher factors of tessellation.
YET it doesnt become any better looking from haveing higher factors of tessellation.

:http://video.golem.de/games/2116/tessellation-demo-von-amd-auf-eigenem-directx-11-chip.html

busydude · Oct 29, 2010

Scali said:
Technically they're correct though. You do not necessarily process triangles. You can also process lines or quads.

That makes sense.

Scali said:
It now became a bit of a balancing act between hardware and software. Would games push detail via geometry, or via more advanced shading? Developers generally chose the advanced shading option, using bumpmapping for adding detail to relatively lowpoly objects. This meant that shader power was more important than triangle throughput, so shader power scaled up faster in the hardware. It seems that the time has now come for triangle throughput (and thus geometry detail) to catch up with shader power. The easiest way to think of tessellation is to think of it as the geometry equivalent of texture compression, I suppose.

Thanks for that post. Just a hypothetical scenario:

Consider there are two GPU manufacturers X and Y, X has more shader processing power but weak Geometry processing power and Y is just the opposite. As a developer, which GPU do you prefer in current situation?

Scali · Oct 30, 2010

blastingcap said:
P.S. Seven posts in a row?! Haha.

Well yea, happens when you're in a different timezone, I guess

This thread had about 2.5 pages of extra posts all of a sudden

Scali · Oct 30, 2010

busydude said:
Thanks for that post. Just a hypothetical scenario:

Consider there are two GPU manufacturers X and Y, X has more shader processing power but weak Geometry processing power and Y is just the opposite. As a developer, which GPU do you prefer in current situation?

Difficult to say, without knowing how much weaker one card is, compared to the other.
But where we are today, I'd say that the shader power has reached the 'overkill' level. High-end cards can easily run games at full detail with 4xAA or more, at least up to 1920x1080 resolutions. This is with very complex shading.
This is also exemplied by the fact that they're trying to apply the shader power in different ways now, like PhysX, post-processing filters, and other GPGPU-related things in games (AI etc?). And then there's other ways to burn your extra cycles, like with 3D Vision or 2+ monitor setups.
I think it's safe to say we have so much processing power, that we don't know what to do with it anymore, if it's just graphics on a 1920x1080 screen.

So I would definitely prefer better geometry processing power. This solves a bottleneck, unlike just having more shader power (the problem is feeding those shaders, you need a better geometry pipeline for that).
As a side-effect, because of using more geometry detail, you can use simpler shaders and less textures, because you no longer need to fake as much detail with bumpmapping and all that.
So you can actually get away with having slightly less shader power, since you work smarter, not harder.

AtenRa · Oct 30, 2010

A food for thought, my version of Cayman

CAYMAN

4-Way VLIW (Xt-Yt-Zt-W) where X,Y,Z share T (transcendentals) with W alone.

16 SIMDs
Each SIMD has 32 SPs with 4 or 6 TMUs
2048 Shaders with 64 or 96 TMUs

Dual Ultra-Threaded Dispatch Processors and Dual Tessellator units ??

T2k · Oct 30, 2010

blastingcap said:
This is just false,

No it's not.

although it would be true to say that in an actual gameplay situation (not just a benchmark), the bottleneck may be in the shaders rather than the tessellators. Heck the bottleneck could even be in the CPU!

NV has dedicated tessellation hardware. Period. See, e.g., http://www.anandtech.com/show/2918/2

No, they do NOT unless you have no clue what "dedicated" means.

T2k · Oct 30, 2010

jones377 said:
Each SM contains 32 cores (CUDA cores as they call it) and a 'polymorph engine'. The geometry (including tesselation) is handled by the polymorph engine, not the CUDA cores! Fermi contains 16 SMs (2 are disabled in GTX480). So no, using tesselation does NOT mean less performance for the shaders. Stop listening to Charlie D, he's the one spreading this false information.

Yes, it does go down, it's fairly obvious - they use the same resources. It's not Charlie but even their own papers - and please, spare me this "polymorph engine" bullshit PR, thanks.

Furthermore, GF100 can do 4 triangles/ clock compared with 1/clock for previous generation GPUs (both Nvidia and ATI) going back a long time. Not sure how much impact this has however, I would guess none for current gen game titles.

Just like this entire tessellation issue: it's nothing but the very typical NV-sponsored BS-fest, only to change the subject that NV has once again no competitive next-gen offer for a gamer.

No profanity in the tech sub forums

esquared
Anandtech Forum Director

jones377 · Oct 30, 2010

Scali explained it better than I did. Anyways, welcome to my ignore list

(See, I can do bold too!)

Gloomy · Oct 30, 2010

:sneaky:

T2k said:
bullshit

Sic em, mod bro.

Arkadrel · Oct 30, 2010

Gloomy in the context he said it, I dont really see why that would warrnt a mod "sic'ing em". >.> I didnt even notice the word when I first read it.

Also if someone can say type BS.. why cant they type the full word used in the right context?

T2k you could change your BS to "trivial, insincere, untruthful talk or writing, nonsense".
If people have a problem with the S word in it.

Phil1977 · Oct 30, 2010

Well I'm not convinced of tesselation (yet).

The few games that support it, to be honest I couldn't tell the difference. Ever wondered why all the tesselation demos use wireframes?

I also found that the performance impact is quite large, especially when other DX11 features are used (e.g. tesselation and advanced shadows in Alien vs. Predator)

Lonbjerg · Oct 30, 2010

I wonder if 2 x faster at tessellation only covers from 1-11x.
Or they fixed performance above (12-64x).

blastingcap · Oct 30, 2010

Lonbjerg said:
I wonder if 2 x faster at tessellation only covers from 1-11x.
Or they fixed performance above (12-64x).

2x better in geometry does not necessarily mean 2x better at tessellation. They are different things.

AMD Radeon HD 6970 already benchmarked? Enough to beat GTX480 in Tesselation?

Diamond Member

Diamond Member

Diamond Member

Diamond Member

Banned

Banned

Banned

Banned

Banned

Banned

Banned

Diamond Member

Diamond Member

Diamond Member

Banned

Banned

Lifer

Golden Member

Golden Member

Senior member

Golden Member

Diamond Member

Senior member

Diamond Member

Diamond Member