Why can't we have more SGSS AA hacks in DX11 games?

Anarchist420 · Oct 13, 2014

deferred rendering in DX9 games works fine with SGSSAA hacks. Is an issue microsoft not not giving approval to drivers or to most game developers? after all i doubt microsoft and AMD want DX11 games to have any hacks.

Did Microsoft make sure that Xbox One owners and AMD would protected by disallowing more SGSS/MSAA bits in DX11 games?

Thinking that because PC game development has been hindered by Microsoft and the Xbox 360 now the Xbox One. However, I wanted your thoughts on this.
[Opinion: Fixed function hardware rasterization and even fixed function video rendering is a huge problem and is part of why we have to rely on hacks; we need programmable functions via general purpose processors with higher ratio of double float precision performance than is common in most GPUs marketed for "gaming".]

[Disclaimer: I'm not giving an endorsement of nvidia's drivers and its abuse of intellectual monopoly, but Microsoft since 2005 especially has been a far greater evil. OpenGL for the win]

BFG10K · Oct 14, 2014

If you think Microsoft is so bad for PC gaming, why don't you just run along to Linux and stop posting these threads? It's free, it's open-source, it uses OpenGL, and there's no "intellectual monopoly".

Yet you keep going back to Windows, and you keep making these threads. Why? Do you do it for the attention, or for other reasons?

Opinion: Fixed function hardware rasterization and even fixed function video rendering is a huge problem and is part of why we have to rely on hacks; we need programmable functions via general purpose processors with higher ratio of double float precision performance than is common in most GPUs marketed for "gaming".]

Larrabee did exactly what you asked for, and it was an utter failure for gaming performance. But again, why keep posting about it? You obviously feel this is the ideal GPU, so go and purchase one from Intel.

Anarchist420 · Oct 18, 2014

BFG10K said:
If you think Microsoft is so bad for PC gaming, why don't you just run along to Linux and stop posting these threads? It's free, it's open-source, it uses OpenGL, and there's no "intellectual monopoly".

because of backwards compatibility issues and the fact that i am used to windows. the windows 7 GUI is very good and win 7 x64 is adequate to good overall, and i guess i can't complain too much given that i didnt pay more than $35 for my windows 7 x64 key. i'd also have to get another storage drive if i wanted linux.

but i can't get windows 8.x or 10 for sure; viruses and terrible GUI (win 10 screenshot i saw looks ugly). When 7 is no longer supported i will definitely switch to linux then assuming i am still alive and free with a sound mind, body and limbs hands and fingers all intact.

also, nvidia's drivers for linux are still not that good, i can be sure of that. open source drivers for nv would take a lot of time to get off the ground. for example, the lossy depth optimization bugs me.

BFG10K said:
Larrabee did exactly what you asked for, and it was an utter failure for gaming performance. But again, why keep posting about it? You obviously feel this is the ideal GPU, so go and purchase one from Intel.

It failed because it wasn't parallel enough and because it used the x86 inst set with not enough appropriate extensions and without appropriate cache arch that were good for graphics performance. And the iGPU is not good; the driver teams at 3dfx and probably even matrox would be ashamed of us for putting up with nvidia's drivers mixed with their prices and patents.

also, it used hardware texturing and if i am not mistaken didn't have IEEE 754-compliant double float precision. Integer performance had to have been weak too given 1Ghz cores being controlled by a pentium 1 derivative. It had to have had poor IPC no matter how many cores there were. Less IPC doesn't allow as much data throughput.

if nvidia and ARM had had intel's fab advantage for the past 5 years, then fixed function rasterization would've been replaced with fully programmable rendering 1 year ago.

software rendering is versatile, it's just intel has never been much of a graphics company when they had every opportunity to be one. hell, had they made competent FPunits in '98 and two cpu sockets in mainstream motherboards, 3dfx would've had to run for their money. but instead, intel lived off its x86 ISA since the 80s (due to patents and the licensing fees they could charge, they didn't even have to make breakthroughs with extensions) as well as their clever marketing (thanks to trademarks) and was too intimidated by the idea of 2 Gen purpose processing units in one system for end users.

intel isn't a terrible business by any stretch of the imagination (in fact, i think they're much more ethical than nvidia), but they focused on CPUs without making more functions out of their general purpose; they didn't care to make software or to offer a fully programmable alternative to the Voodoo2 in '98. the i740 (or whatever their graphics processor was that was out in '98) was competent considering how terrible the riva 128 and its drivers were.

but intel either supported or never really cut through the red tape so they didn't make many original breakthrough products after their first 4 or so engineers made the first microprocessor in the 70s or early 80s a reality. You're good though

BFG10K · Oct 18, 2014

Anarchist420 said:
because of backwards compatibility issues and the fact that i am used to windows. the windows 7 GUI is very good and win 7 x64 is adequate to good overall, and i guess i can't complain too much given that i didnt pay more than $35 for my windows 7 x64 key

I thought you didn't believe in Windows, DirectX, patents, or IP?
I thought Microsoft was "crippling" PC gaming?
I thought OpenGL, free and open-source was the way to go (AKA Linux)?

There's a dictionary definition for what you do: http://www.merriam-webster.com/dictionary/hypocrite

hyp·o·crite noun \: a person who claims or pretends to have certain beliefs about what is right but who behaves in a way that disagrees with those beliefs

i'd also have to get another storage drive if i wanted Linux.

You're too destitute to afford a $50 HDD, but you can afford a 780 Ti?

also, nvidia's drivers for linux are still not that good, i can be sure of that. open source drivers for nv would take a lot of time to get off the ground. for example, the lossy depth optimization bugs me.

There is no "lossy depth optimization"; it's merely a figment of your imagination.

It failed because it wasn't parallel enough and because it used the x86 inst set with not enough appropriate extensions and without appropriate cache arch that were good for graphics performance.

Seriously, just stop it.

Anarchist420 · Oct 19, 2014

BFG10K said:
You're too destitute to afford a $50 HDD, but you can afford a 780 Ti?

I don't really have much room left in my case and I don't have much SATA power connector left. i already have 4 SATA drives. And I doubt I would get a $50 hard drive.

BFG10K said:
I thought you didn't believe in Windows, DirectX, patents, or IP? I thought Microsoft was "crippling" PC gaming? I thought OpenGL, free and open-source was the way to go (AKA Linux)?

It wouldn't be that way if patents had been eliminated 5 years ago.

BFG10K said:
There is no "lossy depth optimization"; it's merely a figment of your imagination.

Then why was there z-fighting in uniengine benchmark? and how did nvidia keep on increasing kepler's performance all without reducing bandwidth used?

it had been so inconsistently applied so they have caused enough deceit that they won't offer a working option for depth calculation to be the same as it was in R275. it is just like how they took away the old anisotropic filtering method and the pre-render zero frames ahead; fortunately someone added the old AF back with some lines in the inf (at least i believe that's how he did it) but i can't switch it off for nglide games where it doesn't work. I will have to look at that inf file and compare it to the driver i am now using

BFG10K said:
Seriously, just stop it.

i cant help it; software is way versatile and hardware can't always be better.

KaRLiToS · Oct 19, 2014

Another hostile thread

Cerb · Oct 19, 2014

Anarchist420 said:
It failed because it wasn't parallel enough

Then nothing can be parallel enough. GPUs are not like CPUs, and no current GPU can match the parallelism of say, a Xeon Phi, or Tilera SoC (if comparing apples to apples, they are not multicore processors, and can not do all the same things, though share many traits with them).

and because it used the x86 inst set with not enough appropriate extensions and without appropriate cache arch that were good for graphics performance.

It had the extensions. Cache was a problem, but AMD and nVidia are getting very close to the sort of the cache rules used by ISAs like x86 (difference is they aren't making multicore CPUs). Cache coherence, strong memory ordering rules, and virtual memory, are, going into the future, necessities. Note as well that the current Phis are going with Atom-based cores, so going Pentium was surely not the best idea, though probably seemed good at the time, given the area and power constraints.

also, it used hardware texturing

Which will not go away until textures do.

and if i am not mistaken didn't have IEEE 754-compliant double float precision.

Intel's x86 FP has always, TMK, been the very definition of IEEE-754. For graphics, though, you would not want that, and instead would be better off with implementations lacking all the special case handling, like ARM usually does.

Integer performance had to have been weak too given 1Ghz cores being controlled by a pentium 1 derivative. It had to have had poor IPC no matter how many cores there were. Less IPC doesn't allow as much data throughput.

Low IPC allows TONS of data throughput, just not pure scalar single-process. A VLIW5 Radeon HD, FI, could process up to 80 pieces of input data per instruction, where a typical scalar instruction allows for at most 2 (plus, that same instruction was being run on many more SIMD processing units, as well, if the work wasn't done). Larrabee and its derivatives had and have wide SIMD units, to do similar.

if nvidia and ARM had had intel's fab advantage for the past 5 years, then fixed function rasterization would've been replaced with fully programmable rendering 1 year ago.

No, they wouldn't. Intel's fab gives them maybe a 50% better all-around performance at any given time, it seems, but more when it comes to density. To go full software rendering agaiin, you'd need it to be a factor of close to 100.

Cerb · Oct 19, 2014

Anarchist420 said:
Then why was there z-fighting in uniengine benchmark? and how did nvidia keep on increasing kepler's performance all without reducing bandwidth used?

Caches, and lossless compression. Z-fighting is an engine thing, not a GPU thing, these days. In DX9 they could hack around it, but didn't, usually. In DX11+ they can eliminate it, if they want to take the time.

it had been so inconsistently applied so they have caused enough deceit that they won't offer a working option for depth calculation to be the same as it was in R275. it is just like how they took away the old anisotropic filtering method and the pre-render zero frames ahead

0 prerender should break games, or simply not get used by those it would break. Those it wouldn't break likely do it already. TBH, I can't think of a way to use it, reasonably, on anything but ancient computers, where the code could actually change the video output during the process of displaying it on-screen.

Current AF is fine at high levels, and cheap at high levels. If they did it the same way as back when, it might still chew up 30% or more of your FPS. Same as how 4xMSAA today doesn't look as good as 4xMSA from 10 years ago, but may as well be free. There are still higher options, for better quality, though, so not that big of a deal (though, SGSSAA should be a control panel option, not something that needs 3rd-party tools to enable).

Anarchist420 · Oct 19, 2014

Cerb said:
Current AF is fine at high levels, and cheap at high levels. If they did it the same way as back when, it might still chew up 30% or more of your FPS. Same as how 4xMSAA today doesn't look as good as 4xMSA from 10 years ago, but may as well be free. There are still higher options, for better quality, though, so not that big of a deal (though, SGSSAA should be a control panel option, not something that needs 3rd-party tools to enable).

Thanks Cerb

But do you see the issue of nvidia not allowing the end user to choose? I mean, I couldn't give a flying d*** whether I am getting 60 fps or 120 fps in UT 99.

Cerb said:
Then nothing can be parallel enough. GPUs are not like CPUs, and no current GPU can match the parallelism of say, a Xeon Phi, or Tilera SoC (if comparing apples to apples, they are not multicore processors, and can not do all the same things, though share many traits with them).

Is Xeon Phi more parallel than Larrabee?

Cerb said:
It had the extensions. Cache was a problem, but AMD and nVidia are getting very close to the sort of the cache rules used by ISAs like x86 (difference is they aren't making multicore CPUs). Cache coherence, strong memory ordering rules, and virtual memory, are, going into the future, necessities. Note as well that the current Phis are going with Atom-based cores, so going Pentium was surely not the best idea, though probably seemed good at the time, given the area and power constraints.

Well then they can always try again with better cache architecture.

Cerb said:
Which will not go away until textures do.

Thank you for setting that straight.

Cerb said:
Intel's x86 FP has always, TMK, been the very definition of IEEE-754. For graphics, though, you would not want that, and instead would be better off with implementations lacking all the special case handling, like ARM usually does.

I thought I read that Larabee's FPUs couldn't do FP64. Perhaps I was mistaken.

Cerb said:
Low IPC allows TONS of data throughput, just not pure scalar single-process. A VLIW5 Radeon HD, FI, could process up to 80 pieces of input data per instruction, where a typical scalar instruction allows for at most 2 (plus, that same instruction was being run on many more SIMD processing units, as well, if the work wasn't done). Larrabee and its derivatives had and have wide SIMD units, to do similar.

Thank you for setting me straight

Cerb said:
No, they wouldn't. Intel's fab gives them maybe a 50% better all-around performance at any given time, it seems, but more when it comes to density. To go full software rendering agaiin, you'd need it to be a factor of close to 100.

I don't know how you can quantify that. You could be right, but I don't see it. I am sorry that I don't. And please remember, hardware rendering is a balance barely chosen by the end user (maybe 20%), somewhat chosen by the game devs (30%), but at least half chosen by the IHV (50%). Software rendering, however, in the long run, definitely shifts the balance of power and choice to game programmers and end-users. Given that devs are boxed in at least somewhat and given all we pay nvidia and intel for what they render to us, it's ridiculous that the end user and software devs don't have the vast majority of choice and that potential is untapped.
iGPUs in PCs aren't up to snuff for many gamers and Maxwell doesn't really break a whole hell of a lot of new ground.

Software rendering can look as lousy and be as fast as anyone could wish for, or just the opposite, or anywhere in between. In fact, due to lack of fixed function, it could be faster with even lower quality.

I mean, we could have individuals like richard stallman do more for humanity and be the best software testers if we didn't have software patents; he would be a lot happier if we didn't have them. Doesn't DX make programming easy due to the structure? Don't we have more career programmers rather than the multi-specialists we used to have? Isn't it good to have more multi-specialists who can make the very best without structure? I thought Sega's programmers like Yu Suzuki made more cult classics and franchises than Shigeru Miyamoto did. Chaotic management is good and neither intel nor nvidia have that.

Cerb · Oct 20, 2014

Anarchist420 said:
Is Xeon Phi more parallel than Larrabee?

Yes, for what little difference it would make (typical GPUs have hundreds to thousands of SIMD units, while x86 is going on having around 50).

Well then they can always try again with better cache architecture.

Maybe, but not much better. The GPU guys are getting closer to the way CPUs have done it, rather than the other way around (in mobile, they've been there for ages, since it's all IGP). Ultimately, Intel went with better CPUs.

I thought I read that Larabee's FPUs couldn't do FP64. Perhaps I was mistaken.

The SIMD units couldn't do 64-bit, I don't think. At the time, no GPUs could, either, TMK.

Software rendering, however, in the long run, definitely shifts the balance of power and choice to game programmers and end-users. Given that devs are boxed in at least somewhat and given all we pay nvidia and intel for what they render to us, it's ridiculous that the end user and software devs don't have the vast majority of choice and that potential is untapped.

Processing textures, polygons, etc., with generic software methods, even using SIMD, takes massive amounts of resources, compared to a dedicated processing unit. It's not a matter of flexibility when the difference is going to be orders of magnitude. A dedicated processing unit doesn't have to decode instructions, manage registers, manage pipelines, manage data ports, etc.. It gets externally configured, and if it receives data, it does what it was told to do. These days, even weak CPUs have very little space taken up by the actual execution of instructions.

Light and shadow might get a lot from many CPU cores, as good shadows are not simple, but w/ DX11, they seem to be working around that pretty well.

iGPUs in PCs aren't up to snuff for many gamers and Maxwell doesn't really break a whole hell of a lot of new ground.

Maxwell basically catches up to AMD in power management, while being pretty power-efficient on its own, and coming with nice amounts of VRAM, for the price. Ground-breaking it is not.

Software rendering can look as lousy and be as fast as anyone could wish for, or just the opposite, or anywhere in between. In fact, due to lack of fixed function, it could be faster with even lower quality.

If and only if you have a fast enough CPU. CPUs haven't kept up, and are now improving by maybe 15% per year, if not less, depending on what you do with them. Adding more of them turns the problem from one of per-core performance to one of memory performance, in an embarrassingly parallel workload, and in the process, and necessarily lowers single-threaded performance. All the while, a CPU, having to execute generic code, is going to take a ton more space and power per pixel processed than dedicated hardware will.

Anarchist420 · Oct 20, 2014

Cerb said:
If and only if you have a fast enough CPU. CPUs haven't kept up, and are now improving by maybe 15% per year, if not less, depending on what you do with them. Adding more of them turns the problem from one of per-core performance to one of memory performance, in an embarrassingly parallel workload, and in the process, and necessarily lowers single-threaded performance. All the while, a CPU, having to execute generic code, is going to take a ton more space and power per pixel processed than dedicated hardware will.

Programmable rendering is what I meant; perhaps all processing on one die is a bad idea. I am sorry for any confusion or if I seemed rude.

Cerb said:
Yes, for what little difference it would make (typical GPUs have hundreds to thousands of SIMD units, while x86 is going on having around 50).

well then maybe CUDA or some type of GPU compute cores could be invented or changed and improved enough to replace fixed function elements.

i don't see how fixed function hardware audio was done away with yet fixed graphics never could be. physX is all programmable.

NTMBK · Oct 20, 2014

Anarchist420 said:
i don't see how fixed function hardware audio was done away with yet fixed graphics never could be.

It's because audio is much less computationally demanding. It would still be more efficient to audio in hardware, but nobody needs the extra performance, and it would only save a couple of watts.

Rakehellion · Oct 20, 2014

Anarchist420 makes yet another thread about SGSSAA and consoles.

Answer: It's slow as hell and isn't worth the effort. You don't even see it on desktops.

Anarchist420 · Oct 27, 2014

NTMBK said:
It's because audio is much less computationally demanding. It would still be more efficient to audio in hardware, but nobody needs the extra performance, and it would only save a couple of watts.

Well, we don't use intel CPUs with file decompression hardware. So do you honestly believe that we will never have full software rendering win out over fixed?

Rakehellion said:
Anarchist420 makes yet another thread about SGSSAA and consoles. Answer: It's slow as hell and isn't worth the effort. You don't even see it on desktops.

SGSSAA isn't "slow as hell" and I most certainly did see properly sparsed grid SSAA in tomb raider 2013 for PC. A huge problem with the consoles is target res and target frame rate and the stability mandates for frame rate. Basically the new consoles are having smooth frame rate without A-sync (or gsync) and without decent AA, w/o much more than 900p, and without much more than 30 fps; they basically give well less than half of what you get on the PC.
And that's understandable given the costs.

But then PCs could do even more if nvidia wasn't so concerned about getting microsoft's approval. They ought to just team up with linux and help each other violate Microsoft's patents so Linux and nvidia can win users over to Linux; it would have to be done rapidly and not secretly though. nvidia's prices could be a lot lower too; and they should be given all the intellectual monopoly, waste recycling, and reforming they do.

All that's more than possible given the assets nvidia has. Look at how incompetent Microsoft's management is. I hope JHH can do better than that. If he can't then he needs to retire for the good of the company he helped found.

nurturedhate · Oct 27, 2014

Anarchist420 said:
They ought to just team up with linux and help each other violate Microsoft's patents so Linux and nvidia can win users over to Linux; it would have to be done rapidly and not secretly though. nvidia's prices could be a lot lower too; and they should be given all the intellectual monopoly, waste recycling, and reforming they do.

Umm, what? really?

f1sherman · Oct 27, 2014

Anarchist420 said:
SGSSAA isn't "slow as hell" and I most certainly did see properly sparsed grid SSAA in tomb raider 2013 for PC. A

You have better than that in Splinter Cell Blacklist - rotated grid

bystander36 · Oct 27, 2014

When all else fails, there is always downsampling, which works with all games (assuming they support higher resolutions). The only problem I have is that it can add a bit of latency, but for the quality of AA, it is pretty efficient.

Search

Why can't we have more SGSS AA hacks in DX11 games?

Anarchist420

Diamond Member

BFG10K

Lifer

Anarchist420

Diamond Member

BFG10K

Lifer

Anarchist420

Diamond Member

KaRLiToS

Golden Member

Cerb

Elite Member

Cerb

Elite Member

Anarchist420

Diamond Member

Cerb

Elite Member

Anarchist420

Diamond Member

NTMBK

Lifer

Rakehellion

Lifer

Anarchist420

Diamond Member

nurturedhate

Golden Member

f1sherman

Platinum Member

bystander36

Diamond Member

TRENDING THREADS