Leo DirectX forward plus rendering lighting

BFG10K

Lifer
Aug 14, 2000
22,709
3,005
126
This needs more fanfare. It allows the performance benefit of deferred rendering while retaining full compatibility with hardware MSAA. If game engines used this, we wouldn’t need post-filtering AA.
 

thilanliyan

Lifer
Jun 21, 2005
12,065
2,278
126
Would everything work properly with nvidia cards?

If some of these features only work on AMD cards,I think devs would be less likely to use it.
 

Dark Shroud

Golden Member
Mar 26, 2010
1,576
1
0
Would everything work properly with nvidia cards?

If some of these features only work on AMD cards,I think devs would be less likely to use it.

The main part uses the DX11 API for Direct Compute so that should work on Nvidia hardware. How well it work on the GTX 600 series is a different matter.
 

Arkadrel

Diamond Member
Oct 19, 2010
3,681
2
0
From Revdarian @hardforums:

Well, there are two main methods to render 3d games. Forward Rendering was the original, old school one; Deferred Rendering was the new kid on the block. Each one had it's pros and cons, and the main pro of the deferred rendering was that it had a much lesser performance hit when dealing with multiple light sources and was a relatively simple method; because of that Deferred Rendering became eventually the "go to" method for most important rendering engines.

The problem is that D.R. brings a list of cons in the form of heavier performance hit when handling multiple materials, and because it usually discards the geometry data it can't really apply proper MultiSampling antialiasing. The solution for 1) chosen by most devs was "ok, we won't use multiple materials! i mean, texturing alone is good enough, right?" the solution for 2) was "we will create a special buffer, let's call it G buffer, and we will store geometry data and some other useful things there!". Problem is that the G buffer eats memory like nothing, and if you try to use different materials, each material makes the G buffer even bigger (on top of the performance hit).

At AMD this D.R. became a royal pain in the ass, since their gpu designs were made with proper MSAA in mind, and thus this workaround has made them take significantly bigger performance hits than on Nvidia's hardware, and thus they have been hard at work until they got this solution.

The solution they found is to run a proper compute shader to apply the lighting to the Forward Rendered image, instead of the usual way of "Render everything 1 time for each light source in the scene!" This way they save a great deal of passes, they save on memory by not needing the G buffer (the geometry is always present on a Forward Renderer, instead of discarded), they can use the proper MSAA included on their original gpu design, and multiple materials can be used without the big performance & memory hit of the DR, all it takes is compute time for the new shader.

So, this demo to a game dev should be interesting in the form of the performance achieved with a FR engine while dealing with multiple light sources, the lack of a noticeable performance and memory useage hit due to the multiple materials & the use of "proper" MSAA.
----------------Deferred Shadeing (http://en.wikipedia.org/wiki/Deferred_shading):
Pro's:
1) the decoupling of scene geometry from lighting. Only one geometry pass is required and each light is only computed for those pixels that it actually affects. This gives the ability to render many lights in scene without significant performance-hit.

con's:
1) inability to handle transparency within the algorithm, although this problem is a generic one in Z-buffered scenes and it tends to be handled by delaying and sorting the rendering of transparent portions of the scene. "Depth peeling" can be used to achieve order-independent transparency in deferred rendering, but at the cost of additional batches and g-buffer size. Modern hardware, supporting DirectX 10 and later, is often capable of performing batches fast enough to maintain interactive frame rates.

2) difficulty with using multiple materials. It's possible to use many different materials, but it requires more data to be stored in the G-buffer, which is already quite large and eats up a large amount of the memory bandwidth.

3) due to separating the lighting stage from the geometric stage, hardware anti-aliasing does not produce correct results any more. One of the usual techniques to overcome this limitation is using edge detection on the final image and then applying blur over the edges, however recently more advanced post-process edge-smoothing techniques have been developed, such as MLAA, FXAA, SRAA, DLAA, post MSAA.




It might be time for a comeback for Forwards rendering.
If it means much higher performance, while requireing less memory bandwidth than Deferred shadeing does.

Memory bandwidth / Memory size is always going to be a issue, and its only going to grow.
Its easier to throw "compute shader" (gpgpu) at a problem, than it is to just magically give extra memory bandwidth / more memory.

This is esp true for mobile devices, like laptops that still come with only 1x DDR3 stick of memory and make use of IGPs.
Since people are preaching about "tablets/nettops/smart phones" being the future, it seems like its time for a Forwards rendering comeback (maybe?).
 
Last edited:

Red Hawk

Diamond Member
Jan 1, 2011
3,266
169
106
Great demo, very informative. I hope these techniques catch on, but I fear they won't until the next console generation. Deferred rendering is the king of console graphics engines right now, and rebuilding a game's rendering engine from deferred to forward+ just for a PC port would take a lot of development resources. We might get it from a PC-exclusive strategy game or an AMD Gaming Evolved title that AMD really, really pushes for.

I do wonder though if any games already have a form of this rendering tech. Total War: Shogun 2 only supports MSAA in its DirectX 11 renderer, and it's both a Gaming Evolved title and a PC exclusive strategy game. Perhaps AMD helped to implement an early form of this tech there.
 

piesquared

Golden Member
Oct 16, 2006
1,651
473
136
Great demo, very informative. I hope these techniques catch on, but I fear they won't until the next console generation. Deferred rendering is the king of console graphics engines right now, and rebuilding a game's rendering engine from deferred to forward+ just for a PC port would take a lot of development resources. We might get it from a PC-exclusive strategy game or an AMD Gaming Evolved title that AMD really, really pushes for.

I do wonder though if any games already have a form of this rendering tech. Total War: Shogun 2 only supports MSAA in its DirectX 11 renderer, and it's both a Gaming Evolved title and a PC exclusive strategy game. Perhaps AMD helped to implement an early form of this tech there.

There are a few gaming evolved titles out that use forward+ rendering techniques like Dirt Showdown, Sniper Elite. There is a list on AMD's website.
 
Last edited:

PrincessFrosty

Platinum Member
Feb 13, 2008
2,300
68
91
www.frostyhacks.blogspot.com
Just seeing if I have this right, with the increased capacity for additional lighting, one possible real world use is to sample multiple points in the scene where a spotlight is landing and then using the colour there to create new light sources back into the scene creating the subtle hue effect.

That's actually really cool, hopefully we'll see some usable tech demos of this that work on both Nvidia and AMD cards, I'd love to see this in real time.

*edit*

Oh apparently v1.1 works on Nvidia hardware...downloading now and will test tonight.

http://developer.amd.com/samples/demos/pages/AMDRadeonHD7900SeriesGraphicsReal-TimeDemos.aspx
 
Last edited:

djsb

Member
Jun 14, 2011
81
0
61
It works on any DX11 capable cards. I remember it looking good (but predictably running like poo) on the Radeon 6850 I had in my machine a few months ago. I admit I'm slightly disappointed at the educational mode in 1.1. I thought it would be something where you could play with dynamic lights rather than it just being a slideshow.
 

thilanliyan

Lifer
Jun 21, 2005
12,065
2,278
126
The main part uses the DX11 API for Direct Compute so that should work on Nvidia hardware. How well it work on the GTX 600 series is a different matter.

That's good. Hopefully more devs start using it.

HOWEVER, if it is even a bit slower on nVidia cards, they will not be behind it I'm guessing, making it less likely to take off.
 

Red Hawk

Diamond Member
Jan 1, 2011
3,266
169
106
That's good. Hopefully more devs start using it.

HOWEVER, if it is even a bit slower on nVidia cards, they will not be behind it I'm guessing, making it less likely to take off.

Ah come on now, vendor-agnostic technology is good for everyone. *glares at PhysX :colbert:*
 

Pottuvoi

Senior member
Apr 16, 2012
416
2
81
There has been advancement on Forward+ techniques after the Leo demo was released.
One of the big advances is the tiling of light sources, new methods tile in depth as well as X&Y dimensions.
http://www.cse.chalmers.se/~olaolss/main_frame.php?contents=publication&id=clustered_shading

This allows better utilization of power in hard cases. (when tile has multiple depth ranges, this is easily visible in the Leo demonstration. (edge tiles have huge amounts of lights.)

One thing that forward+ tech makes very easy when compared to deferred renderer is lighting of transparent surfaces.
http://www.cse.chalmers.se/~olaolss/main_frame.php?contents=publication&id=tiled_clustered_forward_talk

Other Forward+ links
http://aras-p.info/blog/2012/03/27/tiled-forward-shading-links/

Sniper Elite V2 uses Unreal Engine 3, that's about as deferred as they come.
UE3 is forward renderer with deferred shadows and post processing.
 
Last edited:

Olikan

Platinum Member
Sep 23, 2011
2,023
275
126
And this is exactly what I was thinking of, and why I hate PhysX so much and hope it remains marginalized.

Forward+ is not locked like PhysX...it's just kepler that sucks at GPGPU...

i won't be surprised if fermi goes well here...or the future maxwell

it's pretty much the same thing with tesselation last year
 

piesquared

Golden Member
Oct 16, 2006
1,651
473
136
There's more examples in Sleeping Dogs of how AMD and game developers are using GCN's compute power to produce cutting edge effects. The lack of compute power in Kepler is taking it's toll in a big way on the architecture.

http://blogs.amd.com/play/2012/08/16/sleeping-dogs-gaming-evolved-and-you/

Kepler being better at gaming is such a myth it's laughable. It may produce a couple extra percent in a select few games, but it loses more than it wins, and by a much wider margins. Not sure if it's just NV's bad drivers, but then NV's drivers are unfallable i've heard so it must be the architecture. Realistically, it's a combination of both but much more problematic on the architecture side. Kepler just doesn't have the grunt to produce these leading edge effects. Now that developers are starting to exploit all the amazing cababilities inside GCN, Kepler will fall further and further behind. There is simply no choice, Kepler doesn't have close to the compute power packed inside Tahiti and the rest of the GCN family. It's quite a chip, reviewers don't give near enough credit to what AMD has been doing in advancing gaming.

To me, Kepler seems like a chip with yesterdays features at todays performance. While GCN is a chip with today and tomorrow's features at the same or better performance level than the competition.
 

Grooveriding

Diamond Member
Dec 25, 2008
9,147
1,330
126
I just picked up Sleeping Dogs and it is kicking my setup's ass. Trying to max it out gives me 20FPS...