"Tilers", the best bet?

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
A question posed by Soccerman in Dave's current thread lead me to post this topic.

"Tilers", such as the PowerVR/Kyro boards are much better at handling what has always been considered the most important aspects of 3D graphics cards. With 3dfx poised to launch their Gigapixel based part in the not too distant future(I'm assuming by the end of 01) I feel that it is fairly safe to say that they will likely dominate in the traditional performance standard, mainly effective fillrate.

I don't think it will matter very much.

Since the launch of the Voodoo1 graphics boards have been dealing with the CPU as their main counterpart, be it the limiting factor in some cases("Crusher" type situations) or a possible boost in others(SIMD in general).

In terms of the graphics boards themselves, memory bandwith has been increasingly a limiting factor for rasterizers, particularly the current and more then likely upcoming nVidia parts(and likely the RadeonII Rampage, though not enough is known to be sure).

Now we are at the point in time with AMD and Intel upping the ante faster on CPUs then any developer would have likely predicted eighteen months ago, and we are well into the territory of performance held by Crays and the like at the dawn of 3D PC gaming(circa 1996 with the Voodoo1). This, combined with the current generation of 3D accelerators offloading certain functions from the CPU, with more to follow in the upcoming generation, has rendered CPU speed pretty much a non factor currently. I suspect that CPUs will continue to outpace gaming advancements, particularly with so many DX8 titles likely to target XBox level hardware.

In terms of graphics boards themselves, memory bandwith is definitely rearing its' head. Unlike CPUs which are having tasks offloaded with increasing frequency, memory bandwith requirement are going up extremely fast, particularly when compared to the relative power increases of rasterizers. The next generation of parts(Rampage, NV20, RadeonII) I assume will all be using at least some sort of primitive HSR saving them some effective bandwith. This combined with MSAA's reduced memory needs has us very close to hitting a wall that moves very, very slowly.......monitors.

Right now, you can buy a board that can push Quake3, still one of the most fillrate intensive games on the market, 1600x1200 32bit color at nearly 60FPS. The next generation should have an edge in terms of both actual bandwith and more efficiently utilize the available bandwith.

How far off are we from 16x12 4X FSAA with the at the time current games? I'm sure the GP technology will give us that with plenty to spare, but what do we need more fillrate for?

The first answer to that is more advanced rendering techniques and increased texture passes. We have all heard about Doom3 and some rumbling have it using as many as ten texture passes at once, this certainly will require some serious fillrate, and bandwith, but how much more then what we will have in current offerings? I think it is safe to assume that the level of "HSR" and like techniques(eDRAM) will have progressed by that point, how much of an edge will having an effective 3GTexels fill be over 2GTexels? Even if we up that to 10GTexels, what good will it do us with 1600x1200 being the limit for the forseeable future?

Increased FSAA samples? Of course this is definately a possibility, but FSAA has very quickly diminishing returns when you pass 4x. Telling the difference between 4x and 9x is fairly easy(nothing like 2x and 4x though), 9x to 16x gets a bit tougher, particularly at higher resolutions. 16x to 32x and I am willing to bet you would need a trained eye, even when zooming in on a still, particularly if we are dealing with 1600x1200 resolution anyway.

Is this going to change? Without a major technilogical breakthrough in monitors it is extremely unlikely. We are going to hit the limits of monitors sooner then many think. Sure, you could go out and pick up a real high end Sony that offers 20xx+ resolution, but that certainly won't be what many, if any, gamers are going to want to do to improve visual quality.

Look to CGI. Gaming has been following several years behind CGI for some time now, and in that area increasing resolution and texture passes isn't the norm, not at all. Look at the difference between some fixed resolution DVDs for an example. Even using two now aging examples, Toy Story and A Bug's Life(your d@mn straight Robo, what good would a thread be if I didn't bring up TS;)). Viewing them both on DVD at a set resolution it is extremely clear which has the superior visuals, and neither of them use ten pass rendering or anything else like it, they use procedural textures.

Procedural textures, a mathematical equation instead of a stored image for texture maps is one direction that should be looked at. Why go through ten pass texturing when you can calculate all of your desired effects and handle everything in a single pass? Not only can this produce vastly superior visual results, it also saves *considerably* on bandwith needs(which aren't likely to be too much of a concern by that point).

Another area is a given, increase the d@mn poly counts. Not this little 15%-25% a year BS either. DX8/X-Box should give us some significantly improved game visuals due to target hardware vs developments costs. Having a console to cover the dev costs should make developers *ignoring* the average eMachine user a bit more acceptable to the publishers. This is another area that needs to be improved upon significantly. Current T&L solutions have a lot more to offer then what we have today, but by the time we are dealing with GP and NV25 they will be far off the cutting edge. We need poly rates, and real world poly rates, in the hundreds of millions of polys range, and sooner rather then later. This is one area that I am wondering on with the GP technology, but Mr T;) and the guys I'm sure have this covered(and have stated they do, feel free to fill us in on any particulars Dave:p;)).

After that, we should be looking at real time RayTracing. A few months, even a few weeks ago if anyone asked me about this I would and did in at least one thread say this was several years off. Since then I have learned that at least on application is shipping *this month* with support for real time RayTracing in a 3D environment(I'll let you know how it goes as soon as I get my hands on it). This doesn't lead me to believe that it will be reasonable in a game within the next six months, but I am cutting back on the amount of time I think it will take to get this up and running in hardware.

In summary, the GP will likely be the fillrate king by a wide margin, but will that be worth much by the time it is in production? nVidia has made it clear in many statements, and by certain hiring practices, where they are going. The Cannucks seem to be following the market leader at this point(well, who knows what Matrox is doing), and I think they are more likely to move closer to the above direction then in the fillrate is king mindset(the ArtX acquistion only reinforces this in my mind).

Next fall winter should be very, very interesting. NV25, Gigapixel, G800 and ATi's offering based on GameCube technology fighting it out for supremacy.

The only question in my mind is where will developers go? I am absolute in terms of where I think graphics technology *should* go, and it isn't keeping with the same fillrate is king mentality that has ruled the 3D grphics card market for so long.

BTW- I'm sure I have dozens of typos in this post... No, I don't feel like fixing them either:)
 

RoboTECH

Platinum Member
Jun 16, 2000
2,034
0
0
Even using two now aging examples, Toy Story and A Bug's Life(your d@mn straight Robo, what good would a thread be if I didn't bring up TS;)

HA!!! Bastage! gotta have TS in EVERY post.

Having a console to cover the dev costs should make developers *ignoring* the average eMachine user a bit more acceptable to the publishers.

indeed, that is a very good point. I'm looking forward to a game that can totally push the limits of my hardware. :)

Without a major technilogical breakthrough in monitors it is extremely unlikely.

for the first time the other day, I saw into the future. I sat down (my wife was looking at refrigerators...<yawn>) in the electronics section at Sears, and they had this big ole', 5'x3' HDTV presentation going on.

OH
MY
GAWD

I will pay whatever it takes to play games on that thing. WHATEVER IT TAKES. I'll finance the damn thing, take out a Home Equity-based loan....hell, I'll sell the dogs (well, maybe not, but I'll rent them out for work, heh....)

seriously. I cannot imagine playing on something THAT AWESOME. Definitely the best asexual erection-producing concept out there, IMHO.



 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Robo-

The highest end HDTV in normal production that I am aware of is 1080i, though if it was a demonstration it may have been the &quot;1920&quot; series(forget the exact name) monitors(which only support up to 1920x1080 maximum resolution). No major breakthrough when compared to current monitor technology, at least not in the sense that I was talking about:)
 

DaveB3D

Senior member
Sep 21, 2000
927
0
0
Ben, the biggest flaw in what you say is the talk of using embedded memory. Why is this a flaw? Because it is expensive and it increases your die size. Yes, you can use embedded memory but not without a price. Deferred rendering doesn't have to worry about this.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
&quot;Ben, the biggest flaw in what you say is the talk of using embedded memory.&quot;

Perhaps I should have added a &quot;?&quot; after that. I'm not saying that that is what they &quot;will&quot; use, just one route that is possible. I know the penalty, one of the other things that I was looking at is nVidia's upcoming support for memory interleaving on their mobos(only bring NV up as they are the one company that I am aware of that is doing it). Another costly route to be sure. There is also QDR, but I don't think that will be widely available.

My main point is that I think, and please feel free to disagree, that we are going to hit the limits of the monitors before we run into a serious problem with memory technology. Again, feel free to disagree as I am honestly interested in hearing any and all(at least somewhat intelligent:p) conflicting views.
 

DaveB3D

Senior member
Sep 21, 2000
927
0
0
I think that is somewhat true. However, you are forgetting that there is more than just fill-rate.. bandwidth too. For moving to higher color depths, etc.. Bandwidth is needed. 64-bit color is going to double bandwidth requires if/when it gets implemented (it is an obvious progression). Memory bandwidth is always going to be what holds us back (well that and cost). Deferred rendering mostly solves this problem.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
64bit color I can see, at least at some point in the future. Outside of that, which is an immediate ~doubling(I'm assuming that 64bit Z would be a waste in games so that should be some savings), what bandwith concerns do you think will outpace memory bandwith advancements?

To date, it has been higher resolution and increased color depth, we are at the point now where both of those goals are pretty much reached(in terms of usability with today's monitor technology). FSAA, for this generation has been extremely bandwith intensive, but less so with the upcoming boards.

My main question, I guess, is what do you see as being extremely bandwith intensive on the horizon, short or long term? With improved &quot;HSR&quot; techniques being used by &quot;traditionals&quot; and the brick wall like status of monitor advancement, where do you see the additional bandwith being used?

This is an honest question and I have been thinking of this for some time now. Moving forward, I mainly see things moving to procedural based, saving quite a bit of bandwith, do you think that is unreasonable or see things moving in a different direction?

Clearly bandwith needs will continue to increase, but will they outpace the advancements in RAM technology now that we have &quot;caught&quot; displays?

With eDRAM, 1TSRAM(or whatever it is ArtX is using is called), QDR, MDRAM and the like on the horizon(not to mention ever faster SD/SG/DDR), what do you see going above and beyond the advancements?

I see what you are saying, &quot;tiling&quot; negates this concern, but is the concern going to be a serious issue if it is an issue at all?
 

DaveB3D

Senior member
Sep 21, 2000
927
0
0
Well Ben, I obviously can't talk about some of this stuff. One thing to consider though is overal depth complexity. With increased polygon counts, not only are we going to see more realistic objects but also moreobjects in general. That is going to be a big factor. Then as we advance even further down the road, object complexity itself will make depth complexity go up. Use treemark as an example. Whem we move to that level of complexity for entire worlds...

I suggest reading some of SA's posts on B3D to get a firm grasp on where architectures are moving.. he has contributed some very interesting points.
 

Blackhawk2

Senior member
May 1, 2000
455
0
0
<<...With eDRAM, 1TSRAM(or whatever it is ArtX is using is called), QDR, MDRAM and the like on the horizon(not to mention ever faster SD/SG/DDR), what do you see going above and beyond the advancements?

I see what you are saying, &quot;tiling&quot; negates this concern, but is the concern going to be a serious issue if it is an issue at all?
>>

At this point embedded memory is unrealistic for PC graphics. Some graphic card companies may try to convince you otherwise but until they can fit the entire framebuffer with a resolution of 1600x1200 and color depth of at least 32-bits into embedded memory they are pulling your chain because they will be limited by the external memory bandwidth used to fetch the missing information.

As Dave stated before 64MB of embedded memory for 64-bit color, 4xFSAA, 1600x1200 for a traditional renderer.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
I understand what you are saying Dave, and I have read SA's posts over at B3D(I read pretty much every post in every thread every day over there). I see what you are saying about the direction that at least whatever company he works for is going, but is that the only way, or the needed way?

Increasing depth complexity is of course a factor, but with the reduced bandwith needs of HSR and more efficient rasterizers combined with increasingly faster memory, will the needs outpace RAM technology? Clearly bandwith needs are going to increase, but with resolution, color dpeth, and shortly FSAA &quot;conquered&quot;, what is going to continue to push us at the rate we have seen for the last couple of years?

I see that you are looking at this from a hardware engineers POV, of course:) Talking to software developers though I'm not hearing anything upcoming that is going to have bandwith requirements shooting up any time soon. Doom3 is the first title that I have heard of that has me questioning if NV25 will have bandwith to spare on.

&quot;Some graphic card companies may try to convince you otherwise but until they can fit the entire framebuffer with a resolution of 1600x1200 and color depth of at least 32-bits into embedded memory they are pulling your chain because they will be limited by the external memory bandwidth used to fetch the missing information.&quot;

One way to make it useful would be a cache for Z data, this could speed up HSR operations quite a bit, and would save you in main memory bandwith needs without nearing anywhere near the full framebuffer worth. Don't know if we will see it soon or not, but there are ways to use some high speed RAM and increase the overall effectiveness of the main RAM.
 

Soccerman

Elite Member
Oct 9, 1999
6,378
0
0
&quot;After that, we should be looking at real time RayTracing. A few months, even a few weeks ago if anyone asked me about this I would and did in at least one thread say this was several years off.&quot;

;) it's gonna happen I tells ya..

pretty nice post.. good overview of things to expect.

can you go into more detail with that equation representing a texture idea? I've often wondered about replacing things (not textures though) with textures, the potential would be pretty nice..

also, with increased polygon counts (say we get to the point where individual blades of grass are being displayed), you probably want a deeper Z-buffer, so that you can accurately describe the Z values of all these little polygons.

along with 4x + FSAA, will come the requirement for 64 bit colour (the more samples you have, the more options you'll need to give the card to find a good match for averages of samples). it's not bad at all yet, but when we're trying to get close to realism, you'll have to do it eventually.
 

DaveB3D

Senior member
Sep 21, 2000
927
0
0
A Z cache will help.. NV20 will have something like this, though it won't work like you'd expect.

Z are far from perfect as well.. though I'm a bit too tired to think of the reasons why.. ask me tomorrow. :)

However, when it comes down to it you can pull all of these tricks but at the end of the day, deferred rendering will still be on top in performance. And that is what counts, no? Deferred rendering has some things going for it too that I don't think I can talk about, which a traditional architecture does not.
 

Rigoletto

Banned
Aug 6, 2000
1,207
0
0
It just looks to me like the more power available, the more it is frittered away... like buying another rubbish bin and not emptying either until they are both full.
64 bit colour! (I still play in 16bit). 16x FSAA! And all this just to play games!
Honestly, I think 1024x768 at 2x FSAA provides all the quality anyone needs. This &quot;progress&quot; doesn't impress ME much. Ten passes a pixel is also getting silly.
Remember a TV is like 640x480 or 800x600. This resolution is passable. 1024x768 2xFSAA is quite enriched enough for anybody. Anyway, if it's the speed that counts then they are too busy fragging to notice poofy things like colour depths right?
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
DaGoose-

That is still only ~3,000x3,000 and several years away from the consumer space. Still a good move in the right direction though.

Soccerman-

&quot;can you go into more detail with that equation representing a texture idea? I've often wondered about replacing things (not textures though) with textures, the potential would be pretty nice..&quot;

Procedural textures or volumetric textures? Procedural are still some time away, at least in the sense that I am talking about(Unreal engine uses them sparingly here and there). Volumetric textures are standard DX8 fare, all the boards will have them. Which implementation are you interested in and what are you looking to replace?

&quot;also, with increased polygon counts (say we get to the point where individual blades of grass are being displayed), you probably want a deeper Z-buffer, so that you can accurately describe the Z values of all these little polygons.&quot;

32 bit Z will handle things just fine:) I work with scenes that go into the million poly range(compared to about 13,000 for Quake3) and 32bit Z has been serving me well. Inreased levels of precission would help high end MCAD type applications, but only when precission beyond what is visible is required. Besides that, if we are dealing with real time RayTracing then Z-Buffer accuracy is fairly useless anyway.

&quot;along with 4x + FSAA, will come the requirement for 64 bit colour (the more samples you have, the more options you'll need to give the card to find a good match for averages of samples). it's not bad at all yet, but when we're trying to get close to realism, you'll have to do it eventually.&quot;

If we increase rendering passes significantly then I think 64bit color will aid, but with procedural over 32bit is a waste. Also, 4x MSAA, not FSAA;)

Dave-

&quot;However, when it comes down to it you can pull all of these tricks but at the end of the day, deferred rendering will still be on top in performance.&quot;

Will it though? I see that as being the case if memory bandwith/fillrate remains the most important performance aspect, but will it? With massive amounts of geometry, and tasks such as procedural texturing/geometry(or HOS, or is upcoming HOS support going to be primitive procedural geometry?) and ray tracing on the horizon, will bandwith/fillrate be the limmiting performance criteria or will the raw power of the &quot;GPU&quot;? I imagine we are well under two years away from TFLOP GPUs, matching raw calculation performance with many current Crays, enough to handle the massive amoutns of calculations needed for things such as procedural texturing.

&quot;And that is what counts, no?&quot;

But will it? If what we have been discussing in terms of graphics moving forward does not come about for much longer then anticipated, will performance matter anymore? This is an honest question. If we have one board that pushes 150FPS 16x12, 4x, 32bit, everything cranked and another that handles 250FPS, will anyone care? I know some will, but will it matter to gamers? Will anyone be able to tell the difference?

&quot;Deferred rendering has some things going for it too that I don't think I can talk about, which a traditional architecture does not.&quot;

D@mnit Dave, you were a lot more fun when you weren't under such strict NDAs;) The only things that I know of outside of what we have been talking about is the weakness with handling large geometric loads, which clearly you guys are taking care of. This leads me to believe that you are doing some things very differently and only the basic premise behind current &quot;tilers&quot; will be implemented in the GP part. Of course, you can't comment on specifics. Can you answer if this is something that could be figured out? Without having inside information?

Rigoletto-

You bring up TV which is very much in line with what I am saying, sort of;) We are now near the point where upping resolutions and the like are going to do very little. Now is when we start moving towards real time CGI or real time photorealistic images for gaming. It won't make a poor game good, but it does increase the immersive aspect if applied to an already solid title.
 

Blackhawk2

Senior member
May 1, 2000
455
0
0
BenSkywalker, I agree with your points. In the end it will come down to cost of manufacturing and cost for the end consumer. If the same results as a traditional renderer can be acheived using a deferred renderer at much less cost to both manufacturer and consumer, is there really a point to having a traditional renderer like the NV20 which costs $500+ when the same performance can be acheived for half the cost or less with a deferred renderer?
 

DaveB3D

Senior member
Sep 21, 2000
927
0
0
Ben,

I get the very strong impression that you really don't seem to understand where the industry is headed. You talk about advancing things such as ray tracing. Well one of the guys from Pixar made some comments on that a while back. I don't remember where, but they were interesting.

Can't talk about the other stuff... sorry :)
 

OneOfTheseDays

Diamond Member
Jan 15, 2000
7,052
0
0
In the immediate future, we will definetely need the bandwith saving techniques that deferred rendering can offer. Bandwith is the only thing slowing us down these days. Video cards are getting faster and faster, but are being limited by the lack of bandwith and slow memory. Nvidia is trying to get faster and faster memory to close this bottleneck, but in the end they will lose the battle if they do that. They have to start thinking about ways to save on bandwith to lower costs and use the memory more efficiently.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
&quot;I get the very strong impression that you really don't seem to understand where the industry is headed. You talk about advancing things such as ray tracing. Well one of the guys from Pixar made some comments on that a while back. I don't remember where, but they were interesting.&quot;

Real time ray tracing is shipping in an application this month. That one isn't where the 3D industry is headed, we are already there(I also have one game that uses real time ray tracing, though it is just a little Breakout style clone and used sparingly). Pixar's comments were completely out of line, and related nothing at all with what they were supposed to. No surprise there. Steve Jobs is the CEO.

Pixar doesn't even use Ray Tracing at all, so I don't see why anyone would trust what they have to say on the matter, particularly considering how incredibly slow their render engines are compared to say Mental Ray. Do I trust Pixar on how good their software is? Absolutely not, never have. If SGI were to come out and publicly comment in the way that Pixar did then it would carry much more weight with me, and the majority of people. They brought up the amount of bandwith and amount of data that each frame contained(PR guy talking by the sounds of it), brought up effects that they don't even have(Renderman supports neither Ray Tracing nor Radiosity), and then compared the amount of time it took to render TS at 8000x8000 to a comment about heading in that general direction with a consumer graphics card, which they called a toy, and said it was absurd. A couple of months later there was a press release about them ordering a large quantity of workstations based on that same &quot;toy&quot; graphics subsystem that they had been complaining about.

If you are saying that the industry is not moving in the direction of CGI, and you honestly believe that, then it would be a change in direction, clearly. The particular techniques they use may be very different then the ones I mention(much like using MSAA and texture filtering is a good substitute for FSAA). I know that Carmack has a lighting engine that rivals a Radiosity render engine working in real time. This is the type of thing I'm talking about, reaching CGI quality even if they don't use the exact methods that I mention. Do I honestly think anyone is going to use the type of procedural textures that Renderman uses? Of course not, they are far too slow. Will they come up with an alternative, the Unreal engine already uses them and has for years now, just not with level of frequency I'm talking about.
 

BenSkywalker

Diamond Member
Oct 9, 1999
9,140
67
91
Blackhawk2-

Forgot to respond to you in my previous post-

&quot;BenSkywalker, I agree with your points. In the end it will come down to cost of manufacturing and cost for the end consumer.&quot;

We heard the exact same thing about the V5 being too expensive to produce and so it would be too expensive to the end consumer... didn't happen. The big difference is that nVidia has money to burn, 3dfx doesn't.

&quot;If the same results as a traditional renderer can be acheived using a deferred renderer at much less cost to both manufacturer and consumer, is there really a point to having a traditional renderer like the NV20 which costs $500+ when the same performance can be acheived for half the cost or less with a deferred renderer?&quot;

Sounds like the PPC argument, and they lost. Brute force works. It may not be pretty, or the most attractive to an engineer, but it does work. Will the NV20 cost $500 or more? I highly doubt it. $400 maybe, and I can honestly say I wouldn't bat an eyelid at that price. I payed $320 for my DDR when new, I paid $350 for an 8MB ATi All In Wonder Pro when it was new, $400 is well within my acceptable price range. Dual SLI V2s cost quite a bit more then either of those when new, and were well over $400.

Scales of economy are a factor that seems to have been ignored in this aspect. nVidia CAN charge a lot less for the NV20 and get away with it. They could not make a single cent, or even lose money on every board, and still come out ahead in the long run. 3dfx OTOH NEEDS to make money, they can't sustain continual losses(no company could). Without a viable OEM part to compete with ATi/nVidia, they must make their money in the retail market. nVidia is still making money of of the Riva128ZX, TNT, TNT2 and GF1 series of parts on top of the GF2/GF2MX/GF2U, ATi has the RagePro, Rage128, Rage128Pro besides the RadeonSDR/DDR/64MB/VIVO. 3dfx has the V4/V5 and V3, and even then almost exclusively retail. nVidia would stand to benefit in the long term by keeping 3dfx a marginal player or wiping them out completely if possible, pricing on the NV20 could be used to try and do that(much as Intel tried with the Celeron against AMD, though clearly that doesn't always work;)).

The way I see it, and Dave's numerous comments reinforce this, 3dfx is most definately not headed in the direction of nVidia. I have seen comments on nVidia not having people posting on many tech/gaming boards which is very true, they do however pop on on visualization forums. The message that has been coming from them is moving in the type of direction that I have been talking about in this thread. They have been hiring 3D animators and modelers with a strong preference for people with no game based experience, they want people with CGI/Hollywood type experience. Why would that be?

I'm not convinced that a deferred renderer can handle the type of complexities used for advanced rendering techniques(read- way beyond what we have in games now). This isn't to say that GP tech won't be able to, but I haven't seen anything yet on how it will work.

Bandwith and fillrate are not the be all end all of 3D graphics, in fact they are just very basic needs. Is saving bandwith and fillrate the most needed advancement in graphics a year from now? We will see. If it IS, then GP should do extremely well, if it isn't then it probably won't.

Sudheer Anne

&quot;In the immediate future, we will definetely need the bandwith saving techniques that deferred rendering can offer. Bandwith is the only thing slowing us down these days.&quot;

We are talking about ~a year from now though, not immediate.