Predicting framerate: A theoretical basis? A guide?

sgtroyer · May 30, 2003

Reading reviews, it seems that gaming performance is the main goal for many a computer enthusiast. Whether you're into Quake or not, the true measure of how much a** your computer kicks is the framerate in Quake 3. Or UT2k3, or whatever. So what I want to know is, what factors contribute to that, and why. I've seen statements about being CPU limited, or memory bandwidth limited, or fillrate limited, etc., but I don't have a good feel for how this all fits together.

Under what gaming situations are what pieces of hardware stressed the most? Why is that?

I've read reviews hear at AT, and they allude to these things, but I haven't read a really good primer that just breaks it down and explains the basics. Is it out there? Do you even know what I'm asking? Let me go into nauseating detail, then.

If framerate could be described by a function, it would be a wickedly complex function with about 20 variables. Let me see if I can list a few...

Things you can vary:
resolution
color depth
detail settings
FSAA settings
anisotropic filtering settings

Game dependent:
game complexity
AI sophistication
physics model complexity

Machine dependent:
processor model/speed
main memory size
main memory bandwidth/latency
AGP speed
video processor clock speed
video processor architecture (pixel pipes and vertex shaders and all that voodoo)
video memory size
video memory bandwidth/latency

Whew. I think I've listed all that I can think of. Still with me? Now. Which of these are important in which situations? Like, it seems that certain games are more CPU taxing, while others tax the video card. Running 32 bit color at high resolutions is, I think, more video memory intensive. Advanced AI and physics tax the processor, right? Really pretty looking textures make the video processor work hard, right? What happens if I turn on aniso filtering?

My impression is that memory size isn't a big bottleneck, both in terms of main and video memory. 512MB of main and 128MB of video, and you're pretty good for now. Adding more isn't going to help much. But what about the memory speed? Still pretty important, I assume, but why? And under what conditions?

I can read reviews and peruse benchmarks, and know what a specific system will do with a specific game, but it's tough to actually understand the results, let alone predict what a different system will do. We have a descriptive science, but not a theoretical science. I want the theory.

What I'm proposing, beyond what I hope is an entertaining discussion, is that someone write a guide. I looked in the faq and couldn't find anything. Maybe there is something like this out there somewhere. If so, could you give me a link. If not, anyone want to write one? I'm sure there are plenty of people capable.

p.s. This is the first time I've started a new thread. Hopefully this is the right forum for what I hope will end up being a highly technical discussion. I have a great deal of respect for the folks who show up here, and want to hear what they have to say.

tdowning · May 31, 2003

one rule of thumb I have heard is that (if you have a lousy framerate and want to improve it) if you are using 1024x768 or lower resolution, you need a better processor, because the low res does not really tax the video card, for a number of reasons, 1 less bandwidth is needed to get the data to the DAC. horizontal pixels times vertical pixels times bit depth constant 256=1 65k=2 16m=3 16m+8bit alpha channel aka 32 bit=4 that is how much video memory you need to draw the frame

multiply times refresh rate, and you have bandwidth needed.

now you also need left-over bandwidth to get new data into the video memory, and of course if there is a larve amount of video memory not needed for the frame itself, it can store textures and things like that, therefore, it needs less data from the PC itself.

at higher resolutions, however you need a better video card jusr run a few numbers through the above formula, and you will see that the bandwidth and memory increases dramatically as you move up the scale.

Mingon · May 31, 2003

3 other impotant variables

Game type

Overdraw

Rendering order

Shalmanese · Jun 1, 2003

The thing is, its far too complex to estimate reliably. The only real way of doing it is to do CPU/GPU scaling tests. Its also highly dependant on WHICH scene you choose. You could switch from being CPU dependant to GPU dependant to Network dependant all within a few seconds of each other.

BenSkywalker · Jun 24, 2003

What I'm proposing, beyond what I hope is an entertaining discussion, is that someone write a guide. I looked in the faq and couldn't find anything. Maybe there is something like this out there somewhere.

Your first thread and its quite the complex topic

First off you are correct it segmenting out what needs to be under which category, but the categories themselves should be altered a bit-

'CPU'(Platform)-

Processor Speed
Memory Speed/latency
Memory Amount

GPU-

Fillrate
Memory Speed
Memory Amount
Chip Layout*
Shader Speed

*Chip layour covers numerous issues including but no limited to varrying levels of anisotropic filtering or MSAA and how it impacts performance.

As an example, running Quake3 at lower resolutions it is common to say that the game is CPU limited, which it is for certain segments of the benchmark. It is also memory bandwith limited under certain portions. Increasing either CPU speed or memory bandwith will increase performance as different segements of the bench are limited by different factors. If the CPU can't get the information fast enough, it can't process the information fast enough while at times that bench has given the CPU enough data to keep it busy.

What type of code the CPU is dealing with is another factor of course, and can have a huge impact. Look to UnrealTournament2K3 for an excellent example. Using the 'Flyby' bench nearly all platforms are GPU limited at even moderate settings, but switch to botmatch and your CPU will remain the limiting factor up to all but the most extreme settings. The AI and physics models used under actual gameplay make it so, limiting the host platform far more then the GPU.

This remains true for other titles based on the Unreal2 engine also. Unreal2 itself is extremely CPU intensive, although it also is beastly on video cards also at the higher settings. The 'good' thing about U2 is actually that it is so imbalanced in terms of CPU/GPU intesitivity that you can actually crank the vid settings quite high without seeing a large drop in framerate because it is so CPU intensive. You have to be pushing some fairly stunning visuals to make a current vid card choke to less then 20FPS which the code for U2 is enough to do in certain situations. Even with the extreme levels of CPU intensiveness of Unreal2, there are still instances where you will be vid card limited, they simply tend to occur in areas where the difference would be 60FPS v 100FPS, not nearly as important as those FPS dips in to the teens you see when your CPU is the limit. The physics engine utilized by the Unreal2 engine is such that its demands on processors are considerably steeper then what we are used to seeing.

For the Quake3 engined games, due to the flexibility of the engine which title you run will impact where your limits are greatly. Quake3 itself was fairly ideal as a benchmark as even still to date it has held up fairly well at scaling at low settings with CPU speed/mem bandwith and scaling at higher resolutions with GPU speed/bandwith. On the other side, JediKnightII and RtCW tend to be far more CPU limited then the original title for the engine. This is likely mainly due to the more complex AI and physics models those games use not to mention more intensive code for sound etc.

All of the above in summary indicates that you need to first observe the game engine and the particular game to figure out how intensive it is on a CPU/memory level. Right now, despite the large focus on some of the killer visual enhancements looming on the horizon, significantly more accurate physics engines and AI levels are going to be skyrocketing the CPU load also.

On the GPU end there are a slew of issues to cover, moreso then on the CPU end mainly as AMD and Intel do things relatively close to each other and their performance will be relatively close when looking at chips from the same class. For GPUs, which chip doing which things is going to impact your performance differently. As a general example, if you were to take a R9800Pro 128MB v a 5900 128MB running UT2K3 1600x1200x32x4AA the 5900 is going to show a considerably lower framerate in relation to the R9800Pro then it would at 1280x1024x32x4AA. The reason for this is that the NV3X boards have a post process going on involving their AA and they utilize a front and back buffer both at the expanded AA resolution which makes them exceed on board RAM running 128MB boards while ATi is scaling their image back down to the displayed resolution when they flip buffers. Combined with UT2K3's high texture load this is enough to force the NV3X line in to reading textures over the AGP bus, obviously inferior to local RAM.

On the flip side of that, the NV30/35 boards have four pixel pipes dedicated to stencil ops that are not utlized under normal conditions in most current games. When a game is played that relies heavily on stencil fill is played, they will display a performance edge over the R3X0 core boards that they would not have under situations where they are not relying on heavy amounts of stencil fill. Using most current games under normal situations there are certain performance characteristics that will not display themselves, you need specialized benches to look at these isolated cases designed to showcase them. That doesn't mean they won't be factors to end users, simply that the run of the mill tests won't give you any inidcation that they will happen.

Then we have the complexity of shaders and how they impact things. The R3X0 boards are regularly showing a sizeable edge running shader heavy code but there is quite a bit of confusion about how this will end up playing out in games.

I'll have to post part two a bit later, have to go for now

ScottMac · Jun 24, 2003

I disagree that it's all that hard to predict framerate.

According to what I've seen and heard around the forums:

Framerate = "What you want" minus 5-15FPS

or the alternate:

Framerate = "What the person says they get on the board" minus 25-50FPS

OR the second alternate:

Framerate = "What my friend gets on EXACTLY the same box" minus 7-20 FPS

Or the final alternate

"What you need" plus 100 FPS

See, Easy!

Please accept my apology now, it's been an ugly couple days ... no dumping on the "I want my F P S" crowd intended ... jus' poking a little fun atcha.

FWIW

Scott

BenSkywalker · Jun 25, 2003

To continue-

For shader performance you have numerous different issues that can impact things in a fairly significant fashion even when ending up with the exact same thing. As an example, it is possible for rescheduling of shader operations which can ouput the exact same results but have performance that is ~50% or more faster. That ignores the possibility of having differing level of precission within the shaders which could potentially have a sizeable impact on performance, sometimes with the same end results sometimes not.

For texture filtering operations, you have the core abilities of the chip and the impact of the texture cache. IF there was a GPU that had say 64 texture sample units per pipe AND it had a large enough texture cache then it could offer 64x AF for 'free'. Unfortunately, none of the GPUs out have exacting specifics listed so we must revert to running a series of benches and trying to figure out when it is that they exceed their limits.

There are also the amount of 'Z check' units that a GPU supports which impacts how effectively in can handle MSAA. Currently most chips are bandwith limited most of the time, but it is possible we will see a chip that is raster limited due to either extrme levels of bandwith or an alternate method of rasterization(a TBR should have 0 bandwith penalty for MSAA as an example).

Trying to figure out how a particular board will fair in a given upcoming application is quite a bit more complex now then it has been in the past. With all the information we have now it appears likely that those wagering on Pixel Shader 2.0 performance being the dominant feature will be wanting to take a look at the ATi line of products while those seeing more stencil limited situations being the norm heading in the NV direction.

On the CPU side, the fastest available chip is always a good way to go if you are too worried about performance and don't care about money

With x86 around, you can relatively clearly see which chips are going to be the fastest the majority of the time. There are way too many factors overall to say that platform X is going to perform at X level in this group of games. You can likely gauge the relative performance of each of the boards or processors if you know enough about a game, but without a baseline example of the code you are going to be extremely hard pressed to nail it down to anything too close. In days gone by this was actually relatively simplistic as long as you saw the proper direction things were headed in, too much flexibility with all the power we have now to make it that way.

AbsolutDealage · Jun 25, 2003

Of course, you have to throw OS into the mix for the calculation. Not necessarily a huge factor, but there is a big difference between running an old broken down copy of 98 and running XP.

As for actually calculating framerates on a theoretical system, I think that you are going to have some real trouble. This is one of those things that just has way too many variables to be anywhere near accurate.

sgtroyer · Jun 25, 2003

Wow, Ben. Now that's what I was looking for. Let me digest all of that and see if I have any other questions. Thanks.

wviperw · Jun 25, 2003

on calculating a theoretical fps number: I definately think it's do-able. I guess I can't see all that much reason for needing it since we can just test it anyway. There may be a ton of variables, but all you have to do is group these variables into single numbers and it will make it much simpler.

So the equation could end up something like this:

T.F.P.S = (a*GRAPHICS + b*CPU + c*MEMORY) * d

GRAPHICS = e*FILL + f*T&L + g*POLYS...
CPU = h*ALU + i*FP...
MEMORY = j*...

err.. maybe its a little more complicated than that and also more complicated than what I thought.

Anyway, here are some FS articles about 3D in general that might help:

http://firingsquad.gamers.com/hardware/fillratevscpu/default.asp
http://firingsquad.gamers.com/guides/3dbasics/default.asp
http://firingsquad.gamers.com/guides/3dbasics2/
http://firingsquad.gamers.com/guides/videobasics/default.asp
http://firingsquad.gamers.com/guides/videolightfilter/default.asp
http://firingsquad.gamers.com/guides/videopart3/default.asp

kpb · Jun 25, 2003

I don't think it's possible to predict frame rates. There are too many variables involved and alot of them we can't even get sufficient info about. Video cards in particular don't have a lot technical info released about how they actually work. Just look at the debates about things like how many pipe lines the geforce fx's have. These are very complicated peices of electronics and the only ones who know what's going on there is the companies that made them, maybe.

Search

Predicting framerate: A theoretical basis? A guide?

sgtroyer

Member

tdowning

Member

Mingon

Diamond Member

Shalmanese

Platinum Member

BenSkywalker

Diamond Member

ScottMac

Moderator<br>Networking<br>Elite member

BenSkywalker

Diamond Member

AbsolutDealage

Platinum Member

sgtroyer

Member

wviperw

Senior member

kpb

Senior member

TRENDING THREADS