Interesting benchmarking methodology review by [THG]

bystander36

Diamond Member
Apr 1, 2013
5,154
132
106
PresentMon: Performance In DirectX, OpenGL, And Vulkan
http://www.tomshardware.com/reviews/presentmon-performance-directx-opengl-vulkan,4740.html

Please note, this isn't really an article to compare Nvidia vs AMD. It's more about a new, seemingly superior method of benchmarking performance in terms of both FPS and smoothness. The example game they use is Hitman, which I'm pretty sure is an AMD favored game, so don't get hung up on the results of the example. That said, it seems to show that AMD Rx 480 has better performance on DX11 than DX12, despite the FPS improvements in DX12, if you take frame variance into account.

I like the graphs at the bottom of this page a lot (the blue and gray graphs): http://www.tomshardware.com/reviews/presentmon-performance-directx-opengl-vulkan,4740-3.html

They are easy to read, and show you where the stuttering is, as well as the mildly uneven frame rates that may not be a big deal to people.
 
  • Like
Reactions: Headfoot

antihelten

Golden Member
Feb 2, 2012
1,764
274
126
PresentMon isn't really anything new, it's basically just FRAPS, except it captures its timestamps at a later stage of the rendering pipeline.

The only real reason PresentMon was made, was because FRAPS is incompatible with DX12 games.

Other than the fact that PresentMon is compatible with DX12 (and Vulkan), there isn't really any significant improvements compared to FRAPS, so it doesn't really make sense to call it a superior method, and either way both PresentMon and FRAPS remain inferior to FCAT, as far as accuracy goes.
 

bystander36

Diamond Member
Apr 1, 2013
5,154
132
106
PresentMon isn't really anything new, it's basically just FRAPS, except it captures its timestamps at a later stage of the rendering pipeline.

The only real reason PresentMon was made, was because FRAPS is incompatible with DX12 games.

Other than the fact that PresentMon is compatible with DX12 (and Vulkan), there isn't really any significant improvements compared to FRAPS, so it doesn't really make sense to call it a superior method, and either way both PresentMon and FRAPS remain inferior to FCAT, as far as accuracy goes.
I'm guessing you didn't go on to read the pages past the first one. The main focus was on how to use it to present much better information than you typically see. It's less about Present monitor, and more about their methodology of presenting the data in a way that shows real world performance, and not just about min FPS, and max FPS.

Example:
14-Uneveness-Min-CPU.png
 

bystander36

Diamond Member
Apr 1, 2013
5,154
132
106
To read those graphs, you'll find the blue is the FPS. The gray bars shows where frame rates are a little uneven, but not enough to bother most people. The dash line, is the point where the gray bars and spikes become very noticeable stutters.
 

Madpacket

Platinum Member
Nov 15, 2005
2,068
326
126
Love these graphs, good on Toms to make this easy to digest. Now the issue that's left to solve is one of measuring and presenting Gsync/Fsync frametimes. Many of us (and even non enthusiasts) have adopted adaptive sync so as good as these presentmon graphs look, they become largely irrelevant once you enable adaptive sync. These charts are a good reference point for gamers holding out on adaptive sync but for everyone who's splurged on a decent monitor it's more of an academic measurement.
 
  • Like
Reactions: Headfoot

bystander36

Diamond Member
Apr 1, 2013
5,154
132
106
I'm not sure they would be irrelevant with adaptive syncing tech, but I'd guess it lessens the effect of uneven frame delivery in some cases, though the spikes would still be a problem.
 

AtenRa

Lifer
Feb 2, 2009
14,001
3,357
136
I will sure like to see Core i3 at higher clocks vs Core i5 so that people will understand that average fps in not the whole story.
 
  • Like
Reactions: Headfoot

antihelten

Golden Member
Feb 2, 2012
1,764
274
126
I'm guessing you didn't go on to read the pages past the first one. The main focus was on how to use it to present much better information than you typically see. It's less about Present monitor, and more about their methodology of presenting the data in a way that shows real world performance, and not just about min FPS, and max FPS.

Example:
14-Uneveness-Min-CPU.png

I did in fact read past the first page, but thanks for assuming things I guess, and my point still stands. There is practically nothing new in this article since everything could be done just as easily with FRAPS (and has been done).

The only novel thing is that THG invented this new metric that they call uneveness, but considering that they cant actually be bothered to explain how they actually arrive at their numbers, I would be loathe to call that metric superior to anything.
 

bystander36

Diamond Member
Apr 1, 2013
5,154
132
106
The first page of the article explains why FRAPS cannot do it, as it doesn't work with all 3 primary API (DX11, DX12 and Vulkan). And who else has created such good graphs? I have never seen them.
 

antihelten

Golden Member
Feb 2, 2012
1,764
274
126
The first page of the article explains why FRAPS cannot do it, as it doesn't work with all 3 primary API (DX11, DX12 and Vulkan). And who else has created such good graphs? I have never seen them.

I'm perfectly aware that FRAPS doesn't work with the new APIs, which is why I also wrote that in my first reply here, I assumed that since it had already been mentioned, that it would automatically be implicit, but I guess not.

And as far as the graphs being good, why do you think this? All I see is a completely normal FPS graph, with THG's obscure and unexplained uneveness-index overlaid. In other words, nothing special.
 

bystander36

Diamond Member
Apr 1, 2013
5,154
132
106
I'm perfectly aware that FRAPS doesn't work with the new APIs, which is why I also wrote that in my first reply here, I assumed that since it had already been mentioned, that it would automatically be implicit, but I guess not.
Your first post is simply a thread crap, and not about what they were show casing, which was their method of showing FPS, and conveying the actual experience you get.

And as far as the graphs being good, why do you think this? All I see is a completely normal FPS graph, with THG's obscure and unexplained uneveness-index overlaid. In other words, nothing special.
Min, max and average FPS is not good enough to show what the performance of a card is. With this method, a simple glance show you what the FPS looks like over time. It also shows where there are mildly uneven frames, and where there are bad spikes. All of this is in a simple easy to read graph.

Show me a better graph that can do all that.
 
  • Like
Reactions: crisium

antihelten

Golden Member
Feb 2, 2012
1,764
274
126
Your first post is simply a thread crap, and not about what they were show casing, which was their method of showing FPS, and conveying the actual experience you get.

My first post was a thread crap? simply because it disagreed with your assessment that THG's method was superior. You're a little full of yourself aren't you?

Also I think it's amusing that you don't think THG's article was about PresentMon and APIs, when the title of the article is "PresentMon: Performance in DirectX, OpenGL, and Vulkan" and not "Uneveness-index: A superior method of benchmarking". But I guess THG themselves are threadcrapping you as well :p

But by all means, if you think it was a thread crap, then report it and let the mods sort it out.

Min, max and average FPS is not good enough to show what the performance of a card is. With this method, a simple glance show you what the FPS looks like over time. It also shows where there are mildly uneven frames, and where there are bad spikes. All of this is in a simple easy to read graph.

No kidding min, max and average FPS isn't good enough, which is why various sites out there have been doing FPS graphs for ages. And this is really my entire point, that THG doesn't really bring anything new to the table (other than their rather questionable "uneveness-index").

Show me a better graph that can do all that.

I never said there were better graphs out there, I simply said that other graphs were equivalent to it, you were the one claiming that THG's method was superior (and thus by extension that other methods are inferior). Every single FPS graph out there can do the same as THG's graph (i.e. show FPS over time, uneven frames and spikes. As long as the graph isn't excessively smoothed).

The only thing novel about THG's graph is the uneveness-index, but as long as they don't provide an explanation for how they arrive at their values in said index, then that index remains of dubious value.
 
Last edited:

bystander36

Diamond Member
Apr 1, 2013
5,154
132
106
I don't know what to say to you. Your 1st post was crapping on how Present mon was nothing new, so the thread was pointless. Ignoring that the post was about a new method of showing the performance of a game.

Your 2nd post said you saw the new parts, including the new methods, which leaves me confused here. Where you lying, or thread crapping on the 1st post?

Now you admit that you've not seen better graphs. I said they were seemingly superior. Why are you arguing if you are in agreement, or at the very least, mostly in agreement that they are as good as any out there, if not better (you did mention the new grey bars showing the mildly unevenness as new and better)?

Do you really need to follow me around and continue this?
 
  • Like
Reactions: crisium

Headfoot

Diamond Member
Feb 28, 2008
4,444
641
126
FCAT is defintely still the gold standard, since that measures the actual output and not a timestamp at a stage in the software pipeline (which may still run into weird issues after that software stage, e.g. runt frames) but this is certainly miles ahead of what a FRAPS based min max average analysis would get you.

I like the unevenness metric, definitely want to see that metric turned towards CPU, RAM and CF/SLI testing to see where the benefits are. I really want to see if faster RAM = lower unevenness. The RAM + i3 connection should be explored more. Its been a nonissue for so many years and to see it resurface with so little investigation done is very strange to me.
 
  • Like
Reactions: bystander36

crisium

Platinum Member
Aug 19, 2001
2,643
615
136
I like the charts, a good addition to what's already out there. This also backs up what others are saying. At least at launch, I know reading that DX11 vs DX12 was similar in BF1 and Deus Ex for AMD.
 

antihelten

Golden Member
Feb 2, 2012
1,764
274
126
I don't know what to say to you. Your 1st post was crapping on how Present mon was nothing new, so the thread was pointless. Ignoring that the post was about a new method of showing the performance of a game.

And I really don't know what to say to you either, since you clearly hallucinating now, and seeing things that I never wrote. I never said that the thread was pointless, and I certainly didn't ignore that the article was about a new method, seeing as my post was actually focused on said method (PresentMon).

Your 2nd post said you saw the new parts, including the new methods, which leaves me confused here. Where you lying, or thread crapping on the 1st post?

I never said I didn't see the new part (the unevenness-index), so I really don't see where I would be luying, but I guess you're just threadcrapping your own thread now, by coming up with utterly baseless accusations.

My first post simply explained that PresentMon wasn't anything special, except for it's ability to work with the new APIs, other than that it is for all intents and purposes identical to FRAPS. My second post was about THG's unevenness-index, and the fact that they offer zero explanation as to how they actually arrive at their values for it. Seeing as the two posts covered two completely different things I really don't understand how you could be so confused.

Now you admit that you've not seen better graphs. I said they were seemingly superior. Why are you arguing if you are in agreement, or at the very least, mostly in agreement that they are as good as any out there, if not better (you did mention the new grey bars showing the mildly unevenness as new and better)?

I never said I hadn't seen better graphs (I have, THG does a lot of smoothing on their graph which hides spikes, but then tries to compensate for this by overlaying their obscure unevenness-index), I simply said that there wasn't anything special about THG's graphs, and that it was pretty similar to your average FPS graph, graphs that are dime a dozen.

Do you really need to follow me around and continue this?

Follow you around? a bit paranoid aren't we? You made a thread, I commented on it, no more no less.

Seriously if you can't handle people disagreeing with, and immediately start accusing them of threadcrapping, lying and stalking, then maybe a debate forum isn't for you.
 

bystander36

Diamond Member
Apr 1, 2013
5,154
132
106
FCAT is defintely still the gold standard, since that measures the actual output and not a timestamp at a stage in the software pipeline (which may still run into weird issues after that software stage, e.g. runt frames) but this is certainly miles ahead of what a FRAPS based min max average analysis would get you.

I like the unevenness metric, definitely want to see that metric turned towards CPU, RAM and CF/SLI testing to see where the benefits are. I really want to see if faster RAM = lower unevenness. The RAM + i3 connection should be explored more. Its been a nonissue for so many years and to see it resurface with so little investigation done is very strange to me.
FCAT is definitely great, and these type of graphs could be created with FCAT as the tool, but one thing you constantly hear from people is that FCAT can't be trusted due to it being made by Nvidia. I think that's bogus, as it measure the output, so it shouldn't matter what GPU was used, but that is a common complaint.
 

antihelten

Golden Member
Feb 2, 2012
1,764
274
126
I'm not going to read your post, nor argue with you. If you want to continue to arguing, I'm not interested.

You're not interested, and yet you just can't help but reply.

Seriously, if you have that hard of a time handling someone disagreeing with you on a debate forum, then just put me on ignore. I promise you, it won't hurt my feelings.

FCAT is definitely great, and these type of graphs could be created with FCAT as the tool, but one thing you constantly hear from people is that FCAT can't be trusted due to it being made by Nvidia. I think that's bogus, as it measure the output, so it shouldn't matter what GPU was used, but that is a common complaint.

That might have been a common complaint back when it first came out, but I don't really think you see much of it nowadays.

I think the main reason why we don't see more sites doing FCAT, is simply that it's extremely cumbersome, both hardware wise and time wise.
 

beginner99

Diamond Member
Jun 2, 2009
5,210
1,580
136
Is it really surprising that when your FPS is higher variance will be higher as well? If we finish THG conclusion your better of gaming on a FX8350 than a 6950x because of their new scoring method.

Rather, we perform a more complex calculation that reveals values most likely to affect your level of immersion.

It's hard to tell if they did it right without actually showing the formulas but what is for sure these frame time variances must be relative not absolute. If you use absolute value higher FPS will of course always have higher "unevenness". A drop of 30 FPS from 60 to 30 is not the same as from 120 to 90. Since their graphs always shows "Variance in ms" I'm pretty sure they went with the absolute method which to me honestly makes no sense as the slower system will always look better.

To drive this point home imagine the game has some hard-bottleneck (bug?) in a certain area were the CPU must go to RAM very often (stream textures?), hence neither CPU nor GPU are the bottleneck. In that area even the fastest CPU only manages 30 FPS. However that fastest CPU will be faster than all other CPUs in other areas of the game. But now it has a min of 30 fps and the highest max of all of them and hence obviously the worst "evenness" score.
 

bystander36

Diamond Member
Apr 1, 2013
5,154
132
106
Is it really surprising that when your FPS is higher variance will be higher as well? If we finish THG conclusion your better of gaming on a FX8350 than a 6950x because of their new scoring method.



It's hard to tell if they did it right without actually showing the formulas but what is for sure these frame time variances must be relative not absolute. If you use absolute value higher FPS will of course always have higher "unevenness". A drop of 30 FPS from 60 to 30 is not the same as from 120 to 90. Since their graphs always shows "Variance in ms" I'm pretty sure they went with the absolute method which to me honestly makes no sense as the slower system will always look better.

To drive this point home imagine the game has some hard-bottleneck (bug?) in a certain area were the CPU must go to RAM very often (stream textures?), hence neither CPU nor GPU are the bottleneck. In that area even the fastest CPU only manages 30 FPS. However that fastest CPU will be faster than all other CPUs in other areas of the game. But now it has a min of 30 fps and the highest max of all of them and hence obviously the worst "evenness" score.
Stutters because the engine goes to RAM, is a stutter no matter how you look at it. It might be the fault of the game engine, but that's still a stutter. That should show up, and in comparison to other GPU's, it will show the same thing for all. Cut scenes on the other hand, are smoothed out of the equation.

Some of your concern was talked about here:
To prevent accidental misinterpretations, we use an intelligent filter that catches transitions between the cut scenes you often see in built-in benchmarks. If one sequence is not complex and runs faster, and the second is more challenging, slowing performance, this artifact is filtered out (frame preview, block-wise comparison); it's not real stuttering, but rather a scene change.

In this way, a fairly accurate forecast can be made whether stuttering or dropped frames are visually perceptible to the gamer. If the score for a single frame is higher (that means worse) than the base FPS value, the whole interval is marked with the higher/worse index value.
Stutters because the engine goes to RAM, is a stutter though
 

antihelten

Golden Member
Feb 2, 2012
1,764
274
126
It's hard to tell if they did it right without actually showing the formulas but what is for sure these frame time variances must be relative not absolute. If you use absolute value higher FPS will of course always have higher "unevenness". A drop of 30 FPS from 60 to 30 is not the same as from 120 to 90. Since their graphs always shows "Variance in ms" I'm pretty sure they went with the absolute method which to me honestly makes no sense as the slower system will always look better.

As you say it's basically impossible to properly judge THG's number when they won't show their formulas, but it's worth noting that using absolute values are only higher for the high FPS scenario, when you measure performance in FPS instead of frametime, if you use frametime it's the other way around (the relative value will be higher).

For instance if you are running at a high FPS of 120, then that corresponds to a frametime of 8.33 ms, if you then have a variation of 4 ms (in this case let's say an increase just for the sake of argument), then in relative terms that would be equal to a relative change of 48% (and would correspond to either 81). For the "low" FPS of 60 (16.67 ms), the same 4 ms increase would correspond to a relative change of 24% (to 48 FPS). When measured in FPS the 120 FPS has an absolute change of either 39 FPS, whereas the 60 FPS scenario has an absolute change of 12 FPS.

Unfortunately it's pretty much impossible to tell what THG actually uses based on their explanation, in fact as far as I can tell they use both the FPS and the frametime, plus the frame to frame difference in frametimes. How exactly they go from those numbers to a value of 0 to 10 is anyone's guess.

Stutters because the engine goes to RAM, is a stutter no matter how you look at it. It might be the fault of the game engine, but that's still a stutter. That should show up, and in comparison to other GPU's, it will show the same thing for all. Cut scenes on the other hand, are smoothed out of the equation.

Some of your concern was talked about here:

Stutters because the engine goes to RAM, is a stutter though

I don't think beginner99 is saying that RAM stutter or other similar non-GPU based or non-CPU based stutter isn't real, he's simply saying that the scoring of said stutter will depend upon the general FPS value, with the faster machine being penalized harder, even though the stutter is completely identical to the slower machine.