What the numbers really mean
There are a few things in life that are constant: death, taxes, and hardware upgrades. Every time we experience a major shift in graphics fidelity, there's a requisite purchase of new hardware to keep up. Having now finished off the recent launches of Nvidia's GTX 980 Ti and AMD's Fury X, there's something we want to discuss a bit more that will affect future GPU reviews-or at least, it will affect our presentation of data. That topic is minimum frame rates.
Unlike average frame rates, usually expressed as FPS (frames per second), minimum frame rates are prone to some wild fluctuations between benchmark runs. The problem is that depending on built-in benchmarks and their reported frame rates isn't always reliable. Some games appear to sweep a few low results under the table, others report the absolute minimum frame rate, and still others use a sort of "average minimum" value. All of these can be useful to varying degrees, but the inconsistency between benchmarks is a real concern. To understand why, we need to talk about why minimum FPS matters and then look at a few benchmarks as examples.
The short summary is that minimum frame rates matter because they can cause a game to stutter. Imagine, as a worst-case scenario, a game where one frame renders at 20fps and then the next three frames render at 180fps. While the average frame rate would be 60fps, on a typical 60Hz display with VSYNC disabled you would see the first frame for three screen updates followed by one update showing parts of the three fast frames. Or let's take an even more extreme case: imagine a game that renders at 60fps for 19 frames and then 10fps for a single frame. The game would feel smooth for those 19 frames and then there would be a big stutter on the last frame. The average frame rate is still a respectable 48fps, but the minimum frame rate indicates there's a serious problem somewhere.
The difficulty is that minimum frame rates often don't come at regular intervals. The 19-to-1 ratio of the second example would be horrible if it happened, but it typically doesn't occur in normal gameplay. What's more likely is that you'll have games that run for hundreds or even thousands of frames at higher FPS values, but at times there's a scene transition or the GPU runs out of VRAM and you'll have some stuttering. Ideally, that's what we want to capture, but many games abstract the benchmark results into just minimum and average FPS. So let's look at a few examples.
First, Tomb Raider (2013) has a decent built-in benchmark. The test shows Lara Croft overlooking a scene of crashed boats, airplanes, etc., as the camera orbits around her. As the entire sequence consists of a single scene in the game, the loading of assets is done ahead of the benchmark run and the results are very consistent. If you run the test ten times, you might see a fluctuation of a few percent at most on both the average and minimum frame rates. Maximum frame rates may show greater variability, but few people are worried about the maximum FPS, so that's not a problem. The Tomb Raider benchmark at least is an example of a trustworthy minimum FPS result.
At the other end of the spectrum, the Unigine Heaven 4.0 benchmark has extremely unreliable minimum frame rates. The test consists of 26 scenes, but frame rates are captured during scene transitions-so for example, the first few scenes may have all their assets loaded into memory, but at some point there's a scene that has to load some assets. When this happens, there might be a single instance where the frame rate drops to 30fps. If the 30fps result happened consistently, it might be meaningful, but if it only occurs during a camera/scene change and it only happens for one or two frames out of thousands, it has little bearing on normal game play.
In between these extremes, there are other games where the built-in benchmark may have erratic minimum FPS results during the first few seconds of a benchmark. Shadow of Mordor is like this, as the level assets are still loading for the first few seconds. If you were actually playing Shadow of Mordor rather than just benchmarking it, you might have stuttering frame rates right as a saved game loads, but then for minutes or even hours afterward the frame rates would be higher and generally consistent. Run the built-in benchmark once and the minimum frame rate might show as 25fps. Run it three times in a row and you'll typically find that the second and third runs show significantly higher "minimums." But even running the test multiple times doesn't fully account for variations between runs.
Enter the 97 Percentile
The good news is that the problem with looking at pure minimums is a well-known fact, so clever statisticians have already had a solution for decades (or more): percentiles. The concept is easy enough: given a large enough set of numbers, sort them and the 97 percentile would be the number that is larger than 97 percent of the results. For minimum frame rates, we're going the other way and looking at the number that's smaller than 97 percent of results. Of course, you can make arguments for a different percentile-99 and 95 percentile are commonly used-but that's more debating semantics.
Some games even report percentiles already, e.g., GTAV's built-in benchmark reports 90-99 percentiles for the five test scenes, along with 50, 75, 80, and 85 percentiles for good measure. GTAV also goes one step further and reports the number of frames under 60fps and 30fps for each test sequence. Let's quickly look at what this means for a specific GPU tested in GTAV, the GTX 980 Ti running at 1080p.
GTAV GTX 980 Ti 1080p Results - Pass 4
Min | Avg | % > 60fps | % > 30fps | 97% |
8.6 | 83.1 | 93.9 | 99.8 | 58.8 |
If we only look at the minimum frame rate, it might appear that GTAV stutters a lot for this specific test case-the average FPS is a rather high 83, but minimum FPS is only 8.6! But looking closer, that minimum frame rate is quite rare, likely occurring only when a bunch of data has to be loaded into memory. Looking at the frame rates above 30fps, only 0.2 percent of frames were below that threshold, while 6.1 percent were below 60fps. And finally, the 97 percentile tells us that 97 percent of frames rendered at 58.8fps or faster.
Of course, we could just present you with a complete graph of frame rates for the test, as shown above. The problem with this approach is that it makes comparing products difficult, especially for people who don't eat, breathe, and sleep statistics. We would need one chart per GPU per game, or perhaps we could do a few GPUs in each chart, but either way it quickly results in information overload. It also requires a lot more time to create all the charts, time which could be better spent in other endeavors. Using a 97 percentile result allows us to quickly get to the heart of the matter and provide a meaningful "typical minimum FPS" value.
Of course, totally ignoring all frame rates below the 97 percentile doesn't necessarily make sense either. Those slow frames are still present, and if there's enough of them-and if they're slow enough-it can dramatically impact the overall experience. Our solution is simple. Instead of looking at just the 97 percentile frame rate, we can find the average FPS for all frames that are slower than the 97 percentile. That way we don't miss out on the effect of a few very slow frames.
Check Out These Frame Rates
Here's where things get interesting. If we take the built-in benchmarks we're running for our GPU tests and log frame rates with FRAPS, this allows us to calculate the average FPS as well as the average FPS of the slowest three percent of frames-what we're calling the "average 97 percentile." We can also look at the true instantaneous minimum FPS according to FRAPS. You might assume all of the games report the absolute minimum FPS, but it turns out they all vary in how they're calculating minimums.
Some of the tests appear to "miss" counting certain frames while others seem to have some sort of percentile calculation in effect. The short summary is that most of the games show unwanted variations in this metric. But let's not jump ahead. In the charts below, we'll look at the reported "minimums" compared to our calculated "average 97 percentile" to show what's going on. We'll also provide separate commentary on each chart to discuss what we've noticed during testing.
Starting with Batman: Arkham Origins, we immediately see that the "minimum" FPS reported by the built-in benchmark already appears to be doing some form of percentile calculation. Our "average 97 percentile" in all cases is lower than the game's reported minimum, and it would appear Arkham Origins is using a 90 percentile or similar; we've also noticed a tendency for the built-in benchmark to report a lower than expected "minimum" that only occurs between scenes. Looking at AMD and Nvidia GPUs, we have two different results. On the Nvidia side, our 97 percentile and the game's reported minimums are relatively close at higher resolutions, while on AMD hardware we see a relatively large discrepancy. This is one of several titles where the built-in benchmark provides somewhat "misleading" (or at least, not entirely correct) minimum frame rate reporting.
Hitman: Absolution shows some of the same issues as Arkham Origins. The supposed minimum frame rates the built-in benchmark reports are either skipping some frames or using a percentile calculation. Interestingly, this appears to benefit AMD GPUs in particular at higher resolutions, where 4K on the 390X has a reported minimum that's 44 percent higher than our average 97 percentile calculation.
Shadow of Mordor reverses the trend and we see universally higher frame rates compared to the reported minimum FPS. This time, however, AMD cards were being hurt by lower reported minimums. The reason is actually pretty straightforward: For the first few seconds after the benchmark begins, AMD GPUs in particular have a frame every half a second or so that takes longer to render. Once all the assets are loaded, things smooth out, but those early dips occur more frequently on Radeons. Also note that we were running this particular benchmark three times in succession at each resolution to try to stabilize the reported minimum FPS-if we only ran the test once, the reported minimums would be substantially lower on all GPUs.
Finally, we have a game where the reported minimum FPS closely matches our average 97 percentile calculations. It could be that Tomb Raider was already doing a percentile calculation, or more likely it's due to the fact that the benchmark scene isn't very dynamic, so the typical minimum FPS occurs for a longer period of time. The game also appears to pre-cache all assets, as much as possible, so there aren't any unusual spikes in FPS. The net result is that at most resolutions, we only see a 1-2fps difference, so the built-in gives meaningful and consistent results-at least for this particular game.
Of the five games/engines we're testing for this article, Unigine Heaven has the least reliable/useful minimum FPS reporting. At lower resolutions on Nvidia hardware, a few dips in frame rate during scene transitions skew the numbers, so our 97 percentile results are substantially higher. Where things get interesting is on AMD GPUs, where the engine actually appears to miss some of the low frame rates. This might be something a driver update could address, but a detailed graph comparing the 980 Ti and Fury X will help illustrate the current problem better:
We used two graphs because Heaven is a longer benchmark, and looking at the full 260-second chart doesn't clearly show what's happening. Zooming in on the first 20 seconds gives the proper view. Basically, while there are minor fluctuations in FPS on the GTX 980 Ti, on the Fury X (and other AMD GPUs show the same issue), there's a pattern that repeats every seven frames: one slow-to-render frame, one fast-to-render frame (most likely a runt frame), and then five frames at a mostly consistent rate. Right now, this is occurring throughout the 260-second test sequence. The result is a stuttering frame rate that's noticeable to the user, though thankfully Unigine is more of a tech demo/benchmark as opposed to an actual game.
Putting It All Together
"Hey, you in the back... WAKE UP!" Okay, sorry for the boring math diatribe, but sometimes it's important to understand what's going on and what it really means. This isn't intended as any form of manifesto on frame rates, and in fact this is a topic that has come up before. Nvidia even helped create some hardware and software to better report on what is happening on the end-user screen, called FCAT (Frame Capture Analysis Tool), but frankly it can be a pain to use. Our reason for talking about this is merely to shed some light, once more, on the importance of consistent frame rates.
We've been collecting data for a little while now, and since we were already running FRAPS for certain tests it makes sense to look at other places where it makes sense. We're planning to start using our 97 percentile "average minimum FPS" results for future GPU reviews, as it will help to eliminate some discrepancies in the reported frame rates from certain games (see above). It shouldn't radically alter our conclusions, but if there are driver issues (e.g., AMD clearly has something going wrong with Unigine Heaven 4.0 right now), looking at the reported minimum FPS along with the 97 percentile will raise a red flag.
If you read one of our GPU reviews in the near future and wonder why some of the minimum FPS results changed, this is why. It will also explain why some of our results won't fully line up with other results you might see reported-we're not willing to trust the reported minimums.
And if you really want to know how to calculate a similar number... fire up Excel, open your FRAPS frametimes CSV, and calculate the individual frame times in column C (e.g., C3 = B3-B2). Copy that formula down column C until the end of the collected data. Then the "average 97 percentile" is as follows, where "[C Data]" is the range of cells in column C containing individual frame times (e.g., "C3:C9289"):
=COUNTIF([C Data], ">"& PERCENTILE.INC([C Data],0.97))/SUMIF([C Data], ">"&PERCENTILE.INC([C Data],0.97))*1000