4K and SLI tested on Nvidia's high-end Maxwell card
Sometimes things don't go according to plan. Both AMD and Nvidia were supposed to have shifted to 20-nanometer parts by now. In theory, that's supposed to get you lower temperatures, higher clock speeds and quieter operation. Due to circumstances largely out of its control, Nvidia has had to go ahead with a 28nm high-end Maxwell part instead, dubbed GM204. This is not a direct successor to the GTX 780, which has more transistors, texture mapping units, and things like that. The 980 is actually the next step beyond the GTX 680, aka GK104, which was launched in March 2012.
Despite that, our testing indicates that the GTX 980 is still perceptibly faster than the GTX 780 and 780 Ti (and AMD's Radeon R9 290 and 290X, for that matter) in every game we benchmarked. (though there are a of couple games better optimized for Radeon hardware). When 20nm processes become available sometime next year, we'll probably see the actual successor to the GTX 780. But right now, the GTX 980 is here, and comes in at $500. That seems high at first, but recall that the GTX 680, 580, and 480 all launched at this price. And keep in mind that it's a faster card than the 780 and 780 Ti, which currently cost more. (As we wrote this, AMD announced that it was dropping the base price of the R9 290X from $500 to $450, so that war rages on.) The GTX 970 at $400 may be a better deal, but we have not yet obtained one of those for testing.
In other news, Nvidia told us that they were dropping the price of the GTX 760 to $219, and the GTX 780 Ti, 780 and 770 are being officially discontinued. So if you need a second one of those for SLI, now is a good time.
Let's take a look at the specs:
| GTX 980 | GTX 970 | GTX 680 | GTX 780 | GTX 780 Ti | R9 290X |
Generation | GM204 | GM204 | GK104 | GK104 | GK110 | Hawaii |
Core Clock (MHz) | 1126 | 1050 | 1006 | 863 | 876 | "up to" 1GHz |
Boost Clock (MHz) | 1216 | 1178 | 1058 | 900 | 928 | N/A |
VRAM Clock (MHz) | 7000 | 7000 | 6000 | 6000 | 7000 | 5000 |
VRAM Amount | 4GB | 4GB | 2GB/4GB | 3GB/6GB | 3GB | 4GB |
Bus | 256-bit | 256-bit | 256-bit | 384-bit | 384-bit | 512-bit |
ROPs | 64 | 64 | 32 | 48 | 48 | 64 |
TMUs | 128 | 104 | 128 | 192 | 240 | 176 |
Shaders | 2048 | 1664 | 1536 | 2304 | 2880 | 2816 |
SMs | 16 | 13 | 8 | 12 | 15 | N/A |
TDP (watts) | 165 | 145 | 195 | 250 | 250 | 290 |
Launch Price | $549 | $329 | $499 | $649 | $699 | $549 |
On paper, the 980 and 970 don't look like much of a jump from the 680. In fact, the 980 has only 128 shaders (aka "CUDA cores") per streaming multi-processor (SM). Performance tends to increase with a higher number of shaders per SM, so how did the 980 GTX perform so well in our benches, despite having a worse ratio than all the other cards? Well, Nvidia claims that they've improved the performance of each CUDA core by 40%. Provided that this calculation is accurate, the GTX 980 effectively has about as many CUDA cores as a 780 Ti. Add the GTX 980's bigger clock speeds, and performance should be higher.
You probably also noticed the unusually low price for the GTX 970. The GTX 670 launched at $400 in May 2012, and the GTX 570 launched at $350 in December 2010. These earlier two cards were also had more similar specs compared to their bigger brothers. For example, the GTX 570 had 480 CUDA cores, while the 580 had 512 cores. This is a difference of just 6.25%, although the memory bus was reduced from 384-bits to 320-bits. In contrast, the 970 gets nearly 20% fewer CUDA cores than the 980, though its memory bus remains unchanged. As we said, we haven't gotten a 970 in yet, but, based on its specs, we doubt that we can compensate with overclocking, as we've been able to do in the past with the GTX 670 and 760, and the Radeon R9 290.
Nvidia also says that the official boost clock on these new Maxwell cards is not set in stone. We witnessed our cards boosting up to 1,253MHz for extended periods of time (i.e., 20 seconds here, 30 seconds there). When the cards hit their thermal limit of 80 degrees Celsius, they would fall down as low as 1,165Mhz, but we never saw them throttle below the official base clock of 1,126MHz. In SLI, we also noted that the upper card would go up to 84 C. According to Nvidia, these cards have an upper boundary of 95 C, at which point they will throttle below the base clock to avoid going up in smoke. We were not inclined to test that theory, for now.
The company also says that its delta color compression algorithms have improved bandwidth requirements by about 25 percent on average (it varies from game to game). This extra headroom provides more space for increased frame rates. Since DCC directly affects pixels, this effect should scale with your resolution, becoming increasingly helpful as you crank your res higher.
You can also combine these gains with Nvidia's new Multi-Frame Sampled Anti-Aliasing (MFAA). This technique rotates a pixel's sampling points from one frame to the next, so that two of these points can simulate the visual results of four sampling points whose locations remain static. The effect starts to shimmer at about 20FPS, whereupon it's automatically disabled. But when running well, Nvidia claims that it can be 30 percent faster, on average, than the visually equivalent level of Multi-Sample Anti-Aliasing (MSAA). Like TXAA (Temporal Anti-Aliasing), this technique won't be available on AMD cards (or if it is, it will be built by AMD from the ground up and called something else).
Unfortunately, MFAA was not available in the version 344.07 beta drivers given to us for testing, but Nvidia said it would be in the driver after this one. This means that the package will not be complete on launch day. Support will trickle down to the older Kepler cards later on. Nvidia hasn't been specific about timelines of specific cards, but it sounded like the 750 and 750 Ti (also technically Maxwell cards), will not be invited to this party.
Another major upgrade is Voxel Global Illumination, or VXGI. Nvidia positions this as the next step beyond ambient occlusion. With VXGI, light bounces off of surfaces to illuminate nooks and crannies that would otherwise not be lit realistically, in real time. Ordinarily, light does not bounce around in a 3D game engine like it does in meatspace. It simply hits a surface, illuminates it, and that's the end. Sometimes the lighting effect is just painted onto the texture. So there's a lot more calculation going on with VXGI.
But Nvidia has not made specific performance claims because the effect is highly scalable. A developer can choose how many cones of light they want to use, and the degree of bounced light resolution (you can go for diffused/blurry spots of light, or a reflection that's nearly a mirror image of the bounced surface), and they balance this result against a performance target. Since this is something that has to be coded into the game engine, we won't see that effect right away by forcing it in the drivers, like Nvidia users can with ambient occlusion.
Nvidia is also investing more deeply into VR headsets with an initiative called VR Direct. Their main bullet point is a reduction in average latency from 50ms to 25ms, using a combination of code optimization, MFAA, and another new feature called Auto Asynchronous Warp (AAW). This displays frames at 60fps even when performance drops below that. Since each eye is getting an independently rendered scene, your PC effectively needs to maintain 120FPS otherwise, which isn't going to be common with more demanding games. AAW takes care of the difference. However, we haven't had the opportunity to test the GTX 980 with VR-enabled games yet.
Speaking of which, Nvidia is also introducing another new feature called Auto Stereo. As its name implies, it forces stereoscopic rendering in games that were not built with VR headsets in mind. We look forward to testing VR Direct at a later date.
Lastly, we also noticed that GeForce Experience can now record at this resolution. It was previously limited to 2560x1600.
Until we get our hands on MFAA and DSR, we have some general benchmarks to tide you over. We tested the GTX 980 in two-way SLI and by itself, at 2560x1600 and 3820x2160. We compared it to roughly equivalent cards that we've also run in solo and two-way configs.
Here's the system that we've been using for all of our recent GPU benchmarks:
Part | Component |
CPU | Intel Core i7-3960X (at stock clock speeds; 3.3GHz base, 3.9GHz turbo) |
CPU Cooler | Corsair Hydro Series H100 |
Mobo | Asus Rampage IV Extreme |
RAM | 4x 4GB G.Skill Ripjaws X, 2133MHz CL9 |
Power Supply | Thermaltake Toughpower Grand (1,050 watts) |
SSD | 1TB Crucial M550 |
OS | Windows 8.1 Update 1 |
Case | NZXT Phantom 530 |
Now, let's take a look at our results at 2560x1600 with 4xMSAA. For reference, this is twice as many pixels as 1920x1080. So gamers playing at 1080p on a similar PC can expect roughly twice the framerate, if they use the same graphical settings. We customarily use the highest preset provided by the game itself; for example, Hitman: Absolution is benchmarked with the "Ultra" setting. 3DMark runs the Firestrike test at 1080p, however. We also enable TressFX in Tomb Raider, and PhysX in Metro: Last Light.
| GTX 980 | GTX 680 | GTX 780 | GTX 780 Ti | R9 290X |
Tomb Raider | 33 | 19 | 25 | 27 | 26 |
Metro: Last Light | 46 | 21 | 22 | 32 | 30 |
Batman: Arkham Origins | 75 | 51 | 65 | 78 | 65 |
Hitman: Absolution | 42 | 27 | 40 | 45 | 50 |
Unigine Valley | 45 | 30 | 43 | 48 | 41 |
Unigine Heaven | 39 | 64 | 35 | 39 | 34 |
3DMark Firestrike | 11,490 | 6,719 | 8,482 | 9,976 | 9,837 |
(best scores bolded)
To synthesize the results into a few sentences, we would say that the 980 is doing very well for its price. It's not leapfrogging over the 780 and 780 Ti, but Nvidia indicates that it's not supposed to anyway. It dominates the GTX 680, but that card is also two years old and discontinued, so the difference is not unexpected or likely to change buying habits. The R9 290X, meanwhile, is hitting $430, while the not-much-slower 290 can be had for as little as $340. And you can pick up a 780 Ti for $560. So the GTX 980's price at launch is going to be a bit of a hurdle for Nvidia.
Performance in Metro: Last Light has also vastly improved. (We run that benchmark with "Advanced PhysX" enabled, indicating that Nvidia has made some optimizations there. Further testing is needed.) Loyal Radeon fans will probably not be swayed to switch camps, at least on the basis of pure performance. Hitman in particular does not appear to favor the Green Team.
We were fortunate enough to obtain a second GTX 980, so we decided to set them up in SLI, at the same resolution of 2560x1600. Here, the differences are more distinct. We've honed the comparison down to the most competitive cards that we have SLI/CF benchmarks for. (Unfortunately, we do not have a second GTX 680 in hand at this time. But judging by its single-card performance, it's very unlikely to suddenly pull ahead.) For this special occasion, we brought in the Radeon R9 295X2, which has two 290X GPUs on one card and has been retailing lately for about a thousand bucks.
| GTX 980 | GTX 780 | GTX 780 Ti | R9 295X2 |
Tomb Raider | 66 | 45 | 56 | 50 |
Metro: Last Light | 70 | 52 | 53 | 48 |
Batman: Arkham Origins | 131 | 122 | 143 | 90 |
Hitman: Absolution | 77 | 74 | 79 | 79 |
Unigine Valley | 80 | 72 | 87 | 41 |
Unigine Heaven | 73 | 60 | 77 | 65 |
3DMark Firestrike | 17,490 | 14,336 | 16,830 | 15,656 |
(best scores bolded)
While a solo 980 GTX is already a respectable competitor for the price, its success is more pronounced when we add a second card—as is the gap between it and the 780 Ti. It still continues to best the GTX 780, getting us over 60 FPS in each game with all visual effects cranked up. That's an ideal threshold. It also looks like Nvidia's claim of 40 percent improved CUDA core performance may not be happening consistently. Future driver releases should reveal if this is a matter of software optimization, or if it's a limitation in hardware. Or just a random cosmic anomaly.
So, what happens when we scale up to 3840x2160, also known as "4K"? Here we have almost twice as many pixels as 2560x1600, and four times as many as 1080p. Can the GTX 980's 256-bit bus really handle this much bandwidth?
| GTX 980 | GTX 680 | GTX 780 | GTX 780 Ti | R9 290X |
Tomb Raider | 16 | 8.7* | 26 | 28 | 28 |
Metro: Last Light | 36 | 12 | 18 | 19 | 18 |
Batman: Arkham Origins | 35 | 25 | 43 | 44 | 38 |
Hitman: Absolution | 20 | 15 | 37 | 40 | 45 |
Unigine Valley | 19 | 15 | 30 | 30 | 26 |
Unigine Heaven | 19 | 11 | 23 | 23 | 18 |
(best scores bolded)
*TressFX disabled
The 980 is still scaling well, but the 384-bit 780 and 780 Ti are clearly scaling better, as is the 512-bit 290X. Metro: Last Light is the only clear success for the GTX 980. (We don't include 3DMark because it doesn't have an option to run above 2560x1600.) We had to disable TressFX when benchmarking the 680, because the test would crash otherwise, and it was operating at less than 1FPS anyway. At 4K, that card basically meets its match, and almost its maker.
Maybe the 980's story improves if we add a second card? Here's 4K SLI/Crossfire. All tests are still conducted at 4xMSAA, which is total overkill at 4K, but we want to see just how hard we can push these cards. (Ironically, we have most of the SLI results for the 290X here, but not for 2560x1600. That's a paddlin'.)
| GTX 980 | GTX 780 | GTX 780 Ti | R9 290X | R9 295X2 |
Tomb Raider | 33 | 41 | 44 | 52 | 53 |
Metro: Last Light | 43 | 21 | 27 | 29 | 26 |
Batman: Arkham Origins | 68 | 99 | 103 | 67 | 66 |
Hitman: Absolution | 42 | 63 | 75 | 75 | 75 |
Unigine Valley | 39 | 43 | 40 | 24 | 19 |
Unigine Heaven | 34 | 33 | 44 | 17 | 34 |
(best scores bolded)
It does appear that the raw memory bandwidth of the 780, 780 Ti, and 290X come in handy at this resolution, despite the optimizations of Maxwell CUDA cores. That Metro: Last Light score remains pretty interesting. It's the only one we run with PhysX enabled (to balance out using TressFX in Tomb Raider). It really does look like Maxwell is much better at PhysX than any other GPU before it. That tech isn't quite common enough to change the game. But if the difference is as good as our testing indicates, more developers may pick it up.
Even a blisteringly fast card can be brought down by high noise levels or prodigious heat. Thankfully, this reference cooler is up to the task. Keep in mind that this card draws up to 165 watts, and its cooler is designed to handle cards that go up to 250W. But even with the fan spinning up to nearly 3,000rpm, it's not unpleasant. With the case side panels on, you can still hear the fan going like crazy, but we didn't find it distracting. These acoustics only happened in SLI, by the way. Without the primary card sucking in hot air from the card right below it, its fan behaved much more quietly. The GTX 980's cooling is nothing like the reference design of the Radeon R9 290 or 290X.
With a TDP of just 165W, a respectable 650-watt power supply should have no trouble powering two 980 GTXs. Meanwhile, the 290-watt R9 290X really needs a nice 850-watt unit to have some breathing room, and even more power would not be unwelcome.
Since MFAA and DSR were not available in the driver that was supplied for testing, there's more story for us to tell over the coming weeks. And we still need to do some testing with VR. But as it stands right now, the GTX 980 is another impressive showing for Nvidia. Its 4K scaling isn't as good as we'd like, especially since Maxwell is currently the only tech that will have Super Dynamic Resolution. If you want to play at that level, it looks like the 290 and 290X are better choices, price-wise, while the overall performance crown at 4K still belongs to the 780 and 780 Ti.
For 2560x1600 or lower resolutions, the 980 GTX emerges as a compelling option, but we're not convinced that it's over $100 better than a 290X. Then again, you have MFAA, DSR, and VR Direct, (and the overall GeForce Experience package that's a bit slicker than AMD's Gaming Evolved) which might work some people, or for Nvidia loyalists who've been waiting for an upgrade from their 680 that's not quite as expensive as the 780 or 780 Ti.
Our amigo Wes Fenlon over at PC Gamer has a write-up of his own, so go check it out.