From Enumeration Comes Elucidation
Benchmarks are for the pros, right? Gordon Mah Ung disappears into the maw of the Maximum PC lab for days, and emerges to tell the world whether the next CPU or chipset matters. I'm often hunkered down in the basement lab at the Case House, running endless series of games and 3D tests on graphics cards to find that sweet card just right for your budget.
Every now and then, though, you need to check the performance of your system. Maybe it seems to be running sluggishly. Perhaps you just got a new graphics card, or doubled your installed DRAM. So you want to run some quick performance tests to see if your system is indeed more sluggish than before, or faster with that upgrade.
What you want to do is run the appropriate benchmark. Benchmarks are simply standardized methods for testing performance. They may be standalone applications specifically designed to test performance of a particular component (like a graphics card) or the entire system. Another type of benchmarks uses an actual application as the test, but these often only tell you how your system or component behaves with that one application.
This isn't a comprehensive tutorial on how to run benchmarks for repeatable results; if you want to know the skinny on benchmarking methodologies, check out Gordon's article on that topic. Instead, we'll be diving into the world of quick and dirty benchmarking: testing your system as a quick way to see the impact of changes, or as troubleshooting tools.
In addition, this is benchmarking on a budget. We'll be using benchmarks or applications available at the best possible price: free. In some cases, the free benchmark may be a stripped-down version of a more robust test; if so, we'll mention that. Our trip down benchmarking lane is also divided up by categories: CPU, graphics, storage, and system tests.
But before diving into the specifics of individual tests, let's take a look at the two key reasons an everyday user might want to run benchmarks. Let's begin, shall we?
Something Changed
Something in your system has changed. Maybe it was an intentional change on your part—you added more DRAM, dropped in a second graphics card, or finally plunked down some hard earned coin for a shiny new SSD.
Maybe you're tweaking your system, trying to overclock it to the maximum stable setting. In that case, you need to make one change at a time, and then run a couple of benchmarks. For maximum stability testing, you may want to run them for several hours.
Or perhaps the change is something you noticed. Your system suddenly takes longer to load applications—or is it just your imagination? Your frame rates go from butter smooth to chunky monkey—or maybe it's just that new game?
Whatever the change, it's time to run a few performance tests. The right benchmark may tell you if that new SSD really is faster than your old 10,000RPM Raptor RAID 0 setup. Or it might tell you that something's gone wrong with your graphics setup, if 3D benchmarks are suddenly in the tank compared to a few weeks back.
New System Baseline
You've just finished installing Windows 7, along with the 3,542 required updates. All the drivers are current, and your new system is humming along nicely. What's the first application you should install?
Why, a benchmark of course. Better yet, install several.
Some would say you should install some type of dedicated burn-in application. However, I've discovered over the years that a good benchmark—or sometimes, multiple benchmarks run at the same time make excellent burn-in apps.
You also want to set a performance baseline. For that, you need to run a systems benchmark, a storage test and a 3D benchmark. Depending on what applications you run most often, it may also be worth running benchmarks that replicate how those apps work—game benchmarks if you're a PC gamer, or a benchmark like Cinebench if you're into 3D modeling and rendering.
You want to run those tests before you clutter your system up with applications, many of which may run background applets or pre-load DLLs during system boot. That way, you establish a baseline for the performance of your new system. So if you get to a point where the system seems to get sluggish, you can run those same tests and compare to the original results. My general rule of thumb is that if my heavily loaded system that's been running actively for six months or more running less than 10% slower than the baseline, I'm still good. If overall performance exceeds that, I begin to look at ways to declutter the system.
Okay, you've got your baseline performance measured and recorded. But what benchmarks should you use for your baseline? Let's look at a few.
System Benchmarks
System benchmarks typically generate a score that's an aggregate of different performance metrics, including CPU, memory, storage, graphics, and some tests that reflect combinations of those individual subsystems, like video playback. There are numerous system benchmarks available, of varying pedigree and cost.
For general system benchmarks, I've gradually settled on FutureMark's PC Mark. It's not perfect, but the free basic version of the most current PCMark 7 exercises your system pretty well. I've used other tests, like Passmark and Novabench but none of those seem to really thrash the system. PCMark can also be a great stability test—if PCMark crashes, it's because something is dodgy with your system, not the benchmark, provided the benchmark has been properly installed and updated.
PCMark 7. The free, basic version only runs the PCMark Suite and generates a single score.
PC Mark is considered a synthetic benchmark, since it's not an actual application, but it is built with actual instructions recorded from using the built-in applications that ship with Windows.
If you want to update to the advanced version, though, it costs $40. That's certainly cheaper than the $250 PC World charges for a single user version of PC WorldBench 6. The upside to spending the $40 is that you also gain access to one of the best applications-based disk benchmarks around. (The basic version doesn't give you access to the storage test.) The PCMark 7 storage test is built on Intel's original RankDisk benchmark, and uses recorded behavior from actual applications to hammer the drives.
Another useful—and free—system test is PC Wizard, from the same people that bring you CPU-Z, the popular CPU and system ID tool. PC Wizard also give you a set of individual tests for different subsystems that can be useful.
PC Wizard generates a graphical result based on a reference system.
There are other free benchmarks available, as we noted, but most don't really exercise a Windows 7 system particularly well. Phoronix builds a benchmark that works with both Linux and Windows, but as with many similar open projects, it requires a certain amount of manual effort and script writing to get it to work.
CPU and Memory Benchmarks
Sometimes it's worth testing just the CPU. I once noticed that my system seemed to suddenly be running sluggishly. After running a couple of CPU tests and comparing it to my baseline tests, the CPU seemed to be running almost 50% slower. At first, I suspected heat throttling, but the core temperature was around 40 degrees C at idle. Then I discovered that the BIOS update I'd installed had reset the CPU to its lowest clock speed. This would never happen to a retail CPU, but the new BIOS didn't know how to handle the engineering sample CPU I was running, and set it to the clock frequency of the slowest product it knew.
There are several useful CPU specific benchmarks worth using. One old standby is Prime95. Prime95 is a pretty geeky benchmark, spitting out a set of results for different sets of fast Fourier transforms, and running different thread and core counts before completion. It's probably more useful as a stress test. One torture test I often run when burning in a system is to run a combination of Prime95 in blend mode while simultaneously running the older 3DMark 2006 benchmark. If the entire system can run those two benchmarks in concert for a few hours, the system will almost certainly be stable for gaming.
Prime95 is pretty geeky, but useful for torture testing.
The processor test built into PC Wizard, mentioned earlier, is also a good, relatively quick test. It's not as useful for torture testing, but sometimes you just need some quick performance validation. I'm not a big believer in memory benchmarks. Those results often don't translate well into actual performance. But the memory and latency tests in PC Wizard, like the CPU tests, are good for validating what your memory is doing.
The PC Wizard CPU test is pretty synthetic, but can be useful.
Storage
Unless you're doing a lot of comparative tests of hard drives or SSDs, you probably won't be running drive benchmarks often. They're not that useful as diagnostic tools, either. Drive failures are sudden enough that running a benchmark might just push it over the edge if it's ready to fail.
Benchmarks can be useful in helping you decide when it's time to de-clutter your SSD. As SSDs get close to being full, performance may radically fall off. If the result of a storage benchmark is much slower than your baseline test—20% slower or more—it's time to clean the crud off your drive.
Given that you're not going to run disk benchmarks frequently, a free benchmark sounds like a good idea. The problem: most free storage benchmarks aren't all that robust. One popular test, CrystalDiskMark, is a synthetic test, but has been updated to work more effectively with SSDs and will even run on the Windows 8 developer preview.
This test is often run at trade shows by hard drive and SSD manufacturers.
Graphics & Game Benchmarks
There are more graphics benchmarks than you can shake a stick at, and many of them are free. Most of these have only marginal utility, because they don't always tell you how well your system might behave in an actual game. What is cool is that a benchmark, like the various versions of 3DMark, make it easy for you to compare your system with thousands of other systems, since they give you the option of collecting data online.
Benchmarks like 3DMark and Unigine Heaven are designed to completely thrash your graphics card. It's like the difference between putting your car on a dynamometer to get a theoretical idea of how well your car can perform and actually running it on a road course.
The free, basic version of 3DMark only gives you a score.
Heaven is similar in some ways to 3DMark, but based on an actual game engine.
On the other hand, no one game benchmark is perfect. They can only give you a rough idea of how well your system might do in a particular game or genre. So you really need to run game tests that reflect your tastes in gaming. Running first person shooter benchmarks won't help you determine how your system might run a real time strategy game, for example.
The other problem with game benchmarks is more subtle. Running one or two tests and looking at average frame rates doesn't always give you a good feel for how a game might play, as The Tech Report discovered. Game benchmarks that generate average frame rate results are useful for comparing different cards, but if you want to really dive in and see how a game runs, you may need to actually run that particular game.
The other problem is that most games don't have in-game benchmarks. Those that do are often simplistic, running a pre-collected bunch of frames and giving you an average frame rate. One step up is when the game test shows you minimum, maximum, and average frame rates. The best game benchmarks give you both a front end to launch the benchmark and graphical results. One example is the benchmark applet that ships with Metro 2033.
I wish every game shipped with a benchmark launcher like this.
The results screen for Metro 2033 gives you lots of information on the run.
The real issue, though, for the casual benchmarker, is that you often have to own the game to run the benchmark. If all you want to do is run a few tests to get a rough idea of how your system might perform, dropping $50 on a game just to run performance tests seems a little extreme—especially when you consider the technology in games varies considerably from one title to the next.
There are free, standalone game benchmarks available, however. Here are several good ones.
- S.T.A.L.K.E.R.: Call of Pripyat. This is one of the early DX11 game benchmarks. Offers a nice launcher applet, too.
The Call of Pripyat test ships with a nice launcher and lets you easily tweak settings before you run it.
- Aliens vs. Predator. Another DX11 benchmark. Requires you to edit separate batch and config files to run.
- Dawn of War II. The demo for the original Dawn of War II is available on Steam. It has a built-in DX10 benchmark that's very CPU intensive, but does scale somewhat with graphics cards.
- Battle Forge. BattleForge features a short, built-in benchmark. You need to download the game client, install it, and then download and install the high resolution pack if you want to run the benchmark in full DX11 mode.
We can't finish our discussion on game benchmarking without talking about Fraps. Fraps is a tool that lets you benchmark any game. You run Fraps, set up recording parameters, then play the game. The problem with Fraps is that you often need to run through the game a number of times to get a repeatable result. But it's a useful tool if you've got the patience.
Applications Benchmark
If you want to see how your system runs a particular application, you'll need to run that application. Many are available as demos, so you can often download them and see how well they run. The downside is that quite a few lack built-in benchmarks.
User communities are often helpful. If you're running a professional graphics application or high end image editing tool, user communities often have links to application tests you can run. Bear in mind that many of these take some work to run.
You can find a few benchmarks specifically designed for certain professional applications. SPEC, the Standard Performance Evaluation Corporation, offers a number of benchmarks for high end applications like 3DSMax, Solidworks, and Lightwave, but they all assume you actually have the application, and just supply scripts and data files. SPEC ViewPerf is an exception, and can demonstrate how your GPU and system might run a set of these pro apps, but the scenarios are a little artificial.
Also, Cinebench can test how your CPU and GPU work with Maxon's Cinema4D, but will also give you a clue as to how your system might perform with software 3D rendering in general.
The bottom line is that application benchmarks are complex and difficult, but the only way to test how your system might run an application critical to your work.
Final Thoughts
If you're not constantly comparing hardware, then benchmarks aren't something you'll run often. But there's a wealth of tools available to test the performance of your system. They're useful diagnostic tools, can help you hit a stable overclock and tell you when your system performance has deteriorated beyond acceptable parameters. Best of all, many of them are free as well as useful.