A deep dive into Intel's Haswell-E
After three long years of going hungry with six cores, red meat is finally back on the menu for enthusiasts. And not just any gamey slab full of gristle with shared cores, either. With its new eight-core Haswell-E CPU, Intel may have served up the most mouth-watering, beautifully seared piece of red meat in a long time.
And it's a good thing, too, because enthusiast's stomachs have been growling. Devil's Canyon? That puny quad-core was just an appetizer. And that dual-core highly overclockable Pentium K CPU? It's the mint you grab on your way out of the steak house.
No, what enthusiasts have craved and wanted—ever since Intel's original clock-blocking job on the original Sandy Bridge-E—is a true, overclockable enthusiast chip with eight cores. So if you're ready for a belt loosening belly full of enthusiast-level prime rib, pass the horseradish, get that damned salad off our table, and read on to see if Intel's Haswell-E is everything we hoped it would be.
The wait is Over: Haswell-E
The first consumer Intel eight-core arrives at last
Being a member of the PC enthusiast class is not an easy path to follow. Sure, you get the most cores and priciest parts, but it also means you wait a hell of a long time between CPU upgrades. And with Intel's cadence the last few years, it also means you get the leftovers. It's been that way since Intel went with its two-socket strategy with the original LGA1366/LGA1156. Those who picked the big-boy socket always got the shaft.
The original Ivy Bridge in LGA1156 socket, for example, hit the streets in April of 2012. Intel then rewarded the small-socket crowd with its Haswell in June of 2013. It wasn't until September of 2013 that big-boy socket users got Ivy Bridge-E for their LGA2011s. But with Haswell already tearing up the benchmarks, who the hell cared?
Well, that time has come with Haswell-E, Intel's first replacement for the aging LGA2011 platform since 2011. For the first time since the original Pentium 4 Extreme Edition, paying the premium price actually nets you more: namely, the company's first consumer eight-core CPU.
Meet the T-Rex of consumer CPUs: The Core i7-5960X
Overclocking to 4.5GHz and Beyond
We were a little leery of Haswell when it first launched last year. It was, after all, a chip seemingly tuned for the mobile/laptoppy world we were told was our post-PC apocalyptic future. Despite this, we recognized the chip as the CPU to have for new system builders. Clock for clock, its 22nm process tri-gate transistors put everything else to shame—even the six-core Core i7-3930K chip in many tasks. So it's no surprise that when Intel took a quad-core Haswell, put it in the Xerox machine, and hit the copy button, we'd be ecstatic. Eight cores are decidedly better than six or four cores when you need them.
The cores don't come without a cost though, and we don't mean the usual painful price Intel asks for its highest-end CPUs. It's no secret that more cores means more heat, which means lower clock speeds. That's one of the rationales Intel used with the original six-core Core i7-3960X. Although sold as a six-core, the original Sandy Bridge-E was built using an eight-core die on which Intel had switched off two cores. Intel said it wanted to balance the needs of the many versus the needs of the few—that is, by turning off two of the cores, the part could hit higher clock speeds. Indeed, the Core i7-3960X had a base clock of 3.3GHz and Turbo Boost of 3.9GHz, and most could overclock it to 5GHz. The same chip packaged as a Xeon with all eight cores working—the Xeon E5-2687W—was locked down at 3.1GHz and mostly buzzed along at 3.4GHz.
Fully Unlocked
With the new Core i7-5960X—the only eight-core of the bunch—the chip starts at a seemingly pedestrian 3GHz with a Turbo Boost of one core up to 3.5GHz. Those subsonic clock speeds won't impress against the Core i7-4790K, which starts at 4GHz.
You'll fi nd more on how well Haswell-E performs against Haswell in our performance section, but that's the price to be paid, apparently, to get a chip with this many cores under the heat spreader. Regarding thermals, Intel has increased the TDP rating to 140 watts versus the 130 watts of Ivy Bridge-E and Sandy Bridge-E.
If the low clocks annoy you, the good news is the part is fully unlocked, so overclocking has been approved. For our test units, we had tight deadlines, so didn't get far with our overclocking efforts. But talking with vendors, most seem pleased with the clock speeds they're seeing. One vendor told us overclocks of all cores at 4.5GHz are already obtainable, with newer microcode updates expected to improve that. With even the Devil's Canyon Core i7-4790K topping out at 4.7GHz to 4.8GHz, a 4.5GHz is actually a healthy overclock for an eight-core CPU.
When you dive down into the actual cores though, much is the same, of course. It's based on a 22nm process. It has "3D" trigate transistors and integrated voltage regulation. Oh, and it's also the first CPU to feature an integrated DDR4 memory controller.
DDR4 Details
The memory stalemate is over
If you think Haswell-E has been a long wait, just think about DDR3, which made its debut as main memory in 2007 systems. Yes, 2007. The only component that's lasted seven years in most enthusiasts' systems might be the PSU, but it's even rare to find anyone kicking a 500-watt PSU from 2007 these days. DDR4 has been in gestation seemingly as long, so why the delay? From what we can tell, resistance to yet another new memory standard when people thought the PC was dying has been the root delay. And it didn't help that no one wanted to stick their head out first. RAM makers didn't want to begin producing DDR4 in volume until AMD or Intel made chipsets for it, while AMD and Intel didn't want to support it because of the costs it would add to PCs. The stalemate finally ends with Haswell-E, which integrates a quadchannel DDR4 memory controller into its die.
Initial launch speeds of DDR4 clock in at DDR4/2133. For those already running DDR3 at 3GHz or higher, a 2,133 data rate is a snooze, but you should realize that anything over 2,133 is overclocked RAM. With DDR4, the JEDEC speeds (the body that sets RAM standards) already has target data rates of 3,200 on the map. RAM vendors we've talked to are already shopping DIMMS near that speed.
But the best part of DDR4 may be its density message. For years, consumer DDR3 has topped out at 8GB on a DIMM. With DDR4, we should see 16B DIMMs almost immediately, and stacking of chips is built into the standard, so it's possible we'll see 32GB DIMMs over its lifetime. On a quad-channel, eight-DIMM motherboard, you should expect to be able to build systems with 128GB of RAM using non-ECC DIMMs almost immediately. DDR4 also brings power savings and other improvements, but the main highlights enthusiasts should expect are higher densities and higher clocks. Oh, and higher prices. RAM prices haven't been fun for anyone of late, but DDR4 will definitely be a premium part for some time. In fact, we couldn't even get exact pricing from memory vendors as we were going to press, so we're bracing for some really bad news.
PCIe Lanes
Now a feature to be blocked
Over the years, we've come to expect Intel to clock-block core counts, clock speeds, Hyper-Threading, and even cache for "market segmentation" purposes. What that means is Intel has to find ways to differentiate one CPU from another. Sometimes that's by turning off Hyper-Threading (witness Core i5 and Core i7) and sometimes its by locking down clock speeds. With Haswell-E though, Intel has gone to new heights with its clock-blocking by actually turning off PCIe lanes on some Haswell-E parts.
At the top end, you have the 3GHz Core i7-5960X with eight cores. In the midrange you have the six-core 3.5GHz Core i7-5930K. And at the "low-end" you have the six-core 3.3GHz Core i7-5820K. The 5930K and the 5820K are virtually the same in specs except for one key difference: The PCIe lanes get blocked. Yes, while the Core i7-5960X and Core i7-5930K get 40 lanes of PCIe 3.0, the Core i7-5820K gets an odd 28 lanes of PCIe 3.0. That means those who hoped to build "budget" Haswell-E boxes with multiple GPUs may have to think hard about using the lowest-end Haswell-E chip. The good news is that for most people, it won't matter. Plenty of people run Haswell systems with SLI or CrossFire, and those CPUs are limited to 16 lanes. Boards with PLX switches even support four-way GPU setups.
Still, it's a brain bender to think that when you populate an X99 board with the lowest-end Haswell-E, the PCIe configuration will change. At least they'll work, just more slowly. Intel says it worked with board vendors to make sure all the slots will function with the budget part.
X99
Finally, the chipset high-end enthusiasts want, sort of
You know what we won't miss? The X79 chipset. No offense to X79 owners, while the Core i7-4960X can stick around for a bit, X79 can take its under-spec'ed butt out of our establishment. Think we're being too harsh? We don't.
X79 has no native USB 3.0 support. And its SATA 6Gb/s ports? Only two. It almost reads like a feature set from the last decade. Fortunately, Intel has gone hog wild in overcompensating for the weaknesses of X79.
X99 has eight USB 2.0 ports and six USB 3.0 ports. For SATA 6Gb/s, Intel adds 10 ports to X99. Yes, 10. That gazongo number, however, is balanced out by two glaring omissions: no official SATA Express or M.2 support that came with Z97. Intel would only say motherboard vendors were free to implement it. We guess Intel left the feature off as the firm is a stickler for testing new interfaces before adding official support. At this point, SATA Express has been a noshow. After all, motherboards with SATA Express became available in May with Z97, yet we still haven't seen any native SATA Express drives. We expect most vendors to simply add it through discrete controllers.
Intel overcompensated in SATA on X99 but oddly left SATA Express on the cutting-room floor.
One potential weakness of X99 is Intel's use of the DMI 2.0. That offers roughly 2.5GB/s of transfer speed between the CPU and the south bridge or PCH, but with the board hanging 10 SATA devices, USB 3.0, Gigabit Ethernet, and 8 PCIe Gen 2.0 lanes off that link, there's potential for massive congestion in a worst-case scenario. But, says Intel, you can just hang devices off the plentiful PCIe Gen 3.0 from the CPU.
That does bring up our last point: the PCIe lanes. There will be confusion over the PCIe lane configuration on systems with Core i7-5820K parts. With only 28 lanes available, there's concern whole slots will be turned off. That won't happen, Intel says. Instead, if you go with the low-rent ride, you simply lose bandwidth, but will still have more than you can get from a normal LGA1150-based Core i7-4770K. It will be confusing, but we expect motherboard vendors to sort it out.
Haswell-E does bring one more interesting PCIe configuration—the ability to run five graphics cards in the PCIe slots at x8 speeds. What for? Intel didn't explain. Maybe mining configurations where miners are already running six GPUs. But mining doesn't seem to need the bandwidth that a x8 slot would provide. The other possibility is a five-way graphics card configuration being planned by Nvidia or AMD. At this point it's just conjecture, but one thing we know is that X99 is a welcome upgrade. Good riddance X79.
Core Competency
How many cores do you really need?
Like great technology philosopher Sir Mix-A-Lot says, we like big cores and we cannot lie. We want as many cores as legally available. But we recognize that not everyone rolls as hard as we do with a posse of threads. With Intel's first eight-core CPU, consumers can now pick from two cores all the way to eight on the Intel side of the aisle—and then there's Hyper-Threading to confuse you even more. So, how many cores do you need? We'll give you the quick-and-dirty lowdown.
Two Cores
Normally, we'd completely skip dual-cores without Hyper-Threading because the parts tend to be the very bottom end of the pool Celerons. Our asterisk is the new Intel Pentium G3258 Anniversary Edition, or "Pentium K," which is a real hoot of a chip. It easily overclocks and is dead cheap. It's not the fastest in content creation by a long shot, but if we were building an ultra-budget gaming rig and needed to steal from the CPU budget for a faster GPU, we'd recommend this one. Otherwise, we see dual-cores as purely ultra-budget parts today.
Two Cores with Hyper-Threading
For your parents who need a reliable, solid PC without overclocking (you really don't want to explain how to back down the core voltage in the BIOS to grandma, do you?), the dual-core Core i3 parts fulfill the needs of most people who only do content creation on occasion. Hyper-Threading adds value in multi-threaded and multi-tasking jobs. You can almost think of these chips with Hyper-Threading as three-core CPUs.
Four Cores
For anyone who does content creation such as video editing, encoding, or even photo editing with newer applications, a quad-core is usually our recommended part. Newer game consoles are also expected to push minimum specs for newer games to quadcores or more as well, so for most people who carry an Enthusiast badge, a quad-core part is the place to start.
It's indeed a glorious thing to see a task manager with this many threads, but not everyone needs them.
Four Cores with Hyper-Threading
Hyper-Threading got a bad name early on from the Pentium 4 and existing software that actually saw it reduce performance when turned on. Those days are long behind us though, and Hyper-Threading offers a nice performance boost with its virtual cores. How much? A 3.5GHz Core i7 quad-core with Hyper-Threading generally offers the same performance on multi-threaded tasks as a Core i5 running at 4.5GHz. The Hyper-Threading helps with content creation and, we'd say, if content creation is 30 percent or less of your time, this is the place to be. It's the best fit for 90 percent of enthusiasts.
Six Cores with Hyper-Threading
Once you pass the quad-core mark, you are moving pixels professionally in video editing, 3D modeling, or other tasks that necessitate the costs of a six-core chip or more. We still think that for 90 percent of folks, a four-core CPU is plenty, but if losing time rendering a video costs you money (or you're ADD), pay for a six-core or more CPU. How to decide if you need six or eight cores? Read on.
Eight Cores with Hyper-Threading
Not everyone needs an eight-core processor. In fact, one way to save cash is to buy the midrange six-core chip. But, if time is money, an eight-core chip will pay for itself. For example, the eight-core Haswell-E is about 45 percent faster than the four-core Core i7-4790K chip. If your render job is three hours, that's more time working on other paying projects. The gap gets smaller between the six-core and the eight-core, so it becomes about how much your time is worth. To give you an idea, the 3.3GHz Core i7-5960X is about 20 percent faster than the Core i7-4960X running at 4GHz.
Intel's Top Guns Compared
The LGA2011-based Core i7-4960X (left) and the LGA2011-v3-based Core i7-5960X (middle) dwarf the Core i7-4790K chip (right). Note the change in the heat spreader between the older 4960X and 5960X, which now has larger "wings" to make it easier to remove the CPU. The breather hole, which allows for curing of the thermal interface material, has also been moved. Finally, while the chips are the same size, they're keyed differently to stop you installing a newer Haswell-E into an older Ivy Bridge-E board.
Benchmarks
Performance junkies, rejoice! Haswell-E hits it out of the ballpark
For our testing, we set up three identical systems with the fastest available CPUs for each platform. Each system used an Nvidia GeForce GTX 780 with the same 340.52 drivers, Corsair 240GB Neutron GTX SSDs, and 64-bit Windows 8.1 Enterprise. Since we've had issues with clock speeds varying on cards that physically look the same, we also verified the clock speeds of each GPU manually and also recorded the multiplier, b-clock, and speeds the parts run at under single-threaded and multi-threaded loads.
So you know, the 3GHz Core i7-5960X's would run at 3.5GHz on single-threaded tasks but usually sat at 3.33GHz on multithreaded tasks. The 3.6GHz Core i7-4960X ran everything at 4GHz, including multithreading tasks. The 4GHz Core i7-4790K part sat at 4.4GHz on both single- and multithreaded loads.
For Z97, we used a Gigabyte Z97M-D3H mobo with a Core i7-4790K "Devil's Canyon" chip aboard. An Asus Sabertooth X79 did the duty for our Core i7-4960X "Ivy Bridge-E" chip. Finally, for our Core i7-5960X chip, we obtained an early Gigabyte X99-Gaming 5 motherboard. The board was pretty early but we feel comfortable with our performance numbers as Intel has claimed the Core i7-5960X was "45 percent" faster than a quadcore chip, and that's what we saw in some of our tests.
One thing to note: The RAM capacities were different but in the grand scheme of things and the tests we ran, it has no impact. The Sabertooth X79 had 16GB of DDR3/2133 in quad-channel mode, the Z97M-D3H had 16GB of DDR3/2133 in dual-channel mode. Finally, the X99-Gaming 5 board had 32GB of Corsair DDR4/2133. All three CPUs will overclock, but we tested at stock speeds to get a good baseline feel.
For our benchmarks, we selected from a pile of real-world games, synthetic tests, as well as real-world applications across a wide gamut of disciplines. Our gaming tests were also run at very low resolutions and low-quality settings to take the graphics card out of the equation. We also acknowledge that people want to know what they can expect from the different CPUs at realistic settings and resolutions, so we also ran all of the games at their highest settings at 1920x1080 resolution, which is still the norm in PC gaming.
We used a Gigabyte X99 motherboard (without the final heatsinks for the voltageregulation modules) for our testing.
The Results
We could get into a multi-sentence analysis of how it did and slowly break out with our verdict but in a society where people get impatient at the microwave, we'll give you the goods up front: Holy Frakking Smokes, this chip is fast! The Core i7-5960X is simply everything high-end enthusiasts have been dreaming about.
Just to give you an idea, we've been recording scores from $7,000 and $13,000 PCs in our custom Premiere Pro CS6 benchmark for a couple of years now. The fastest we've ever seen is the Digital Storm Aventum II that we reviewed in our January 2014 issue. The 3.3GHz Core i7-5960X was faster than the Aventum II's Core i7-4960X running at 4.7GHz. Again, at stock speeds, the Haswell-E was faster than the fastest Ivy Bridge-E machine we've ever seen.
It wasn't just Premiere Pro CS6 we saw that spread in either. In most of our tests that stress multi-threading, we saw roughly a 45 to 50 percent improvement going from the Haswell to the Haswell-E part. The scaling gets tighter when you're comparing the six-core Core i7-4960X, but it's still a nice, big number. We generally saw a 20 to 25 percent improvement in multithreaded tasks.
That's not even factoring in the clock differences between the parts. The Core i7-4790K buzzes along at 4.4GHz—1.1GHz faster than the Core i7-5960X in multithreaded tasks—yet it still got stomped by 45 to 50 percent. The Core i7-4960X had a nearly 700MHz clock advantage as well over the eight-core chip.
The whole world isn't multi-threaded, though. Once we get to workloads that don't push all eight cores, the higher clock speeds of the other parts predictably take over. ProShow Producer 5.0, for example, has never pushed more than four threads and we saw the Core i7-5960X lose by 17 percent. The same happened in our custom Stitch.EFx 2.0 benchmark, too. In fact, in general, the Core i7-4790K will be faster thanks to its clock speed advantage. If you overclocked the Core i7-5960X to 4GHz or 4.4GHz on just four cores, the two should be on par in pure performance on light-duty workloads.
In gaming, we saw some results from our tests that are a little bewildering. At low-resolution and low-quality settings, where the graphics card was not the bottleneck, the Core i7-4790K had the same 10 to 20 percent advantage. When we ran the same tests at ultra and 1080p resolution, the Core i7-5960X actually had a slight advantage in some of the runs against the Core i7-4790K chip. We think that may be from the bandwidth advantage the 5960X has. Remember, we ran all of the RAM at 2,133, so it's not DDR4 versus DDR3. It's really quad-channel versus dual-channel.
We actually put a full breakdown of each of the benchmarks and detailed analysis on www.maximumpc.com if you really want to nerd out on the performance.
What You Should Buy
Let's say it again: The Core i7-5960X stands as the single fastest CPU we've seen to date. It's simply a monster for performance in multi-threaded tasks and we think that once you've overclocked it, it'll be as fast as all the others in tasks that aren't threadheavy workloads.
But Is It for You?
That performance, however, doesn't mean everyone should start saving to buy a $1,000 CPU. No, for most people, the dynamic doesn't change. For the 80 percent of you who fall into the average Joe or Jane nerd category, a four-core with Hyper-Threading still offers the best bang for the buck. It won't be as fast as the eight-core, but unless you're really working your rig for a living, made of money, or hate for your Handbrake encodes to take that extra 25 minutes, you can slum it with the Core i7-4790K chip. You don't even have to heavily overclock it for the performance to be extremely peppy.
For the remaining 20 percent who actually do a lot of encoding, rendering, professional photo editing, or heavy multi-tasking, the Core i7-5960X stands as the must-have CPU. It's the chip you've been waiting for Intel to release. Just know that at purely stock speeds, you do give up performance to the Core i7-4790K part. But again, the good news is that with minor overclocking tweaks, it'll be the equal or better of the quad-core chip.
What's really nice here is that for the first time, Intel is giving its "Extreme" SKU something truly extra for the $999 they spend. Previous Core i7 Extreme parts have always been good overclockers, but a lot of people bypassed them for the midrange chips such as the Core i7-4930K, which gave you the same core counts and overclocking to boot. The only true differentiation Extreme CPU buyers got was bragging rights. With Haswell-E, the Extreme buyers are the only ones with eight-core parts.
The Upgrade Dilemma
Bang-for-the-buck buyers also get a treat from the six-core Core i7-5820K chip. At $389, it's slightly more expensive than the chip it replaces—the $323 Core i7-4820K—but the extra price nets you two more cores. Yes, you lose PCIe bandwidth but most people probably won't notice the difference. We didn't have a Core i7-5820K part to test, but we believe on our testing with the Core i7-5960X that minor overclocking on the cheap Haswell-E would easily make it the equal of Intel's previous six-core chips that could never be had for less than $580.
And that, of course, brings us to the last point of discussion: Should you upgrade from your Core i7-4960X part? The easy answer to that is no. In pure CPU-on-CPU showdowns, the Core i7-4960X is about 20 percent slower in multi-threaded tasks, and in light-duty threads it's about the same, thanks to the clock-speed advantage the Core i7-4960X has. There are two reasons we might want to toss aside the older chip, though. The first is the pathetic SATA 6Gb/s ports, which, frankly, you actually need on a heavy-duty work machine. The second reason would be the folks for whom a 20 percent reduction in rendering time would actually be worth paying for.