- Oct 14, 2003
- 8,686
- 3,786
- 136
Here I will present Haswell's features, along with my thoughts at the end of each section.
CPU:
-Better branch prediction
-Increase OoO window and resources
-L2 TLB(TLB size and ways doubles to 1024 and 8 ways in 4K and 2M shared scenario)
-Lower VT latencies
-Port 0 and 1 add the ability to do 256-bit FMA
-New Port 6 adds extra ALU and branch unit
-New Port 7 adds extra store address unit
-Doubled read and write bandwidth for the L1 cache
-Improved cache line latency and throughput
-L2 cache bandwidth doubled
-Same branch misprediction penalty and same cache latencies
-New Integer instructions: Indexing and Hashing, Cryptography, Endian conversion-MOVBE
-CRC, SHA-256, SHA-256 MultiBuffer, AES GCM, RSA-2K performance greatly improves over Ivy Bridge
-Per slice LLC access bandwidth improved
-Improved load balancing in the System Agent
-Improved DRAM write throughput
I wasn't expecting big change until I saw Fudzilla's "at least 10% gain" claim. This is a comprehensive change. Addition of ports is even more than what Sandy Bridge architecture went through. Impressive. I expect another 10-20% Sandy Bridge achieved over Nehalem/Westmere is possible again.
GPU:
-DX11.1, OpenCL 1.2, OpenGL 4.0 support
-Adds a Resource Streamer to reduce driver overhead
-Most of fixed function unit performance doubled(for GT3)
-GT3 is literally doubles the back end of the iGPU, which includes the Rasterizer, Z, Stencil, Color Blend, Samplers
-Up to 4x improvement of sampler performance over Ivy Bridge(which will be double again with GT3)
Good to see that its not a direct scale up of Ivy Bridge units, and on its own a big improvement as well. The one I like a lot is the Resource Streamer to reduce driver overhead.
Media:
-SVC codec
-MJPEG decode & MPEG2 encode
-Higher encoding quality
-Stand-alone VQE(Video Quality Engine)
-Frame rate conversion and image stabilization
-4Kx2K video playback
-Improved encoder quality and new video processing functions
Can't tell much about media units, but seems like a step forward. This area is best shown by real life examples.
Power management:
-Voltage Regulator integration
-Cores seperated from LLC+Ring for frequency
-C-state transition times improved by 25%
-GT3 can power gate slice
This is the biggest improvement for Haswell, by far. As outlined here: http://forums.anandtech.com/showthread.php?t=2241480&highlight=overview+power+management
Here's an interesting fact though. Intel is saying the 4th Generation Core "M" series will have 20% reduced idle power consumption. The "U" series though will have >20x reduced idle power consumption. They are talking that they want to reach S0 style wake up with S3 style power usage.
How significant is this? Datasheets show that CPU-alone C7 package power for Ivy Bridge 17W CPU has a TDP of 2.2W. Haswell will be under 200mW, not just for the CPU but for the entire platform sans the display. The reason for such a big drop is because we move from a system that has basically the CPU scaling in power usage only, to the entire system scaling based on demand. We'll see(for Ultrabooks anyway) deep power states on SSDs, PCI Express, USB, and other things that go way below what they do today.
Power measurements when the system is idle with screen-off is known to be 3-4W on an Ultrabook-style system. That goes 2-3W with the screen on idle, for a total of 5-7W. Even if the screen isn't improved it would mean the idle power with screen on would drop from current 5W to 2.2W.. Combine that with Haswell's support for Panel Self Refresh to save ~500mW of power we end up with 1.7W screen-on idle with Haswell with this calculation.
Actually, to be more specific they show in the Ultrabook guidelines, a 2011-2012 platform having a 35WHr battery having 5 hours of battery life, with bigger systems doing 7 hours, a fairly realistic figure since the Macbook Air and Zenbook can get similar figure with 35WHr. In the 2013 platform though they are aiming for double that, if not more, using a similar battery capacity!
TDP-wise, there will be a <10W Ivy Bridge slated for early 2013. Haswell will bring a 15W ULT SKU and another one that goes even below the <10W Ivy Bridge. I'd expect the 10W SKU to aim for Ivy Bridge-like performance, with only the 15W SKU being the real performant one.
CPU:
-Better branch prediction
-Increase OoO window and resources
-L2 TLB(TLB size and ways doubles to 1024 and 8 ways in 4K and 2M shared scenario)
-Lower VT latencies
-Port 0 and 1 add the ability to do 256-bit FMA
-New Port 6 adds extra ALU and branch unit
-New Port 7 adds extra store address unit
-Doubled read and write bandwidth for the L1 cache
-Improved cache line latency and throughput
-L2 cache bandwidth doubled
-Same branch misprediction penalty and same cache latencies
-New Integer instructions: Indexing and Hashing, Cryptography, Endian conversion-MOVBE
-CRC, SHA-256, SHA-256 MultiBuffer, AES GCM, RSA-2K performance greatly improves over Ivy Bridge
-Per slice LLC access bandwidth improved
-Improved load balancing in the System Agent
-Improved DRAM write throughput
I wasn't expecting big change until I saw Fudzilla's "at least 10% gain" claim. This is a comprehensive change. Addition of ports is even more than what Sandy Bridge architecture went through. Impressive. I expect another 10-20% Sandy Bridge achieved over Nehalem/Westmere is possible again.
GPU:
-DX11.1, OpenCL 1.2, OpenGL 4.0 support
-Adds a Resource Streamer to reduce driver overhead
-Most of fixed function unit performance doubled(for GT3)
-GT3 is literally doubles the back end of the iGPU, which includes the Rasterizer, Z, Stencil, Color Blend, Samplers
-Up to 4x improvement of sampler performance over Ivy Bridge(which will be double again with GT3)
Good to see that its not a direct scale up of Ivy Bridge units, and on its own a big improvement as well. The one I like a lot is the Resource Streamer to reduce driver overhead.
Media:
-SVC codec
-MJPEG decode & MPEG2 encode
-Higher encoding quality
-Stand-alone VQE(Video Quality Engine)
-Frame rate conversion and image stabilization
-4Kx2K video playback
-Improved encoder quality and new video processing functions
Can't tell much about media units, but seems like a step forward. This area is best shown by real life examples.
Power management:
-Voltage Regulator integration
-Cores seperated from LLC+Ring for frequency
-C-state transition times improved by 25%
-GT3 can power gate slice
This is the biggest improvement for Haswell, by far. As outlined here: http://forums.anandtech.com/showthread.php?t=2241480&highlight=overview+power+management
Here's an interesting fact though. Intel is saying the 4th Generation Core "M" series will have 20% reduced idle power consumption. The "U" series though will have >20x reduced idle power consumption. They are talking that they want to reach S0 style wake up with S3 style power usage.
How significant is this? Datasheets show that CPU-alone C7 package power for Ivy Bridge 17W CPU has a TDP of 2.2W. Haswell will be under 200mW, not just for the CPU but for the entire platform sans the display. The reason for such a big drop is because we move from a system that has basically the CPU scaling in power usage only, to the entire system scaling based on demand. We'll see(for Ultrabooks anyway) deep power states on SSDs, PCI Express, USB, and other things that go way below what they do today.
Power measurements when the system is idle with screen-off is known to be 3-4W on an Ultrabook-style system. That goes 2-3W with the screen on idle, for a total of 5-7W. Even if the screen isn't improved it would mean the idle power with screen on would drop from current 5W to 2.2W.. Combine that with Haswell's support for Panel Self Refresh to save ~500mW of power we end up with 1.7W screen-on idle with Haswell with this calculation.
Actually, to be more specific they show in the Ultrabook guidelines, a 2011-2012 platform having a 35WHr battery having 5 hours of battery life, with bigger systems doing 7 hours, a fairly realistic figure since the Macbook Air and Zenbook can get similar figure with 35WHr. In the 2013 platform though they are aiming for double that, if not more, using a similar battery capacity!
TDP-wise, there will be a <10W Ivy Bridge slated for early 2013. Haswell will bring a 15W ULT SKU and another one that goes even below the <10W Ivy Bridge. I'd expect the 10W SKU to aim for Ivy Bridge-like performance, with only the 15W SKU being the real performant one.
Last edited: