New Branch Prediction side-channel attack discovered

tamz_msc · Mar 27, 2018

Original PDF.

From Ars Techinca:

BranchScope and Spectre 2 both take advantage of different parts of the branch predictor. Spectre 2 relied on a part called the Branch Target Buffer (BTB)—the data structure within the processor that records the branch target. BranchScope, instead, leaks information using the direction of the prediction—whether it's likely to be taken or not—which is stored in the pattern history table (PHT).

The PHT keeps a kind of running score of recently taken branches to remember if those branches were taken or not. Typically, it's a two-bit counter with four states: strongly taken, weakly taken, weakly not taken, and strongly not taken. Each time a branch is taken, the counter's value is moved toward "strongly taken"; each time it's not taken, it's moved toward "strongly not taken." This design means that an occasional mispredict won't change the result of the prediction: a branch that's almost always taken will still predict as taken, even if every it's occasionally not actually taken. Changing the prediction requires two back-to-back mispredicts. This design is proven to provide better results than a one-bit counter that simply predicts a branch based on what happened the last time it was taken.

In the new attack, an attacker primes the PHT and running branch instructions so that the PHT will always assume a particular branch is taken or not taken. The victim code then runs and make a branch, which is potentially disturbing the PHT. The attacker then runs more branch instructions of its own to detect that disturbance to the PHT; the attacker knows that some branches should be predicted in a particular direction and tests to see if the victim's code has changed that prediction.

The researchers looked only at Intel processors, using the attacks to leak information protected using Intel's SGX (Software Guard Extensions), a feature found on certain chips to carve out small sections of encrypted code and data such that even the operating system (or virtualization software) cannot access it. They also described ways the attack could be used against address space layout randomization and to infer data in encryption and image libraries.

NTMBK · Mar 27, 2018

And they've used it to poke a hole in SGX. Nice.

VirtualLarry · Mar 27, 2018

Wow. When it rains, it pours, doesn't it. AMD's machine-intelligence global branch predictor seems better every day.

LTC8K6 · Mar 27, 2018

As with Spectre 2, it's not clear just how much software is truly vulnerable to BranchScope attacks. In both cases, attackers need the ability to run code on a victim system, so these attacks will never be used for initial entry into a system.

So this apparently applies to AMD and Intel, like Spectre, and it requires a total lack of system security in the first place.

NTMBK · Mar 27, 2018

LTC8K6 said:
So this apparently applies to AMD and Intel, like Spectre, and it requires a total lack of system security in the first place.

The big worry is for cloud providers. If code can "break out" of a VM and read memory from other VMs on the same host, it's a problem.

Panino Manino · Mar 27, 2018

When it rains... and Intel SGX gone out without an umbrella.
These new concepts looks high level voodoo for me, looks too hard to bother to try.

Kenmitch · Mar 27, 2018

LTC8K6 said:
So this apparently applies to AMD and Intel, like Spectre, and it requires a total lack of system security in the first place.

Intel yes. AMD wasn't tested.

William Gaatjes · Mar 27, 2018

SGX only applies to Intel.
It seems there are more issues with security instructions SGX from Intel.

A researcher who in January helped highlight possible flaws in Intel's Software Guard Extensions' input-output protection is back, this time with malware running inside a protected SGX enclave.

https://www.theregister.co.uk/2017/...e_under_intels_door_sgx_can_leak_crypto_keys/

By Richard Chirgwin 7 Mar 2017 at 05:58

A researcher who in January helped highlight possible flaws in Intel's Software Guard Extensions' input-output protection is back, this time with malware running inside a protected SGX enclave.
Instead of protecting the system, Samuel Weiser and four collaborators of Austria's Graz University of Technology write that the proof-of-concept uses SGX to conceal the malware – and that within five minutes, he can grab RSA keys from SGX enclaves running on the same system.
It's the kind of thing SGX is explicitly designed to prevent. SGX is an isolation mechanism that's supposed to keep both code and data from prying eyes, even if a privileged user is malicious.
Weiser and his team created a side-channel attack they call “Prime+Probe”, and say it works in a native Intel environment, or across Docker containers.
The PoC is specifically designed to recover RSA keys in someone else's enclave in a complex three-step process: first, discovering the location of the victim's cache sets; second, watch the cache sets when the victim triggers an RSA signature computation; and finally, extracting the key.

As the paper puts it:

We developed the most accurate timing measurement technique currently known for Intel CPUs, perfectly tailored to the hardware. We combined DRAM and cache side channels, to build a novel approach that recovers physical address bits without assumptions on the page size. We attack the RSA implementation of mbedTLS that is used for instance in OpenVPN. The attack succeeds despite protection against sidechannel attacks using a constant-time multiplication primitive. We extract 96 % of a 4096-bit RSA private key from a single Prime+Probe trace and achieve full key recovery from only 11 traces within 5 minutes.
The attack even works across different Docker containers, because the Docker engine calls to the same SGX driver for both containers.

Docker containers share the same SGX driver

Timing: A cryptography side-channel attack needs a high resolution timer, something forbidden in SGX. Weiser and his collaborators combed Intel's specs, and settled on the inc and add instructions, because these have “a latency of 1 cycle and a throughput of 0.25 cycles/instruction when executed with a register as an operand”.

To emulate the forbidden timer, the researchers used these x86 instructions:

mov &counter , %rcx
1: inc %rax
mov %rax , (%rex)
jmp lb

”Eviction set" generation: This step is designed to discover virtual addresses “that map to the same cache set”: we scan memory sequentially for an address pair in physical proximity that causes a row conflict. As SGX enclave memory is allocated in a contiguous way we can perform this scan on virtual addresses.”

With those two steps completed, Weiser et al worked out how to monitor vulnerable cache sets, looking for the characteristic signature of RSA key calculation.

This part of the attack has to happen offline – that is, separately to the cache monitoring that collects the data – because you end up with lots of data that has lots of noise in it (from timing errors, context switching, non-RSA-key activity in the victim's enclave, and CPU timing changes due to power management, and so on).

Key recovery comes in three steps. First, traces are preprocessed. Second, a partial key is extracted from each trace. Third, the partial keys are merged to recover the private key.
On an SGX-capable Lenovo ThinkPad T460s running Ubuntu 16.10, they found:

With 340 trials, their malware was able to find a vulnerable cache set from the 2048 cache sets available;

Capturing a trace from the vulnerable cache set took 72 seconds, on average;

A single cache trace provided access to 96 per cent of a 4096-bit RSA key, and with 11 traces, the full RSA key is available.

The researchers say their attack can be blocked, but the fix will have to come from Intel, because modifications to operating systems risk weakening the SGX model.

Kenmitch · Mar 27, 2018

Luckily only gamers use Intel cpus.

William Gaatjes · Mar 27, 2018

It is a year old. I wonder if it is solved with a microcode patch or with generation 8 or not at all.
It seems SGX flaws applies to server software for cloud systems.

LTC8K6 · Mar 27, 2018

Yes, a year old at least

https://www.theregister.co.uk/2017/...e_under_intels_door_sgx_can_leak_crypto_keys/

Here's a paper on mitigation of such attacks from September:

https://arxiv.org/pdf/1709.09917.pdf

William Gaatjes · Mar 27, 2018

How did they mitigate the weakness of SGX ?

I don't feel like reading it now. I think i am going to continue with prey.

LTC8K6 · Mar 27, 2018

https://www.usenix.org/system/files/conference/usenixsecurity17/sec17-gruss.pdf

Another paper on mitigation from August.

Abstract
Cache-based side-channel attacks are a serious problem
in multi-tenant environments, for example, modern cloud
data centers. We address this problem with Cloak, a
new technique that uses hardware transactional memory
to prevent adversarial observation of cache misses
on sensitive code and data. We show that Cloak provides
strong protection against all known cache-based
side-channel attacks with low performance overhead. We
demonstrate the efficacy of our approach by retrofitting
vulnerable code with Cloak and experimentally confirming
immunity against state-of-the-art attacks. We also
show that by applying Cloak to code running inside Intel
SGX enclaves we can effectively block information
leakage through cache side channels from enclaves, thus
addressing one of the main weaknesses of SGX.

LTC8K6 · Mar 27, 2018

Abstract from the other paper.

Abstract—Recent research has demonstrated that Intel’s SGX
is vulnerable to various software-based side-channel attacks. In
particular, attacks that monitor CPU caches shared between the
victim enclave and untrusted software enable accurate leakage of
secret enclave data. Known defenses assume developer assistance,
require hardware changes, impose high overhead, or prevent only
some of the known attacks. In this paper we propose data location
randomization as a novel defensive approach to address the threat
of side-channel attacks. Our main goal is to break the link
between the cache observations by the privileged adversary and
the actual data accesses by the victim. We design and implement a
compiler-based tool called DR.SGX that instruments enclave code
such that data locations are permuted at the granularity of cache
lines. We realize the permutation with the CPU’s cryptographic
hardware-acceleration units providing secure randomization. To
prevent correlation of repeated memory accesses we continuously
re-randomize all enclave data during execution. Our solution
effectively protects many (but not all) enclaves from cache attacks
and provides a complementary enclave hardening technique that
is especially useful against unpredictable information leakage.

William Gaatjes · Mar 27, 2018

I wonder what the execution speed price is for these mitigations.

LTC8K6 · Mar 27, 2018

SGX didn't appear until Skylake. So if you waited to upgrade, you are okay.

If you have SGX, you can disable it in BIOS.

Apparently it often lacks BIOS support, so even if your chip supports it, it's often not working.

https://ark.intel.com/Search/FeatureFilter?productType=processors&SoftwareGuardExtensions=true

moinmoin · Mar 27, 2018

Why do you all focus on SGX? (Though SGG makes a good exploit example, Spectre allows one to access it as well.) The point of BranchScope is that, unlike Spectre, it doesn't exploit speculative execution using BTB but using PHT instead. The paper also points out that there may be other aspects left for abusing the predictor features. I'm very interested in how much other manufacturers are affected, it may well turn out that the whole way speculative execution has been done up to now needs to be brought back to the drawing board.

tamz_msc · Mar 28, 2018

The thing that nobody seems to realize is that BranchScope works better(with lower failure rates) with better branch predictors, ie. the faster your CPU the likelihood of the exploit working successfully also increases. These flaws are only scratching the surface it seems, and to address them properly would require a complete rethink of the way speculative execution is implemented when more ingenious discoveries are made.

thecoolnessrune · Mar 29, 2018

How nice that they were given a nice long window prior to public notification. It's almost like that should be the norm.

Also curious to get a statement from AMD on these.

wahdangun · Mar 29, 2018

Maybe the solution was to give some randomness in OO ? So it will be harder to predict, but the downside was it will impact the performance.

The quest to predict make it more predictable.haha

New Branch Prediction side-channel attack discovered

Diamond Member

Lifer

No Lifer

Lifer

Lifer

Golden Member

Diamond Member

Lifer

Diamond Member

Lifer

Lifer

Lifer

Lifer

Lifer

Lifer

Lifer

Diamond Member

Diamond Member

Diamond Member

Golden Member