• Guest, The rules for the P & N subforum have been updated to prohibit "ad hominem" or personal attacks against other posters. See the full details in the post "Politics and News Rules & Guidelines."

Massive security hole in CPU's incoming?Official Meltdown/Spectre Discussion Thread

Page 13 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

mattiasnyc

Senior member
Mar 30, 2017
356
337
136
Designs are a philosophy. Implementation is where bugs come in. Previously you seemed to claim it can't be a bug because of manufacturing? Manufacturing is no more "implementation" (of a design concept) than compiling a program is. Both can be perfect and the end products still contain bugs. Since intel designed their CPUs to separate memory systems and they failed to properly implement their designs it's a bug.

Frankly I don't think you understand the concept, and I think others are just here to try control the narrative. Neither of which I can change, all I can do is add my 2c.
Ok, well I think that last line of yours is a bit insulting to be honest, but let's just do this then:

Tell me what design decision/intent Intel failed to properly implement.

Mind you,
I'm not talking about describing how this attack works, I'm talking about the deliberate intentional design of Intel, described by you, which isn't working properly, and just how it isn't working as designed.

(To use a poor analogy: From what I can tell it's as if Intel designed a car that can provide natural fresh air inside without air conditioning. It does this by not having the door-windows go up all the way, thus enabling air flow. When driving the car it works as designed. Someone then decides to slip a piece of string into parked cars and loop it around the locking mechanism thereby unlocking the door and stealing the car. That's not a "bug". That's a design decision that is flawed. A bug would have been if the decided intended design didn't work as planned because of an improper implementation (i.e. the window actually goes all the way up instead of stopping short, despite the design)).....
 

richaron

Golden Member
Mar 27, 2012
1,357
329
136
Tell me what design decision/intent Intel failed to properly implement.
Yes. This is the crux of the matter. And this is what I've tried to address in my posts to you.

Like in all recent CPUs code security, runlevels, or privileges or whatever it's called had been a huge priority. There has been a lot of effort put into making hardware which separates these different levels of security, and up until now everyone thought they had achieved their design intentions.

But now it appears intel CPUs have a bug. Because they are intended to have this certain security, and they almost do, but they missed a spot; And this missed spot is the bug. Simply not having particular functionality is not in itself a bug, but missing particular functionality which should exist per design intentions and 'validation' is without a doubt a bug.
 

csbin

Senior member
Feb 4, 2013
834
339
136
https://lkml.org/lkml/2018/1/3/797

From Linus Torvalds <>
Date Wed, 3 Jan 2018 15:51:35 -0800
Subject Re: Avoid speculative indirect calls in kernel
On Wed, Jan 3, 2018 at 3:09 PM, Andi Kleen <andi@firstfloor.org> wrote:
> This is a fix for Variant 2 in
> https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html
>
> Any speculative indirect calls in the kernel can be tricked
> to execute any kernel code, which may allow side channel
> attacks that can leak arbitrary kernel data.

Why is this all done without any configuration options?

A *competent* CPU engineer would fix this by making sure speculation
doesn't happen across protection domains. Maybe even a L1 I$ that is
keyed by CPL.

I think somebody inside of Intel needs to really take a long hard look
at their CPU's, and actually admit that they have issues instead of
writing PR blurbs that say that everything works as designed.

.. and that really means that all these mitigation patches should be
written with "not all CPU's are crap" in mind.

Or is Intel basically saying "we are committed to selling you shit
forever and ever, and never fixing anything"?

Because if that's the case, maybe we should start looking towards the
ARM64 people more.

Please talk to management. Because I really see exactly two possibibilities:

- Intel never intends to fix anything

OR

- these workarounds should have a way to disable them.

Which of the two is it?

Linus
 
  • Like
Reactions: NTMBK
Aug 11, 2008
10,451
641
126
I haven't asked them...

Meltdown exists because this intel CPU bug exists. I'll call the bug Meltdown then also. Makes sense to me, you can call it something else. The purpose of language is to convey an idea and my way works fine.
I agree with Phyanz. It is not a "bug" in the traditional sense of an incorrect output of data. It is a design "choice" or "flaw" if you wish that allows an exploit. By the definition of "bug" that a lot of posters are using, every security hole that has to be, or has been, patched in hardware or an OS is a "bug". That said, to the end user, it doesnt really matter. It has to be patched in either case, and the ultimate performance loss is the same.
 

itsmydamnation

Platinum Member
Feb 6, 2011
2,247
1,911
136
amd's page gives the best information about what the actual vectors are, probably because they feel they have the least to be afraid of.....


https://www.amd.com/en/corporate/speculative-execution
Variant One
Bounds Check Bypass
Resolved by software / OS updates to be made available by system vendors and manufacturers. Negligible performance impact expected.

Variant Two
Branch Target Injection
Differences in AMD architecture mean there is a near zero risk of exploitation of this variant. Vulnerability to Variant 2 has not been demonstrated on AMD processors to date.

Variant Three
Rogue Data Cache Load
Zero AMD vulnerability due to AMD architecture differences.
 

richaron

Golden Member
Mar 27, 2012
1,357
329
136
I agree with Phyanz. It is not a "bug" in the traditional sense of an incorrect output of data. It is a design "choice" or "flaw" if you wish that allows an exploit. By the definition of "bug" that a lot of posters are using, every security hole that has to be, or has been, patched in hardware or an OS is a "bug". That said, to the end user, it doesnt really matter. It has to be patched in either case, and the ultimate performance loss is the same.
Of course you can think what you want... but the "Meltdown bug" does create incorrect output. And who said a bug only produces incorrect output?
 

goldstone77

Senior member
Dec 12, 2017
217
93
61
Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipHEADmaster
Pull x86 page table isolation fixes from Thomas Gleixner:

"A couple of urgent fixes for PTI:
- Fix a PTE mismatch between user and kernel visible mapping of the
cpu entry area (differs vs. the GLB bit) and causes a TLB mismatch
MCE on older AMD K8 machines
- Fix the misplaced CR3 switch in the SYSCALL compat entry code which
causes access to unmapped kernel memory resulting in double faults.
- Fix the section mismatch of the cpu_tss_rw percpu storage caused by
using a different mechanism for declaration and definition.
- Two fixes for dumpstack which help to decode entry stack issues
better
- Enable PTI by default in Kconfig. We should have done that earlier,
but it slipped through the cracks.
- Exclude AMD from the PTI enforcement. Not necessarily a fix, but if
AMD is so confident that they are not affected, then we should not
burden users with the overhead"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/process: Define cpu_tss_rw in same section as declaration
x86/pti: Switch to kernel CR3 at early in entry_SYSCALL_compat()
x86/dumpstack: Print registers for first stack frame
x86/dumpstack: Fix partial register dumps
x86/pti: Make sure the user/kernel PTEs match
x86/cpu, x86/pti: Do not enable PTI on AMD processors
x86/pti: Enable PTI by default
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=00a5ae218d57741088068799b810416ac249a9ce&utm_source=anz

AMD is confident in their CPU's to the point that they are forgoing the performance impacting patch Intel has to use.
 
Last edited:

Markfw

CPU Moderator, VC&G Moderator, Elite Member
Super Moderator
May 16, 2002
21,556
9,622
136
I agree with Phyanz. It is not a "bug" in the traditional sense of an incorrect output of data. It is a design "choice" or "flaw" if you wish that allows an exploit. By the definition of "bug" that a lot of posters are using, every security hole that has to be, or has been, patched in hardware or an OS is a "bug". That said, to the end user, it doesn't really matter. It has to be patched in either case, and the ultimate performance loss is the same.
Well, not sure who I agree with, and I still can't find the post I read today, or a linked document, but when you have processes at different priority (or security?) levels, they should not be allowed to cross priorities. Intel chose to allow this to speed execution, AMD did not. So Intel chose to violate the rules for speed, and now it has bitten them.

Somebody help me find that reference please.
 
  • Like
Reactions: beginner99

Schmide

Diamond Member
Mar 7, 2002
5,373
292
126
Notice my sig that I've had for probably 10 years.

It takes a lot of setup and a lot of snooping to get a couple thousand bytes a second. That being said the processor is loading and or executing memory on behalf of a process that should not be allowed to. The fact that that process can further decode and reveal the data makes it a bug. The flaw was allowing the initial execution. However IMO it became a bug when it was read.
 

formulav8

Diamond Member
Sep 18, 2000
7,003
522
126
So it's possible that they could release a new stepping then I guess.
I'm not sure a stepping (usually an upper layer IIRC) could fix it. It may take a full revision to fix it. (Usually a bottom/near base layer).
 
May 11, 2008
18,309
831
126
Well, not sure who I agree with, and I still can't find the post I read today, or a linked document, but when you have processes at different priority (or security?) levels, they should not be allowed to cross priorities. Intel chose to allow this to speed execution, AMD did not. So Intel chose to violate the rules for speed, and now it has bitten them.

Somebody help me find that reference please.
This perhaps ?
https://www.wired.com/story/critical-intel-flaw-breaks-basic-security-for-most-computers/
VUSEC's Bosman confirmed that when Intel processors perform that speculative execution, they don't fully segregate processes that are meant to be low-privilege and untrusted from the highest-privilege memory in the computer's kernel. That means a hacker can trick the processor into allowing unprivileged code to peek into the kernel's memory with speculative execution.

"The processor basically runs too far ahead, executing instructions that it should not execute," says Daniel Gruss, one of the researchers from the Graz University of Technology who discovered the attacks.

Retrieving any data from that privileged peeking isn't simple, since once the processor stops its speculative execution and jumps back to the fork in its instructions, it throws out the results. But before it does, it stores them in its cache, a collection of temporary memory allotted to the processor to give it quick access to recent data. By carefully crafting requests to the processor and seeing how fast it responds, a hacker's code could figure out whether the requested data is in the cache or not. And with a series of speculative execution and cache probes, he or she can start to assemble parts of the computer's high privilege memory, including even sensitive personal information or passwords.

Many security researchers who spotted signs of developers working to fix that bug had speculated that the Intel flaw merely allowed hackers to defeat a security protection known as Kernel Address Space Layout Randomization, which makes it far more difficult for hackers to find the location of the kernel in memory before they use other tricks to attack it. But Bosman confirms theories that the bug is more serious: It allows malicious code to not only locate the kernel in memory, but steal that memory's contents, too.

"Out of the two things that were speculated, this is the worst outcome," Bosman says.

https://twitter.com/brainsmoke/status/948561799875502080
 
May 11, 2008
18,309
831
126
I must be going bananas.
There is now a lot more information on this page than it was a few hours ago, i guess that was the NDA sstill being active:
https://googleprojectzero.blogspot.nl/2018/01/reading-privileged-memory-with-side.html

Theoretical explanation
The Intel Optimization Reference Manual says the following regarding Sandy Bridge (and later microarchitectural revisions) in section 2.3.2.3 ("Branch Prediction"):

Branch prediction predicts the branch target and enables the
processor to begin executing instructions long before the branch
true execution path is known.

In section 2.3.5.2 ("L1 DCache"):

Loads can:
[...]
  • Be carried out speculatively, before preceding branches are resolved.
  • Take cache misses out of order and in an overlapped manner.

Intel's Software Developer's Manual [7] states in Volume 3A, section 11.7 ("Implicit Caching (Pentium 4, Intel Xeon, and P6 family processors"):

Implicit caching occurs when a memory element is made potentially cacheable, although the element may never have been accessed in the normal von Neumann sequence. Implicit caching occurs on the P6 and more recent processor families due to aggressive prefetching, branch prediction, and TLB miss handling. Implicit caching is an extension of the behavior of existing Intel386, Intel486, and Pentium processor systems, since software running on these processor families also has not been able to deterministically predict the behavior of instruction prefetch.
 

french toast

Senior member
Feb 22, 2017
988
824
136
I'm going to put my tin foil hat on, intel is too big a company with too competent engineers to let something like this slip for all this time.
It's almost like it is a purposely designed "feature" or intended circumstance.
Why? Well, performance enhancement? The decision to allow/enable this hole/feature could have come way back when AMD was extremely competitive..circa 2002-2005...once added, it's too tempting just leave it there, especially when intel processor improvements have been miniscule from iteration to iteration since sandy bridge.

More crazy explanation...CIA forced exploit? :/
 
May 11, 2008
18,309
831
126
I'm going to put my tin foil hat on, intel is too big a company with too competent engineers to let something like this slip for all this time.
It's almost like it is a purposely designed "feature" or intended circumstance.
Why? Well, performance enhancement? The decision to allow/enable this hole/feature could have come way back when AMD was extremely competitive..circa 2002-2005...once added, it's too tempting just leave it there, especially when intel processor improvements have been miniscule from iteration to iteration since sandy bridge.

More crazy explanation...CIA forced exploit? :/
It is not that extreme.
It kind of reminds of what itanium did. Execute both branches of a conditional branch and throw away whatever branch turns out to be not taken.
 
  • Like
Reactions: french toast

Thala

Golden Member
Nov 12, 2014
1,264
575
136
This is the code snippet from the Meltdown paper:

1 ; rcx = kernel address
2 ; rbx = probe array
3 retry:
4 mov al, byte [rcx]
5 shl rax, 0xc
6 jz retry
7 mov rbx, qword [rbx + rax]
The central issues is, that the secret value from kernel space is actually read into register al. But not only this, instructions 5-7 also need to be executed before the exception is raised.
Thing is, when virtual address (in rcx) is looked up in the TLB, at the time the physical address is known it is also know that the access would be illegal. At least in parallel with the Tag-address-compare you can easily determine, that access is illegal. Any sane CPU architect would not allow the load to be performed at all - and even if the load is performed, it should never be forwarded to the following instructions.
It would not surprise me, if the attack is not working on ARM or even AMD as it is such a big blunder.
 

french toast

Senior member
Feb 22, 2017
988
824
136
http://uk.businessinsider.com/intel-ceo-krzanich-sold-shares-after-company-was-informed-of-chip-flaw-2018-1?r=US&IR=T
Intel was informed by Google back in JUNE....intel CEO krzanich sold his shares in October..
Another tidbit...if intel knew in June...and they still released coffeelake ... supposing it still had the hardware flaw that would require a performance gimping software fix...let alone knowing the CPU would be vulnerable for months until such patch materialises.....
Well that is ripe for class action lawsuits, investor lawsuits and potentially some prison time?
 

bryanW1995

Lifer
May 22, 2007
11,143
32
91
So Intel's response was to focus on one type of attack but not the more serious one affecting them so they're technically not lying?
They're not lying at all, they're just deliberately being obtuse. Standard corporate mumbo jumbo. I don't blame them for it as I'm sure that AMD, or NV, or ARM, or anybody else would have put out something similar if the roles were reversed. However, it only reinforces my determination to go with Ryzen now.
 
  • Like
Reactions: french toast

ASK THE COMMUNITY