[TheReg] Intel Management Engine: Is this why we have seen slower speed growth?

Page 5 - Seeking answers? Join the AnandTech community: where nearly half-a-million members share solutions and discuss the latest tech.

DrMrLordX

Lifer
Apr 27, 2000
21,582
10,785
136
It is upsetting that some types of malware cannot be removed by reinstalling the Operating system. So this idea sounds really interesting.

I wonder how easy it would be to implement on Intel? On AMD?

As Shintai pointed out earlier in the thread, the greatest threat wrt persistent malware is the UEFI. UEFI rootkits do exist already, they are in the wild, and they are quite dangerous to anyone not using the UEFI secure boot feature. Sadly secure boot breaks some things (such as certain wifi adapters in Linux) so some folks have to disable that feature.
 

ShintaiDK

Lifer
Apr 22, 2012
20,378
145
106
IPC would only really count per core, so P4 to Core to Core2.

From what I remember Athlon64 was a big increase over both P4 and core, and Core2 was a big increase again.

K8 performed on pair with Dothan.
3277.png

3273.png

q3.png

cons-1.png


Yonah was faster than K8.
fc.png

hl2.png

q4.png
 
Last edited:

beginner99

Diamond Member
Jun 2, 2009
5,208
1,580
136
Some other interesting info found in the above article:

It is upsetting that some types of malware cannot be removed by reinstalling the Operating system. So this idea sounds really interesting.

Not that new. Your hard drives controller is nothing else than a microprocessor running firmware and said firmware can be flashed. You can then run any program you want and especially reinstalling the OS won't help. Plus because you are controlling the hdd you can manipulate all data to your liking, mostly it's about stealing.

http://www.wired.com/2015/02/nsa-firmware-hacking/
 

intangir

Member
Jun 13, 2005
113
0
76
Not that new. Your hard drives controller is nothing else than a microprocessor running firmware and said firmware can be flashed. You can then run any program you want and especially reinstalling the OS won't help. Plus because you are controlling the hdd you can manipulate all data to your liking, mostly it's about stealing.

http://www.wired.com/2015/02/nsa-firmware-hacking/

Yup. It's not just your CPU/chipset firmware you have to worry about, it's your storage component firmware as well. The guys at libreboot and these security outfits that make futile recommendations are pretty much screwed, because no modern hardware sold since 2008 meets their requirements for openness.

http://libreboot.org/faq/#firmware-hddssd
HDDs and SSDs are quite complex, and these days contain quite complex hardware which is even capable of running an entire operating system (by this, we mean that the drive itself is capable of running its own embedded OS), even GNU/Linux or BusyBox/Linux.

Example attack that malicious firmware could do: substitute your SSH keys, allowing unauthorized remote access by an unknown adversary. Or maybe substitute your GPG keys. SATA drives can also have DMA (through the controller), which means that they could read from system memory; the drive can have its own hidden storage, theoretically, where it could read your LUKS keys and store them unencrypted for future retrieval by an adversary.

Viable free replacement firmware is currently unknown to exist.
 

Deders

Platinum Member
Oct 14, 2012
2,401
1
91
Didn't Kaspersky uncover some NSA firmware that has been rife among HDD's for years?

http://www.theregister.co.uk/2015/02/17/kaspersky_labs_equation_group/

Not sure if this article is talking about the same specific code but it has this to say:

"However, don’t rush to find your screwdriver – we don’t expect this ultimate infection ability to become mainstream. Even the Equation group itself probably only used it a few times, as HDD infector module is extremely rare on victim’s systems. For starters, hard drive reprogramming is much more complex than writing, let’s say, Windows software. Each hard drive model is unique and it is very expensive and painstaking to develop an alternative firmware. A hacker must obtain the hard drive vendor’s internal documentation (which is nearly impossible), purchase some drives of the exact same model, develop and test required functionality, and squeeze malicious routines into existing firmware, all while keeping its original functions. This is very high profile engineering which requires months of development and millions in investment. That’s why it’s not feasible to use this kind of stealth technologies in criminal malware or even most targeted attacks. In addition, firmware development is obviously a boutique approach which can’t be easily scaled. Many manufacturers release firmware for multiple drives each month, new models come out constantly, and hacking each one is something beyond the possibility (and need) for the Equation group – and anyone else."

 

intangir

Member
Jun 13, 2005
113
0
76
Doesn't anyone have anything else to say about the IME?

Not much, other than if implemented correctly, the IME should have no impact at all on performance during normal operation. Really, it's like adding another independent watchdog component that operates in parallel with the existing chip components. So the entire premise of the thread is fallacious.

You'll have to look elsewhere for an explanation of the tepid performance gains of the last few generations of x86 chips. In my estimation, it's pretty much the reasons frozentundra123456 summed up: it's hard to squeeze out more IPC gains at this point, Dennard scaling of transistors has come to an end, and the market is no longer as performance-driven as it used to be (existing chips are fast enough for the majority of customers).

These small IPC gains aren't recent. Here are where the large IPC gains happened:

1) 8086 to 80286
2) 80286 to 80386
3) 80386 to 80486
4) 80486 to Pentium
5) Pentium to Pentium Pro

IPC increases since Pentium Pro have been fairly small. So essentially we've had 20 years of small IPC gains.

Yes, exactly, and with clock speeds no longer increasing, that doesn't leave anywhere for performance improvements to come from.

Pentium 4 to Core2
Core 2 to Nehalem
Nehalem to Sandy bridge

All pretty big IPC increases.

You don't have a proper perspective of what IPC gains used to look like back in the 1980s and '90s. The gains you listed were in the neighborhood of ~50% for the Pentium 4 to Core transition and maaaybe at most 20% for the generations since then. And the P4 transition was a unique circumstance, switching from the Netburst "speed-racer" microarchitecture that pushed clock speed to a "brainiac" one that emphasized more work per clock cycle.

Here, let me show you some numbers for the earlier x86 generations:

Code:
Year     Gen              MIPS   MHz   IPC     Gain
1978     8086             0.33     5   0.066    -
1982     286              1.5     10   0.15   +127%
1985     386             11.4     33   0.33   +120%
1989     486             40       50   0.80   +142%
1993     Pentium (P5)   126.5     75   1.69   +111%
1995     PPro (P6)      541      200   2.71   + 60%

As you can see, historically, any gain that doesn't at least double IPC isn't anything to get excited about. We've been inching along IPC-wise since the Pentium Pro came out 20 years ago. And things are unlikely to improve no matter how many resources Intel expends on the problem.

The 386 was a pretty straightforward 32-bit extension of the previously 16-bit x86 architecture. The 486 added scalar operation (pipelining) and an integrated floating-point unit, the P5 added superscalar (multiple execution pipelines) and branch prediction, the P6 added out-of-order execution and micro-op decode. Since then, with one 5-year detour through Netburst, Intel has essentially been making incremental tweaks to the base P6 design and bolting on special-purpose hardware like SSE units. There are just no big microarchitectural ideas left that we can expect to suddenly wave a magic wand and unlock new branches of IPC fruit to pick.

Moreover, Robert Colwell, one of the designers of the P6, made the observation that the vast majority of performance gains since 1980 have been from clock speed and not microarchitecture. In his estimation, it's about 3500x improvement from clockspeed (1 MHz to 3.5 GHz) and about 50x from architecture / microarchitecture (0.066 IPC to ~3.5). And now that transistor scaling seems to be running out of steam, we'll be looking forward to perhaps 10% gains from here on in.

You should listen to his keynote speech "The Chip Design Game at the End of Moore's Law", from the HotChips25 symposium in 2013. It's really a good entertaining watch. At the time, he was Director of the Microprocessor Technology Office at DARPA.

"It's really important: don't confuse performance and transistors. Moore's Law is about transistors. It's [the industry's] job to turn that into performance, or whatever people want to pay money for."

"I don't think there's ever been a technology development exponential like this one. I also heard Ray Kurzweil in particular goes down a different path. He says 'no no no no, Moore's Law is just one of a set of exponentials over history. It's just the latest one; don't worry about it.' And I say, baloney, I don't agree with you at all. I think there haven't been five; I think there's been one, and this is it.

"The way I thought of a chip architect's job in context with Moore's Law was to stay out of the way. The idea was if nature's going to give you this bounty of lots more transistors, oh and they're all faster, oh and they're lower power, don't fight it! Find a way to design the machine so as to leverage that fact, rather than try to be clever and just blow that off.

"I'm not saying microarchitecture has no place. But the way I view it, the scorecard over the last, say, 35 years: I figure, [in] 1980, Bill and I were both at Bell Labs designing essentially a 1 megahertz 32-bit processor. Today, clocks are running about 3500 times faster. And so I think it's entertaining to sort of consider what did we architects bring to the table, in terms of pure architecture--microarchitecture ideas like pipelining, or superscalar, caches, all the stuff that we threw in--relative to just the plain clocks plus large number of transistors. And I think the score is, we came out way on the short end of that stick. I think the silicon gave us way more than the architects could have made up for.

"Why that matters, aside from getting yelled at from [your managers]-- Regina Dugan was the previous head of DARPA, and at some point I said much these same words to her, and she said, well then, you wasted your time professionally for many years, didn't you? And I went, I choose not to view it that way, ok? But aside from personal pride, the question is if the fundamental Energizer Bunny silicon engine stalls out, and we have to resort to being clever and only microarchitecture without additional transistors, how much runway is left? And I'm saying we're going to go down there, we're going to do the best we can, but don't expect 3500x. There's nothing like that on the [horizon]--I don't see that remaining.

"All right, so can we continue to crank out successful new chips? You could ask the question, well sure we can, if you can find enough goodies to bucket them all together and say, okay maybe Moore's Law left and I don't have as many transistors, but I'm a clever person and my new machine is 50% better than the old one in... performance, or power, or something. I would say 50% is pretty good. You could probably find a market for that. Uh, how about 20%? How about ten percent? How far down are you willing to go and still think that you've got something that you can sell? I think that's the future that we have to contemplate seriously and try to avoid, because I don't think the world's going to give a whole lot of extra money for a ten-percenter in general.

"So here's an example. I picked this off the Internet. Unfortunately, in DARPA's zeal to give everything the proper attribution down here, they replaced the person's name with the column, but you'll find it. This was one of those letters to the editor kind of things, a comment column. It's usually a lot of junky stuff down there, but this at least was crisp about what the attitude was. It says, 'Ultimately, I think Moore's Law will never stop. Computer builders will find other methods to make their computers faster.' And I think, well, that's at least--I'm happy for your optimism. I actually think that there's some truth to that for a short time.

"But the problem is that the low-hanging fruit has already been taken, and the amount of effort it's going to take to do anything beyond that is going to be substantial. And you cannot, in my personal opinion, you cannot make up for the lack of an underlying exponential. Those fixes will last us a few--we'll play all the little tricks that we still didn't get around to, we'll make better machines for a while, but you can't fix that exponential. A bunch of incremental tweaks isn't gonna hack it."

Performance doesn't always come from IPC*MHz, it can be dependent on the kind of code used. For instance having more SIMD paths is partly what set Sandy bridge processors apart from Nehalem. A specific example of this is comparing my old i5-750 to an Ivy bridge i5 system I built for my brother. Using the same model Geforce 670 in both, PhysX performance in Arkham City was close to double using the ivy Bridge chip, and the i5-750 was clocked 400MHz higher.

I think the reason we are seeing significantly better performance in games like GTAV with Skylake compared to Haswell is because games like these can make better use of the CPU's registers and resources than older less demanding games.

I think what you're seeing here is an example of software taking advantage of special-purpose hardware additions, like QuickSync or AES encryption instructions, and those sorts of gains are workload-dependent and rely on the unpredictable implementation schedules of software writers. You can't depend on those for general performance improvements.
 
Last edited:

imported_ats

Senior member
Mar 21, 2008
422
63
86
Some other interesting info found in the above article:



It is upsetting that some types of malware cannot be removed by reinstalling the Operating system. So this idea sounds really interesting.

I wonder how easy it would be to implement on Intel? On AMD?

Every bit of the ME blobs are encrypted and cryptographically signed with strong encryption. In order for a kit to remain post boot would require getting the private keys. Its highly unlikely. Very very few people within Intel even have access to the keys.