It seems only fitting that one of the two hardware based exploits to rock the CPU world this week was named Meltdown. Because for the last 24 hours or so, it feels like I’ve been on the verge of one just trying to keep up with all of the new information that has come out on this and the also aptly named Spectre exploit. Suffice it to say, it’s the kind of week we haven’t seen for a long time in the technology industry. But I’m getting ahead of myself, so let’s start at the beginning.

Security researchers working for Google’s Project Zero group, along with other research groups and academic institutions, have discovered a series of far-ranging security risks involving speculative execution. Speculative execution is one of the cornerstones of high-performance execution on modern CPUs, and is found in essentially all CPU designs more performant than an embedded microcontroller. As a result, essentially every last high-performance CPU on the market or that has been produced in the last couple of decades is vulnerable to one or more of a few different exploit scenarios.

The immediate concern is an exploit being called Meltdown, which primarily affects Intel’s CPUs, but also has been confirmed to affect some ARM CPU designs as well. With Meltdown it is possible for malicious code to abuse Intel and ARM’s speculative execution implementations to get the processor to leak information from other processes – particularly the all-knowing operating system kernel. As a result, Meltdown can be readily used to spy on other processes and sneak out information that should be restricted to the kernel, other programs, or other virtual machines.

Meanwhile a second class of attacks is being called Spectre, and the number of processors at risk for exploitation is even wider. Essentially every high-performance processor ever made – Intel, AMD, ARM, and POWER – is thought to be vulnerable here. Like Meltdown, a Spectre attack abuses speculative execution in order to glean information that should be restricted. What makes Spectre different however is that it’s a less-straightforward but much more insidious attack; whereas Meltdown is based on abusing specific implementations of speculative execution, Spectre can be thought of as a (previously unknown) fundamental risk of speculative execution, one that can now be weaponized. Spectre requires more setup work to coerce a target application to leak information, but the fundamental nature of the risk means that Spectre is currently considered harder to mitigate, and in general is not as well understood.

Between Meltdown and Spectre, the end result is that prior to patching and mitigation efforts, virtually every PC and every mobile device is thought to be vulnerable to these attacks. And because the root causes are based in hardware architecture rather than software, these are not as easily fixed as software bugs. The good news is that it looks like the worst of these attacks can be mitigated in a combination of software and CPU microcode updates. However the scale of the problem – virtually all computers and mobile devices currently in use – means that it will take quite a bit of effort to mitigate.

An Early Release Leads To An Incomplete Picture

The way that the news about Meltdown and Spectre have been released was unplanned, and as a result researchers, hardware vendors, software vendors, and the public at large have been trying to catch up on everything that is going on. We now know that the Meltdown and Spectre attacks were discovered last summer, and since then vendors and security researchers have been coordinating their response in order to better understand the ramifications of the exploits and to ensure everyone had time to develop the necessary fixes and guidance. Until yesterday, information on Meltdown and Spectre was not intended to be published until Tuesday, January 9th, which would have been the first Patch Tuesday of the year.

Instead, what happened is that as the mitigation patches for the exploits were committed to the Linux kernel repository, people quickly began piecing together the idea that something was wrong. With both correct and incorrect speculation quickly taking over, the vendors moved up their information and patch releases on the exploits to yesterday. The result, as you might expect from a premature publication, was somewhat haphazard – I was reading new advisories well into the evening – and even now not everyone has published advisories (or at least, not published that would be normal for full advisories).

As a result, I’m still piecing together information, and it’s likely by the time I’ve finished writing this, something in this article will be out of date or wrong. But as of right now, this is everything we know as of this morning.

DON’T PANIC

It is said that despite its many glaring (and occasionally fatal) inaccuracies, the Hitchhiker's Guide to the Galaxy itself has outsold the Encyclopedia Galactica because it is slightly cheaper, and because it has the words 'DON'T PANIC' in large, friendly letters on the cover.

Before diving into some of the deeper technical aspects of Meltdown and Spectre, I think it’s best to start with a high level overview of where things stand. This includes the risks posed by these exploits, how they’re being mitigated, and what individual users need to do (if anything) about the issue.

The good news is that unless you're a cloud service provider, the immediate risk from these attacks is very low. The bad news is that because these exploits are based on hardware vulnerabilities, they will take some time to fix. And there are a lot of devices running a lot of different OSes out there that need to be fixed.

Things We Know

These are local attacks: Both Meltdown and Spectre are local attacks that require executing malicious code on a target machine. This means that these attacks are not (directly) drive-by style remote code execution attacks – think Nimda or Code Red – and that systems cannot be attacked merely by being connected to a network. Conceptually, these are closer to privilege escalation attacks, a class of attacks that help you get deeper into a system you already have access to. With that said however, researchers have shown that they can perform Spectre-based attacks using JavaScript, so it is possible for a web browser to pull a malicious JavaScript file and then attack itself in that fashion.

These are read-only (information disclosure) attacks: Along with not directly being remotely exploitable, even if Meltdown and Spectre attacks are executed on a local system, the nature of the exploit is that these are read-only attacks. That is, they can only read information from a system. They cannot directly force code execution in the OS kernel, in other virtual machines, or other programs. These sort of information disclosure attacks can still be devastating depending on what information is leaked – and there is always the risk of using that information to then chain it into a code execution attack – which is why they’re still concerning. But the real risk is in hostile parties using these attacks to steal information, not to control a system.

The principal threat is to shared hosting environments: Given the above, these attacks are most threatening to shared hosting environments, where multiple users are all capable of executing code on a single system. As a result, cloud service providers like Amazon and Microsoft have already deployed attack mitigation efforts to their services. Individual systems and devices, by comparison, have a much lower practical risk. An attacker still needs to get malicious code executing on an individual system before they can run an attack; and if an attacker can do that, then an individual system is already in a bad position to begin with.

Meltdown and Spectre can be mitigated in software: Because the root issues at the heart of Meltdown and Spectre are at the hardware level, ideally, that hardware needs to be replaced. As replacing 20 years of systems isn’t remotely practical however, like other CPU errata it can be mitigated in a combination of CPU microcode and operating system updates. Vendors like Microsoft, Apple, and the Linux distros are already in the process of rolling out some of these fixes, including an ultra-rare out of band security update from Microsoft Wednesday evening.

…but it will take some time: However because information on the exploit was released earlier than planned, not all mitigation efforts are ready. Full mitigation requires both software and microcode updates, and as Intel has noted in their own announcements, mitigation efforts will take days and weeks.

Mitigating Meltdown will have a variable performance impact: In a nutshell, the mitigation efforts for Meltdown involve better separating user space programs from the OS kernel. As a result, context switches between the user space and the kernel will get more expensive. However the actual performance impact of this process is going to vary with the workload and the CPU architecture.

It’s a bad week to be Intel: Intel has been the dominant CPU supplier for cloud service providers, whom in turn are at the greatest risk from Meltdown and Spectre. Coupled with the fact that Intel is far more broadly affected by the more pressing Meltdown attack than any other vendor, and this means that Intel’s high margin server customers are shouldering the brunt of the risk of these attacks. And that in turn reflects poorly on Intel.

Meanwhile, as for what CPUs are affected, Intel's most recent note confirms that everything Core architecture-based back to the first generation (Nehalem) is affected. Note that this doesn't exclude earlier processors either.

It’s a better week to be AMD (but not great): Conversely, AMD is having a much better week. As best as can be determined, their CPUs aren’t vulnerable to Meltdown attacks – the only vendor among the Big 3 not to be impacted by it. And with Meltdown being the more pressing risk, this means that AMD and its users aren’t nearly as exposed as Intel users. However AMD’s CPUs are still vulnerable to Spectre attacks, and in the long run there are still a lot of unknowns about how dangerous Spectre really is, and how well it can be mitigated.

Things We Don’t Know

When the full mitigation updates will be available for any one platform: As noted above, rolling out updates to mitigate the exploits will take days and weeks. On the PC side of matters, Microsoft’s update is just one piece of the puzzle – though arguably the most important one – as it can mitigate Meltdown without a microcode update. However we’re still waiting on microcode updates to better mitigate Spectre on Intel processors (and it looks that way for AMD processors as well).

Otherwise on the mobile side of manners, Google has announced that they’ve already rolled out Android updates with ARM’s recommended mitigations to supported Nexus and Pixel devices, but these updates don’t include all of the necessary upstream fixes from the Linux kernel.

The performance impact of these mitigations is unclear: Following up on the point about mitigating Meltdown, it’s not clear what the full performance impact of this will be. Individual operations and workloads could potentially be upwards of 30% slower, but it heavily depends on how often a task is context switching to the kernel. I expect the average real-world impact to be less, particularly for desktop users. However server users and their unique-but-narrowly-focused workloads could be much more affected if they're particularly unlucky.

Meanwhile, the performance impact of Spectre mitigations is even less understood, in part because efforts to mitigate Spectre are ongoing. Based on what the hardware vendors have published, the impact should be minimal. But besides the need for empirical testing, that could change if Spectre requires more dramatic mitigation efforts.

On our end we’re still waiting on some microcode updates, and even then it’s a process of trying to figure out what an average case will even entail. It’s definitely something we intend to dig into in true AnandTech style once we have the requisite updates.

Finally, for their part, Microsoft's Azure group has published their own figures in their advisory to customers, noting that the performance impact they've found thus far is quite limited.

The majority of Azure customers should not see a noticeable performance impact with this update. We’ve worked to optimize the CPU and disk I/O path and are not seeing noticeable performance impact after the fix has been applied. A small set of customers may experience some networking performance impact.

It’s not clear just what the full security ramifications of Spectre are: While Meltdown is the more immediate threat, how it works and how to mitigate it are fairly well documented. Spectre however is a definite wildcard right now. There are multiple proof of concept attacks as it stands, but more broadly speaking, Spectre attacks are a new class of attacks not quite like anything vendors have seen before. As a result no one is completely confident that they understand the full security ramifications of the exploit. There is a risk that Spectre attacks can be used for more than what’s currently understood.

It’s also not clear just how well Spectre can be mitigated: The corollary to not fully understanding the attack surface of Spectre is that defending against it is not fully understood either. The researchers behind the attack for their part are not convinced that software or microcode updates are enough to fully resolve the problem, and are advising that they should be treated as stop-gap solutions for now. Specific types of Spectre attacks can be mitigated with care, but those protections may not help against other types of Spectre attacks. It’s an area where a lot more research needs to be done.

What Can Users Do Right Now?

Finally, there’s the question of what system and device owners should do about the Meltdown and Spectre attacks. The fundamental weakness that allows these speculative execution attacks is in the hardware itself, so short of replacing devices, the problem cannot be truly solved. The only thing that can be done right now is to mitigate the problem with software and microcode patches that attempt to work-around the problem.

The solution then is a double-edged sword for users: there’s not much they can do, but there’s also not much they have to do. The software and microcode updates to mitigate these exploits will be distributed as software updates, so keeping your systems and mobile devices up-to-date with the latest OS version is the single most important step one can take. As mentioned earlier, everyone has or is in the process of rolling out the necessary software updates to help mitigate this. The flip side is that, short of a focused ion beam tool, there’s not much else for users to do. This is a problem whose resolution is going to be vendor-driven.

More to Come

Be sure to check back later today for some additional information on these attacks. In particular, I’m looking to further explore the importance of speculative execution, and why an attack against it may have some significant ramifications for CPU designs down the line. Speculative execution is a rather important feature for boosting performance on modern processors, but as these attacks have shown, even when (seemingly) ideally implemented, it can have some security tradeoffs. Which in this current era of computing, may no longer be acceptable tradeoffs.

I’ll also be taking a deeper look at Meltdown and Spectre, and how although both of them are attacks on speculative execution, they are very different kinds of attacks. Meltdown is the biggest concern right now due to its relative ease in exploiting it. But as the authors of the Spectre paper note, Spectre is a more fundamental vulnerability that will be with us a lot longer. Just what it means for system security remains to be seen, but from a computer science point of view, I expect these latest discoveries are going to have a big impact on how future CPUs are designed.

Comments Locked

210 Comments

View All Comments

  • Cumulus7 - Wednesday, January 10, 2018 - link

    Exactly! Now we need another server-CPU-shootout. All old test data suddenly seems obsolete....
  • Round - Monday, January 8, 2018 - link

    Sigh.. Too bad that the word "performant" gets used in an article intended for mass consumption.. It wasn't a word 10 years ago, and it still isn't a word today, irrespective of those that think it is....
  • twtech - Thursday, January 11, 2018 - link

    The hack method central to both of these attacks is really pretty ingenious, and reminds me of something Sherlock Holmes might have come up with, if he worked in information security.

    This is essentially Meltdown in a nutshell:

    Accessing data from main memory is a (comparatively) really slow operation for a CPU. For that reason, the CPU has a small amount of memory on-chip called cache where it can store the most recently used values.

    A computer program is a set of sequential instructions that tell the CPU what to do, in order. So if the processor needs to load a value from memory, normally it would wait until it reached that instruction, issue a load request, and then wait for a significant amount of time for the data to come in over the bus.

    That's not very efficient, so a feature called "speculative execution" was added. With speculative execution, in addition to executing the current instruction, the processor will also essentially look ahead beyond that point to future instructions, and see what operations it might be able to get started early.

    Since normal execution not actually there yet, it's not even guaranteed that the speculative work it's doing will actually be used. The instructions may contain a branch - which is essentially, "skip past these instructions", or "skip over these, but do those instead" - and the CPU doesn't know yet which way it will go. So rather than waiting to find out, the CPU just speculatively executes them anyway. If they turn out to be unused, the CPU will discard the unused values and everything continues on as if the speculative execution had never happened.

    Or so they thought. With Meltdown, the attacking software gets the CPU to speculatively load some data from a memory location the program isn't allowed to access, placed inside a branch that will never actually execute. So that causes the desired value to be loaded into cache, and since the program never actually executes the instructions with the invalid load, the program doesn't crash.

    But what good is that to the hack program if it can never actually inspect the value that it caused to be loaded into cache, right? That's where the ingenious trick comes in. In addition to loading in the value from a memory address it can't access, it further has the CPU speculatively load in a second value, from a memory address that the program is allowed to access. The memory location of that second variable however is chosen based on the contents of the first variable.

    Because we know that loading a value from cache is much faster than loading a value from main memory, the attacking program can then test and figure out which memory location was chosen to be the 2nd one loaded in, by timing how long the operation takes.

    Since that second memory location was chosen based on the value of the data in the first memory location, the program has now has gained some information about what is stored in the first memory location that it's not supposed to have. And due to the magic of computers being extremely fast and predictable, the attacking program can repeat this process many times over to essentially indirectly read the contents of protected memory that should only be readable by the OS.
  • thuckabay - Friday, January 12, 2018 - link

    This is NOT spam, but a legitimate comment AnandTech: I am not willing to sacrifice the kind of performance noted for my Windows 7 laptop running a Sandy Bridge i7 CPU. That is stupid, especially given that there really is NO threat. Now that so many systems are going to be updated, there is little reason for any scumbags to try to exploit these vulnerabilities, IMO. From my perspective, the cure is far worse than the disease, especially on older hardware / OS combinations. It just is not worth it. So, I believe Microsoft should make a way to have these patches be OPTIONAL and AVOIDABLE and UNINSTALLABLE. This is crap!
  • UtilityMax - Sunday, January 14, 2018 - link

    So go sue Intel or turn off the updates and run your Windows 7 unpatched. doh
  • linuxgeex - Tuesday, January 23, 2018 - link

    "More to Come
    Be sure to check back later today for some additional information on these attacks. In particular, I’m looking to further explore the importance of speculative execution, and why an attack against it may have some significant ramifications for CPU designs down the line."

    That was 18 days ago. You encouraged us to come back later the same day. Where's the update?
  • Rοb - Wednesday, January 24, 2018 - link

    @linuxgeex - I did some searching and the patches have been the subject of much criticism.

    There's been some systems rendered unbootable and some rebooting.

    Once successfully applied many benchmarks experience an insignificant change while some are over 20% slower.

    Many patches have been rolled back or withdrawn.

    Linus Torvalds has given Intel heck in a post over the weekend ...

    Apple claims they've had success ...

    ---

    One can only hope that the relative silence and possibly slower rollout of AMD's Zen 2 (Epyc 7 nm) is related to thoroughly testing and investigating that moving forward they will be immune and have the necessary isolation in place to run multiple processes with proper compartmentization.

    As the author stated at the beginning of his article, he's suffering a meltdown and haunted by spectres.

    🔥+👻=🌋🌀
  • HStewart - Monday, March 5, 2018 - link

    According to eWeek (3/2/2018), Microsoft and I assume Intel has released official patches for Windows 10 for Meltdown and Spectre

    http://www.eweek.com/security/microsoft-resumes-is...

    Hopefully this is end of this problem.

Log in

Don't have an account? Sign up now