AMD Zen Microarchitecture: Dual Schedulers, Micro-Op Cache and Memory Hierarchy

AMD Zen Microarchitecture: Dual Schedulers, Micro-Op Cache and Memory Hierarchy(anandtech.com)

190 points by dineshp2 9 years ago | 131 comments

geertj 9 years ago |

SEV (Secure Encrypted Virtualization, [1]) is a hugely interesting feature that will be available with Zen. Once it's mature and perfected, it would allow you to securely run a VM in the cloud that is protected against someone who controls the hypervisor. And you'd also be able to attest that indeed you're running in such a protected VM.

How do you protect against someone controlling the hypervisor? Read the paper. But the high level is to encrypt memory using keys that cannot leave the processor and are only available to a specific VM ASID (Address Space Identifier), assisted by a secure firmware similar to the Secure Enclave. Attestation uses an on-chip certificate signed by an AMD master key during fabrication.

There were some discussions on this on the linux-kernel mailing list [2]. As I understand it, the current generation of SEV is still somewhat leaky, but there's no fundamental reason why those leaks cannot be closed.

[1] http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/... [2] http://www.mail-archive.com/linux-doc@vger.kernel.org/msg025...

AnthonyMouse 9 years ago | |

I still don't understand how this is ever supposed to work. Generally when someone finds a vulnerability, you take countermeasures or take the system offline until it can be patched (or apply the patch immediately).

With this, the party in control of the system is also in control of that, so every time a new vulnerability is found they can exploit it before patching it to retroactively get access to your data. Or never patch it at all and use the vulnerability itself to forge attestations that the vulnerability is patched.

sliverstorm 9 years ago | | |

It might not make your guest truly impervious, but it certainly raises the bar for your bad actor host.

Depending on how determined you imagine your bad actor host, you can probably never get around things like "zero day is discovered, host disconnects guest from internet preventing you from patching zero day, exploits guest".

Or are you talking about vulnerabilities in SEV itself?

urza 9 years ago | |

It reminds me of Permutation City, the scifi book by Greg Egan.. encrypted realities inside other realities.. :)

nitrogen 9 years ago | | |

See also Rainbows End by Vernor Vinge, where a secure encrypted sub-CPU plays a role.

spaceheeder 9 years ago | |

In addition to cloud VMs, I wonder what applications this might have on local systems. Systems that boot from encrypted partitions and can't have the keys recovered by cold boot attacks? Secure graphics acceleration of different guests in a Qubes system? etc.

geertj 9 years ago | | |

I believe that the enabling technology for SEV (SME - Secure Memory Encryption) would indeed protect against cold boot attacks. The SME keys are not stored in memory themselves and therefore once the CPU reboots and the SME keys are erased, the memory contents are lost forever.

fulafel 9 years ago | | |

DRM and keeping the user from rooting their PC. This is like MS/Intel Trusted Computing on steroids.

sangnoir 9 years ago | | |

> In addition to cloud VMs, I wonder what applications this might have on local systems

Here's one application for the red team: AV-resistant malware, rootkits and next generation APTs

lancemjoseph 9 years ago | |

I thought homomorphic encryption was supposed to fill the niche of allowing one to securely run VMs in a cloud environment. I've not heard of serious progress on this front the last time I went looking. Will we always require hardware with a "secure enclave"-like device to safely store keys in a public cloud? Is it possible to implement this scheme purely in software or is some "trusted" hardware always necessary?

MertsA 9 years ago | | |

Forewarning, I am by no means an expert on anything that follows.

Homomorphic encryption would allow for "true security" where the party doing the computation doesn't ever have the encryption keys necessary to see what data they're operating on. This is something more akin to a TPM. The key that can read all of the data is in the possession of the party doing the computation, but it's stored in the CPU and the CPU will not give that key to anyone. Theoretically the key could be read off of the CPU but in practice this would require either a flaw, sidechannel, or a lot of time with an electron microscope.

For practical purposes, I believe that all implementations of secure cloud computing are going to be like this where the key is just secured physically. It's possible with homomorphic encryption to have someone securely do computations on data that they can't see all in software, but I just don't see any major breakthroughs happening that would make this fast enough to be practical.

Dylan16807 9 years ago | | |

Homomorphic encryption is currently hilariously slow as I understand it, and even if you solve that it can't branch on data. All paths have to be evaluated and summed.

api 9 years ago | |

> "it would allow you to securely run a VM in the cloud that is protected against someone who controls the hypervisor"

> Attestation uses an on-chip certificate signed by an AMD master key during fabrication.

This is absolutely fantastic for security in the cloud, but it is important to note that this will not protect against nation state level actors.

Rest assured that the USG will obtain the AMD master signing key with or without AMD's permission. Other nation states may do likewise. The rest will have to wait for a leak, and if that key is leaked this feature will become almost nonexistent.

tcoppi 9 years ago | |

This is interesting. The most compelling use case IMO is protection against cold boot attacks rather than virtualization, at least until SEV has been proven empirically to do what they claim. Virtualization security is hard to get right in general and adding another layer of complexity probably won't help in the short term.

emn13 9 years ago | | |

Even if it turns out to be leaky, it could still be a big deal: I think it's fair to say that the greatest cloud risk isn't actively and persistently hostile providers - mostly because that sounds like an almost hopeless task. A more realistic risk is that via a VM-breakout or other hack hostile code manages to run on the hypervisor or to at least indirectly influence the hypervisor and other VMs. And that kind of code may well be harder to exploit with even slightly leaky encryption. A hacked hypervisor may not be entirely under the control of the hacker; or breaking the encryption may cause side-effects (such as instability) that causes watchers to take note; or it may simply be quite complex and require case-by-case exploits that are generally impractical.

Even a less that perfect protection from the hyper-visor may still have some value.

I'd be more worried about the performance overhead, personally - I can't imagine using this if the impact is significant, and it seems like it almost has to be.

derefr 9 years ago | |

Sounds like something that would make the claims made by https://en.wikipedia.org/wiki/Denuvo copy-protection, actually plausible.

mtgx 9 years ago | |

Is SME like Intel's SGX?

anonymousDan 9 years ago | | |

Yup. I'm not too up to date on the specifics, but one advantage I remember reading from the paper over SGX is that you can run programs with raised privileges, whereas in SGX enclaves can only have user privileges.

anonymousDan 9 years ago | |

Do you know is there hardware available with SEV capabilities yet? Or is there even a roadmap/timeline for its release somewhere? Would be really interested to get my hands on it.

arcanus 9 years ago |

I've been recently reading reports from some of my banking friends (and actually chatted with some folks I know at AMD) because I'm curious about AMD's turnaround. Even just last year AMD looked to be in very dire straights, and are still operating at a loss.

However, they seem to have a strong technical pipeline and they have historically punched above their weight-class. Does it look like they are going to make it?

snuxoll 9 years ago |

Not having a unified L3 cache is an interesting choice, I can see how it would significantly reduce the cost of the chip and considering many multi-threaded workloads are operating on separate chunks of data chances are it shouldn't incur a noticeable performance penalty (especially in virtualization workloads, I'm interested to see what their 32-core server chip ends up looking like).

gpderetta 9 years ago | |

On the other hand an unified (inclusive) L3 cache helps with maintaining cache coherency, which need to be explicitly handled in a non-unified design.

I guess a big benefit of the separate caches is that if only half cores are in use, you can power half of it down, saving power and TDP.

sliverstorm 9 years ago | | |

A unified L3 is expensive in a number of ways. It is large, which means it is geographically remote, as well as slow (for caches, big == slow). This costs lots of access latency.

It also has a bandwidth problem. If 64 threads are vying for access, you either build it with few access ports and it gets choked, or you build it with many access ports which is costly in area, power, & speed.

Two separate peer caches automatically have twice the bandwidth of one similar double-size cache, for the price of NUMA & cache coherency challenges.

There is no one right answer here. Bandwidth is far more important and coherency much easier in a small L1; as you go down the hierarchy, bandwidth needs shrink and coherency is more expensive.

tcoppi 9 years ago | | |

Cache coherency might not necessarily be an issue if they treat each Cores->L3 pair as a NUMA node. I don't think they are doing that since we probably would have already heard about it if they are, but AMD has done crazier things before, and they are pretty good at NUMA architectures.

tcoppi 9 years ago | |

It could even help with low threadcount workloads since the L3 will presumably be able to be fewer cycles away from each core than a unified last level cache would be.

tcoppi 9 years ago |

The timeframe is slightly disappointing since I think a lot of people were expecting Q3/Q4 2016.

The architecture itself sounds pretty much like what everyone was expecting, a traditional fat and wide core. Their power management and foundry process will probably make the difference as to whether final performance is impressive or not, may also be the cause of the delay.

BlackMonday 9 years ago | |

AMD stated for a while (march or so) that they maybe have small shipments in december, but the bulk of shipments will realy only start in Q1 2017.

Anyway, the first benchmark is promising, and I hope Zen can also keep up with Broadwell performance in other benchmarks/workloads, as well as in power efficency.

maerek 9 years ago |

As someone who is fascinated by articles like this one, but doesn't have a background in CE/EE, any recommendations for literature/classes I could take so that I can better understand the topics being discussed?

dr_zoidberg 9 years ago | |

Read this link: http://www.lighterra.com/papers/modernmicroprocessors/

It's a good mix between high-level and highly-detailed.

paulmd 9 years ago | |

I very much recommend Agner Fog's Microarchitecture. It's a rather ponderous tome, but it is quite simply the definitive resource on the actual design and performance of real-world x86 CPUs.

It does have a brief introduction on some of the basic execution fundamentals but then it jumps right in, so you will probably need some external introduction if you are not generally familiar with the topic.

http://www.agner.org/optimize/microarchitecture.pdf

mark-r 9 years ago |

I've been a big booster of AMD for a long time, but recently the performance/power is so much in Intel's favor that I've been forced to use Intel for my last couple of PCs. I hope Zen makes them competitive again.

Zardoz84 9 years ago |

The most important thing for me. Zen cores have the AMD equivalent of Intel AMT ? (I don't remember the name).

If it haves it, I would avoid it like a pest, and get an FX-8370 or 8350 to replace my now aging FX-4100. The last thing that I like to have on my computer is a hidden uncontrollable CPU doing things that could affect to my privacy.

milcron 9 years ago | |

Unfortunately, it seems impossible to acquire a modern x86/x64 chip without such hidden firmware. The last Intel CPUs without it are from 2008, and the last AMD CPUs without it are from 2013.

If you can tolerate using a different CPU architecture, Raptor Engineering's Talos Secure Workstation looks very intriguing. https://www.raptorengineering.com/TALOS/prerelease.php

rasz_pl 9 years ago | | |

>last Intel CPUs without it are from 2008

and those have cpu wide ring0 escalation bug https://www.blackhat.com/docs/us-15/materials/us-15-Domas-Th...

imtringued 9 years ago | | |

Why is the power8 architecture not more widely used? The performance is competitive with the intel's Xeon series and for memory or IO bound workloads a power8 CPU is often superior.

SXX 9 years ago | |

What you talking about is Platform Security Processor (PSP) and according to libreboot website it's built in all their CPUs released since 2013.

snuxoll 9 years ago | | |

Libreboot is inaccurate on this one, as far as I can tell this is only on the Puma chips right now, but it's likely it will make it's way to other chips as they come out. Newer Bulldozer-derived (Family 15h) cores made today such as the current FX lineup do not have the PSP.

snuxoll 9 years ago | |

AMD has the Platform Security Processor in the Puma chips (Mullins and Beema) which serves the same purpose as the Intel ME (Management Engine). I would not be surprised if this made its way into Zen.

bitL 9 years ago |

I just wish AMD made drivers for Win 7 as well - then I could switch from 4-core 4790k/32GB to 8-core ZEN/64GB ECC and keep using all the Adobe video editing stuff.

clevernickname 9 years ago | |

Citation needed. There would have been a huge stink if AMD had stopped supporting Windows 7 already.

bitL 9 years ago | | |

http://techreport.com/news/29611/win10-will-be-the-only-wind...

akerro 9 years ago |

Should I wait for Zen or buy i7 now?

toast0 9 years ago | |

i7 doesn't tell you anything useful for comparison with Zen. Ask more about if you should wait for Zen or buy Skylake. Or if you should wait for Zen or Kaby Lake or Cannonlake.

i7 just says you're going to get the top of the performance (and price) list for a desktop/mobile processor.

RussianCow 9 years ago | |

Nobody knows if Zen will actually live up to expectations yet, so I would wait until reviews and benchmarks come out, and base your decision on that. I made the mistake of ordering a Bulldozer CPU as soon as they came out.

mtgx 9 years ago | |

Wait. It should be a good chip performance wise, and great value for the money, too. Also, the more we put pressure on Intel, the better it is for everyone in the long term.

bitL 9 years ago |

Am I the only person whom the Zen logo makes cringe? They shouldn't copycat Intel Inside logo there. Really bad taste...

evanriley 9 years ago | |

It's not copying the Intel Inside log, although I'll agree they look _similar_ its most likely based off of the Ensō[0], which is part of _Zen_ Buddhism.

[0] https://en.wikipedia.org/wiki/Ens%C5%8D