AMD Unveils Ryzen 9000 CPUs for Desktop, Zen 5

AMD Unveils Ryzen 9000 CPUs for Desktop, Zen 5(anandtech.com)

334 points by rx_tx 2 years ago | 292 comments

buildbot 2 years ago |

AVX512 in a single cycle vs 2 cycles is big if the clock speed can be maintained at all near 5GHz. Also doubling of L1 cache bandwidth is interesting! Possibly, needed to actually feed an AVX512 rich instruction stream I guess.

adrian_b 2 years ago | |

For most instructions, both Intel and AMD CPUs with AVX-512 support are able to do two 512-bit instructions per clock cycle. There is no difference between Intel and AMD Zen 4 for most 512-bit AVX-512 instructions.

I expect that this will remain true for Zen 5 and the next Intel CPUs.

The only important differences in throughput between Intel and AMD were for the 512-bit load and store instructions from the L1 cache and for the 512-bit fused multiply-add instructions, where Intel had double throughput in its more expensive models of server CPUs.

I interpret AMD's announcement that now Zen 5 has a double transfer throughput between the 512-bit registers and the L1 cache and also a double 512-bit FP multiplier, so now it matches the Intel AVX-512 throughput per clock cycle in all important instructions.

Aardwolf 2 years ago | | |

> There is no difference between Intel and AMD Zen 4 for most 512-bit AVX-512 instructions.

Except for the fact that Intel hasn't had any AVX-512 for years already in consumer CPUs, so there's nothing to compare against really in this target market

tempnow987 2 years ago | | |

The difference is intel chips that support AVX-512 run $1,300 - $11,000 with MUCH higher total system costs whereas AMD actually DOES support AVX-512 on all it's chips and you can get AVX-512 for dirt cheap. The whole intel instruction support story feels garbage here. Weren't they the ones to introduce this whole 512 thing in the first place?

DEADMINCE 2 years ago | | |

> The only important differences in throughput between Intel and AMD

Not exactly related, but AMD also has a much better track record when it comes to speculative execution attacks.

xattt 2 years ago | | |

I see the discussion of instruction fusion for AVX512 in Intel chips. Can someone explain the clock speed drop?

camel-cdr 2 years ago | |

AVX512 was never over 2 cycles. In Zen4 it used the 256 wide execution units of avx2 (except for shuffle), but there are more then one 256-bit wide execution units, so you still got your one cycle throughput.

dzaima 2 years ago | | |

More importantly for the "2 cycles" question, Zen 4 can get one cycle latency for double-pumped 512-bit ops (for the ops where that's reasonable, i.e. basic integer/bitwise arith).

Having all 512-bit pipes would still be a massive throughput improvement over Zen 4 (as long as pipe count is less than halved), if that is what Zen 5 actually does; things don't stop at 1 op/cycle. Though a rather important question with that would be where that leaves AVX2 code.

SomeoneFromCA 2 years ago | |

I wish it had AVX512 Fp16.

api 2 years ago | |

At what point do these become competitive with GPUs for AI cost wise if GPUs retain their nutty price premium?

bloaf 2 years ago | | |

I’ve been running some LLMs on my 5600x and 5700g cpus, and the performance is… ok but not great. Token generation is about “reading out loud” pace for the 7&13 B models. I also encounter occasional system crashes that I haven’t diagnosed yet, possibly due to high RAM utilization, but also possibly just power/thermal management issues.

A 50% speed boost would probably make the CPU option a lot more viable for home chatbot, just due to how easy it is to make a system with 128gb RAM vs 128gb VRAM.

I personally am going to experiment with the 48gb modules in the not too distant future.

hajile 2 years ago | |

Does Zen5 do FP math in a single cycle?

dzaima 2 years ago | | |

Almost certainly Zen 5 won't have single-cycle FP latency (I haven't heard of anything doing such even for scalar at modern clock rates (though maybe that does exist somewhere); AMD, Intel, and Apple all currently have 3- or 4-cycle latency). And Zen 4 already has a throughput of 2 FP ops/cycle for up to 256-bit arguments.

The thing discussed is that Zen 4 does 512-bit SIMD ops via splitting them into two 256-bit ones, whereas Zen 5 supposedly will have hardware doing all 512 bits at a time.

dhx 2 years ago |

How are the 24x PCIe 5.0 lanes (~90GB/s) of the 9950X allocated?

The article makes it appear as:

* 16x PCIe 5.0 lanes for "graphics use" connected directly to the 9950X (~63GB/s).

* 1x PCIe 5.0 lane for an M.2 port connected directly to the 9950X (~4GB/s). Motherboard manufacturers seemingly could repurpose "graphics use" PCIe 5.0 lanes for additional M.2 ports.

* 7x PCIe 5.0 lanes connected to the X870E chipset (~28GB/s). Used as follows:

  * 4x USB 4.0 ports connected to the X870E chipset (~8GB/s).

  * 4x PCIe 4.0 ports connected to the X870E chipset (~8GB/s).

  * 4x PCIe 3.0 ports connected to the X870E chipset (~4GB/s).

  * 8x SATA 3.0 ports connected to the X870E chipset (some >~2.4GB/s part of ~8GB/s shared with WiFi 7).

  * WiFi 7 connected to the X870E chipset (some >~1GB/s part of ~8GB/s shared with 8x SATA 3.0 ports).

irusensei 2 years ago |

I'll probably wait one or two years before getting into anything with DDR5. I've blew some money on an AMD laptop in 2021. At the time it was a monster with decent expansion options: RX 6800m, Ryzen 9 5900HX. I've stuck it with maximum 64GB DDR4 and 2x 4TB psi 3.0 nvme. Runs Linux very well.

But now I'm seeing lots of things I'm locked out. Faster ethernet standards, the fun that brings with tons of GPU memory (no USB4, can't add 10Gbe either), faster and larger memory options, AV1 encoding. It's just sad that I bought a laptop right before those things were released.

Should had go with a proper PC. Not doing this mistake anymore.

isoos 2 years ago | |

It sounds like you need a desktop workstation with replaceable extension cards, and not a mostly immutable laptop, which has different strengths.

irusensei 2 years ago | | |

Agreed but it will need to wait for now.

worthless-trash 2 years ago | |

You will find that this is the cost of any laptop, any time you buy it there is always new tech around the corner and there isn't much you can do about it.

simcop2387 2 years ago | | |

(disclosure I own a 13in one)

Yea closest I see to being better about it is Frame.work laptops, and even then it's not as good a story as desktops, just the best story for upgrading a laptop right now. Other than that buying one and making sure you have at least two thunderbolt (or compatible) ports on separate busses is probably the best you can do since that'd mean two 40Gb/s links for expansion even if it's not portable, but would let you get things like 10GbE adapters or fast external storage and such without compromising too much on capability.

kmfrk 2 years ago | |

Waiting for CAMM2 to get wider adoption could be interesting:

https://x.com/msigaming/status/1793628162334621754

Hopefully won't be too long now.

Delmololo 2 years ago | |

Either it was a shitty investment from the beginning or you actually use it very regularly and it would be worth it anyway to slowly thinking about something new.

gautamcgoel 2 years ago |

Surprisingly not that much to be excited about IMO. AMD isn't using TSMC's latest node and the CPUs only officially support DDR5 speeds up to 5600MHz (yes, I know that you can use faster RAM). The CPUs are also using the previous-gen graphics architecture, RDNA2.

mmaniac 2 years ago |

AMD seem to be playing it safe this with this desktop generation. Same node, similar frequencies, same core counts, same IOD, X3D chips only arriving later... IPC seems like the only noteworthy improvement here. 15% overall is good but nothing earth shattering.

The mobile APUs are way more interesting.

TacticalCoder 2 years ago | |

> 15% overall is good but nothing earth shattering

Interestingly though the 9700X seems to be rated at 65W TDP (compared to a 105 TDP for the 7700X). I run my 7700X in "eco mode" where I lowered the TDP to max 95 W (IIRC, maybe it was 85 W: I should check in the BIOS).

So it looks like it's 15% overall more power with less power consumption.

ArtTimeInvestor 2 years ago |

Do all CPUs and GPUs these days involve components made by TSMC?

If so, is this unique - that a whole industry of relies on one company?

microtonal 2 years ago | |

Arguably that single company is ASML. There are more fabs (e.g. Intel), but AFAIK cutting-edge nodes all use ASML EUV chip fabrication machines?

Cu3PO42 2 years ago | |

Intel still fabs their own CPUs, their dedicated Xe graphics are made by TSMC, though.

Nvidia 30-series was fabbed by Samsung.

So there is some competition in the high-end space, but not much. All of these companies rely on buying lithography machines from ASML, though.

Wytwwww 2 years ago | | |

>Intel still fabs their own CPUs

Isn't Lunar Lake made by TSMC? Supposedly they have comparable efficiency to AMD/Apple/Qualcomm at the cost of making their fab business even less profitable

unwind 2 years ago | |

As far as I know, Intel is still very much a fab company.

pjc50 2 years ago | |

This is probably a lot more common than you might think. How much of an "entire industry", or indeed industry as a whole, relies on Microsoft?

icf80 2 years ago | |

wait till china invades

brokencode 2 years ago | | |

Good thing TSMC is getting billions of dollars of government subsidies to build fabs all around the world including in the US and Japan.

preisschild 2 years ago | | |

As long as Biden wins the election, they won't.

Because the US will defend Taiwan.

gattr 2 years ago |

I normally rebuild my workstation every ~4 years (recently more for fun than out of actual need for more processing power), might finally do it again (preferably a recent 8-/12-core Ryzen). My most recent major upgrade was in 2017 (Core i5 3570K -> Ryzen 7 1700X), with a minor fix in 2019 (Ryzen 7 2700, since 1700X was suffering from the random-segfaults-during-parallel-builds issue).

Night_Thastus 2 years ago | |

Same. I'm on a 10700k and thinking on an upgrade. I'll wait for X3D parts to come out (assuming they're doing that this gen, not sure if we've got confirmation), and compare vs 15th gen Intel once it's out in like September-ish.

tracker1 2 years ago | |

Might be worth considering a Ryzen 9 5900XT (just launched as well) for a drop in upgrade. Been running a 5950X since close to launch and still pretty happy with it.

mananaysiempre 2 years ago | | |

Would it really be smart to build an AM4 desktop at this point though?

ComputerGuru 2 years ago |

Interesting unveil; it seems everyone was expecting more significant architectural changes though they still managed to net a decent IPC improvement.

In light of the "very good but not incredible" generation-over-generation improvement, I guess we can now play the "can you get more performance for less dollars buying used last-gen HEDT or Epyc hardware or with the newest Zen 5 releases?" game (NB: not "value for your dollar" but "actually better performance").

tripdout 2 years ago |

7900X equivalent 9900X has 120W TDP as opposed to 170W. Is it that much more power efficient?

Voultapher 2 years ago | |

That 170W TDP was chasing the benchmark crown. The 7950X lost <5% MT perf when run at 120W.

canucker2016 2 years ago | | |

The all-core max clock speed costs a lot of watts for that last 10% - close to 50% of total CPU consumption in some cases.

That's why undervolting has become a thing to do (unless you're an Intel CPU marketer) - give up a few percent of your all-core max clock rate and cut your wattage used by a lot.

GordonS 2 years ago | | |

I wonder then if the 9900X could be undervolted without losing more than a few percentage points of speed?

sylware 2 years ago |

Improving avx512 is really good, since this is the data unit sweet spot: a cache line is 512 bits.

But I am more interested in the cleanup of the GPU hardware interface (it should be astonishingly simple to program the GPU with its various ring buffers, like as it is rumored to be the case on nvidia side) AND in the squishing of all hardware shader bugs: look at valve ACO compiler erratas in mesa, AMD hardware shader is a bug minefield. Hopefully, the GFX12 did fix ALL KNOWN SHADER HARDWARE BUGS (sorry, ACO is written with that horrible c++, I dunno what went thru the head of valve and no, rust syntax is a complex as c++, then this is toxic too).

Sparkyte 2 years ago |

Huge improvement over my 5k series processor.

Pet_Ant 2 years ago |

Does anyone have metrics for some sort of benchmark per transistor? I wonder where the sweetspot is between something like these and SERV [1]. Obviously GPU wins on parallel math, but I wonder how much we are chasing in terms diminishing returns.

[1] https://www.youtube.com/watch?v=GSHasXHvZaQ

adriancr 2 years ago |

Finally, was waiting for this for a new build and it seems decent upgrade.

nubinetwork 2 years ago |

I'd love to replace my 2950x, but most desktop motherboards still don't support 128gb of memory.

mkl 2 years ago | |

No, a lot support 128GB. Some support 192GB or even 256GB: https://skinflint.co.uk/?cat=mbam5&xf=317_X670E&asuch=&bpmin...

lldb 2 years ago | |

You can run 4x32 DDR5 on current gen AMD consumer platforms but don't expect speeds above 4400MHz in quad-channel regardless of what the module is rated for. I'd instead suggest dual channel 48GB DIMMs for 96GB at full speed.

IamFr0ssT 2 years ago | | |

4 DIMMs does not equal Quad-channel. I see in AMD's presentation Quad-channel is supported by chipsets, but I am not aware of a current AMD consumer/hedt chip with 4 memory channels.

erinnh 2 years ago | |

I just had a look and the comparison website I use says there are 113 motherboards that support 128GB or more memory for AM5. (in my local market. so the US probably has even more)

128GB isn't exactly a lot, so that would surprise me if it wasnt supported.

nubinetwork 2 years ago | | |

I think for a while there, the only way to be able to use 128gb was to go TR4 or TRX... I kindof stopped looking for a while, but 100+ boards is certainly a nice change.

DEADMINCE 2 years ago | | |

> I just had a look and the comparison website I use

Which website is that?

Tepix 2 years ago |

It's weird that CPUs tend to use more power every two years now instead of less. Looks like this time, the TDP didn't increase. Yay! i guess. Don't most people think their PCs are fast enough already?

It's as if our planet wasn't being destroyed at a frightening speed. We're headed towards a cliff, but instead if braking, we're accelerating.

mmaniac 2 years ago | |

The people who think PCs are already fast enough don't buy CPUs every year.

A 7950X in Eco mode is ridiculously capable for the power it pulls but that's less of a selling point.

hajile 2 years ago | | |

I think there's a market for a 7950E with a specific binning for lower power rather than highest possible clocks.

adham-omran 2 years ago | | |

I have a 7950X which I run in Eco mode with air cooling and it's amazingly quiet when not under load and rips when it needs to, I don't see a reason to upgrade any time soon.

dotnet00 2 years ago | |

The TDP increases slower than the efficiency, so for the same workload, a newer CPU consumes similar or less power. Especially nowadays, most of these chips only get anywhere near the stated TDP when being near fully maxed out.

jiggawatts 2 years ago |

I wonder when the PC world will transition to unified memory and more monolithic architectures as seen in the Apple M-series chips, but also in the NVIDIA GB200 "superchip", the AMD MI300 accelerator, the X-Box and Playstation consoles, and even in some mobile phone system-on-a-chip designs. It feels like the PC is the "last holdout" of discrete upgradeable components with relatively low bandwidth between them.

Sooner or later, AI will need to run on the edge, and that'll require RAM bandwidths measured in multiple terabytes per second, as well as "tensor" compute integrated closely with CPUs.

Sure, a lot of people see LLMs as "useless toys" or "overhyped" now, but people said that about the Internet too. What it took to make everything revolve around the Internet instead of it being just a fad is broadband. When everyone had fast always-on Internet at home and in their mobile devices, then nobody could argue that the Internet wasn't useful. Build it, and the products will come!

If every gaming PC had the same spec as a GB200 or MI300, then games could do real-time voice interaction with "intelligent" NPCs with low latency. You could talk to characters, and they could talk back. Not just talk, but argue, haggle, and debate!

"No, no, no, the dragon is too powerful! ... I don't care if your sword is a unique artefact, your arm is weak!"

I feel like this is the same kind of step-change as floppy drives to hard drives, or dialup or fibre. It'll take time. People will argue that "you don't need it" or "it's for enterprise use, not for consumers", but I have faster Internet going to my apartment than my entire continent had 30 years ago.

alberth 2 years ago |

> TSMC N4

Question: Am I understanding this correctly that AMD will be using a node size from TSMC that’s 2-years old, but in a way it’s kind of older.

Because N4 was like a “N5+” (and the current gen is “N3+”).

EDIT: why the downvotes for a question?

Night_Thastus 2 years ago | |

Yes, AMD is not using the leading-edge node. The older node is cheaper/has better yield and more importantly has much larger supply than N3, which Apple has likely completely devoured.

I am personally very curious how it compares vs Intel's 15th gen, which is rumored to be on Intel 20 process.

aurareturn 2 years ago |

15% IPC boost is mildly disappointing.

It will be significantly slower in ST than M4, and even more so against the M4 Pro/Max.

KeplerBoy 2 years ago | |

I wouldn't be sure about that. Isn't apple silicon performance mostly benchmarked using Geekbench?

AMD claims +35% IPC improvements in that specific benchmark, due to improvement in the AVX512 pipeline.

aurareturn 2 years ago | | |

AMD claims +35% IPC improvement in AES subtest of Geekbench 6. It's not the entire Geekbench 6 CPU suite. It's deceptive.

Overall GB6 improvement is likely around 10-15% only because that's how much IPC improved while clock speed remains the same.

hajile 2 years ago | | |

Apple themselves have around 10% IPC uplift with M4 too.

The real issue is that most code people run doesn't use very much SIMD and even less uses AVX-512.

muxator 2 years ago | |

How come is a 15% IPC increase generation for generation a disappointing result? There might be greener pastures, I agree, but a 15% increase year over year for the quality factor of a product is nothing to be disappointed of. It's good execution, even more so in a mature and competitive sector such as microelectronics.

aurareturn 2 years ago | | |

It's not year over year. It's 2 years.

It's disappointing because M4 is significantly ahead. I would expect Zen to make a bigger leap to catch up.

Also, this small leap opens up for Intel's Arrow Lake to take the lead.

papichulo2023 2 years ago |

Meh, I guess still 128 bits memory channel width. Guess they are not even trying anymore.

icf80 2 years ago | |

2 x 2 x 32bit

anything else will require newer SOCKET, MB AND RAM

lmz 2 years ago | |

Is this even possible to change without moving to a new socket?

KingOfCoders 2 years ago | | |

No.

AMD Ryzen 9 7950X (16 core) 560.8 Apple M2 Ultra (24 cores) 501.82 Apple M3 Max (12 cores) 408.27 Apple M3 Pro 226.46 Apple M3 160.58