Intel’s Plans for 3DXP DIMMs Emerge

Intel’s Plans for 3DXP DIMMs Emerge(realworldtech.com)

126 points by dzaragozar 7 years ago | 73 comments

jzelinskie 7 years ago |

It seems obvious in retrospect, but persistent memory adds a pretty exciting new advantage for persistent data structures.

Another thought: As potentially paradigm changing technology like this becomes available will it ever make sense to redesign the OS?

caf 7 years ago | |

I used to think so - once you have persistent primary memory, you can install operating systems and applications directly into primary memory, so there's no longer any point in having a distinction between "install" and "run/boot".

However, iOS and Android have shown that it's possible to do away with this distinction even with a traditional OS running underneath. So I now tend to think that instead what will happen is more continual evolutionary changes at the OS level to work better in a "boot once" environment, rather than a revolution.

amluto 7 years ago | | |

> so there's no longer any point in having a distinction between "install" and "run/boot".

When Linux boots, the in memory state changes quite a bit. Even the actual code gets modified during boot. The whole process takes well under a second. Linux does support an “execute in place”, but it’s barely a win, and I don’t think it works on x86.

A more interesting idea is to put your OS installation on a DAX (direct access) filesystem.

glangdale 7 years ago | |

Seriously, yes. If there was ever a time to rethink OS design, surely this is it.

That being said, operating systems like Linux tend to capture most of the value from these kind of advances - often by dint of being able to simply 'get out of the way' if a sufficiently important user space process wants access to the device.

But one would suspect that things have changed sufficiently from the 1970s to warrant a ground-up rethink. Core counts, distributed systems (the Plan 9 folks already too a swing at this in the 90s), nearly ubiquitous graphics/GPGPU accelerators, persistent memory, nearly ubiquitous access to 64-bit address spaces (at least for desktop and most phones) - you'd think something would change about design. I don't work in the area so I don't know what that is...

dragontamer 7 years ago | | |

> Seriously, yes. If there was ever a time to rethink OS design, surely this is it.

Why?

Traditional servers are persistent: they never turn off. 500+ days of uptime is typical. And today, with VMs which at worst... hibernate... it seems like "never turning off" might be the norm.

mozumder 7 years ago | |

Probably the optimal use of this technology is databases, since they rely most on random access to large persistent data.

Any operating system that's designed around this technology is probably going to look like a database.

Basically, boot to Postgres and all "files" are now SQL tables, stored in NVDIMM. Indexes are in DRAM, and critical nodes are in cache.

All data (system and user) is organized and opinionated: All photos are in a photo database, with tables for IPTC metadata. All music. All executable files. If you're browsing the web, it'll probably cache data in local SQL tables. etc..

I can envision using SQL stored procedures as actual apps, perhaps with an API to access graphics hardware, network, sound, etc..

infogulch 7 years ago | | |

The entire information world is either a database or a cache (or communication between them), layered on top of each other over and over. Every new storage technology typically ends up being yet another layer as either database or cache (or both). This case is pretty unique in that it can actually serve to remove a layer: ram (typically a cache) is not necessary if nvs is viable at the same speeds. But in general new storage tech just adds another layer, which the software world reacts to by rushing in as if to fill a void by creating new software to take advantage of it which ends up being... another database or cache, often with similar tradeoffs to the layers of cache/database surrounding it. In the limit, I see more and more layers of cache/database until they merge into some kind of continuous data/cache field with a continuous tradeoff gradient between size and latency.

freeone3000 7 years ago | | |

Hey, that sounds line a mainframe... I wonder what it'll take to get zOS running on commodity.

WhitneyLand 7 years ago | |

Persistent dara structures? Well yes, but that’s just the tip of the iceberg. There other scenarios like powerful real-time analytics that could benefit from 100TB of RAM immediately.

100TB systems at RAM speeds are theoretically possible without this new memory. For,example 64-bit systems could easily provide enough address space.

The problem is practically speaking, server systems limit address bit capabilities quite often. And other problems still remain, not the least of which is the crazy price for 100TB of DDR4,physical slots, etc. The price would be crazy even for most enterprise projects.

So yes this new generation of memory will be disruptive, but also keep in mind even though it’s faster than SSDs, that’s not nearly enough. I’m not positive, but IIRC correctly it’s still 2 or 3 orders of magnitude slower than conventional memory.

Does that mean this new wave od persistent RAM it’s not useful and awesome? Not at all, I’ve already started using it.

But it does mean it’s still at the stage where you have to analyze your scenarios carefully, see if it’s a good for your architecture and environment, and benchmark your particular stack to verify assumptions and make sure it’s help you the best way it can.

tgtweak 7 years ago | | |

There will, almost inevitably, be someone who needs 101TB of memory. Then you get back to the same place where you need to scale out instead of up. If you asked cloud architects for cheaper, lower latency network or faster more expemsive storage you'd probably get the former most of the time.

Spark already works nicely with 100+TB datasets, and those can sit in memory across a thousand spot instances. Technology like tidalscale's hyperkernel can also merge together multiple systems into a single addressable memory space at the OS level so that you can run non-distributed applications across multiple commodity machines (like a reverse VM).

If 3d xpoint can give competitive price and speeds to tradional DRAM, then it will have a place in the market. Nobody has seen pricing yet nor benchmarks for these. For Intel however, this could increase their component share from CPU/chipset/network/storage to also include memory. That is pretty compelling since it's a market they haven't monetizes (not counting memory controllers) since the early days of Intel.

platform 7 years ago | |

I would also think, OSes that tend to emphasize their primary file systems as 'the distributed memory', like DragonFly BSD -- would benefit significantly.

I am speculating, of course, but the whole Hammer 2 design of DF BSD emphasizes cross-machine 'database-like' file system, with built-in transparent state snapshots, state-branches, etc. [1].

So with this new type of persistent storage, DF's Hammer2 could erase the difference between 'persistent state' and in-memory-only state.

Therefore eliminating the need for reconciliations, application-specific backups, and application-specific distributed architectures.

[1] http://apollo.backplane.com/DFlyMisc/hammer2.txt

spamizbad 7 years ago | |

> Another thought: As potentially paradigm changing technology like this becomes available will it ever make sense to redesign the OS?

Realistically, it's implications are much bigger for applications that depend heavily on persistent storage, like databases. They make tons of assumptions about persisting to block storage, whereas 3DXP could enable them to function entirely "in memory", so all that block storage specific optimization they have is now working against them. I'm just generalizing here, though.

gravypod 7 years ago | |

Zero serialization. Imagine installing a program and always having it "running". It may be swapped out but littetally everything in it is ready to go when you switch to it's window.

robotresearcher 7 years ago | | |

We have that right now, do we not? You don't have to quit apps except on reboot.

Except that many apps are so buggy you have to restart them often in practice. NVM won't change that, sadly.

TeMPOraL 7 years ago | | |

I'd still want a way to kill it.

This is a huge PITA with mobile devices - I have no clue what code is, or isn't, being executed at any given time. Even if I force-kill an app, it has still most likely left some background service running, that will still use data, trigger GPS updates, wake the phone up, etc. What I wanted since the very day I first got my smartphone is to have PC-like control over applications.

In a perfect world of total ubiquity of wireless electricity, not to mention infinite CPU speeds and free and unlimited bandwidth, having everything running all the time in some way might be ok. As it is today, we still need the ability to kill software (and have it stay down), up to and including rebooting everything, to deal with obscure bugs in applications, OS and drivers. Not to mention being able to have some semblance of understanding of the device's state.

vbezhenar 7 years ago | | |

Leaking memory would be dangerous.

fh973 7 years ago |

While there are some analytics workloads that will benefit tremendously, the main use case will be improving server utilization.

Currently RAM is not a compressible resource like CPU. However many applications don't have a fixed or if easily predictable RAM footprint and so you have to overprovision. Swap has been there to solve that but with its performance impact, it often can't be used for server applications.

These DIMMs will blur the boundary between memory and swap and make swap again viable.

tpetry 7 years ago | |

I dont get your logic. CPUs have a finite number of instructions they can do in a timeframe, its not compressible. In the other one someone could compress the memory, works great for storage. Sure, it‘ll be slower but compressing seldomly used memory pages like macOS does is indeed possible

oblio 7 years ago | | |

I think his point is that you can "run" a large amount of applications at the same time on a CPU. It will execute everything, albeit slowly. This might not be acceptable for performance concerns, but it's doable.

He's not talking about actual data compression in RAM. Because even with compression, with current OSes, if you try to fit more than 20GB of data, let's say becoming 10GB compressed, into 5GB of RAM, it's not possible. You have to swap and at that point your performance is completely gone.

The performance gap between an overloaded CPU and swapping is humongous. One is annoying or slightly troublesome, the second is a death knell.

dmichulke 7 years ago | |

> and make swap again viable

You shouldn't work in political marketing :-)

grogers 7 years ago |

One interesting thing for databases is that as nonvolatile storage latency decreases, traditional btrees get more attractive relative to newer log-structured designs. Especially if the write endurance is increased as well over current SSDs.

wtallis 7 years ago | |

Or to put it another way: there's not a lot of reason to have two layers of log-structured storage. Your SSD already needs its own log-structured flash translation layer, and if that's tuned properly for your database workload, then another layer of the same kind of thing may not help much.

gravypod 7 years ago |

There's so much amazing stuff I could do with this. Imagine persistent redis? Huge huge pages? Booting from a DIM?

The possibilities are endless.

oh_sigh 7 years ago |

Why can't we just put a battery onto DRAM that maintains state if the power goes out, and be done with it?

steve19 7 years ago | |

That is how storage worked on PDAs back in the day with volitile memory. Let's the batteries completely die or change them incorrectly and you lost your data. Let's not go back to those days!

wmf 7 years ago | |

DRAM is not that dense; you can fit 128 GB of DRAM or 512 GB of XPoint on a DIMM. XPoint is also supposed to be cheaper than DRAM.

hetman 7 years ago | |

Because the power consumption to keep DRAM refreshed is fairly high so you'd need a pretty big battery, and because it would still be more expensive than 3DXP. It's just not practical for most use cases.

amelius 7 years ago | | |

I figured that, but can we have some numbers here?

ggm 7 years ago | |

We do on RAID cards. It has limits.

-G

jcoffland 7 years ago | |

Battery backed DRAMM has been around for a decade or more.

IshKebab 7 years ago | |

You mean like suspend-to-RAM?

QuadrupleA 7 years ago |

I guess in a theoretical NVM-only system you could pull the plug at any time, and instantly resume it when the power is back on? If I'm reading right though the latency of 3DXP is somewhere in the 10-20us ballpark, still 100-1000x slower than DRAM.

sp332 7 years ago | |

Yes but it's also cheaper than DRAM.

You could resume after pulling the plug as long as things are consistent. If you commit data in the wrong order you could have trouble!

randyrand 7 years ago | |

The CPU has internal state as well that won’t be persisted - at the very least the registers.

Would be to save them when you notice power loss.

rasz 7 years ago | |

no, unless Everything else in your system retains state, that includes all the registers in every single chipset/controller/processor.

ksec 7 years ago |

Price. Remember the current DRAM is 2.5x the price of what is was two to three years ago. So the XP DIMM being 4x cheaper then DRAM Now isn't that much different if DRAM dropped back to its median level.

kristianp 7 years ago |

Is there any tech that's faster than DRAM and cheaper than SRAM? There's a need to fill that gap.

AtlasBarfed 7 years ago | |

L2/L3 cache?

kristianp 7 years ago | | |

That's SRAM usually.

Ocha 7 years ago |

website times out. Also, this story hasnt been reported by any other news source. Is there some other site I can see the story at?

tarlinian 7 years ago | |

It's not a news story...mostly analysis/predictions. (If you're talking about generic announcements of XPoint DIMMs that was in the news at the end of may: https://www.anandtech.com/show/12828/intel-launches-optane-d...)

dkanter 7 years ago | |

Website should be up, I just rebooted AWS :)

pvg 7 years ago | | |

Oh don't do that. We have other shit running on it.

stephengillie 7 years ago | | |

I have coworkers who believe AWS is "Amazon WorkSpace" and complain "my AWS is slow, please reboot my AWS."

kartD 7 years ago | | |

Unfortunately it’s still timing out :(

jaytaylor 7 years ago | |

Found a copy freshly archived today @ archive.org:

http://web.archive.org/web/20180723220131/https://www.realwo...

hughes 7 years ago | | |

Sadly this only archived the first of four pages.

vbezhenar 7 years ago | |

removed

dkanter 7 years ago | | |

Thank you for being a gentleman (or gentlewoman)!

inamberclad 7 years ago |

On a few occasions I've found myself in the presence of a senior engineer at a large defense company who would never stop talking about how persistent memory will change everything forever. Fair enough, but he'd go on about it in the weirdest ways. I think his impression is that the CPU registers would also be nonvolatile. I'm concerned that guy might be a few electrons short of a full orbital.

vbezhenar 7 years ago | |

Saving and restoring CPU registers and flushing caches into non-volatile memory doesn't require much time or energy.

monocasa 7 years ago | | |

Specifically, you can normally do something like that between when you notice the power dropping and when it drains too much and you have to shut down.

Milner08 7 years ago | |

This is perfectly possible. I used to work on something that did just this but at the time used a small battery and an SSD to quickly dump the volatile state before power loss. This meant we had to limit the amount of volatile data that could be stored in ram (due to the transfer rate and up time on the battery). We were eagerly awaiting 3DXP DIMMs so that we could remove that limit. It really will have a big impact on critical systems where any data loss is not acceptable.

zeusk 7 years ago | |

Shouldn't be hard, all they have to do is a light context switch to idle thread on power loss interrupt and make the idle thread externally re-entrable.

AtlasBarfed 7 years ago |

The initial release was really underwhelming, given the hype around this. So my personal (uninformed) expectations is just incremental improvement to the initial product.

wtallis 7 years ago | |

Moving 3D XPoint memory from the peripheral IO bus to the memory bus is way more than an incremental improvement.

throwaway2048 7 years ago | | |

Says the company that claimed it would be 1000x faster than NAND flash, it isn't, and moving the location of the bus isn't going to change that.