The Pre-Scheme Restoration(prescheme.org) |
The Pre-Scheme Restoration(prescheme.org) |
I believe the idea is essentially to write C semantics in scheme notation. Variables get marked with 'u32' or similar instead of being implicit sum types of anything the language can represent, memory allocation is explicit instead of garbage collected. In itself that essentially means writing C syntax trees in prefix notation, which is probably an acquired taste.
However scheme also comes with the compile time macro layer and that lot runs just fine in pre-scheme, garbage collected and all, because it's burned off before runtime anyway. Specifically, it's wholly macro-expanded before compilation to C (or similar), which is the obvious lowering to use for execution.
Also scheme has tooling, so if you're careful, the type annotated Cish syntax trees execute correctly as scheme, so you can debug the thing there, unit test it from scheme and so forth.
I really like it as a path to writing lisp runtimes in something that isn't C since an alarming fraction of them turn out to have a C runtime library at the bottom of the stack. Also for writing other things that I tend to write in C, where it's really the semantics I want and the syntax getting in the way.
There are two complete implementations
1. one that runs under a stock Python interpreter (which doesn't use static types)
2. one that's pure C++, translated from statically typed Python code, and from data structures generated in Python
In the second case, everything before main() is "burned off" at build time -- e.g. there is metaprogramming on lexers, flag parsers, dicts, etc. that gets run and then turned into static C data -- i.e. data that incurs zero startup cost
Comparison to Pre-Scheme: https://lobste.rs/s/tjiwrd/revival_pre_scheme_systems_progra... (types, compiler output, and GC)
Brief Descriptions of a Python to C++ Translator - https://www.oilshell.org/blog/2022/05/mycpp.html
...
And related to your other point, I remember looking at Racket's implementation around the time it started the Chez Scheme conversion. For some reason, I was surprised that it was over 100K lines of hand-written C in the runtime -- it looked similar to CPython in many ways (which is at least 250K lines of C in the core).
But no more! It's so exciting that Andrew Whatson has begun reviving the project with such great enthusiasm and making it so that Pre-Scheme can run on top of a variety of Schemes. And it's wonderful that NLnet has recognized how important this effort is. I think Pre-Scheme could play an interesting role alongside Zed and Rust, and indeed I know that Andrew plans to incorporate many of the newer ideas explored in those languages on top of Pre-Scheme eventually.
Go Pre-Scheme revival... I'm cheering it on, and can't wait to use this stuff myself!
https://github.com/carp-lang/Carp/issues/1460#issuecomment-2...
Settled on Gerbil Scheme instead, lively community & been actively developed to this day for over 15 years now. Although fair warning, still GC'd and (for now) only type-annotated, not (100%) statically typed. But stdlib-wise and compilation-wise still way more "systems-bent" than most Schemes out there.
Can recommend Gerbil Scheme. Although fair warning, still GC'd and (for now) only type-annotated, not (100%) statically typed. But stdlib-wise and compilation-wise still way more "systems-bent" than most Schemes out there.
Gerbil Scheme – A Lisp for the 21st Century - https://news.ycombinator.com/item?id=39809323 - March 2024 (126 comments)
Gerbil – A meta-dialect of Scheme - https://news.ycombinator.com/item?id=20585637 - Aug 2019 (17 comments)
Gerbil Scheme - https://news.ycombinator.com/item?id=17707622 - Aug 2018 (9 comments)
Gerbil – An opinionated dialect of Scheme designed for systems programming - https://news.ycombinator.com/item?id=15394603 - Oct 2017 (78 comments)
But I also think it will exacerbate an existing problem with C, namely, macros. Low-level programming is all about knowing exactly what's going on, and since C has a preprocessor, that's more difficult than it otherwise would be. Just because something looks like a function call, doesn't mean it actually is one.
Schemes have a much better macro system, and that will simultaneously make the core issue both better, and worse. But it's very much worthwhile to try it, imho, and see if good tooling can ameliorate the downsides, while still enjoying the power, and freedom from tedium, which macros bring to the table.
Pre-scheme: A Scheme dialect for systems programming (1997) [pdf] - https://news.ycombinator.com/item?id=29725313 - Dec 2021 (12 comments)
(surprised there hasn't been more)
The Nix/OS folks might take exception. I'm guessing this is tongue-in-cheek but it belies the tone of the rest of the post.
In all seriousness, though, this is exciting from a modern, end-user's vantage point and fascinating from an historical perspective.
Nixpkgs also doesn't seem to require that all packages be built from source - which, if you're really looking for reproducibility, is a downside. I recognize that there are practical reasons for this, and it's part of why Nix has so many more packages available than Guix, but IMO it makes Guix a better foundation to build on if you want as much of your system as possible to be reproducible.
Genuine question: would there be any advantages in targeting LLVM IR, rather than transpiling to C? With C being notoriously implementation dependent (down to things like the sizes of integer types), it seems like a messy target for something intended to be a sane systems language.
Targeting LLVM IR has the drawback that it is not platform independent: Details of calling conventions must be modeled in the IR, so the compiler must know what ABI it is targeting and emit the appropriate code. Compiling to C doesn't have this problem, since the C compiler will handle calling conventions for you.
That said, LLVM would indeed have some advantages. Scheme has guaranteed tail call optimization, which you cannot guarantee with C. But LLVM does allow you to annotate calls as tail calls, and it can transform tail self-recursion into a loop for you.
Good point! You don't tend to see them too much in the wild, but they're available, which is good enough for present purposes.
> Targeting LLVM IR has the drawback that it is not platform independent: Details of calling conventions must be modeled in the IR, so the compiler must know what ABI it is targeting and emit the appropriate code. Compiling to C doesn't have this problem, since the C compiler will handle calling conventions for you.
Ooft, that would be a rough one. It kind of seems like there'd be some benefit to a low-level IR that's neither platform / implementation-specific, nor has the warts of C, but I appreciate that'd be well outside the scope of this project.
> That said, LLVM would indeed have some advantages. Scheme has guaranteed tail call optimization, which you cannot guarantee with C. But LLVM does allow you to annotate calls as tail calls, and it can transform tail self-recursion into a loop for you.
I suppose this will need be handled manually in the Pre-Scheme transpiler itself. Losing TCO seems like it ought to be a non-starter for anything Scheme-like.
This is pretty cool, and it's generous of them to grant them funding, but (and I'm not trying to be rude) I wonder why they chose to give a grant for Pre-Scheme specifically. This seems only loosely related to the goals of the NGI Zero Core program (linked in the article):
"The next generation internet initiative envisions the information age will be an era that brings out the best in all of us. We want to enable human potential, mobility and creativity at the largest possible scale – while dealing responsibly with our natural resources. In order to preserve and expand the European way of life, the programme helps shape a value-centric, human and inclusive Internet for all."
...
"We want a more resilient, trustworthy and open internet. We want to empower end-users. Given the speed at which the 'twin transition' is taking place, we need a greener internet and more sustainable services sooner rather than later. Neither will happen at global scale without protocol evolution, which — as the case of three decades of IPv6 introduction demonstrates — is extremely challenging. NGI0 Core is designed to push beyond the status quo and create a virtuous cycle of innovation through free and open source software, libre hardware and open standards. If we want everyone to use and benefit from the internet to its full potential without holding back, the internet must be built on strong and transparent technologies that allow for permissionless innovation and are equally accessible to all."
I dream of some day soon running Emacs/Guix/Hurd on an open RISC-V chip and not having it be some flossy novelty but a genuine spiritual successor to Genera and the Lisp Machines.
Also when it comes to macros, does that include `syntax-rules` or `syntax-case` style macros, where the latter are much more powerful?
While an embedded Scheme-like language is incredibly useful, at some point I feel as if you would simply have to include these features, and to that end it would just be Scheme reinvented.
NLnet being the operator of the call is no small thing though, having been through the process they are very thoughtful, knowledgeable, thorough in how they run things. They even run the software they fund and verify it's working and check that the overall ideas are sensible, which is something I can't say of many other grant programs I've interacted with. So NLnet does deserve thanks.
> Scheme syntax, with full support for macros,
you can read that not as that Scheme prefix syntax alone is a big selling point, but the fact that it then supports Scheme macros (which are much better than in most other languages that support some kind of macros, partly due to the syntax making this easier).
Then you can read the rest of the sentence, for a bonus:
> and a compatibility library to run Pre-Scheme code in a Scheme interpreter.
Which means that you can do things like develop using this language within a normal Scheme development environment, possibly share code between developing for the PreScheme compiler target and non-PreScheme targets, etc.
Good for them.
> possibly share code between developing for the PreScheme compiler target and non-PreScheme targets
"possibly" is a strong word, seeing that Pre-Scheme is a statically typed, explicitly memory managed subset and all. There's a very large and coarse-grained semantic leap.
Then you can read the rest of <https://www.steveblackburn.org/pubs/papers/vmmagic-vee-2009....>, for a bonus.
It's a funny problem but because it's antithetical to the original project's spirit you won't hear about it from any official Guix sources and so it's relatively unknown.
Does Guix not have GHC (Glasgow Haskell Compiler) or did it somehow bootstrap GHC? Last time I checked bootstrapping GHC on today's hardware is effectively an unsolved problem. [1]
> NixOS does not have an equivalent to Guix's full-source bootstrap
While you are not wrong, there is nothing fundamentally stopping Nixpkgs from being bootstrapped in a similar way to Guix. emilytrau has already done a lot of the work. [2]
[1] https://elephly.net/posts/2017-01-09-bootstrapping-haskell-p... [2] https://github.com/NixOS/nixpkgs/pull/227914
However just to clarify for others, it's not the only thing there of course. There is free software in nonguix, maybe because it's PITA to bootstrap, like for example Leinigen and other parts of the Clojure ecosystem, as well as everything and anything written using Electron. And of course notable free software things there are also the blobbed Linux kernel (probably obvious reasons), as well as Firefox, since Mozilla has some interesting trademark opinions, so you can't have it on the main Guix channel.
But it's helpful to have Guix itself aim for reproducibility even if nonguix exists, so you can install upstream Guix alone if you're looking for reproducibility.
You also forgo any improvements to compiler improvements
I think you're right, it looks like they've gotten a little further now than in that post but there's still a gap in the bootstrap chain. So maybe not every package is fully bootstrapped, but they do seem to take it more seriously.
> While you are not wrong, there is nothing fundamentally stopping Nixpkgs from being bootstrapped in a similar way to Guix. emilytrau has already done a lot of the work.
Yes, I agree, and I hope they get there! I just also think that acknowledging the places where Guix is currently ahead isn't wrong. Nix isn't the only game in town anymore.
Can you say more, or provide any references? I would be interested in the state of the art here.
I had to look up exactly what this means, not being very familiar with the Haskell ecosystem myself. It looks like it's not the raw source form and is architecture-specific, but it's also not the compiled binary form. So that's not perfect, but better than relying on the compiled binaries I guess. (Unfortunate for me since my laptop is ARM and I'd like to be able to use git-annex, haha.) But this seems to work for older versions of GHC.
This post by Simon Tournier from last year describes the current situation near the bottom, and from what I can tell this is still correct: https://simon.tournier.info/posts/2023-10-01-bootstrapping.h...
> The bootstrapping problem for Haskell is not solved. And Ricardo works hard on it. Currently, from the older GHC around (4.08.2), which relies on gcc-2.95 – part of the Bootstrapping story above – it is possible to chain until version 6.10.4. Then versions 6.12.3 and 7.4.2 are not packaged yet for completing the Haskell chain from version 4.08.2 to modern version as 9.2.5; fully connecting the dots with bootstrap-seeds and dropping these 450MiB of binaries. The solution of this chicken-or-the-egg is not yet complete.
(I'm the author of the 2017 blog post. I had planned a follow-up but since I didn't have much to show I scrapped it.)