Ask HN: What open source project, in your opinion, has the highest code quality?

458 points by chefqual 7 years ago | 286 comments

akavel 7 years ago |

I hold the source code of Go standard library & base distribution (i.e. compiler, etc.) in very high regard. Especially the standard library is, in my opinion, stunningly easy to read, explore and understand, while at the same time being well thought through, easy to use (great and astonishingly well documented APIs), of very good performance, and with huge amounts of (also well readable!) tests. The compiler (including the runtime library) is noticeably harder to read and understand (especially because of sparse comments and somewhat idiosyncratic naming conventions; that's partly explained by it being constantly in flux). But still doable for a human being, and I guess probably significantly easier than in most modern compilers. (Though I'd love to be proven wrong on this account!)

At the same time, the apparent simplicity should not be mistaken for lack of effort; on the contrary, I feel every line oozes with purpose, practicality, and to-the-point-ness, like a well sharpened knife, or a great piece of art where it's not about that you cannot add more, but that you cannot remove more.

013a 7 years ago | |

This is one of the great things about many Go libraries; the language is so simple its difficult to overcomplicate a Go project. This makes reading any Go source code, projects, libraries, the stdlib, a joy. The only times I've found Go libraries to be a PITA to read is when they get autogenerated from some other language (protobuf compilations, parts of the compiler that came from C, AWS/GoogleCloud/Azure libraries, etc), but that's to be expected in every language.

Kubernetes is another great example of a project that is so unbelievably complex in its function, it should be completely impenetrable to anyone who isn't a language expert. But, go check it out; its certainly complex and huge, but actually grokable.

ben_jones 7 years ago | | |

I would argue while Kubernetes is a great piece of software, and its definitely practical to go in with relatively little experience and tweak a single line or function, Kubernetes is not easy to grok or reproduce in its entirety for example it has its own implementation of generics and a custom type system [1].

[1]: https://medium.com/@arschles/go-experience-report-generics-i...

eptcyka 7 years ago | |

I'd agree, but only as far as aesthetics go. When you have to understand the time complexity and runtime characteristics of the standard library sorting algorithms, I think Go does a very bad job - the standard `sort.Sort(data sort.Interface)` will run poorly if the data is already mostly sorted. I expect these kinds of things to be documented properly.

kragen 7 years ago | | |

Golang's `sort.Sort(data sort.Interface)` will sort mostly-sorted data in nearly its fastest possible time, because it basically uses median-of-three quicksort, falling back to insertion sort for small partitions. Median-of-three on sorted or nearly-sorted data picks the optimal or nearly optimal partitioning element for quicksort. The code is simple, readable, and well-commented. Moreover, its average and worst-case complexity is documented in the godoc.

In short, your comment is wrong from beginning to end. What led you to believe that anything in it was true?

pjscott 7 years ago | | |

It's guaranteed to run in O(n log n) time. Currently it uses quicksort with heapsort as a fallback to prevent quicksort's quadratic worst-case time.

https://golang.org/src/sort/sort.go?s=5414:5439#L206

jehlakj 7 years ago | | |

I was pretty certain most libraries shuffle then quicksort. No need for documentation. Does go not do this?

rusk 7 years ago | |

> standard library is, in my opinion, stunningly easy to read

Reading this brought to mind the JDK. All well structured, neatly formatted and well documented. I’ll often just click thru to the source to get the nitty-gritty on a function, I rarely need to consult the actual docs!

rick22 7 years ago | | |

not c++ stl. Very obscure.

unstuckdev 7 years ago | |

The fact that almost everyone uses the same style and standards through the go tools has made learning easy. I can dip into the most advanced package and make sense of what's going on quickly.

abtinf 7 years ago | |

I think one of the things that makes Go library code so easy to read is the lack of generics. Everything you need to understand the code is right in front of you, without the barrier of having to learn new sets of complicated abstractions or worrying that some obscure code in some other file impacts/is invisibly called by the function. With large code bases written in other programming languages, I have to spend an inordinate amount of time studying the code base and object relations before making changes.

For me, code readability is such a high value that, on these grounds alone, I oppose the introduction of generics and hope the current proposals ultimately fail.

cube2222 7 years ago | |

As far as I know, parts of the compiler are still code automatically translated from C, so this may be part of the reason.

colek42 7 years ago | | |

It is not, the compiler has been pure Go since, I think, v1.4.

tzury 7 years ago |

SQLite.

and for this reason alone!

https://www.sqlite.org/testing.html

    As of version 3.23.0 (2018-04-02), the SQLite library consists of approximately 
    128.9 KSLOC of C code. (KSLOC means thousands of "Source Lines Of Code" or, in 
    other words, lines of code excluding blank lines and comments.) 

    By comparison, the project has 711 times as much test code and test scripts - 
    91772.0 KSLOC.

moviuro 7 years ago |

OpenBSD and Co. (OpenSSH, etc.)

* https://cvsweb.openbsd.org/src/

* https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/usr.bin/ssh/

* https://www.libressl.org/

* https://cvsweb.openbsd.org/src/usr.sbin/ntpd/

kristopolous 7 years ago |

NetBSD.

Why? I was able to do substantial changes to the kernel when I was a teenager (late 90s), mostly on my first try. There was no giant wall of abstraction I had to climb over or some huge swath of mutually interacting code I had to comprehend. There was also nothing that required fancy code navigation and the creation of something like the ctags database in order to find out what on earth was happening.

No action at a distance or lasagna style dereferencing or mysterious type names that are just typedef'd and #define'd around dozens of times back to something basic like char. No fancy obscure GNU preprocessor extensions or exotic programming patterns.

Nothing had obtuse documentation that tried my patience or required much more than enthusiasm and basic C knowledge.

I did things like got a wireless card working from code written for one with a similar chipset and got various other things like the IrDA transmitter on my laptop at the time to do a slattach and thus work as a primitive wireless network - all in the late 90s.

I likely had no idea what, say, the difference between network byte order and host byte order was at the time or how the 802.11b protocol worked or what a radiotap header was or any of that. The separation of concerns was so good however, that none of that knowledge was actually needed.

Compare that to say, the Qualcomm compatible WWAN I just dealt with over the past few weeks where I needed to have in-depth knowledge of an exhaustive number of things (very specific chipset and network details) to get a basic ipv4 address working. Then I needed to read up on GNSS technology and NMEA data to debug codes over USBmon to get the GPS from the wwan working. Then after I had the qmi kernel modules doing what I wanted and the qmi userland toolsets, I had to write some python scripts to talk to dbus to get the data from the modemmanager that I needed in order to log the GPS. All the maintainers of these pieces were very nice and helpful and I have nothing negative to say. This is just how it usually is these days.

Back then however, I wasn't a good programmer, I was likely pretty terrible in fact but with the NetBSD codebase I was able to knockout whatever I wanted every time, fast, on a 486.

I miss those days.

hyperman1 7 years ago |

Its older than some HN posters, but the GPLed DOOM source code was one I liked.

The performance reached by the game was considered impossible until Carmack did show us otherwise. So I expected lots of ASM and weird hacks, especially as compiler optimization wasnt as good as it is today.

Surprise, surprise, the thing was easy to read, easy to get going, easy to port, reasonablye documented . It has shown me what a goog balance between nice code and usable code is.

If you want tho browse: https://github.com/id-Software/DOOM

NoSirRah 7 years ago | |

FYI, "Surprise, surprise" is generally meant sarcastically. I think you mean "Surprisingly".

https://www.merriam-webster.com/dictionary/surprise,%20surpr...

hyperman1 7 years ago | | |

  I can speak English!  I learn it from a book!

Though not always very good. Thanks for the bugfix.

jay_kyburz 7 years ago | | |

Actually, Hypeman is using this correctly in my opinion. We all knew the code would be good, it was only him that doubted it. So its not surprising to us, the reader, that the code is good.

iso-8859-1 7 years ago | |

You are reading cleaned up source code that only compiles and runs on Linux. That's why it looks nice.

  Many thanks to Bernd Kreimeier for taking the time to clean up the
  project and make sure that it actually works.  Projects tends to rot if
  you leave it alone for a few years, and it takes effort for someone to
  deal with it again.

hyperman1 7 years ago | | |

I didnt have interrnet at the time so I didn't check github 20 years ago ;-)

On the more serous side, i wanted to say something about the TODOs as example of the balance, but couldnt find any. I thought i was confusing with quake, but the cleanup might explain it better.

GNi33 7 years ago | |

These 666 forks can't be a coincidence, right?

thecatspaw 7 years ago | | |

it seems to be, judging by the fact that its now 667 forks

blackbeard334 7 years ago | | |

667

de_watcher 7 years ago | |

The DOOM code is so straightforward. You don't ever experience that feeling of having zero understanding of the code when you look into a file.

bsandert 7 years ago |

This is not necessarily about the code, but I've been really impressed for a while by the lodash project and its maintainer's dedication to constantly keep the number of open issues at 0. Any issues get dealt with at record speed, it's quite a sight to see.

https://github.com/lodash/lodash/issues

oldmanhorton 7 years ago | |

JDD, the maintainer, is also incredibly devoted and overall a nice guy to talk to. He has something like 5 years (and counting) of making a commit every single day, including weekends and holidays and sick days. They may not always be world-changing commits, but it still shows an incredible amount of dedication

rootlocus 7 years ago | |

Not necessarily relevant, but 15% of the issues are labeled "wontfix".

adimitrov 7 years ago | | |

With such a big project, being quick to hand out wontfix isn't necessarily a bad thing. To be honest, seeing as this project is used by a huge part of the… rather diverse JS crowd, 15% wontfix is astoundingly low.

fergie 7 years ago | | |

As an open source maintainer myself, that seems like a pretty low percentage.

xfs 7 years ago | |

It's not always a good thing. In the haste of fixing things introspection of root causes may be neglected...

dilipray 7 years ago | |

Never noticed this.

curtisz 7 years ago |

Strictly talking about code quality, I will nominate RCP100, which is a small, virtually unknown, now-abandoned routing software written in C [0]. I started programming with C way back in the 90s, and this is one of only two projects I can recall being immediately struck by the beauty of the code (Redis being the other). I know almost nothing about the author but he seems not to want to be known by name. You can browse the source on Github [1], which I uploaded myself, since you can only get a tarball from sourceforge. Anyway, as someone else mentions, C is usually a mess, but RCP100 struck me as beautiful.

[0] http://rcp100.sourceforge.net/

[1] https://github.com/curtiszimmerman/rcp100

carliton 7 years ago | |

Hi Curtiz,

Thanks for uploading RCP100. Your comment is a timely one. I wanted to learn how a router works and is built and was looking for a simpler implementation.

Can you recommend any resources from which I could learn more about network programming, so that I could understand RCP100 code better?

Thanks!

hansoolo 7 years ago | |

Maybe you just send the guys an email ;)

curtisz 7 years ago | | |

I actually did send fan mail to the author, heh, thinly-disguised as a courtesy to let them know that I mirrored their project on Github.

zubairlk 7 years ago |

I'm surprised no one has mentioned the Linux Kernel!

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

It is quite clean when you consider the task that it accomplishes.

Being able to compile across multiple architectures/endian-ness,32/64-bit/scale up/down from server/desktop/router/phone while accepting contributions from thousands of people..

aortega 7 years ago | |

It's not clean at all. Thousands of different styles, no single convention on function-naming, etc.

Want a clean kernel, go look at the BSDs.

elihu 7 years ago | |

One thing the Linux kernel has going for it is that there are a lot of books that describe how the various parts work and how to use the various internal interfaces. I can't think of any other open source project that has multiple books on how to contribute.

(Sadly, most of those good kernel books were written in the 90's and early 2000's. I don't know if there are any recent kernel hacking books.)

brentjanderson 7 years ago |

Going to throw Elixir Lang into the mix.

- The tooling is excellent.

- The code is well-documented and readable.

- The core team committed to never needing to introduce breaking changes.

The Elixir community tends to produce work that is actually considered "Done". An elixir package is not stale when it hasn't seen a commit in a few months. Instead, the feeling is: "It's feature complete and only needs maintenance from here on out."

https://github.com/elixir-lang/elixir

flaque 7 years ago | |

> The core team committed to never needing to introduce breaking changes.

Is this why Elixir seems to have many different ways of doing the same thing though?

JamesUtah07 7 years ago | | |

I think that's one reason. The other is that classic erlang (Elixir is built on top of the erlang beam vm) sometimes does things one way but elixir has a more elegant way of doing the same thing, however, in elixir you can still call into erlang libraries to achieve the same thing if that's more familiar to you.

binalpatel 7 years ago |

Scikit Learn comes to mind - not just because I can dig into the source code and immediately know what's happening, but also for the stellar documentation that goes above and beyond telling your what the functions do.

For example their Cross Validation documentation is amazing:

http://scikit-learn.org/stable/modules/cross_validation.html...

andygrunwald 7 years ago |

When we take the language into consideration, unwound like to mention Redis.

Often codebases written in C are a a mess to understand, a mess to read. The Redis Source Code is understandable even without deep knowledge of C

christophilus 7 years ago | |

Yep. I was going to say Redis and SQLite. Both are really well commented. They almost read like a manual.

andoma 7 years ago | | |

Although still in beta, I'd like to add BearSSL to the mix of well written and documented C libraries. In particular compared to the OpenSSL "documentation". It's also nice to see an TLS implementation without any memory allocations at all.

travmatt 7 years ago | |

> Redis Source Code is understandable even without deep knowledge of C

Came here to say exactly this - Redis is very cleanly written.

maaaats 7 years ago |

I'd prefer if people said why they consider the code good, instead of throwing out a bunch of random projects.

miguendes 7 years ago |

Python: I really like requests, scikit-learn, the Path module from the stardard library, Keras, Django.

C: Redis, SQLite, LUA.

Java: Joda Time, Guava

potta_coffee 7 years ago | |

+1 for requests, I forgot about that one but it's quite good.

chillydawg 7 years ago | |

Joda Time is one of my all time favourite libraries.

After struggling with JVM stdlib time nonsense, JodaTime was a breath of fresh air and actually made programming with time fun.

hellofunk 7 years ago | | |

Java 8 time module is now considered the replacement for JodaTime for new projects. It is separate from the older Java time libraries, and fixes many of the problems in Joda. Give it a try!

guidovranken 7 years ago |

ARM mbed TLS [1], Amazon S2N [2], nginx [3] have a super consistent code style throughout and are prime examples of how C application programming should be done (in my opinion).

[1] https://github.com/ARMmbed/mbedtls

[2] https://github.com/awslabs/s2n

[3] https://github.com/nginx/nginx

ac 7 years ago | |

+1 for s2n. It's one of the select few C codebases that is actually a pleasure to read.

_the_inflator 7 years ago |

There are too many in very different domains and languages.

However, I opt for jQuery here. It is one of the greatest examples of how constant refactoring and thoughful usage of design pattern get you a very long way.

If you are designing JavaScript libraries, pls have a look at jQuery. So many great design decisions aka great code quality.

AmericanChopper 7 years ago | |

Pushing all dom manipulation through global evals seems like the exact opposite of thoughtful design to me. I have a long list of places where I want to implement strict CSPs, but can’t purely for minor use of jQuery.

jaequery 7 years ago |

Sequel, a database ORM for Ruby: https://github.com/jeremyevans/sequel

The quality of the code is amazing, it's simple to use and even simpler to look through the docs to reason about.

I also want to praise the author of the library (Jeremy Evans), his support through the IRC is second to none, you can talk directly with him pretty much on a daily basis.

And even after 8+ years, the project is still constantly being updated (last commit 4 days ago). I haven't seen too many project of this calibre especially when it is ran mostly by a single person.

ChrisRackauckas 7 years ago |

Julia. Julia / Julialang is so pedantically tested and the names are pretty meticulously chosen. The algorithms in Base are almost all generic and handle a very wide variety of inputs without catering to them. If you want to learn Julia, along with good software engineering, looking at the Base library is quite recommended.

tlamponi 7 years ago | |

Did not look to much into it but at least a packager from Alpine Linux does not think Julia's compiler ecosystem is clean/easy to work with: http://lists.alpinelinux.org/alpine-devel/6248.html

But as said, I did not really checked this claim for validness myself...

ChrisRackauckas 7 years ago | | |

Julia requires patched versions of things like LLVM in order for all tests to pass because upstreaming bugfixes take time. This has given some Linux package managers an issue since they try to build using system LLVM/OpenBLAS/etc. with the known bugs. I agree this does cause some distribution problems, but as a scientist and mathematician I do like that the standard distribution of Julia uses the most numerically correct versions (as of current knowledge) of the dependencies as it can, and has a test to identify known potential issues. To me this is good practice.

But anyways, I was talking about the Julia Base library and its numerical routines. I just look at the Julia code and don't touch the build systems.

nazri1 7 years ago |

Does assembly count? Prince of Persia's source code (not really open source...): https://github.com/jmechner/Prince-of-Persia-Apple-II

One look at any of the assembly files and you can get a sense of how properly organized the source code is.

bovermyer 7 years ago | |

Thanks for that! I love looking at historical game code.

mpasternacki 7 years ago |

I'm a bit surprised nobody mentioned qmail yet: https://cr.yp.to/qmail.html

pvarangot 7 years ago | |

qmail cheats a bit because it's so simple, that most people end up using something with messy code on top. Not that I don't think it's a sound engineering decision but when comparing it's code cleanliness with other SMTP stacks it needs to be mentioned.

harryh 7 years ago | |

Or djbdns!

djb is a legend

informatimago 7 years ago | |

I don’t know about qmail, but postfix sources are really nice.

tptacek 7 years ago | | |

They're better than most C software of the era, but not better than qmail --- qmail has a better vulnerability record than Postfix does (perhaps because it does less, but that's beside the point).

potta_coffee 7 years ago |

Granted I haven't read much open source code but when I was working in Flask, I found the source code to be awesomely clear and well-documented. I actually learned quite a bit about Python by reading Flask code. Also, no-one could explain "g" in a way that made sense, but the source code made it obvious. Would recommend reading it if you're into Python at all.

83457 7 years ago | |

jxub 7 years ago | | |

The global request object if I recall properly.

dredmorbius 7 years ago |

I'd like to suggest;

1. Don't simply list projects.

2. Give some notion of why you're nominating code.

3. A sense of what you consider to be quality.

Enough to spark discussion, inquiry, or comparison. Doesn't have to be much.

This is rudimentay. But affords purchase; https://news.ycombinator.com/item?id=18037815

This does not: https://news.ycombinator.com/item?id=18038047

(Both reference the same project.)

philliphaydon 7 years ago |

Both Vue and PostgreSQL. Both have great code base. Amazing documentation. And amazing communities.

chiefalchemist 7 years ago | |

Nice to see you include / mention docs and community. I believe a code-based product has a UX. That UX is the code (with comments), documentation and community. That UX is your (i.e., a dev / engineer) end to end experience with "the product." It's not simply the code.

Put another way, there's more to a product that's easy and sensible to work with than code quality.

graki 7 years ago |

I'm suprised nobody cited TeX from Knuth. It's an absolute standard in quality of implementation, documentation and computer science background. Perhaps unsurpassed.

jacques_chester 7 years ago |

I definitely admired PostgreSQL's code when I first looked at it.

Projects written in C require a fair amount of care and discipline to be scaled up to larger codebases and teams. PostgreSQL is such a codebase.

I've also seen various parts of Spring's codebase and found all of it to be consistently solid and careful. They take a lot of care to structure carefully and comment immaculately.

Disclosure: I work for Pivotal, which sponsors Spring. Which is why Spring is highly visible in my working life.

nojvek 7 years ago |

Typescript

Even though it’s a fairly complex transpiler, the authors did a good job modularizing and leaving lots of contextual comments on what each part does.

Also typescript baseline tests are a simple but very effective way to get lots of coverage on the compiler.

I’ve read source code for Babel, typescript, coffeescript and flow. Typescript architecture stands out.

Typescript not only does fascinating things like magical code completion abilities and great tooling for IDEs but their codebase has been an inspiration for me to build better front end code.

I may be a bit biased since I’ve worked at Microsoft before.

ioddly 7 years ago | |

I found the TypeScript type checker pretty hard to read through, though it may be my lack of, well, almost any knowledge about type theory. I didn't dig much into the other parts of the codebase however. What parts of it do you enjoy reading?

nojvek 7 years ago | | |

While submitting a PR, the parser, Lexor and emitter were fairly easy to understand.

daniel-levin 7 years ago |

LLVM and associated projects such as clang. Bazel is good too. OkHttp and Retrofit by Square.

tom_mellior 7 years ago | |

I've been working with LLVM for a few years and I still find the code difficult to navigate and badly documented. And every single function's argument list is a random jumble of pointers and references (almost all arguments should be references, but many aren't).

anarazel 7 years ago | | |

Indeed. And it's not just medium to low-level stuff that's not well documented, it's the high-level stuff too. I personally don't mind that much if I have to spend a few minutes to understand something on a a very local scope, but if the bigger picture is unclear, that's quite bad. For LLVM one largely has to grep for a bunch of other users and try to figure it out from that.

While I think it has some clear deficiencies, I found a lot of e.g. the optimization passes in GCC a lot easier to read. It's probably above par, but e.g. https://github.com/gcc-mirror/gcc/blob/master/gcc/gimple-ssa... is really well explained imo.

vnorilo 7 years ago | |

LLVM is remarkable; the domain is both difficult and critical. Still, the code is consistent enough that I can often guess how things work based on what I think would be reasonable!

glandium 7 years ago | | |

Don't look how inline assembly is handled between clang and llvm.

xfs 7 years ago | |

Yet it contains monsters like this one https://github.com/llvm-mirror/llvm/blob/master/lib/Target/X...

McP 7 years ago | |

The coding standard for variables in LLVM drives me nuts. Both class names and variables names must be upper camel case so if you're lucky the code looks like this:

Analyzer TheAnalyzer;

but more commonly:

Analyzer A;

with A being utterly unhelpful to read many lines later.

helium 7 years ago |

The requests codebase is really well written and it has a beautiful api

https://github.com/requests/requests

Drdrdrq 7 years ago | |

I like Kenneth's comments: https://github.com/requests/requests/blob/master/requests/pa...

kbr2000 7 years ago |

Tcl! See https://www.tcl.tk/doc/engManual.pdf to start to understand why. (For code written in Tcl itself there's also some proposed conventions: https://www.tcl.tk/doc/styleGuide.pdf)

a-saleh 7 years ago |

I really liked the clojure core, I read it quite a lot when learning the language.

I have heard good things about sqlite, and some day, I plan to read it :-)

unixhero 7 years ago |

Dolphin Emulator

https://dolphin-emu.org/

delroth 7 years ago | |

We try to keep up, but the truth is that it's a 15 years old C++ codebase implementing some weird hardware in even weirder ways. We're far from where we'd want to be code quality wise -- close to no automating testing infrastructure, code is full of module-level globals, inconsistent conventions, etc.

swsieber 7 years ago | | |

How would you even test an emulator except manually? It seems like automated website testing, but even worse. I guess screenshots + scripted input?

That seems like it'd be terrible to try to get running reliably.

stevekemp 7 years ago | |

I've never installed it, or read the code, but their progress-report writeups are fascinating to me.

e.g. The most recent https://dolphin-emu.org/blog/2018/09/01/dolphin-progress-rep...

zengid 7 years ago |

The JUCE C++ library is very nice: https://github.com/WeAreROLI/JUCE

garyclarke27 7 years ago |

I’m no C expert so I’m somewhat guessing, to me, PostgreSQL source looks remarkably clean, well structured and nicely commented.

itsoggy 7 years ago |

The Quake 3 source was fairly good...

charlchi 7 years ago | |

As a C beginner getting into writing larger projects, especially in that sort of context, the quake source has been my reference on how to structure my code.

archi42 7 years ago | |

Oh, this +1. I ported it to another C dialect (test case for the compiler) and found those parts I touched well structured and easy to understand.

batteryhorse 7 years ago |

I was going to say the GNU version of /bin/false and /bin/true, but I actually took a look at the source and it is terrible.

panic 7 years ago | |

The original /bin/true is probably the highest quality code ever written, but unfortunately I don’t think the license is OSI approved: http://trillian.mit.edu/~jc/%3B-)/ATT_Copyright_true.html

hyperpallium 7 years ago | | |

Gosh, it seems like copyright lawyers will stop at nothing.

floren 7 years ago | |

The GNU coding style does not help with readability, in my opinion (he said, donning flame-proof underwear)

mixedbit 7 years ago |

Python core libraries have great code. You can open pretty much any module and be able to understand the source without much context.

akvadrako 7 years ago | |

I don't know how you can say this. The standard lib isn't even very pythonic, let alone "great" along other dimensions.

blattimwind 7 years ago | | |

Agreed. Almost every time I've looked deeply into stdlib code I was surprised by how hard to follow it is and how frequently antipatterns are employed. Doubly so for anything near a C module.

I consider the Python stdlib in a similar vein as the C++ stdlib or Boost: Yes, some useful bits in there, but (1) lots of rot (2) you don't want to have your code look anything like it.

falsedan 7 years ago | |

The only core library code I needed to look at was namedtuple, which is pretty incomprehensible even with context.

Walkman 7 years ago | |

You obviously did not dig in :D There are absolutely terrible parts, would not recommend!

xtreak29 7 years ago | |

Though core has some bad API due to maintaining backwards compatibility a lot of the third party libraries like requests, Flask have great focus on API design and code quality.

The authors have good quality repos :

https://github.com/kennethreitz

https://github.com/mitsuhiko

dyeray 7 years ago | | |

I agree with Flask, much more readable code than Django for example. I would also add Django Rest Framework (and Tom Christie) to the list.

dyeray 7 years ago | |

Agreed with the rest, I've ended up reading pypy's implementation of some functions sometimes to see how it works after trying CPython first. From the few I've read I'd say pypy looks nice by the way (I'm talking about standard library).

noir_lord 7 years ago |

In PHP land where I spend time for work.

Hands down Symfony.

utahcon 7 years ago |

Golang and Kubernetes have been highly regarded as high quality. I particularly found the Golang code for Kubernetes to be well documented and well architected.

jimhefferon 7 years ago |

Knuth did a good job on TeX, and it has been closely examined for many years since so there are very few bugs.

informatimago 7 years ago | |

The TeX language itself, and the logs and error messages of TeX are so bad, that I would hardly believe it.

svat 7 years ago | | |

Are you sure you aren't thinking of LaTeX?

TeX (plain TeX, not LaTeX) has phenomenally good logging and error messages IMO — everything you need is there, each error message comes in a “formal” and “informal” form and points you to exactly the place the error happened, and TeX lets you fix things on-the-fly without restarting the program. All this of course assumes you use TeX the way it is described in the manual (The TeXbook). The experience is opposite with LaTeX, so I find it worth giving up all the convenience of LaTeX just for the wonderful experience with TeX.

As for “the TeX language”, there is no such thing. As Knuth has said many times, TeX is designed for typesetting, not programming. Sure it has macros to save some typing, but if you're writing elaborate programs in it (as is nearly inevitable if you're using LaTeX) you're doing something wrong. Knuth said:

> When I put in the calculation of prime numbers into the TeX manual I was not thinking of this as the way to use TeX. I was thinking, “Oh, by the way, look at this: dogs can stand on their hind legs and TeX can calculate prime numbers.”

But of course LaTeX does every such thing imaginable :-)

More on TeX not being a programming language: https://cstheory.stackexchange.com/a/40282/115

On the TeX error experience: https://news.ycombinator.com/item?id=15734980

begriffs 7 years ago |

OpenBSD

https://www.openbsd.org/goals.html

DanWaterworth 7 years ago |

SQLite

zevv 7 years ago |

Lua: https://www.lua.org/

cellover 7 years ago |

To me that would be Appleseed rendering engine.

https://github.com/appleseedhq/appleseed

Even though I can't code C++, I can read it here and understand most of it (besides the maths).

pa7ch 7 years ago |

I particularly like reading code from Upspin (upspin.io). Its probably partially because I think the project design is interesting and write go. Regardless, its a great ground up Go project by some of the original Go authors and contributors.

Very well organized code and it feels like they got the project off the ground, fixed bugs for a few months, and now have largely trailed off from maintaining it largely because it just works (I use it) which lends some credibility to their coding style. Of course, I'd like to see the project evolve conceptually, but, right now it does what it says it does reliably for a project that hasn't even cut a single release.

silur 7 years ago |

radare2 - https://github.com/radare/radare2/ More GNU than actual GNU sources, more UNIX than the linux kernel. Huge codebase but extremely easy to get involved with, orthogonal design with no compromise on speed. Best codebase I ever encountered

gameswithgo 7 years ago |

PostgreSQL and Quake3 are good candidates. Both are C codebases which are surprisingly readable even by relative novices.

gorb314 7 years ago |

I think musl libc [1] has good quality code. If anything their build system is great. It makes the code much easier to navigate.

[1] https://www.musl-libc.org

izabera 7 years ago | |

https://git.musl-libc.org/cgit/musl/tree/src/string/strcspn....

https://git.musl-libc.org/cgit/musl/tree/src/stdio/vfprintf....

ah yes, good quality code

AndyKelley 7 years ago | | |

I still think musl overall is quite readable, but my goodness, that switch statement in your second example. What a monster. I didn't think it was possible to be this confusing without the preprocessor.

stonogo 7 years ago | | |

What's the problem, out loud?

ckorhonen 7 years ago |

On the JavaScript side I've enjoyed reading the code for Backbone and Underscore, helped also by the awesome in-line documentation. Very easy to see what is going on.

Also big fan of Sidekiq for similar reasons.

tmilard 7 years ago |

BabylonJs is a wonderful clean code. Made to write 3D on the Web. https://www.babylonjs.com/

epynonymous 7 years ago |

most people are talking about clean code, good design constructs, but i feel that many are missing the point, we’re talking about code quality here, design is the grit and grind that all developers go through to develop great software, certainly there are better designed software projects out there that leaves them more maintainable and prone to less bugs, but the fact of the matter is that for complicated code, designs go through many iterations and refactorings over time e.g. linux kernel, all software projects have bugs, even well designed or well tested software. but the significance of good testing and good processes are not being highlighted here, unit testing, code coverage, functional testing, end to end testing，scale testing, performance testing, code review, fault injection, debuggability, test automation, static code analysis, etc, i am shocked not to see lots of discussion on these things (aside from the sqlite mention) and testing techniques. probably a more developer friendly crowd here at hn, but testing is a significant and game changing part of what separates developers from great developers.

TekMol 7 years ago |

I like the Laravel framework. It has a clean style to it.

jedberg 7 years ago |

Postgres.

sgt 7 years ago | |

Definitely agree with this. Both the documentation and code are of excellent quality. Others that come to mind are sqlite and zeromq.

bsaul 7 years ago |

my first experience with high quality code was with tge quake2 engine.

i was both amazed by the simplicity of the architecture (a huge single event loop), and the attention to code presentation and indentation.

SmellyGeekBoy 7 years ago | |

Interesting to see so many John Carmack projects in this thread. He's a good candidate for "best programmer of all time", if there were such a thing.

i_feel_great 7 years ago |

Gambit, Chicken, Racket, Chez and Guile Schemes

jMyles 7 years ago |

Twisted. Not only highly organized and sensibly delineated, but also a lot of fun to read - borderline comical at places.

iso-8859-1 7 years ago | |

How do you think the asyncio (formerly Tulip) sources compare?

jMyles 7 years ago | | |

asyncio is more modern, more stylish, and more concrete.

Twisted is more timeless, more patterned, and more self-aware.

I can imagine Twisted's asyncio reactor becoming its default (and the Twisted flow control slowly declining in importance), but Twisted's protocols, control structures, and execution models becoming more popular.

Twisted has undergone a great resurgence in quality engineering since asyncio became more viable - this was surprising to me, but is actually probably reasonably consistent with the way the historical influence of the standard library.

Overall, I think that Twisted is a great project; I almost always reach for it when my python codebase becomes mature enough to need more thoughtful abstractions around network I/O.

otakucode 7 years ago |

Does 'Physically Based Rendering' count? It's a book... which is also source. It was written as only the 2nd work of true 'Literate Programming' that I know of. I believe Knuth wrote a book about TeX which was the first example. But basically it is prose interleaved with source, readable as a book.

fapjacks 7 years ago |

Actually, I think early versions (like from pre-1.0 through maybe 1.5 or so) of Docker had some very high quality code and was also very pleasing to look at. It was very clean and super approachable and readable, and I felt sort of like how the NetBSD commenter felt as described in their comment.

mehrdadn 7 years ago |

"Highest" I don't know, but "code whose quality I look up to", then:

For C: Process Hacker and some similar code that is designed like and written around Windows kernel APIs: https://github.com/processhacker/processhacker/blob/master/p...

For C++: Some of the Boost code, and stuff like it, such as P-Stade Oven: https://github.com/himura/p-stade/blob/master/pstade/pstade/...

For others: (need to look later, I forget)

robbick 7 years ago |

Can't say I've seen enough to be confident on the best library but redux (https://github.com/reduxjs/redux) is just so simple, and has great, readable/understandable code.

icc97 7 years ago | |

In Dan Abramov's excellent egghead redux course [0] he implements the `createStore` from scratch which is the core of redux, it's simple enough to post here:

  const createStore = (reducer) => {
    let state;
    let listeners = [];

    const getState = () => state;

    const dispatch = (action) => {
        state = reducer(state, action);
        listeners.map(listener => listener());
    };

    const subscribe = (listener) => {
        listeners.push(listener);
        // unsubscribe
        return () => {
            listeners = listeners.filter(l => l !== listener)
        };
    };

    // populate initial state
    dispatch({});

    return { getState, dispatch, subscribe };
  };

[0]: https://egghead.io/lessons/react-redux-implementing-store-fr...

coldnose 7 years ago |

After spending about a month of concerted effort pouring through the zlib sources, looking for vulnerabilities, I can say that zlib is the most astonishingly bug-free code I've ever seen. But in the conventional understanding of "code quality", it's pretty bad.

wrasee 7 years ago | |

Julian Storer (JUCE library) did a talk on code quality using zlib as an example. Might be interesting to you if you've not seen it already.

https://www.youtube.com/watch?v=SIAAvv1O7Gg

beefhash 7 years ago |

I'd have to go with Monocypher. It makes very tasteful use of comments, functions and macros to maximize readability and clarity.

https://github.com/LoupVaillant/Monocypher

rileyraver57 7 years ago |

Toybox by Landley(https://github.com/landley/toybox) is probably the best example of a modern c implementation I have ever seen. Surprised no one has mentioned it yet.

ezequiel-garzon 7 years ago |

I admit I don’t have the knowledge to make my own assessment, but I’ve read some downright poetic praise on djb’s work [1], and more than once.

[1] https://cr.yp.to/

Dowwie 7 years ago |

For Rust, many say that the regex crate sets a high standard for excellence: https://github.com/rust-lang/regex

markpapadakis 7 years ago |

I study codebases as a hobby. I highly recomend Seastar, Folly, Aeron and Disruptor, SQLite, PostgresSQL, LMDB, Tensorflow, Hashicorp’s vault, and the Linux Kernel projects as prime examples of high quality codebases.

chubot 7 years ago |

For clean C++, I like leveldb (a key-value DB library) and re2 (a regex engine). Random files from each of them:

https://github.com/google/leveldb/blob/master/table/table_bu...

https://github.com/google/re2/blob/master/re2/nfa.cc

macco 7 years ago |

I really like the source code of prosemirror:

https://github.com/ProseMirror/

It's not typical js, but very good none the less

Theodores 7 years ago |

The open source code I know from web development has to be fixed with various hacks - PHP and the frontend javascript that goes with it. Therefore the code I know is not 'highest code quality'. If it was 'highest code quality' then I would not know the code.

Therefore the highest code quality is likely to be in projects where I do not have to go under the hood, e.g. the Chromium project where all contributors are vastly more educated and capable than myself.

eloycoto 7 years ago |

Libuv is one of my favourites https://github.com/libuv/libuv

badminton1 7 years ago |

Good code bases that inspired larger projects: MINIX, KHTML

schaefer 7 years ago |

with respect to the C++ Language: there was a book published in 1996. Large-Scale C++ Software Design by John Lakos. He's about to publish the second edition of the book while also expanding it's reach to span two volumes.

Anyhow, while we await the publication of that book, John has been working at bloomberg. some of the code written there has been published to github[1]. He's also done a five hour lecture series [2] available on safari-online (paid service) that cover the topics of his book, and introduce the open source bloomberg repo as an example of code written in that style.

I can't offer you a review as I've just found this all myself, but I'll be eagerly studying it along with some of the other items mentioned here.

[1] https://github.com/bloomberg/bde [2]https://www.safaribooksonline.com/videos/large-scale-c-livel...

epynonymous 7 years ago |

linux kernel, purely the reasoning being that it’s probably one of the most used pieces of software out there, along those lines, probably the kernel libraries and user libraries like libstdc that are a part of it. i dont know how the linux kernel is tested, but i know that production testing of the kernel on different platforms, at large scale is probably the most used open source in the market.

TangoTrotFox 7 years ago |

I would not judge things on aesthetic quality, but simply on results. In general code faces difficulties that grow exponentially with with time, size, and the number of contributors. Millions of lines code, thousands of contributors, decades of development and it's still at the top of its game? In spite of its complete lack of aesthetic appeal, that's the Linux kernel.

sparkling 7 years ago |

High quality code and one of the best APIs i ever used: https://github.com/requests/requests

Best source code layout, architecture, maintainability: https://github.com/rg3/youtube-dl

agentultra 7 years ago |

As far as C++ code goes, the Lean Prover is really well maintained: https://github.com/leanprover/lean

I'd also say GHC is quite good.

And Pandoc as well.

I don't think I can compute enough variables to consider the "highest" though... so the aforementioned are only examples of what I think are good.

nojvek 7 years ago |

Redis. I have to say antirez not only is an amazing engineer but from the way the code is written, you can see he is a very clear thinker.

I hold Redis codebase as an example of what good C code should be. On the other hand opencv codebase as an example of what C could should not be. Opencv codebase is really inconsistent with quite a bit of unreadable spaghetti sauce.

moneysconcerned 7 years ago |

CVEdetails.com lists the number of (reported) vulnerabilities by year for software projects that have a CVE identifier.

Here's bitcoind: https://www.cvedetails.com/product/22744/Bitcoin-Bitcoind.ht...

rataata_jr 7 years ago |

XMonad window manager written in Haskell.

iso-8859-1 7 years ago | |

What do you think about the GHC sources in comparison?

dhuramas 7 years ago |

I am surprised no one mentioned SycallaDB(https://github.com/scylladb/scylla) . Redis and SycallaDB have often been pointed out as examples of good codebases to look at for C/C++ Devs.

Dawny33 7 years ago |

Gensim : https://github.com/RaRe-Technologies/gensim

[Can't speak for the 'highest' part of the qn, but Gensim upholds very high code quality standards]

kostarelo 7 years ago |

I like Spectrum for both their architecture and code quality. Node.js/JavaScript.

https://github.com/withspectrum/spectrum Https://spectrum.chat

nunobrito 7 years ago |

Referred by thousands and available since 2004 without one single bug reported in the last decade: http://users.telenet.be/AphexSoft/

It is not yet on Github.

sv12l 7 years ago |

Pretty sure PostgreSQL will have a place at the top quarter of this page.

numeromancer 7 years ago |

Pari:

http://pari.math.u-bordeaux.fr/git/pari.git

I prefer the early versions, before it was softened up for the vulgo.

cantagi 7 years ago |

GTKmm. GTK uses GObject to implement inheritance between C structs and it's easy to go wrong when extending. GTKmm wraps GTK in C++. It's a joy to use and is safer.

SoylentOrange 7 years ago |

I like the design philosophy behind BoringSSL:

https://boringssl.googlesource.com/boringssl

If some portion of the library is overly complex, look into the use case and delete it wherever possible. It maintains a long-term bound on code complexity, which I quite like.

Edit: a nice explanation on the design philosophy here https://www.imperialviolet.org/2015/10/17/boringssl.html

anuraaga 7 years ago |

I am very lucky that there are too many great open source libraries out there to label one with the "highest" quality.

hysan 7 years ago |

Any React and React Native suggestions?

hmsync 7 years ago |

Spring Framework

1. Elegant structure 2. Strict code style 3. Project size is not too large 4. Have detailed documentation

jankotek 7 years ago |

For Java I would say H2 SQL DB. It is small, compact, packed with features and good abstraction.

tom-jh 7 years ago |

Nobody mentions Android. Any examples good quality code on Android?

winkdinkerson 7 years ago |

I guess my vote for Matt's Script Archive is going nowhere..

rzvme 7 years ago |

I would suggest Laravel!

cmarschner 7 years ago |

Torch has the best code of DNN libraries I have seen so far.

ddtaylor 7 years ago |

A lot of the KDE source code is well written and maintained.

_pmf_ 7 years ago | |

Qt sources too (which has a lot of overlap in people and mindshare with KDE). Mostly.

praveenster 7 years ago |

zeromq. Both the code and documentation are very good.

halayli 7 years ago |

Postgresql, llvm, Python, sqlite are pretty up there.

anticensor 7 years ago |

Debian is the best with its rigid QA procedures.

aloukissas 7 years ago |

My nomination would go to the chromium project.

mikkelam 7 years ago |

Any iOS swift/objc releated projects?

joelbirchler 7 years ago |

Kubernetes is extremely well designed.

novaRom 7 years ago |

Python (official cpython)

vfinn 7 years ago | |

There's a nice introductory lecture series on CPython internals on Youtube that tries to cover how the interpreter works and how the python code maps to bytecode by going trough the cpython source: https://www.youtube.com/watch?v=LhadeL7_EIU

unixhero 7 years ago |

LibreOffice

charlysl 7 years ago |

xv6

se7entime 7 years ago |

Linux

lowry 7 years ago |

Lua. It's has everything a good C project should have: small size, simple build system, portability by using the simplest constructs and not ifdefs, a clear and well define scope that none dares trespassing.

Quenty 7 years ago | |

You can see a mirror of Lua here! https://github.com/lua/lua

javaJake 7 years ago | |

When I used this library, I was impressed with how their design not only kept their own code clean, but made it incredibly intuitive and fun to write clean code on top of their API. Coworkers also looked at that code years later and went out of their way to give positive reviews of Lua.

edoo 7 years ago |

The Linux kernel of course. In userland I have to say lib QT. I've used a lot of APIs and QT is always a pleasure to work with.

SmellyGeekBoy 7 years ago | |

I'm a Linux fanboy myself but come on - we're talking about nearly 30 years' worth of commits from thousands (tens of thousands?) of developers.

The only thing I can say is that with this in mind it's actually a lot better than I'd expect - testament to Linus's iron fist, perhaps.

auslander 7 years ago |

openbsd

qualawhat 7 years ago |

Start by defining quality.

moneysconcerned 7 years ago | |

Software quality: https://en.wikipedia.org/wiki/Software_quality

Software metric: https://en.wikipedia.org/wiki/Software_metric

''' Common software measurements include:

- Balanced scorecard - Bugs per line of code - Code coverage - Cohesion - Comment density[1] - Connascent software components - Constructive Cost Model - Coupling - Cyclomatic complexity (McCabe's complexity) - DSQI (design structure quality index) - Function Points and Automated Function Points, an Object Management Group standard[2] - Halstead Complexity - Instruction path length - Maintainability index - Number of classes and interfaces[citation needed] - Number of lines of code - Number of lines of customer requirements[citation needed] - Program execution time - Program load time - Program size (binary) - Weighted Micro Function Points - CISQ automated quality characteristics measures '''

Category:Software metrics https://en.wikipedia.org/wiki/Category:Software_metrics

greg7mdp 7 years ago | |

It is like porn, you know it when you see it.

gupi 7 years ago |

well, if trolling is permitted, I would say that "Hello World" example has the most exquisite code.

in most cases "Hello World" is open-source, but I still don't know if can be named "project"

theboywho 7 years ago |

It's funny to see nobody is even questioning the question.

What does it even hold as a value to be the project of the highest code quality in the world ? How can it exist as a consensus if we can't even agree on best practices ?

If it's for learning purposes, why even look for the ONE project with the HIGHEST quality ? Just go by any GOOD ENOUGH project.

I see this all the time: what's the best editor, the best color scheme, the best font, etc.

How about we just start saying: what's a good enough X for my purpose ?

erpellan 7 years ago | |

Sometimes you need a recipe book, other times you want to lose yourself in a masterpiece.

dredmorbius 7 years ago | |

Popular opinion is a poor test of truth. The rationales offered can be illuminating, however.

I'd actually considered making a similar comment on seing the question.

hyperpallium 7 years ago |

Just wanted to mention some bias in successful open source projects: they are often structured as a number of similar plug-in pieces, like youtube-dl for different video publishers.

This is great for open source, because you can easily discover and navigate to the part you want, and change it. You might need to understand the plugin interface - or you might not. This flat architecture makes it easy for people to contribute, an important aspect of a successful open source project.

But it's not the ideal architecture for every project. In some cases, a cleverer, harder to understand approach is more elegant, shorter, more efficient, simpler.

Of course... one might argue that ease of understanding is more important than anything else.

leetcrew 7 years ago | |

the only thing more important than understanding is shipping on time. but how are you going to ship on time if you can't understand it?