Project Gutenberg – keeps getting better

Project Gutenberg – keeps getting better(gutenberg.org)

1207 points by JSeiko 2 days ago | 275 comments

JSeiko 2 days ago |

Hi! I'm one of the programmers at Gutenberg. We've been improving the site a lot over the past few months (and more is coming!). If you haven't visited the page recently, it's worth checking out again: https://www.gutenberg.org/

svat 2 days ago | |

Have you considered having a detailed version history for each book (etext)? The process of submitting fixes to typos etc in books involves sending an email (https://www.gutenberg.org/help/errata.html) and although the last time I did this (2011) the fixes did get applied reasonably quickly (couple of days), it all felt a bit opaque. The version history could also include the project (usually PGDP correct?) the etext originated from; that way one would be able to compare against the actual page scans.

I have very mixed feelings about Standard Ebooks and would much prefer being able to use Project Gutenberg directly, but one good thing Standard Ebooks does is that every book has an associated git repository (on GitHub), so it's (in principle) possible to see a history of fixes to the text over time.

gluejar 2 days ago | | |

We're using git repos internally to keep history for each book. They existed on github for a while, but our implementation was awkward, and too big of project for the volunteer dev team. But it's likely that we'll evolve towards that.

marcprux 2 days ago | | |

> I have very mixed feelings about Standard Ebooks[…]

Why?

JSeiko 2 days ago | | |

I believe our new-ish CEO Eric Hellman actually did some work on something very similar

JSeiko 2 days ago | | |

That's an interesting idea. not a small feat to accomplish though ...

jefurii 2 days ago | |

When I thought about Project Gutenberg I remembered that original brutalist non-design. The current site has been very tastefully updated but looks like it's still very accessible if you turn styles off. Great job!

JSeiko 2 days ago | | |

sadly HN doesn't have a "heart" emoji I could use :D

fsckboy 1 day ago | | |

>When I thought about Project Gutenberg I remembered that original brutalist non-design.

I suppose a printed book, black ink on paper, is "brutalist" and unpleasant to look at?

The text of a book shouldn't be encrusted with format, your reader or browser should contain the presentation that you want to see, find appealing, or need (accessibility).

eulerpoolapi 1 day ago | |

The biggest lever: make the reading experience great. https://www.gutenberg.org/cache/epub/245/pg245-images.html is still hard to read: lines are tooo long (macbook), no great way for pagination/remembering where I was, notes

tangledhelix 1 day ago | | |

The ebook editions are very good for this. Most of the e-reader software provides all the amenities (bookmarks, highlighting, notes, control of margins, etc).

SwampertX 1 day ago | | |

Firefox's reader mode works amazingly for these situations.

elch 1 day ago | | |

Lines aren't too long. They look great on all my devices.

Use ⌘ + + until you get the line length you like.

lucb1e 2 days ago | |

Huh that's interesting: 4.5 seconds for the TCP handshake and an additional 9.2 seconds for the TLS handshake. Is this some kind of captcha, since most bots would disconnect before that, so if you complete it once then it knows you're good? (Until the bots catch on of course, but so long as it works it's relatively unintrusive and not discriminatory against uncommon client software (that is, non-Chrome/ium).) The rest of the requests were lightning fast

Edit: welcome to your first comment after 9 years on HN btw, nice to have you here!

codys 2 days ago | | |

I think their site is just slow, potentially because more people than they are used to are trying to view it.

I was unable to load it initially (got an error from firefox) and had to re-attempt. Still slow if one forces a reload (shift-r, etc, to not use local cache).

JSeiko 2 days ago | | |

we are having occasional lows in page speed performance due to LARGE amounts of bot traffic. full disclosure - we've not really been able to resolve this fully/well. Let us know if you have a good idea for how to deal with it

gluejar 1 day ago | | |

traffic yesterday ~20% more than recent average. 4971601 sessions 177 robots 863462 robot files 3390115 user files 20.30% robot files (robots id'd based on requests/ip address) 5 apache servers for static content, 1 CherryPy server for dynamic content hosted at iBiblio.

0x0203 2 days ago | |

As long as you're taking suggestions, since many of the books are quite old, adding a publication date or date range to the search functionality might be nice. I personally would find it very useful since I have a tendency to look for things that are older than year _x_ when researching various things.

Thanks for all the effort put into the site!

gluejar 1 day ago | | |

only 20% of our books have original publication data in the db. We have a project to add another 40% or so from another database, let us know if you want to help.

Guestmodinfo 2 days ago | |

Hi for the past 20 years I have known about Project Gutenberg and I used to read a lot from it. One of the obstacle that I face is that there is no way to arrange the books in the order of their original publication. Do you know of any such way. Surely we can arrange the books by their release date on Gutenberg but it has long baffled me as it feels to me the most useless way of sorting the books. Thank you for Project Gutenberg.

gluejar 1 day ago | | |

only 20% of our books have original publication data in the db. We have a project to add another 40% or so from another database, let us know if you want to help. reply

Falimonda 2 days ago | |

The book list elements on front page render as both horizontally and vertically scrollable divs on mobile - seems like an opportunity for improvement.

Keep up the good work!

JSeiko 2 days ago | | |

good feedback thanks! Doing an iteration on the homepage design is actually pretty high on the priority list. will keep your feedback in mind!

xrd 2 days ago | |

Thank you for your work. This site is an international treasure.

windowliker 1 day ago | |

FWIW I absolutely love how 'no-frills' PG is compared to so much of the bloated, over-engineered, script-riddled web these days. Please don't ever change that!

excitednumber 2 days ago | |

Thank you for being one of the best places on the internet

zamadatix 2 days ago | |

Thanks for the free work! Project Gutenberg is nice to have :).

On the site I noticed the library boxes have roughly a single extra line causing a scrollbar to appear and the last line to be chopped off https://i.imgur.com/PQ8T0qc.png is there an issues/bug portal to properly submit these kinds of things?

JSeiko 2 days ago | | |

you can open an Issue at https://github.com/gutenbergtools/gutenbergsite

smallnix 2 days ago | |

There's a minor bug with chrome in android where the menu will not close when you tap outside the menu or on the menu link/button

JSeiko 2 days ago | | |

I've messaged the guy who's best suited to fixing this. He'll be on it this weekend

JSeiko 2 days ago | | |

will open an "Issue" for it

ExtremisAndy 2 days ago | |

Oh, my! This does look nice. Thank you for your hard work!

JSeiko 2 days ago | | |

Thanks! We're currently working on a design update of the page of any specific book. Should be online soon (next 1-2 weeks or so)

freedomben 2 days ago | |

I can't say for project Gutenberg specifically, but in general a huge issue I see is OCR errors. What do you all do to address OCR?

gluejar 2 days ago | | |

Check out Distributed Proofreaders: https://pgdp.net

lapetitejort 2 days ago | | |

I uploaded a PDF to archive.org that auto-OCRs with plenty of mistakes. I have found no way of updating the entire stack of documents produced. I wonder if Project Gutenberg is similar

shuvrojit 2 days ago | |

Great Work. Thank you. I'm also a programmer. If you are ever short on help, let me know. I would love to contribute.

JSeiko 2 days ago | | |

https://github.com/gutenbergtools

autocat3 and gutenbergsite are repos responsible for generating gutenberg.org

8bitsrule 2 days ago | |

Great project. Are many of the books in a format that can easily be converted into audio? Is there a way to search for them, and information on what software your readers find useful for this purpose?

(Note: A lot of print media these days has switched to far-to-small font-sizes. Less of a problem for (zoomable) digital media, but for many that's still a barrier.)

tangledhelix 1 day ago | | |

There are many books available as audio, some are human-read, some were automated. You can see lists here:

human-read: https://www.gutenberg.org/browse/categories/1

computer-generated: https://www.gutenberg.org/browse/categories/2

IIRC many of the human-generated ones come from LibriVox, many of the computer-generated ones came from a collaboration with Microsoft.

OfflineSergio 1 day ago | | |

For the Audio part, I suggest https://desktop.with.audio

TimorousBestie 2 days ago | |

Wanna let you know you’re doing great work and you have my dream job, thanks to the team for everything!

JSeiko 2 days ago | | |

it's not my day job. PG is open-source. I'm "just" a contributor

BiraIgnacio 2 days ago | |

Thanks so much for the work you and your team do!

Jiro 1 day ago | |

I don't know what the status of this is today, but a number of years ago my biggest complaint about Gutenberg is that a lot of books had images added back when low resolution images were the standard, so you have a ton of books with image resolutions from the year 2000.

samwho 1 day ago | |

Looking really good! Great work.

shevy-java 1 day ago | |

There should be more books at Gutenberg.

Also by the way I just searched for 3d printing and found nothing. Either there are no books, or the search query makes things too complicated, IMO.

robin_reala 1 day ago | | |

Gutenberg is nearly all books that have lapsed into the US public domain by dint of being published 95+ years in the past. Which broadly explains why you hit nothing for 3d printing.

tangledhelix 1 day ago | | |

As another commenter said PG is almost all books from 95+ years in the past due to copyright law in the US. We partner with a sister organization, the World Library Foundation, who have a self-publishing portal for modern works by authors who wish to put their own work in the public domain. You might want to look there for more modern material. https://self.gutenberg.org

samcollins 2 days ago | |

Very cool! Do you have a recommended way for an agent to see an index of the books and epub links?

(I can’t quite tell if that’s an egregious abuse of the site or you’re perfectly fine to share without human eye balls hitting your www?)

jzs 2 days ago | | |

Now i'm not associated with gutenberg in any form, but they do have a page for offline consumption:

https://www.gutenberg.org/ebooks/offline_catalogs.html

Perhaps you can find the information you are looking for there.

However if you plan on scraping or otherwise hitting them with a ton of traffic, consider at least to donate a good amount for the traffic you cause them. It ain't free after all.

samcollins 2 days ago | | |

Thanks for the answers! Found it:

> All Project Gutenberg metadata are available digitally in the XML/RDF format. This is updated daily (other than the legacy format mentioned below). Please use one of these files as input to a database or other tools you may be developing, instead of crawling or roboting the website.

And strongly consider a donation! (My addition)

https://www.gutenberg.org/ebooks/offline_catalogs.html#the-p...

kay_o 2 days ago | | |

Check out https://www.gutenberg.org/ebooks/offline_catalogs.html

Don't hit the site with agent. The section furtherst bottom machine readable.

gluejar 2 days ago | | |

if what you want is all the text, please use the tarball or data files at https://www.gutenberg.org/cache/epub/feeds

JSeiko 2 days ago | | |

not yet, but that's not a bad idea imo. Dealing with Ai crawler traffic is definitely a challenge if that's what you were referring to.

dredmorbius 2 days ago | | |

Possibly ZIMs is of interest: <https://ebookfoundation.org/openzim.html> (via: <https://news.ycombinator.com/item?id=48152200>).

throw0101c 2 days ago |

While PG has probably gotten a lot of use and growth with the growth/maintreaming of the Internet since the 1990s, (TIL) it started back in 1971:

> Michael S. Hart began Project Gutenberg in 1971 with the digitization of the United States Declaration of Independence.[5] Hart, a student at the University of Illinois, obtained access to a Xerox Sigma V mainframe computer in the university's Materials Research Lab. […] This computer was one of the 15 nodes on ARPANET, the computer network that would become the Internet. Hart believed one day the general public would be able to access computers and decided to make works of literature available in electronic form for free. […]

* https://en.wikipedia.org/wiki/Project_Gutenberg

aksss 2 days ago | |

"Project Gutenberg began in 1971 when Michael Hart was given an operator’s account with $100,000,000 of computer time in it by the operators of the Xerox Sigma V mainframe at the Materials Research Lab at the University of Illinois."

https://www.gutenberg.org/about/background/history_and_philo...

gluejar 2 days ago | |

wikipedians, please help update this article.

svat 2 days ago | | |

In what way? And from what sources? (Wikipedia as a tertiary source is supposed to be a summary of information present in reliable secondary sources — see for instance https://en.wikipedia.org/wiki/Wikipedia:Based_upon. So if the information on the Wikipedia article is incomplete or out of date, where is the correct information available?)

mcdonje 2 days ago | |

Prescient

drummojg 2 days ago |

The best thing I ever did for my father was to buy him a kindle and an access point and show him how to use Project Gutenberg to get books. He loved the old writings (he being a GED holder who was in the Navy during Korea yet had read the entire Harvard Classics). He had a special rolled up towel he used to prop it on his lap in his favorite chair and he read and read and read. When he passed he was reading "Legends of the Jews" from 1931.

I had some small e-correspondence with Michael S. Hart back in the 90's as well, and made a few modest contributions to the project, which made my English major undergraduate heart swell with pride and joy.

I guess this is only to say that PG is special to me for these reasons, and I am glad to see it still thriving. <3

JSeiko 2 days ago | |

this is so great to hear! Distributed proofreaders (the org that actually does transcriptions) is still looking for volunteer should you feel the urge/inclination :) https://www.pgdp.net

j_bum 2 days ago | |

This was very touching, thanks for sharing. Sorry for your loss.

Someone1234 2 days ago |

I'm surprised no eBook Reader vendor has a Project Gutenberg "Store." Where you can just browse Gutenberg, find a book, and just grab it down to the reader. Instead, they either are actively hostile (Kindle), or require the use of Calibre (which itself is good, it is just the friction).

cosmos0072 2 days ago |

From Italy, https://www.gutenberg.org/ gives a 404 error and https://gutenberg.org/ opens a very official-looking page stating "police notice. This site is under judicial seizure" and references a sentence number: "criminal proceedings 52127/20 R.N.R.I. tribunal of Rome"

Any idea what's happening? I thought PG published public domain books...

cosmos0072 2 days ago | |

Found: it's a sentence from 2020, and PG decided not to appeal (!?)

Full story (in Italian) at https://www.wired.it/internet/web/2020/06/30/progetto-gutenb...

charonn0 2 days ago | | |

Seems like a case for HTTP 451 (Unavailable for Legal Reasons) rather than 404.

johndough 2 days ago | | |

It looks like the issue was that, in Italy, copyright expires 70 years after the death of the author or the first translator of a work.

gluejar 1 day ago | |

A silly legal tribunal confused PG with pirate sites. We sent the tribunal a letter pointing out their error but it was ignored. The block was served on local dns providers so many Italian users evade the block by using DNS from Google or Cloudflare.

dgellow 2 days ago | |

It was also blocked in Germany for a while due to a court order https://cand.pglaf.org/germany/index.html

tangledhelix 2 days ago | | |

The Alfred Döblin books are still blocked in Germany (for a couple more years).

JSeiko 2 days ago | |

I asked Claude to research the background story: "In May 2020, the Court of Rome ordered Italian ISPs to seize/block a list of domains as part of a criminal case (the 52127/20 R.N.R. you're seeing) targeting sites and Telegram channels distributing pirated newspapers and magazines. 28 domains were on the list, and Project Gutenberg got thrown in alongside the actual pirate sites."

apparently this situation hasn't been resolved yet

gluejar 2 days ago |

Nice to see so much appreciation for what we do. (I'm the new-ish executive director.) Any wikipedians reading this, the article about PG is... aging. Last I looked, it said we offered Plucker files. @Jseiko has done some nice work.

britta 1 day ago | |

FYI, I took Plucker out of the lead in November, after a PG volunteer recommended that update on the article talk page. Plucker is currently only mentioned in a sentence about formats offered in 2009.

Happy to make other updates! Writing specific notes on the talk page is helpful.

ssgodderidge 2 days ago |

Looks like the top downloaded book yesterday[0] was Concrete Construction: Methods and Costs by Gillette and Hill.[1] Beat out Moby Dick, Count of Monte Cristo, Frankenstien, Romeo and Juliet, and others.

> 23644 downloads in the last 30 days.

I wonder if this is bot behavior? 23k downloads feels like a lot?

[0] https://www.gutenberg.org/browse/scores/top [1] https://www.gutenberg.org/ebooks/24855

sovietswag 2 days ago | |

Haha well there is an exciting movie about concrete coming out, “The History of Concrete” by John Wilson. Surely the superfans are studying up

tmoertel 2 days ago | |

For context, here is the first paragraph of the book's preface:

How best to perform construction work and what it will cost for materials, labor, plant and general expenses are matters of vital interest to engineers and contractors. This book is a treatise on the methods and cost of concrete construction. No attempt has been made to present the subject of cement testing which is already covered by Mr. W. Purves Taylor's excellent book, nor to discuss the physical properties of cements and concrete, as they are discussed by Falk and by Sabin, nor to consider reinforced concrete design as do Turneaure and Maurer or Buel and Hill, nor to present a general treatise on cements, mortars and concrete construction like that of Reid or of Taylor and Thompson. On the contrary, the authors have handled the subject of concrete construction solely from the viewpoint of the builder of concrete structures. By doing this they have been able to crowd a great amount of detailed information on methods and costs of concrete construction into a volume of moderate size.

nout 1 day ago | | |

I ... now want to read this book.

JSeiko 1 day ago | | |

exciting :)

JSeiko 2 days ago | |

bot traffic would be my guess too. I doubt there was a sudden global spike in interest in "Concrete Construction Methods" :D

why_at 2 days ago | |

It's got better reviews on Goodreads than Moby Dick too. I know what I'm reading next

thangalin 2 days ago |

Project Gutenberg is a treasure trove, though many technical details defy automatic typesetting of its books. Standard Ebooks takes consistency to an unbelievable level. My post compares various sources of public domain books with an eye on typesetting:

https://dave.autonoma.ca/blog/2020/04/11/project-gutenberg-p...

fmajid 2 days ago |

Worth mentioning the Project Gutenberg ZIMs. You can download the entire ENglish Gutenberg corpus for about 60GB (English Wikipedia ZIM complete with images is ~120GB):

https://ebookfoundation.org/openzim.html

cxr 1 day ago | |

Like the Project Gutenberg collection on archive.org, the ZIMs are only current up to 2018.

kreyenborgi 2 days ago |

Gutenberg is awesome. There is also

https://www.fadedpage.com/ from Canada I think

https://runeberg.org/ from Sweden

Arcorann 2 days ago | |

Don't forget Wikisource! https://en.wikisource.org/wiki/Main_Page

carlosjobim 2 days ago |

Their feeds of new books is a goldmine:

https://www.gutenberg.org/ebooks/feeds.html

Every day you'll get much more than you're bargaining for, right into your feed or inbox. Easy download books you're interested in and put them on your Kindle.

WillAdams 2 days ago | |

I used to use the Online Books Page new books listing similarly:

https://onlinebooks.library.upenn.edu/new.html

smilespray 2 days ago |

I remember printing out project Gutenberg books in the mid-90s, four regular pages to an A4 page, double-sided on my inkjet. I had a background in typography, so I made it work.

Any yes, the text needed a lot of processing to make it right.

Now, in my early fifties and with declining eyesight, that's out of reach now.

Thanks for sticking with the project!

JSeiko 2 days ago | |

that's cool! one of my "pet-ideas" is actually to make an AI-agent that does all that typographical work for any PG book to make it nicely printable without any manual labor whatsoever. Maybe that's doable now ...

smilespray 2 days ago | | |

That is doable. Most of my work was regexp and repetitive stuff. And the typograhpy stuff is achievable with the current state of the art models. Not that I remember what I did, it was 30 years ago.

ndr42 2 days ago |

The project was geo-blocked in Germany for a long time: https://news.ycombinator.com/item?id=29024039

tangledhelix 1 day ago | |

One author remains blocked in Germany (but only for a couple more years)...

JSeiko 2 days ago | |

very glad this has been resolved (I'm from Germany myself)

debo_ 2 days ago | |

Project Gesperrtberg

litreads 1 day ago |

Deeply grateful for Project Gutenberg & LibriVox! I've been using the text to force-align LibriVox recordings to produce word-by-word synced audiobooks; first stage of this project is a YouTube channel but I could definitely turn this into a mobile reader app if there's interest: https://www.youtube.com/@LitReadsEditions

aymenfurter 1 day ago |

PG is proof that the best things on the internet are still built by people who just care about the mission.

GeorgeTirebiter 1 day ago | |

Paul Graham? ;-)

JKCalhoun 2 days ago |

Project Gutenberg had (has?) a tendency toward plaintext that always put me off. (And it has been over a decade I'm sure since I explored the site—so I am no doubt now misinformed.)

I like a styled formatted book—would prefer PDFs. (I know, not a popular format apparently.)

I like the idea of Project Gutenberg but guess I found book scans on archive.org my preference.

My go-to example is Lewis Carroll's "Through the Looking Glass" with the fantastic art of John Tenniel and Carroll's sometimes creative formatting of the prose…

I see they (Project Gutenberg) have ePub now, which can be good if well done.

(If not well done it can be a kind of mess. Re-flowable "HTML", paginated… Anyone ever try to print a long web page and did you enjoy the result? Perhaps that is as much on the ePub reader though.)

cold_tom 2 days ago |

Project Gutenberg feels like the opposite of modern internet design philosophy. Quiet, useful, accessible, and built to last.

tomjen3 2 days ago | |

Project Gutenberg is awesome and amazing.

I was visting the ruins of a monestary the other day, and one of the texts listed that it had a library of 320ish books.

I chucked because I have almost 200 books in my personal Kindle library, but I was wrong. I actually have 75000+ books, thanks to Project Gutenberg.

I just haven't downloaded them all yet.

RattlesnakeJake 2 days ago |

As a Kindle user, I still miss the old version of the site. The new one looks great on normal desktop, but the old one was simple enough to load and directly download books on the device's built-in browser.

JSeiko 2 days ago | |

That's interesting. What about the new design prevents you from doing it? Genuinely asking here. We may fix it if it's actionable

RattlesnakeJake 2 days ago | | |

And now it's time to put my foot in my mouth. I haven't used it in a while because it was frustrating, but you guys seem to have already fixed it :)

The previous version of the site had two major flaws:

1. The search bar had been removed from the top of the page, and hidden behind a "Click here to search" (or similar) link partway down the page

2. Once you opened that page, the coloring of the site was so washed out on e-ink that the text input was hard to find.

Thanks for fixing it!

bitigchi 2 days ago | | |

Maybe include a "Lite" version that only displays text/links? No to minimal styling would be great!

graemep 2 days ago | |

Is that a Kindle issue?

You can download books in most browsers. I know Amazon have done things to make life difficult for other stores in the past.

RattlesnakeJake 2 days ago | | |

I'd call it one of those middle-ground things:

• On the one hand, E Ink devices have a fairly known set of limitations, and it would be ridiculous for me to expect them to render the whole web well.

• On the other hand, it's good for website designs to consider the kind of devices employed by their users. Using a Kindle to access Gutenberg is likely less of an edge case than it would be for other sites, so it's worth the extra design work.

(Keep in mind that -- given my sibling comment -- this is all theoretical. The latest iteration of Gutenberg's site is much better than the previous version)

vwkd 1 day ago |

Not sure if this is the right place, but the new layout of the German Projekt Gutenberg is missing any download links. For example

https://projekt-gutenberg.org/authors/johann-wolfgang-von-go...

seizethecheese 2 days ago |

A big pet peeve of mine with Project Gutenberg was the lack of mobile styling. Looks like it’s been fixed! Awesome.

JSeiko 2 days ago | |

good to hear - that was a lot of work!

mowmiatlas 2 days ago |

Made an app that allows reading PG books as audiobooks on iPhone https://loudreader.io/

JSeiko 2 days ago | |

that's cool!

OfflineSergio 1 day ago | |

if the doesn't leave my phone why is it a subscription?

aronhegedus 2 days ago |

Recently downloaded Moby Dick from here:) very easy to use

JSeiko 2 days ago | |

Moby Dick is consistently one of the Top Downloads

autoexec 2 days ago |

I love how usable the site is even with JS disabled!

oidar 2 days ago |

I'm slightly curious how PG handles heavily illustrated books. I've downloaded some years ago, and the quality of the illustrations was always pretty poor. Has it been improved lately? What's the QA like for illustrations?

gluejar 2 days ago | |

Nowadays we depend on scans from Internet Archive, Hathitrust, and other sources. Some scans are better than others. Bear in mind that our illustrations need to be in the public domain and usually from the same edition as the text. https://www.gutenberg.org/help/errata.html

Myzel394 2 days ago |

I wonder if the people behind project Gutenberg use Anna's Archive or mam for books that can't be put on Gutenberg.

AndrewStephens 2 days ago |

PG remains one of the best things on the internet. The amount of fascinating material almost beggers belief.

JSeiko 2 days ago | |

the amount of weird/interesting stuff that one would find nowhere else is possibly the coolest aspect of PG imo

alexdesouza 1 day ago |

Project Gutenberg is the best. Kudos to the team and to the 1000s years of humans developed it!

kgwxd 2 days ago |

How did "Concrete Construction: Methods and Costs" come to be the #1 download?

JSeiko 2 days ago | |

good question. first though - maybe some bot has downloaded it often for whatever reasons and our systems didn't detect it as bot traffic. just a guess.

elias1233 2 days ago |

I thought this was for the Wordpress Gutenberg Editor for a second

gluejar 1 day ago | |

I should hit Matt up for a donation.

timonoko 1 day ago |

Needs "translate" buttons. Now little too cumbersome for most,

https://www-gutenberg-org.translate.goog/cache/epub/64099/pg...

zahirbmirza 1 day ago |

Is Project Guternberg ever going to add PDF download options?

gluejar 1 day ago | |

later this year

zahirbmirza 1 day ago | | |

Amazing!!! As ereaders get faster and with colour, this could make books from the Project even more attractive. I love the work of your team. Thank you.

greenie_beans 2 days ago |

my first ever coding project was making a chrome extension that made the typography better on the html formats: https://github.com/smcalilly/gutenberg-typography

JSeiko 2 days ago | |

nice!

jwpapi 2 days ago |

Please give me some book recommendations :)

JSeiko 2 days ago | |

Flatland: https://www.gutenberg.org/ebooks/search/?query=flatland

I've heard good things. Also - Sherlock Holmes :)

klondike_klive 2 days ago | |

Not a recommendation per se but I used to use Amphetype on Gutenberg texts to practise touch-typing. There's something about writing out a book that hits differently to reading it. You skip less, odd parts stick with you. I think the last one I tried was The Island of Dr Moreau.

jwpapi 2 days ago | | |

Ulnar Nerve Entrapement :/

BaseBaal 2 days ago | |

From the newest releases page I stumbled into "Some Nigerian fertility cults" by Percy Amaury Talbot & am enjoying it so far.

https://www.gutenberg.org/ebooks/78684

bryankaplan 2 days ago |

I find it interesting that the context of this comments page apparently overrides the normal definition of “PG” on HN.

JSeiko 2 days ago | |

JSeiko 2 days ago | | |

personally I'm a fan of the other "PG" as well.

marcellocurto 1 day ago |

one of the last good websites on the web...

1vuio0pswjnm7 1 day ago |

Text files are still the best

Good job

jdthedisciple 1 day ago |

I wonder how extensive the overlap is with sacred-texts.com

cpill 1 day ago |

I love PG... but the covers stink. Should have a public competition to have new ones made and voted on. I'm willing to vibe code a website to make it happen if you're willing...

oxag3n 2 days ago |

Is there a plan to extend search to book content?

tangledhelix 1 day ago | |

Since the books are available on the site as text and HTML the search engines index them already for you. Try searching for the below; it should take you to the book you expect as the first result:

site:gutenberg.org "it was the best of times"

JSeiko 2 days ago | |

not that I know of ...

shevy-java 1 day ago |

All the books should be there. I understand that current society has restrictions, what with near infinite copyright and other shenanigans - but I don't see any of these as reason to hide information from mankind. Eventually we'll free all the information. Remuneration will have to occur in other ways than the current status quo.

JSeiko 1 day ago | |

hopefully!

benj111 1 day ago |

Project Guttenburg was my first introduction to the foss ethos. Well I suppose there was Wikipedia, but project Guttenburg really spoke to me. This was probably around 2003? So I'm glad to see it still going strong.

I just looked at the history (https://www.gutenberg.org/cache/epub/60600/pg60600-images.ht...) and it dates back to the 70s. There was me thinking it was some new fangled web thing.

monegator 2 days ago |

I keep getting PR_CONNECT_RESET_ERROR

JSeiko 2 days ago | |

just heard back that the server provider has been doing a security update. Maybe you were one of the users that got unlucky as a result... maybe try later if still interested

JSeiko 2 days ago | |

I've reported it.

mentalgear 2 days ago |

Keep up the awesome work !

taubek 2 days ago |

Thank you for reminding me about this project. Didn’t visit it in a long time.

gwerbret 2 days ago |

I love Project Gutenberg, don't get me wrong... but frankly, Anna's is better.

TFNA 1 day ago | |

I came here to post something similar. PG is perhaps still important as an archive of proofread OCRed public-domain material, but for ordinary people, the shadow libraries have vastly more stuff. After all, readers don’t want their reading to be limited to what was published before a copyright cutoff date many decades ago.

JSeiko 2 days ago | |

in which way? (genuine question)

gwerbret 2 days ago | | |

Well, mainly in the fact that Anna's has several orders of magnitude more books, and includes research publications and more, ah, contemporary materials to boot.

solarity_studio 2 days ago |

Awesome

derekhdawson 2 days ago |

If you like Project Gutenberg, the closest analog for music is IMSLP, the Petrucci Music Library (imslp.org) — over 855,000 public-domain scores maintained by volunteers, with the same labor-of-love energy and the same perpetual scan-quality and copyright-jurisdiction headaches. Same ethos of "the works belong to humanity, not a storefront." Worth a bookmark for the musicians on HN.

brcmthrowaway 2 days ago |

I can't read anymore due to fear of not being productive with AI

JSeiko 2 days ago | |

maybe there's a way to read more productively using AI: https://x.com/karpathy/status/1990577951671509438

could be a trick to ease that fear :D

zozbot234 2 days ago | | |

I've found that the larger open-weight AI models do a great job of explaining the old non-fiction content on PG, particularly magazine articles which are a good size for the AI to handle. It breaks down the long wall-of-text paragraphs for you and explains all the historically relevant background that would've been assumed to be known back in the day.

If you ask it to assess the relevance of the text in the present day it will also do that very nicely, highlighting the places where the text shows old-fashioned viewpoints that would be sharply criticized today.