State of Text Rendering 2024(behdad.org) |
State of Text Rendering 2024(behdad.org) |
Did anyone stop to consider if this is really necessary? The author makes it sound like he has used his influence steadily over the years to make fonts more complicated. “In year X, I proposed that fonts be able to do Y, because why not?” I get that text shaping is so complex, that in terms of open source, there is just Harfbuzz. I’m not an expert in this area. But I don’t think it’s a good thing if “font standards” are constantly getting new features, like web standards, and font renderers are like mini browser engines, where the sheer scope and number of features and rate of new features keeps everyone using the same codebases.
But the real kicker was emojis which threw a real spanner in the works. Prior to this text rendering had been universally mono, but we really had to add color then.
It's really about being inclusive. Writing (historically) was always something very analog, varying wildly between people, with all sorts of unbelievably arcane rules. Tech is just finally catching up with 5,000 years of history.
Will FreeType and HarfBuzz remain supported as C/C++ projects long-term, I wonder? Asking as someone who depends on these and doesn't want to introduce a dependency on a Rust compiler :)
Anecdotally, I notice a lot of game developers avoid FreeType and Harfbuzz entirely, instead opting for much worse text rendering in the form of stb_truetype.h only (Dear Imgui uses this, for example) - which 'is nice because it is a single header C file' but sucks with international languages; many people use SDFs for similar reasons.
I think the proposed move to WASM fonts, if done right, could make it easier to reduce the amount of code people need to render fonts (if the WASM font does the heavy lifting, and a small C program could render it) and alleviate this trend of people not using a good text rendering stack
I have been hacking on a Rust program recently and I am using Freetype and Harfbuzz via FFI because the Rust packages he names don't appear to be mature yet.
you can link against pre-built .dll/.so/.dylib from your C++ code base.
Interpreting OpenType is very complex.
Have you tried Observable? Their online notebook has live team editing built in, and the option to comment on cells, as well as fork and merge notebooks with suggested edits. However, it uses markdown instead of a WYSIWYG editor (although I did create some tagged template wrappers for djot and markdeep as possible alternatives). On the plus side it's really easy to write interactive demos!
They're kind of pivoting their product at the moment though, so I'm not sure how easy it is to get into the notebook part these days.
[0] https://observablehq.com/@jobleonard/djot
[1] https://observablehq.com/@jobleonard/wrapping-markdeep-into-...
I really hope they succeed because I'm a big fan of both Observable notebooks and the new Observable Framework (and their Observable Plot library is pretty good too. Really, they have tons of good stuff).
But basically, they now have two products, as mentioned in the "overview" section of their docs[2].
[0] https://observablehq.com/framework/
> Finally, I proposed that the future of font shaping and drawing/ painting will involve a fully-programmable paradigm.
> Two tongue-in-cheek mis-uses of the HarfBuzz Wasm shaper appeared recently: the Bad Apple font, and llama.ttf. Check them out!
See... the thing about solving problems is that, eventually you realize that any kind of problem can be solved by simply making a platform that can execute arbitrary code.
...and you see it again and again.
Declarative compile definition in make? Why not just make it code and use cmake?
Declarative infra in terraform? Why not just make it code and use pulumi?
Declarative ui? Why not just make it code and use javascript?
Database that holds data? Why not just use stored procedures and do it in code?
The list just goes on and on, and every time you have to roll back to a few important questions:
- Is the domain so complicated you really need arbitrary code execution?
- Are you inventing a new programming language for it?
- How will people program and test for the target?
- How will you manage the security sandbox around it?
It's just such a persuasive idea; can't figure out how to deal with the complexity in configuration? Make it code and push the problem to the user instead!
Sometimes it makes sense. Sometimes it works out.
To be fair, I get it, yes, font layouts are harder than they look, and yes, using WASM as a target solves some of those issues, but I look at llama.ttf and I really have to pause and go...
...really? Does my font need to be able to implement a LLM?
I don't really think it does.
...and even if the problem is really so complex that rendering glyphs requires arbitrary code (I'm not convinced, but I can see the argument), I think you should be doing this in shaders, which already exist, already have a security sandbox and already have an ecosystem of tooling around it.
I think inventing a new programmable font thing is basically inventing a new less functional form of shaders, and it's a decision that everyone involved will come to regret...
Maybe I'll put RiscOS on a Raspberry Pi at the same time, which (IIRC) had one of the first antialiased font rendering engines ever.
(I do have some old Macs running currently, and weirdly enough still prefer some of the old "blurry" font renderings to a lot of modern ones, at least on regular displays)
As the main author of rustybuzz I'm surprised to hear this. If you need a text shaper, rustybuzz is mostly a drop-in replacement for harfbuzz.
Text shaping and TrueType parsing are hard problems, but Rust does not make them more complicated. Quite the opposite. In fact, rustybuzz is written in 100% safe Rust. I would even go further and say that Rust is the only sane option for solving text-related tasks.
But I think the power of legacy and a bigger community is not to be underestimated.
> An experimental engine Servo originally launched by Mozilla as a successor to Gecko, is implemented in Rust. It was eventually abandoned by Mozilla when Mozilla Corporation announced laying off a quarter of its staff in 2020 and transferred to The Linux Foundation, then in 2023 to TLF Europe. While still experimental, it has been under active development again since 2023. Servo currently uses Rust bindings to Harfbuzz.
I'd say Servo is going pretty well. The git is fairly active and monthly updates on their blog paint a positive picture of the rate of progress. I try the engine out roughly monthly after the blog posts drop and when I last did a few days ago I was impressed to see a lot of my most used websites being displayed correctly. At this rate, I think it could become viable much sooner than we think. However, the project is still critically underfunded, currently only getting a monthly $2229USD according to their website.
> [It is ironic indeed, that a text about text rendering, is presented in such an inaccessible and badly-typed environment. This is a Google Docs Preview page. I am still yet to find a solution that provides the same features (collaboration, commenting, live edits) and is presented better. Suggestions are appreciated.]
PS: That is in Firefox. In Chrome it uses what appears to be a bitmap font, which is much worse.
I got annoyed that the page hijacked right click and key navigation, so I wanted to print to PDF — which didn't work. Chrome printed a single blank page. Firefox managed to print, but also only a single page, and when zoomed in the font got interpolated (= blurry), instead of being more readable.
It’s very easy to see when FreeType is used because it just looks off in a few, but significant ways. I’ve used it with and without Harf. DirectWrite has been a joy by comparison.
Here's a quick picture from a few days ago: https://imgur.com/a/GLohlj1
I can zoom in somewhat only if I press Ctrl + fast enough many times in succession, but then it is easy to overshoot.
I can't count how many times I've seen simple code get turned into a hideously complex declarative language that has serious gaps.
Simple UI library code? Turn it into a custom declarative syntax that is now limited!
Simple build system that works and is debuggable? Turn it into a declarative syntax that can't be debugged and can't handle all the edge cases!
And so on and so forth.
I will admit that the idea of a font programming language sounds genuinely awful to me. So I don't really disagree with your premise. But I'm increasingly annoyed with declarative system when vanilla code is often simpler, more flexible, and more powerful (by necessity). :)
Explain to me exactly why, other than 'I guess someone already implemented some kind of basic version of it' that you would have to have custom CPU code rendering glyphs instead of a shader rendering SDF's like literally everyone does with shaders already?
It's not a good solution. It's a bad, easy solution.
We have a solution for running arbitrary GPU accelerated graphics instructions; it has a cross platform version with webGPU.
This font thing... looks a lot like 'not invented here' syndrome to me, as an uninvolved spectator.
Why would you chose or want not to use GPU acceleration to render your glyphs?
What 'arbitrary code' does a font need to do that couldn't be implemented in a shader?
Maybe the horse has already bolted, yes, I understand programmable fonts already exist.. but geez, its incomprehensible to me, at least from what I can see.
And you could add backdoors to hacked fonts that are activated by magic spells. Isn't it great.
Shaping is different compared to rendering glyphs themselves. SDF renderers (and other GPU text renderers like Slug) still do shaping on the CPU, not in shaders. Maybe some experiments have been done in this area, but I doubt anyone shapes text directly in the GPU in practice.
Think of it like a function that takes text as input, and returns positions as output. Shaders don't really know anything about text. Sure you could probably implement it if you wanted to, but why would you? I think it would add complexity for no benefit (not even performance).
> kerning, ligature replacement, combining diacritical mark placement, and character composition. Slug also supports a number of OpenType features that include stylistic alternates, small caps, oldstyle figures, subscripts, superscripts, case-sensitive punctuation, and fractions.
Probably still uses the CPU.
1) Because SDFs suck badly (and don't cover the whole field) when you want to render sharp text. SDFs are fine when used in a game where everything is mapped to textures and is in motion at weird angles. SDFs are not fine in a static document which is rendered precisely in 2D.
2) Because GPUs handle "conditional" anything like crap. GPUs can apply a zillion computations as long as those computations apply to everything. The moment you want some of those computations to only apply to these things GPUs fall over in a heap. Every "if" statement wipes out half your throughput.
3) Because "text rendering" is multiple problems all smashed together. Text rendering is vector graphics--taking outlines and rendering them to a pixmap. Text rendering is shaping--taking text and a font and generating outlines. Text rendering is interactive--taking text and putting a selection or caret on it. None of these things parallelize well except maybe vector rendering.
Be really specific.
What exactly is it that you can't do in a shader, that you can do in a CPU based sandbox, better and faster?
(There are things, sure, like IO, networking, shared memory but I'm struggling to see why you would want any of them in this context)
I'll accept the answer, 'well, maybe you want to render fonts on a toaster with no GPU'; sure... but that having a GPU isn't good enough for you, yeah... nah. I'm not buying that).
But again, this has nothing to do with HarfBuzz or wasm.
Even if you already have a GPU renderer for glyphs and any other vector data, you still want to know where to actually position the glyphs. And since this is highly dependent on the text itself and your application state (that lies on the CPU), it would actually be pretty difficult to do it directly on the GPU. The shader that you would want should emit positions, but the code to do that won't be easily ported to the GPU. Working with text is not really what shaders are meant for.
No part of the rasterizer or renderer is configurable here. As mentioned above, the rasterizer is already programmable with up to two different bespoke stack bytecode languages, but that has nothing to do with shaping through wasm.
However, the article clearly states there are intentions to move towards much more than just shaping in wasm:
> I proposed that the future of font shaping and drawing/ painting will involve a fully-programmable paradigm.
> Bad Apple will become much easier and faster when we introduce the draw API in Wasm.
> Drawing and painting API will eventually come to HarfBuzz, probably in 2025.
slug builds acceleration structures ahead of time. the structures are overfit to the algorithm in a way that ttf should be but which is economical for video games. that doesn't seem like an interesting concern and nothing about it is specific to the gpu