"Switch to haskell".
If not serious: it’s still low effort, but while it is framed as a zinger, it’s not funny at all. I don’t even understand what the humor might be, maybe it’s serious after all.
Switch to a language/runtime that's not only had virtual threads for decades, but also a saner synchronisation model (transactions) rather than synchronized blocks.
It's better to hide the locking primitives and let the runtime handle it for you safely.
Other languages adding virtual threads later in life don't have the same ability to feel preemptive. Although I think someone said Java has a nice trick or two?
Anyway, if all the virtual threads seem preemptive, you won't have the case that your limited number of actual threads are waiting on locks and not yielding --- all Erlang processes yield eventually; usually in a fairly short time frame.
Contrast this with languages like Java where every object is a potential concurrency problem. Or the 10+ years of trying to make Python async (see Twisted).
Also, this is more of a user error, than a fundamental issue.
I personally ran into this Using the built in com.sun webserver, with a virtual thread executor. My VPS only has two CPUs which means the FJP that virtual threads run on only have 2 active threads at a time. I ran into this hang when some of the connection hung, blocking any further requests from being processed.
Those who run into this issue and are unable or unwilling to do the work to avoid it (replacing synchronized with j.u.c locks) as explained in the adoption guide [1] may want to wait until the issue is resolved in the JDK.
I would strongly recommend that anyone adopting virtual threads read the adoption guide.
[1]: https://docs.oracle.com/en/java/javase/21/core/virtual-threa...
The problem is that it's rare to write code which uses no third-party libraries, and these third-party libraries (most written before Java virtual threads ever existed) have a good chance of using "synchronized" instead of other kinds of locks; and "synchronized" can be more robust than other kinds of locks (no risk of forgetting to release the lock, and on older JVMs, no risk of an out-of-memory while within the lock implementation breaking things), so people can prefer to use it whenever possible.
To me, this is a deal breaker; it makes it too risky to use virtual threads in most cases. It's better to wait for a newer Java LTS which can unmount virtual threads on "synchronized" blocks before starting to use it.
Let's ship this with a foot gun, but lets not mention in the JEP that it may hang - let them figure it out.
Virtual threads are nice for unblocking legacy code but they aren't without issues. There are better options for new code with less trade offs on the jvm as well. I've recently been experimenting with jasync-postgresql (there's a mysql variant as well) as an alternative to JDBC in Kotlin. It's a nice library. It does have some limitations and is a bit on the primitive side. But it appears to be somewhat widely used in various database frameworks for Scala, Java, and Kotlin.
Databases and database frameworks are an area on the JVM where there just is a huge amount of legacy code built on threads and blocking IO. It's probably one of the reasons Oracle worked on virtual threads as migrating away from these frameworks is unlikely to ever happen in a lot of code bases. So, waving a magic wand and making all that code non blocking is very attractive. But of course that magic has some hard limitations and synchronize blocks are one of those. I imagine they are working on improving that further.
The designers of Project Loom would say the exact opposite. The whole push behind Project Loom and similar models (Go's oft-praised "goroutines" runtimes being another one) is motivated by Threads being a much better fit for async behavior in a fundamentally procedural language like Java or Go than promise-based frameworks like async/await.
The whole motivation of Project Loom is to make the simple thing (spawning threads to handle blocking IO) the fast thing as well (by actually replacing the blocking IO with efficient async IO OS calls and managing the threads internally). Project Loom will be considered a full success if the next generation Java web server does something akin to "new Thread(() -> {executeHandlerFunc(conn); }.Start(); " for each incoming connection, just like the Go built-in web server.
The system property jdk.tracePinnedThreads triggers a stack trace when a thread blocks while pinned. Running with -Djdk.tracePinnedThreads=full prints a complete stack trace when a thread blocks while pinned, highlighting native frames and frames holding monitors. Running with -Djdk.tracePinnedThreads=short limits the output to just the problematic frames.https://github.com/jasync-sql/jasync-sql/wiki/API-Overview
From project WIKI (https://github.com/jasync-sql/jasync-sql/wiki)
Virtual threads aren't necessarily faster, you still have just as many sockets and network connections as before. You can easily spawn 5000 platform threads, and if that's not enough, there are quite a few user-space implementations of fibers/coroutines/async etc on the JVM that can deal with many outlying requests (Cats/ZIO in Scala, Kotlin coroutines, the Play framework, concurrent.Future, etc.)
"Don't replace platform/native threads with virtual ones, replace tasks (without further explanation) instead"?!
Combine that with the fact that they chose to implement the scheduler in Java instead of C(++) and you're set for performance problems.
Remember that NIO took from 1.5 to 1.7 to be usable/performant and that was native!
Edit: Finally figured out why: https://news.ycombinator.com/item?id=39010648
The JDK has historically used some native implementations in its stdlib (zip, imageio and others), back when the runtime wasn't as fast as it is today. But today's runtime would often be faster in Java than those native implementations.
Ah yes, the argument from the 1990s. It would make sense to understand where the JVM and its compiler are these days before making incorrect statements about performance.
From your link:
> Blocking network TCP IO needs a sychronized block to work
This is utterly false.
I have always had to do synchronized(something) { socketInputStream.read(); }
And the dude himself says that reading from a socket is a problem if you listen to the interview.
> There are two scenarios in which a virtual thread cannot be unmounted during blocking operations because it is pinned to its carrier:
> When it executes code inside a synchronized block or method
Isn't 'synchronized' effectively sugar for taking a kind of lock? Why can't it be treated uniformly by the scheduler?
Only when one calls a blocking operation from synchronized, the thread is not unmounted. E.g. `synchronized (...) {blockingQueue.take()}`. Note that this is not a sane coding practice. (Calling a potentially long operation from within synchronized. The blockingQueue.take() does not need to be wrapped into synchronized. It has synchronization inside and plays well with virtual threads. Only when wrapped into the synchronized, the current implementation can not unmount the virtual thread.).
The JDK team works to remove quirks like pinning in the future versions.
However, it's built directly into the JVM specification, so it's difficult to change while keeping compatibility, while j.u.c.Locks is just a library. In other words, they can't change synchronized schematic, so they created j.u.c.Locks as a replacement.
Actually, it is trivial to change. Just embed a ReentrantLock into every object and rewrite all calls to "synchronized"/"Object.wait" to use that lock.
Unfortunately, this would result in a bit of performance regression (increasing per-object memory footprint). To solve that would require turning ReentrantLock into a magical intrinsic, fully integrated with lock bytes in the object header. Which is actually not that hard either — other runtimes like Golang or Android VM solve problems like this on daily basis. Oracle, however…
There are numerous recommendations such as
https://learn.microsoft.com/en-us/archive/msdn-magazine/2015...
Final phase is "I hope these techniques will help you adopt async into your existing applications in a way that works best for you."
The summary merely stated that Java virtual thread are great. I expected a summary of the problem and solution, for example something like:
When using Java 21 virtual threads, you can end-up starved of carrier threads due to all carrier threads waiting on a pool exhausted resources with no thread available to free such resources. The solution is to wrap those resources in a virtual-thread aware object. In our case, we solved our problem by wrapping connections in semaphores.
TLS (especially mutual TLS) and Oauth also join this club.
public void syncMethod() {
synchronized(lock) {
// some code
}
}they could translate to
public void syncMethod() {
await reentrantLockAsync.lockAsync();
try {
await somecodeAsync();
} finally {
await lock.unlockAsync()
}
}The second issue is they're not completely equivalent. In the second case, you'd need extra memory for the `reentrantLock`, while `synchronized` works with any object. Furthermore, if you need to use `wait/notify`, then there need to be an extra `Condition` object to use in combination with the `ReentrantLock`. For sure, developers can rewrite most `synchronized` to use `ReentrantLock` and `Condition`, but javac won't do it automatically for you.
C# introduced:
await foreach (int item in RangeAsync(10, 3))
Console.Write(item + " "); // Prints 10 11 12
So you dont have to type:IAsyncEnumerator<int> e = RangeAsync(10, 3).GetAsyncEnumerator();
try {
while (await e.MoveNextAsync()) Console.Write(e.Current + " ");
} finally {
if (e != null) await e.DisposeAsync();
}- jdk.virtualThreadScheduler.maxPoolSize=10
But of course, when you have thousands of Virtual Threads all deliberately pinning the carrier thread, you quickly run out.
Anyway it's pretty simple really. A generic thread is bunch of stack frames (with their associated local variables). A standard OS thread is under the control of the kernel scheduler which decides whether the thread runs and makes progress or not. The VirtualThread in Java is just a thread which is not directly mapped to the OS thread scheduler but exists as a user space object that can be scheduled by a (Java implemented) scheduler. It's basically just a call stack with its local variables, but one that only steps forward when an OS thread of the scheduler decides to step it.
Usually threads were also used for long running io but not cpu intensive tasks. It's recommended to use virtual threads for such scenarios now
> Why two sentences? Maybe you should ask ChatGPT if you want explanations with specific length requirements.
As you can see, people can do it better. I put a limit on it because I didn't want an explanation of what threads are, just of the difference.
> we present a case study on how we encountered a deadlock with virtual threads in TPC-C for PostgreSQL, even without the dining philosophers problem.
I guess it was a clunky non-example!
(I was hoping to see a virtual thread solution to compare to:
https://www.adit.io/posts/2013-05-15-Locks,-Actors,-And-STM-In-Pictures.html
https://www.youtube.com/watch?v=aQXgW55f7cg
https://hackage.haskell.org/package/stm
)The deadlock was a usage error.
A better title would be: Naively switching to Java virtual threads caused a deadlock in TPC-C for Progress SQL.
I enjoyed this writeup by Michael Lynch on finding an illustrator [1], for their blog. In doing some of my own writing, I've really found it enlightening how much secondary work goes into publishing your own work. I often think its so nice to be able to _just_ plug in what I want on a site and get a (more or less) free illustration. But as someone selling their own work / time, it feels wrong. I'd rather pay a real human and build a relationship and have something more quality. On the other hand, though, it can be expensive, time consuming, and I've been screwed over. Often it seems like a bigger risk than its worth.
So idk, you're trading some hardship and risk for an ethical dilemma but ease of use.
Also, a thumbnail tip: square thumbnails are bad. If you have to use a square 1024x1024 AI generation, crop it to something like 1024x575, which incidentally can make things difficult if using AI generation since figuring out what to crop requires human intervention.
I don’t know how good they are, but people have trained models on that problem. Googling “autocrop tool” gives me multiple options.
To each their own, though, of course.
At least they could try and generate something where I can't see malformed bodies within seconds. Or create a nice diagram that actually adds something to the text.
No.
Generated or hand drawn, they're kinda a wasted effort on a technical post.
Stock images used to hide this "quality" better than I thought.
Maybe they have no artistic ability of their own? Maybe they just aren't good at finding the kinds of images (that can be freely used without infringing on anyone's copyright) that they need?
If it were me, and the guidance was "never use AI generated images in your blog post", I would probably just not use any images at all. Which I guess for some people would probably be best. But personally I prefer walls of text to be broken up by... something.
"to the trained eye you can already see that every single ai generated image is a picture of the same thing"
The `synchronized` pins the thread only when from within of the `synchronized` the program calls a blocking operation that would normally unmount the virtual thread, like blockingQueue.take() or similar. (Which is not a sane coding practice). It's because the unmounting, as it's implemented today, does not work well with synchronized.
It's better if people read JEP 444 than rely on forum comments, to avoid being misinformed.
Speaking of long-running - even without synchronized, a long running code keeps the native thread occupied, until some blocking operation is called. So an endless loop that does not call a virtual-thread-ready blocking operation will occupy the native thread forever.
Java virtual threads are a kind of cooperative multithreading - another virtual thread only gets chance to kick-in when some current virtual thread reaches specific blocking operations. In contrast to preemptive multi-threading with native threads.
So I agree with your conclusion. Virtual threads can not (yet?) be blindly used as a drop-in replacement of native threads for existing code. And the new code needs to take their specifics into account.
BTW, another method I discovered to block the native carrier thread that executes a virtual thread is to call blocking reading through FileInputStream, for example reading from the console. The FileInputStream does not implement virtual thread parking at all (yet?).
Go started without preemption and added it later. The Java team has indicated a similar path, so we might see that tackled in the future. I think they could do that using safe points or JEP 312‘s handshakes, so it’s not infeasible.
For file io they wanted to explore io_ring and they might need to add a loom friendly resolver for JEP 418. There is just so much left, like scalable timers, that I think it’s going to be a long time until VTs will be a good default choice.
To get the whole context, so virtual threads are unusable?
What holds a monitor by default and is there a workaround?
Found more:
A virtual thread cannot be unmounted during blocking operations when it is pinned to its carrier. A virtual thread is pinned in the following situations:
The virtual thread runs code inside a synchronized block or method
The virtual thread runs a native method or a foreign function (see Foreign Function and Memory API)
For those those that don't know what this means: Blocking network TCP IO needs a sychronized block to work = you can't use virtual threads for networking. I wish they formulated it like that from the start!Atleast now we know what they meant with don't use virtual threads for anything but tasks <- not blocking IO with synchronization!
So for now manual NIO is still the king of the hill.
We are reaching peak humanity levels of complexity!
That’s not true — blocking TCP IO is not implemented as blocking under the hood - that’s the whole point of virtual threads, so your conclusion is faulty.
A monitor will pin the VT to the carrier thread. That can have surprising incompatibility in the current jdk. Soon these footguns will be fixed and you can use them worry free.
https://www.reddit.com/r/java/comments/1512xuo/virtual_threa...
https://mail.openjdk.org/pipermail/loom-dev/2023-July/005993...
And they are mistaken to call this situation a "pinning"
JEP 444:
> The vast majority of blocking operations in the JDK will unmount the virtual thread, freeing its carrier and the underlying OS thread to take on new work. However, some blocking operations in the JDK do not unmount the virtual thread, and thus block both its carrier and the underlying OS thread. This is because of limitations at either the OS level (e.g., many filesystem operations) or the JDK level (e.g., Object.wait()). The implementations of these blocking operations compensate for the capture of the OS thread by temporarily expanding the parallelism of the scheduler. Consequently, the number of platform threads in the scheduler's ForkJoinPool may temporarily exceed the number of available processors. The maximum number of platform threads available to the scheduler can be tuned with the system property jdk.virtualThreadScheduler.maxPoolSize.
(In my testing the default ForkJoinPool limit was 256)
So theoretically they could have extended the jdk.virtualThreadScheduler.maxPoolSize to a number sufficient for the use case. Although their workaround with semaphores is probably more reliable - no need to guess the sufficient number.
The situation with Object.wait() is not what JEP 444 calls "pinning". The "pinning" happens, for example, when one calls `syncronized(....) {blockingQueue.take()}`, which is not sane coding, BTW. In this case the native thread is blocked and is not compensated by another thread - much worse than the Object.wait(). The number of native threads that run virtual threads is equal to the number of CPUs by default, so "pinning" immediately makes one CPU unavailable to the virtual threads of the application.
All those issues are temporarily, as I understand. The JDK team works for fix Object.wait(), synchronized, etc.
To call Object.wait() you need to own the objects monitor, which would imply that your code would actually look like `synchronized(....) {Object.wait()}` in which case you would indeed be pinned.
If it doesn't spawn threads when all of them are blocked, that seems kinda dumb. And a severe change in semantics. It can be conservative and try running unpinned ones on fewer threads and shuffle them around and slowly spawn more to ensure eventual progress, which would mean a possibly significant optimization problem, but a hard cap impacts correctness.
If you have lots of blocking I/O (meaning: waiting for things happening on other threads or processes, which offers scheduling opportunities), use virtual threads. If you compute or call native code, keep using platform threads.
The issue with synchronized is eventually going to be resolved. But long-running computations (sorting, parsing, number crunching, etc) or native calls must also in the future be offloaded to an ExecutorService with platform threads.
You've got some virtual threads that encounter this code,
synchronized(foo) {
foo.wait()
}
And some other virtual threads that are in charge of awaking the waiters, synchronized(foo) {
operation()
foo.notify()
}
This is a classic approach to the producer/consumer pattern in Java.If operation() can do a virtual thread suspend, then it's possible to be suspended, relinquish the platform thread, which the scheduler reuses for the consumer and gets blocked on Object.wait. If this happens enough, you can end up with all the platform threads blocked, and no threads available to make progress on the producer.
The problem is that Object.wait doesn't release the virtual thread, which is a pretty major foot gun that I think the JDK team would have liked to avoid but it was too hard to implement correctly in the current JDK's codebase.
"Please use the original title, unless it is misleading or linkbait; don't editorialize." - https://news.ycombinator.com/newsguidelines.html
My understanding is there is work to make synchronized not pin the carrier thread, but that's some pretty complex and important code to change.
https://docs.oracle.com/en/java/javase/21/core/virtual-threa...
> The problem is that this synchronized code might be deeply embedded within the libraries you use. In our case, it was within the c3p0 library. So, the fix is straightforward: we simply wrapped the connection with a java.util.concurrent.Semaphore.
I bet if you just checked out connections and slept a random amount of time you’d have the same problem.
Virtual threads changed the contract a little bit. Now one virtual thread running certain code can prevent a different virtual thread from ever getting any cycles even though they are not dependent on each other in the Java code. It’s a side effect of the current Java implementation.
The rules changed, and it tripped up c3p0. Unless they explicitly said somewhere that they were completely ready for virtual threads I’m not sure anyone is at fault here.
As you indicate, the complexity lies in not burning too many bridges with existing users and use cases. This is something that Android regularly does and which Go never really had to do due to its shorter history and up-front design.
Synchronized in this context is pretty nonsensical.
But then again, why couldn't scheduler detect a deadlock? Go has a system in place that, in case of total program deadlock, prints out an error message with all goroutines' stack traces, and stops the program. Perhaps Virtual Thread Scheduler could do the same thing?
But then again, Java also allows for native threads to run in parallel to Virtual Threads, which makes it impossible to detect whether there's a deadlock, and not just virtual threads waiting on a native thread.
I suppose this is a very good example why simple is better than complex.
Moving monitors into Java is not a good solution, like the long solution they are working on.
Java should be the API not the implementation!
In the past I’ve used lots of screenshots which seems to work well.
Where I have used images I have cut and pasted and used things like canva but nothing has ever really ended up as I would have liked it.
Edit: why are some commenters on HN so literal minded anyway? This is free form chat not code specs. You could have read "2 sentences" as "concise", you know...
> Does your page design improve when you replace every image with William Howard Taft?
> If so, then, maybe all those images aren’t adding a lot to your article. At the very least, leave Taft there! You just admitted it looks better.
That's different from blocking functions, described in the quote, that does not even try to unmount virtual thread. Like Object.wait().
Pinning is worse than those functions, because the functions compensate for a blocked native thread by adding one more native thread to the pool.
I haven't professionally written Java in years, however from what I remember synchronized was considered evil from day one. You can't forget to release it, but you better got out of your way to allocate an internal object just for locking because you have no control who else might synchronize on your object and at that point you are only a bit of syntactic sugar away from a try { lock.lock();}finally{lock.unlock();} .
There's an additional benefit to using the built in monitors, and that has to do with heap allocation. The data structure for managing it is allocated lazily, only when contention is actually encountered. This means that "synchronized" can be used as a relatively low cost defensive coding practice in case an object which isn't intended to be used by multiple threads actually is.
I guess I might have preferred if both Java and .NET had chosen to use a dedicated mutex object instead of hanging the whole thing off of just any old instance of Object. But that would have its own downsides, and the designers might have good reason to decide that they were worse. Not being able to just reuse an existing object, for example, would increase heap allocations and the number of pointers to juggle, which might seriously limit the performance of multithreaded code that uses a very fine-grained locking scheme.
> Pinning does not make an application incorrect, but it might hinder its scalability.
The documentation is wrong.
In the end it's just 2x 40kb
If there's no work-stealing from pinned carriers (or they're low-finite and normal threads are effectively infinite): yes that'd be a HUGE issue. I would be shocked if they released anything with that limitation though, that would violate some of the core expectations of mutexes and threads - independent ones need to make progress or nearly all patterns can't guarantee progress.
So yeah I can see that starving rather quickly, particularly with benchmarking-like workloads. Synchronized is very very common, 256 concurrent calls really doesn't seem all that abnormal.
If that were raised to like max-int32 would things be fine, semantically? That'd mimic real threads limits (no jvm limit at all afaict).
Correct you can't steal the carrier thread from an Object.wait() waiting virtual thread. This is apparently in the pipeline but it is a pretty major limitation.
Most cases of synchronized/notify/wait should probably use concurrent collections instead (as message queues) so in greenfield code it's not that big of a deal. Virtual threads make writing consumers/producers using collections way easier too.
Sadly, most Java projects are not greenfield projects.
I mean stealing other virtual threads from the pinned carrier thread (except for the one pinning it) so they can make progress. Normal work-stealing stuff - the queue(thread) is blocked(pinned), so process that task(virtual thread) in a different queue(thread).
It makes sense that a pinned thread remains pinned with the virtual thread that pinned it.
The 256 default carrier thread limit is going to frequently be a problem though, yeah. That's more than enough to cause all this, and it's a pretty crazy default imo.
The default thumbnails in lieu of your own aren't good.
This doesn't mean you need a giant hero header, or an AI generated image, or even any images in your posts at all.
"Making clicks more likely" is a terrible measure of genuine value.
There are lots of images which will make people click, even if once they see your page they click 'Back' a second later. Our metrics are broken if we continue to attribute that click as 'success'.
Genuine value, to who? For the author, getting more clicks is probably of "genuine value", depending on their goals for their writing. But seems most people are not writing and publishing stuff today because they think it provides value to others, but because they think it'll provide value to themselves somehow.
The handful of languages I know either do not have a top level object class that supports a randomized set of features ( C++ ) or prioritize a completely different way of concurrent execution ( Python, JavaScript ).
It really wasn't. There were people on here, including Oracle employees, claiming that the virtual thread implementation was a drop-in replacement that would work (not necessarily perform better, but work) in all cases.
I don't know if those Oracle employees actually did outright say -- or even imply -- "in all cases" as the GP asserted, but if they did, then "only" working in "common usage scenarios" would definitely be overselling the feature.
This was a known and advertised failure case with the new threads runtime, exacerbated by limitations in the implementation that cause certain blocking operations to block the current OS thread instead of blocking the virtual thread and allowing another virtual thread to be re-use the underlying OS thread.
We make scalable graphics rendering servers to stream things like videogames across the web. When we started the project to switch to virtual threads we had that as number one on the big board. "Rewrite for reentrant locks."
Maybe we have more fastidious engineers than a normal company would since we are in the medical space? But even the juniors were reading and familiarizing themselves on how to properly lock in loom's infancy.
All that only to point out that, yes, they had communicated the proper use of reentrant locks long ago.
I do understand what you're saying from an engineering management perspective though. That effort cost a fortune. Especially when you have the FDA to deal with.
It was more than worth it though! In the world of cloud providers, efficiency is money.
1 - Use virtual threads with reentrant locks if you need to do "true heavy" scaling.
2 - Kind of implied, but since you gave the opportunity to make it explicit with your comment =D, there is no need to waste your life on earning no money in videogames when the medical industry is right there willing to pay you 10x as much for the same skills. (Provided your skill is in the hard backend engine and physics work. They pay more for the ML too, if I'm being honest.)
https://docs.oracle.com/en/java/javase/21/core/virtual-threa...
In Virtual Threads: An Adoption Guide part there is:
When using virtual threads, if you want to limit the concurrency of accessing some service, you should use a construct designed specifically for that purpose: the Semaphore class.
This seems like it's at least vaguely headed in the direction of that famous scene from early in The Hitchhiker's Guide to the Galaxy:
“But the plans were on display…”
“On display? I eventually had to go down to the cellar to find them.”
“That’s the display department.”
“With a flashlight.”
“Ah, well, the lights had probably gone.”
“So had the stairs.”
“But look, you found the notice, didn’t you?”
“Yes,” said Arthur, “yes I did. It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying ‘Beware of the Leopard.”
It’s not like multithreaded computing wasn’t full of footguns anyway.
It mostly works fine and it's an impressive bit of engineering. But it has some really ugly failure modes in combination with hacky legacy code designed for real threads. So, you can't blindly assume things to just work. Hence the deadlocks.
Many Java servers already work the way you outline. It's just that they are a bit tedious to use with the traditional Java frameworks. Which is one reason I like using Spring's webflux with Kotlin instead. Just way nicer when it's all exposed via co-routines.
You could say the second choice, the specific API, was done, at least to some extent, for backwards compatibility reasons. I wouldn't agree, but I think there is at least some argument to be made. Here is one of the designer's explanation [0]:
> We also realized that implementing the existing thread API, so turning it into an abstraction with two different implementations won't add any runtime overhead. I also found that when talking about Java's new user mode threads back when this feature was in development, and back when we still called them fibers, every time I talked about them at conferences, I kept repeating myself and explaining that fibers are just like threads. After trying a few early access releases of the JDK with a fiber API, and then a thread API, we decided to go with the thread API.
However, the choice of adding a new concurrency primitive to Java in the form of green threads instead of others was very very clearly not done for backwards compatibility's sake. Ron Pressler (who is active here as 'pron') has several talks on the advantages of green threads over async/await that you can look at [0][1]. The designers of Go also had the same belief, and also chose to add green threads as the fundamental built-in concurrency primitive in Go, obviously not for backwards compatibility reasons in their case.
[0] https://www.infoq.com/presentations/virtual-threads-lightwei...
Sure, but then again the designers of circa 2000-2010 J2EE also thought the verbosity and over-engineering was a good idea.
My understanding is that that highest performance webserver is nginx. And it uses async internally.
IMO, virtual threads is a better general purpose language feature because it avoids function coloring and is generally easier to reason about, but it may not result in the highest performance Java webserver.
The purpose of project Loom is to abstract that away from Java application code. The runtime can use the most efficient IO for the given platform (ideally io_uring on Linux or IOCP on Windows, for example) even if the application code calls the old blocking File.Write(). The application can then use simple APIs and code patterns, but still get massive performance.
With Loom, you can easily have 20,000 virtual threads servicing 20,000 concurrent HTTP requests and each "blocked" in IO, while only using, say, 100 OS threads that are polling an IOCP. A normal Linux box can typically only handle around maybe 1000 threads across all running processes.
It definitely leaves room to optimize by not pinning that thread, which would be great, but that shouldn't change semantics at all. Or is there something actually screwed up in the implementation of virtual threads that makes this a much bigger issue?
Been away from Java land for a while. How did something like that even get into release? That’s like a pretty big loaded shotgun to leave lying around with lots of kids playing, no?
Also, this is a benchmark. It's not surprising that they managed to produce a situation where more than n_cores virtual threads would actually start waiting.
So I don’t see the big fuss about it - don’t spawn a million virtual thread that all just spams synchronized?
If they're, like, limiting to CPU cores * 2 threads: yeah that would be Bad™. Unambiguously. I haven't been able to find anything conclusive about this though.
The issue is that Object.wait doesn't suspend virtual threads, so you get deadlocks. The answer is to reimplement usages of the wait/notify pattern to use locks or concurrent collections (for example, using a concurrent message queue for the producer/consumer pattern, which is a common use case for synchronized/wait/notify).
The trick is to include the green threads from the start, so there are no libraries that depend on real threading. That's why Go and Erlang are so successful.
That doesn't invalidate your point; more than 20 years of Java practice has focused on making things work well for platform threads.
For example, https://pkg.go.dev/runtime#LockOSThread
It pins goroutine until it is explicitly released ensuring that multiple native calls will remain on the same platform thread and nothing else is going to use it. This is critical for namespace manipulation on Linux.
Java only pins for duration of native call and synchronized blocks.
It looks like Java does not offer equivalent API? For now could be achieved with synchronized but if synchronized will be changed in the future to not pin it would break.
For example, this program:
var threads = new Thread[20000];
for (int i = 0; i < 20000; i++) {
threads[i] = Thread.ofVirtual().start(() -> {
try {
Files.copy(FileSystems.getDefault().getPath("abc.txt"), System.out);
} catch (IOException e) {
System.err.println("Error writing file");
e.printStackTrace();
}});
}
for (int i = 0; i < 20000; i++) {
threads[i].join();
}
Run as `java Test > ./cde.txt` takes about 4.5s to run on my WSL2 system with 2 cores, writing a 2 GB file (with abc.txt having 100KB); even this would be within the HTTP timeout, though users would certainly not be happy. Pretty sure a native Linux system on a machine beefy enough to be used as a web server would have no problem serving even larger files over a network like this.As mentioned in another comment: jdk.virtualThreadScheduler.maxPoolSize
That would indeed be a problem if it's not similarly unlimited by default. Configurable makes perfect sense, as does attempting to be conservative, but small hard-capped defaults are very obviously going to cause problems, especially while synchronized locks the carrier.
>The maximum number of platform threads available to the scheduler. It defaults to 256.
Yeah, that's pretty small. >256 simultaneous synchronized calls doesn't seem particularly extreme, given how common its use is.
Tho now I wonder if you can just set this to max-int and resume like normal, or if giant values do awful things internally...
2. You did not compare against fewer threads to see if threads are actually the bottleneck rather than IO. Also, all your threads are competing for stdout.