How Akka Cluster Works: Actors Living in a Cluster

How Akka Cluster Works: Actors Living in a Cluster(lightbend.com)

109 points by _lbaq 5 years ago | 62 comments

I use Akka Cluster extensively with Persistence. It's an amazing piece of technology.

Before I went this route, I tried to make Akka Cluster work with RabbitMQ however I realized (like another poster here) that you're essentially duplicating concerns since Akka itself is a message queue. There's also a ton of logistics with Rabbit around binding queues, architecting your route patterns, etc that add extra cognitive overhead.

I'm creating a highly distributed chat application where each user has their own persistent actor and each chatroom has their own persistent actor. At this point, it doesn't matter where the user or chatroom are in the cluster it literally "just works".

All I need to do is emit a message to the cluster from a user to chatroom or vice versa, even in a cluster of hundreds of nodes, and things just work. Now there's some extra care you need to take at the edge (split-brain via multi-az, multi-datacenter) but those are things you worry about at scale.

Akka is the real fucking deal and it's one of the most pleasurable application frameworks I've ever had the pleasure of using in my career.

edit: The only reason I'd ever want to use Rabbit again is if I had external clients that needed to hook up to our message bus. If you're creating an entirely internal system, Akka Cluster is absolutely the way to go.

acjohnson55 5 years ago | |

I'd echo this.

I worked for a company that made a real-time auction system, based on Akka. It's been frustrating in the years since to program on less powerful foundations.

If I were building a system that combined interactive and autonomous processes, I would absolutely reach for Akka again. The one thing I'd love to see is if they could build something like https://temporal.io on top of Akka. I think it would be complementary to the state machine style model of typed actors and the pipeline model of Akka Streams.

playing_colours 5 years ago | |

I remember using Akka in the domain of IoT. A persistent actor represented a sensor: state and history of readings. There was a great feature in Akka Persistence: if an actor went idle for a while - no new sensor data, it would be offloaded from memory to the storage (Cassandra). As soon as the sensor started to send new signals again or the sensor state was queried by a user the actor got loaded back to memory.

vinay_ys 5 years ago | | |

How much sensor data is kept in memory in the actor object? What's the cost tradeoff between memory vs an SSD? I wonder if SSD based solution would still be cheaper and more scalable than live memory objects based solution.

amelius 5 years ago | |

Could one implement a distributed filesystem using Akka?

halfmatthalfcat 5 years ago | | |

You could but I don't think it would be the best tool for it. When I think of Akka, I'm using it because I don't want to worry about which node in my cluster any given actor is on, I just want to be able to scale horizontally and messages are routed appropriately.

Akka embraces the "let it fail" mentality where as nodes go down (just as pods go down in Kuberentes), you don't have to worry about where your processes are running...they just are...somewhere.

hugofirth 5 years ago |

To offer a slightly dissenting opinion, we’ve had many issues with Akka over the years:

- if you roll your cluster membership a lot the dotted version vectors which are created by Akka distributed data grow unbounded. Eventually they will start making gossip messages exceed the default maximum size (a few kB IIRC) and fail to send.

- in the presence of heavy GC Akka cluster has a really bad time. Members will flip flop in marking each other unavailable. Eventually this will render the leader unable to perform its duties and you will struggle to (for example) allow a previously downed member to rejoin the cluster.

- orderly actor system shutdown will also fail under high GC, which is problematic as sometimes you need to restart your actor system.

- split-brain resolution is really really hard to get right. The Akka team have recently made theirs open source I believe which is good, but back when we were building with Akka cluster it required a Lightbend subscription.

- If you aren’t all in on Actors, the integration point between Akka and the rest of your codebase can be a little odd. You often feel like you should reach for `Patterns.ask` (a way of sending a message to an actor and then getting a Future back which will complete on a particular response) but then people tell you that’s an Anti pattern.

————

Having said all the above, if you’re able to go all in on the Actor pattern and you’re unlikely to hit high GC then you should give Akka cluster a try. The problems it tackles are genuinely hard and you should build on their hard work if you can. In particular they offer (in distributed-data) the most robust/complete set of CRDTs I’ve yet come across. Many other CRDT libraries expect you to bring your own gossip protocol and transport layer.

cutemonster 5 years ago | |

How would you compare Akka with Erlang? Or is that a weird thing to ask here

What made you choose Akka

hugofirth 5 years ago | | |

We're a big Java/Scala shop already, so Akka was easier to integrate :)

kdps 5 years ago | |

Regarding garbage collection: do you think ZGC or Shenandoah could reduce the problems you mentioned?

hugofirth 5 years ago | | |

I believe it would have alleviated some of the issues yes. In general I’m excited for the benefits “pauseless” GC can bring to soft real-time systems on the JVM. Unfortunately, for now we have to continue supporting G1 and friends.

tunesmith 5 years ago |

In scala-land, a lot of people like to scoff at Akka because they prefer other pure fp concepts, but I don't think they've found a replacement for Akka Cluster - where you need objects that have both state and behavior, meaning they need to exist in memory, and where there are too many to exist on one server.

If you don't need behavior, you can use things like distributed databases or caches, and if you don't need to scale out, there are other pure fp solutions. But for this kind of distributed behavior, it still seems to me that Akka Cluster is the killer app.

darksaints 5 years ago | |

I don't think the scala crowd scoffs at Akka because they are beholden to pure functional programming. The pure FP crowd is actually a minority...significant enough to acknowledge, but not enough to make or break anything about the community.

The real problem with Akka is that, at least until very recently, you had to abandon any semblance of type safety if you used it. That was very frustrating to work with. I can take or leave pure FP, but you can pry my strong static typing from my cold dead hands.

AzzieElbab 5 years ago | | |

I wouldn’t say all type safety. It just couldn’t prevent you from sending unhandled messages to actors

alextheparrot 5 years ago | |

I wonder what it would look like trying to implement something like the IO monad backed by Akka actors. Can't recall anything off the top of my head that would make that untenable aside from the aforementioned scoffing.

thelittlenag 5 years ago | | |

Take a look at ZIO Actors (https://zio.github.io/zio-actors/).

AzzieElbab 5 years ago | | |

I have seen several such implementations. They are fairly robust although feel a bit awkward when compared to streams. Scala Fs2 and zstreams are just too good once you figure them out. Neither provide clustering of course

tormeh 5 years ago |

The actor model is brilliant, but I fear it will never get the adoption it deserves because it's too much of a break from convention. I guess it's maybe a bit too much of a leap? It replaces both Kubernetes and messaging queues, so once you've made software with it it's kinda hard to back out to a vanilla programming model.

There's of course the legit downside of needing to use one language and one framework for all actors, which is a problem Kubernetes and a message queues don't have.

halfmatthalfcat 5 years ago | |

It doesn’t replace Kubernetes, it is the ultimate companion.

Kubernetes deals with the OS/node level failures, the actor system deals with the application level failures.

It’s actually amazing how complementary they are.

lostcolony 5 years ago | |

'legit downside of needing to use one language and one framework for all actors' - which is why a lot of Erlang users use it for coordination of messages, and delegate their handling to other services as needed.

That said, most work places I've been at, leadership has -wanted- to use one language. Even with containers and other decoupling technologies. So I don't know how much of a negative effect that downside has.

blandflakes 5 years ago | | |

I've sort of come around in my career to aggressively simplifying the stack where possible... it's handy to be able to script stuff but I don't actually enjoy running a polyglot team. We get a lot more done with one language, one build tool, etc. I tend to save other languages for niche applications (e.g. Lua for nginx scripting).

dragonwriter 5 years ago | |

> There's of course the legit downside of needing to use one language and one framework for all actors

That's not inherent in the actor model, it's a potential artifact of some implementations. Though I think one VM is more common. E.g., BEAM instead of just Erlang, or JVM for Akka.

aaronmill1 5 years ago | |

How would the actor model replace Kubernetes?

acjohnson55 5 years ago | | |

The way I see it, Kubernetes (with some type of message bus) is in some ways a system for deploying processes that will carry out process, kind of like actors. It doesn't use the same formalisms, but functionally, it's quite similar. Or, you could say that Akka Cluster is like Kubernetes, except that every process is contained within an Actor and has to be written in Java or Scala.

jpcooper 5 years ago |

Carl Hewitt, the inventor of the actor model, wrote this paper: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3418003, which he posted a link to on a recent post here on Erlang.

In it, he claims that Godel's Incompleteness Theorem is not true, and that the actor model is more general than the Turing machine. I am open to entertaining the idea.

I've seen that his ideas have been discredited elsewhere on HN. I would be interested to know people's opinions on this, as a lot of the paper went over my head.

hilbertseries 5 years ago | |

> he claims that Godel's Incompleteness Theorem is not true, and that the actor model is more general than the Turing machine

He is certainly wrong about the incompleteness theorems. And it’s entirely possible to create a model of computation that is more general than Turings. The question is whether it better represents what’s computable, the abstract mentions computations involving an “infinite number of computations” between steps...

jpcooper 5 years ago | | |

How is he wrong about GIT?

What more general systems of computation are there than Turing? Are any in use? Are they really more general?

The claims of the generality of Actors seem to rely on continuous time and non-determinism. Actors, determinism, non-determinism, concurrency and the completeness axiom are models which we can use to express computation and our surroundings, and nothing more.

One man says lambda, another says actor. Given our models of physics; given Planck and Heisenberg, are they really different? If so, how? Measure theory rests on the completeness axiom, but it is just a very useful axiom.

Am I missing something?

macintux 5 years ago | |

Dr. Hewitt often pops up to discuss Erlang and the actor model here, so you might have the opportunity to ask him for more details.

29athrowaway 5 years ago |

I used Akka cluster many years ago.

One of the problems I encountered was that to finding actors in remote actor systems. i.e.: you have a unique actor responsible for X, living somewhere in your cluster and you need to know its name as well as the IP address of the actor system where it is running.

A message queue solves this problem, but that was not the approach I took.

My solution was to implement an actor discovery system on top of Zookeeper. Using that, I could have a cluster-wide unique actors.

halfmatthalfcat 5 years ago | |

They have Cluster Singletons that alleviate this concern now. You don't need to know their physical location in the cluster. All you need to do is ask the cluster for a pointer to the Singleton and you can message it directly.

29athrowaway 5 years ago | | |

That's cool. I wish I had this in 2014.

playing_colours 5 years ago |

Actor model may be a handy tool to build neural networks. I have not tried it yet in Akka or Erlang, but there is a whole book about it: https://www.springer.com/gp/book/9781461444626

t-writescode 5 years ago |

I've written a high-availability service with Akka.NET and RabbitMQ and I remember when I was working with that infrastructure, my biggest question around Akka Cluster was "why would I use this when I already have a message queue infrastructure?"

Maybe real Akka is better than Akka.NET when it comes to Akka Cluster?

valenterry 5 years ago | |

Akka Cluster works in-memory, RabbitMQ doesn't.

Say you want to have multiple actors (one per user / customer or whatever) and you get HTTP requests and want that exactly this actor handles them (to guarantee consistency), then you can't really do this with RabbitMQ.

I mean, you can make the machine that receives the request push it to the queue and keep the http connection alive, have the machine that is responsible for the user read it from a queue and then somehow tell the first machine how to respond the http request... but then you pretty much re-implemented Akka Cluster in a worse way.

Persistent queues and Akka Cluster solve different usecases.

t-writescode 5 years ago | | |

At that point, you're still operating on a single machine, though, and you don't need Akka Cluster for that.

29athrowaway 5 years ago | |

Queues overlap with the messaging aspects of actors but not the supervision aspect of actors.

killingtime74 5 years ago | | |

For that we have cluster management like kubernetes

d3ntb3ev1l 5 years ago |

Wait, what, are you saying Akka Clusters “works”