Avoiding the Distributed Monolith(simplethread.com) |
Avoiding the Distributed Monolith(simplethread.com) |
Agree that this headline is presumptive and so is the article. Not every piece of engineering is a "slap it together in 3 weeks" build and many systems are designed with independent parts that use different technologies and scale at different rates.
I think what the author was trying to say is: "If you don't have enough experience to derive an architecture that isn't just some battle of buzz words or copying off blogs posts, you're in for a crude awakening when you have to maintain, scale, and refactor your work. Also, creating and maintaining a continuous delivery architecture with tons of moving parts is a lot of work".
At large companies (and small) devops, engineers, architects, CTOs, and many other players collaborate to develop an architecture which evolves over time and includes legacy systems, greenfield projects, duct tape, off-the-shelf bits, vendor-specific bits, open source bits, and other concerns that all have to work together. These organizations already have massive teams (or small and really good teams) taking care of making sure everything works together.
If you have a small team, little funding, and/or little experience you are probably getting in over your head trying to architect and orchestrate tons of moving pieces. It really depends on what you are building, your budget, and your team as to how you should build. Some industries and products require complexity, scale, and proper function from the beginning while others are accepting of bugs, scaling issues, and long delays.
TLDR: There's not a simple playbook that defines how everything should be engineered. Also, don't read some poorly written blog post which provides zero insight on the complexities of approaching an engineering challenge and decide to choose X or Y approach.
Content marketing - it is a thing and this is how it is done. If were reading it so is some decision maker. It only has to "look right" to someone with a minimal set of information in order for it to make the publisher look like an "expiator"
The key that was not mentioned was the interfaces. In my experience, the key is to carefully define the functions and scope of each separate component, then do a lot of up-front work building a clean interface for sending data and/or messages between the components.
Once this is done, the team on Componnt A should be able to make, rollout, and/or move to different hardware, and/or add/subtract hardware at will without the team on component B even noticing. All teams should be able to keep their paws 100% out of each others' code and data structures.
Then, you have independent components that can actually be immediately scaled to meet unexpected load changes by throwing HW at them, thus buying time to streamline the code. You also have a structure that you can upgrade and maintain with much greater freedom.
Without the clean and stable interfaces, he's right, you have only a distributed monolith.
I'd recommend starting the first version with a quick-to-build near-monolith throw-away, get some experience with the actual data flow, then decide on the actual components and interfaces.
In my experience, microservices get perilous when you have a few junior level devs seduced by the mistaken idea that they only need to think hard about a single component of a complex system, and the rest can just sort of take care of itself later.
1) Authentication - JWT and OAuth provider related, can be used with multiple clients or installed per client 2) Pre signing of S3 uploads (very tiny file to do this one) 3) Ecommerce api w / shipping using fedex and stripe 4) Plugable CMS with JS dropped on a page used for UI and auth and S3 used as content store with an API sitting in between
I guess some of them could be considered just an API? But if you can reuse it generally and the scope is fairly narrow, that is more of what a microservice should be. There is also the tradeoff where you may be thinking if this is so tiny, why even do it?...which is probably a good thing to consider before adding more complexity.
I have never fundamentally disagreed with a well-known domain expert as much as I have with Martin Fowler, and I have had a fairly successful career despite my attitude of "do things the way M.F. wouldn't".
On one hand I find it frustrating how people take his ideas as gospel. On the other hand I'm humbled to think there isn't necessarily "one right way" and other schools of thought can find success.
- Proxies that upload/retrieve assets to multiple cloud providers (i.e. upload files to / retrieve from GCS and S3 in case one is down)
- A service that screens/transforms attachments/uploads for security before allowing them to reach other services
- An API for sending mail/SMS/other contacts via multiple providers to deal with outages to one or more providers.
Often, these are before built as libraries and imported into multiple projects which is the wrong approach. Offering up an API for these instead can help decouple.
However, the author probably doesn't understand versioning and deprecates or makes breaking changes to APIs and then has to update a bunch of consumers. If you want a decoupled system, you have to not break the system. This is why legacy stuff exists at older companies. Once API v1 of the mail sending service is done and working, there's no reason you need to break it, or add new features, or take it down. Keep it running and also run v2 so that people can use the new features. The author is probably running v1 and v2 out of the same codebase and overwriting the v1 history so they can't maintain it, that's just bad software project management.
Maybe the title should switch to: "coordinating a complex architecture is hard, I only build easy stuff"
This assertion - that it is a wrong approach - is false. And an API call that involves RPC is not necessarily more decoupled than one that doesn't. Coupling is another word for correlation of changes over time. RPC has no necessary implication of reducing this.
What makes the library approach awkward is usually persistence, error recovery and asychronisity, not coupling. The need for things like retries that survive restarts, which means they need persistent queues, even if that's just files in a directory. But none of these things necessarily require service implementation rather than library implementation, not even sandboxed attachment screening - that's a subprocess (potentially farmed out to another box container style, which is a service), not a service in itself.
People have been doing versioning with libraries for decades; there are a lot of different ways to crack that nut, and ignorance isn't an argument either way, because it cuts both ways.
When you have the same job that needs the same kind of queuing, same error recovery, same resource management across different applications or areas of a single big application, that's when it makes more sense to try and package it into a service. Something that is stateful and long running, and not just some code at the end of an RPC. Something that might need very different resource consumption requirements than the calling application - e.g. a CPU intensive operation in an otherwise lightweight app. Something that needs to be scalable independently of other components in the system. That kind of thing.
Coordinating a complex architecture IS hard. Microservices are often used as a way to wish away all that hard coordinating because... decoupling or something... and stuff.
Meanwhile what actually gets built is a tangled mess of tightly coupled (but separate) libraries as "services", with many-to-one relations between those services and developers, unnecessarily complex deployments, and... a big, fat, crappy HTTP layer slathered on top, just to add insult to injury. The next step is usually to pour on generous helpings of graphql.
If this had been the actual title of this submission, I would have been much more receptive to it. Let's expand on that: "Coordinating a complex architecture when you do not need it is needlessly expensive, so I recommend not doing so." This blog post coming from a contract software firm tells me that they're willing to sacrifice billable hours to instead write just enough software to solve the client's problem.
Some key differences:
You generally can't horizontally scale two libraries in the same process independently of each other, but you also don't usually have to worry about library availability at runtime.
If you don't need to scale a particular dependency separately, and you can't do useful work if the dependency is unavailable, then it should probably be a library.
Perhaps someone can explain to me the motivation microservices when there still exists a lot of runway for scaling up without scaling out.
We all look back on our previous work and see room for improvement, that’s a good thing. However, spending too many cycles trying to achieve perfection where it’s not desired will prevent you from moving on with your life to bigger and better pursuits.
Be really happy that you have a software project which survived for 7 years. Many fail in far shorter time and provide no value.
Are there cases where you need to scale one library possible without the network (IO and latency) becoming the bottleneck? The way most microservices get used as RPC mechanisms you hit the network bottleneck very quickly.
You either have to repeat yourself by implementing your data models and validation routines in all the languages your service consumers might use, or instrument some language agnostic way to specify your data models and possibly generate code for them and their validation routines/client libraries, for a myriad of languages.
At that point, your basically reinventing something akin to sql and/or database schemas and constraints without ACID guarantees. You can deploy something like apache thrift or google's protocol buffers, and maybe throw in some swagger.io (which are all great things). But they are yet more layers, that for many levels of scale, increases complexity, rather than reduce it.
Or you can just trust all the services to never have bugs or breaking changes for consumer applications...