The Curse of Systems Thinkers(blog.relyabilit.ie) |
The Curse of Systems Thinkers(blog.relyabilit.ie) |
This landed so truly for me, it felt like a punch in the stomach.
I wouldn’t dare count the number of times I’ve been told the technical details of why something is the way it is, without anyone ever saying the reason why we actually wanted it to be this way. My thesis was usually: we don’t.
In my career I feel like I have seen hundreds of examples of me saying the systems equivalent of “lets put the dining table indoors?” to be told that the dining table is outside because the original budget meant the front door could only be yay wide so we had to leave the table in the yard and put a tent over it. And I’m just left standing there agape at how we eat in a cold wet tent every night instead of fixing it.
Except it’s usually more like: why do we have to spend $9k on a commercial dishwasher repair contract? Because we have a commercial dishwasher … to get the rust off the silverware … because we eat outdoors every night … because the front door was too small to get the dining table in the house.
Somehow, when the real examples of this stuff are clever engineering around build / docker / polyrepo / release / feature flags / third party bugs, the cleverness makes people think the existence of the workaround should be tolerated. It’s infuriating to join a new team held hostage by years and years of band aids because they never suffer the bigger picture consequences.
The whole article was fantastic. I hope the author has the engineering leadership role they deserve. We need more people like this.
If you spend all your time refactoring, cleaning up after legacy design constraints, fixing ossified errors, then you run out of time and fail to write actual meaningful income generating features. Conversely if you make none of those improvements, eventually the weight of bad architecture slows all progress to a halt and no new income generating features get delivered.
One of the hardest parts about advancing as a developer, in my opinion, is being able to tell when you should refactor versus just leaving the old working mess alone.
With your example, it's like if it turns out that the dining table is embedded into the ground with concrete because it kept blowing over, and moving it indoors would require getting a carpenter to create new legs. And also that because the dining table has been there for so long, someone decided to run electricity cables through it, so rerouting it requires an electrician and will shut down the factory at the bottom of the garden for half a day. We could buy a separate table for indoors, to try and slowly migrate to the new table, but then we'd have two tables to maintain and we all know how that usually goes.
At a certain point, you look at it and go well the commercial dishwasher is just $9k and we can focus our efforts on building that loft conversion for now.
The first group writes the initial version and all iterations (extra features) up to the point where the expected returns from quickly pushing out future iterations is less than the amount of required effort.
The second group then comes in and does a complete refactor, without changing the look or feel of anything that the customer actually wants. Meanwhile, the first group moves onto "the next big thing".
I have, too. And then I usually haven't managed to put the dining table indoors. And then new people came in and asked the same question you ask, and by then I was one of the people who tried to put the dining table indoors, and explained how it wouldn't fit through the front door, and how I tried to get it in through the window. And then the new people try to put the table indoors and fail and next thing you see they're either leaving the house or explaining to the newcomers why the table is outdoors.
Ultimately, I've realized that talk like this is cheap, unless you can actually improve things. That requires leadership skills and some political capital in your organization. I don't think the author of the article deserves an engineering leadership role simply for complaining about things. (They might still deserve an engineering leadership role for other reasons, what do I know...)
With apologies to Antoine de St. Exupery; if you want to build a better system, don't drum up Jira tickets to gather user stories, make sprints and divide the work and give orders. Instead, teach them to yearn for a system that's not total bullshit.
Simply complaining is tiresome. Writing a well-reasoned internal blog post that explains the faults, gets traction for improving things, and gets people excited for your brave new world, even though it's not arrived yet; that blog post is what engineering leadership looks like.
"The Boeing 737 MAX: Lessons for Engineering Ethics (2020)"
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7351545/
It's an example of a workaround that should not have been tolerated:
> "The Maneuvering Characteristics Augmentation System (MCAS) software was intended to compensate for changes in the size and placement of the engines on the MAX as compared to prior versions of the 737."
Rather shockingly this wasn't even an engineering problem workaround; it does seem that it was solely designed to avoid an aeronautical reclassification of the aircraft that would have required pilots to undergo an expensive retraining program on flight simulators, which might have caused lost orders.
This does look like a systems-level failure, but one at an organizatonal level: the system went from a state where engineering took priority, to a state where financialization took priority. In systems thinking, this could be called a state transition: a fluctuation takes place, and afterwards the system settles down to a new (apparently) stable state quite different from the old state:
> "One factor in Boeing’s apparent reluctance to heed such warnings may be attributed to the seeming transformation of the company’s engineering and safety culture over time to a finance orientation beginning with Boeing’s merger with McDonnell–Douglas in 1997 (Tkacik 2019; Useem 2019). Critical changes after the merger included replacing many in Boeing’s top management, historically engineers, with business executives from McDonnell–Douglas and moving the corporate headquarters to Chicago, while leaving the engineering staff in Seattle (Useem 2019). According to Tkacik (2019), the new management even went so far as “maligning and marginalizing engineers as a class”."
What’s even more infuriating is seeing new engineers join that team, question why the hell something insane is insane, and then slowly grow used to the insane thing. Only for the cycle to repeat when the next new person joins.
I make a huge point of confirming to them, that yes we do acknowledge it’s insane. Even if management doesn’t care.
Oh wow, that hits home. To be fair, the historical context for the decision can be valuable information, the problem is the next step. Even if you can't fix it right now, you might make steps towards that. Or you might say: Now that we have this heavy table outside, why not attach more things to it?
It’s just like re-factoring heavily interdependent code, except without the advantage of the dependencies being written down
obviously this is more or less possible given specific organizations and personalities at said organizations. my point is more that the magnitude of the task of migrating the ossified organization to a better architecture, even with a fully pliant staff totally on board with changing, should not be underestimated.
Don’t ask for permission, just fix the stuff so that it works for you and maybe your small team and then announce it.
There is no documentation and no planning? Just start writing documentation, just start planning. If you need permission, I grant this to you. I‘ve seen too many internal projects not even having a README, so this is now something I start whenever I have to debug something and wished to have documentation.
Someone needs you to do something? Ok, I‘ll share my screen, start asking questions, and write all the important things down.
If it were up to hindsight me, we'd have carefully designed and orchestrated every past decision. We'd do full design qualification and change control on projects and purchases, researching carefully to never make a mistake. We'd sit as a team and brainstorm every possible implication. We'd write, execute and document tests for everything to prove ourselves. Any sniff of inefficiency and we would stop everything and fix it, no matter the cost. We'd take time to document, investigate and follow through. It would be a glorious cavalcade of plans and CAPAs, qualifications and tests. And reports! Binders and binders of wonderful validation reports everywhere!
If it were entirely up to hindsight me, we'd run our little widget company like we're building a space shuttle. Of course, we'd never make any money. But we'd be doing it right, by gum!
Most companies need to be somewhere in the middle. No hindsight and you end up a tangled mess of short-sighted kludges, all hindsight and you can't move forward. Either way you risk ending up a lead balloon.
If you find yourself in the kludge company territory, then here's some advice from a talk I recently attended[1]:
Start by training yourself and your team in Root Cause Analysis. Empower them to start thinking deeply and critically about what's really causing the problems and inefficiencies you encounter. Understanding root causes naturally translate into solutions that aren't just bandaids. Use these skills in your day to day, and you'll start building a culture of quality around your systems.
[1] Steve Gompertz
In my experience, at the time the decision was made, folks did want it that way. The organization has lost that context as to why and has only documented the technical design.
A curse shared by less effective engineers I've worked with is to rage at legacy decisions unable to convince the organization to revise them. They lack the ability to understand the various stakeholders involved and to come up with a plausible plan. A systems engineer (as referenced in the blog) would understand the various sub-systems that make up an organization and be able to drive the change they desired (the conclusion that it's irreparably broken or you lack the expertise to fix it would be fine too).
Love your post, though!
The thing is. If your manager is not already inclined to actually read the post, they’re never going to be receptive to that kind of change anyway.
I'm reminded of back when I was studying pattern recognition for a system that would become an Expert System (this was before that term was used). I would read many articles saying what techniques would work. I had the urge to ask "But this doesn't show how you got there. Show me your discarded solutions that didn't work." I would like to see your wastebasket.
Similarly, I am inclined to someone who acts as an expert to tell me five ways that won't work.
The only logical reason to do this is because it has no impact on the business. Or at least, smaller impact than a total rewrite/refactor would have.
If an engineer presented a case where fixing an underlying issue resulted in better business outcomes vs. a short term band-aid, then I don’t know anyone that would tell them no. Businesses want to succeed. They want to make money. If you can help me make more money I’ll let you do whatever you want (within the confines of the law and civil society).
I think, it's liberating to work in such a company as an experienced developer. You get difficult tasks sometimes, where you need to be the System Engineer and understand the system fully. But you're free to approach the problem in any way you see fit. This requires a lot of trust from the company and your team, but I believe that's a good thing.
It's also a nice for customers, who sometimes have crazy requirements, but they still get results in a reasonable amount of time.
On the other hand, this approach completely destroys newcomers. It's nearly impossible for them to approach a System which they can't fully grasp and is in constant flux. We mentor them, give them easy and introductory task, review their code, but still... It takes a massive amount of time. And I think, it's one of the reasons a company like this has a hard time growing.
And there's the risk of the 'not so competent programmer', who knows how to fix stuff, without thinking far enough.
I don't know, if I'm a System Engineer, but I think it aligns with the description in the article. However, I fear the day, we all agree to overhaul our processes, after which we need to plan and document and review everything.
It seems to be a necessary evil for a company to grow, but at the same time we would also destroy a lot of the liberty we currently have and I don't know which side is worse.
[0] technical debt is when you have a plan to pay it down. A technical naked short option is when your plan is to hope the subsystem goes away before its deficiencies have catastrophic effects.
"Systems" are a mixed blessing, but system thinking is almost always good and useful. We can add value but not bask in it. As a someone invested in the _idea_ here are a few of my favourite quotes that illustrate:
- People don't like systems. Especially new ones.
- Systems ossify and become the problem themselves.
- The ideal system exists only in the mind of its designer.
- The ideal systems designer is invisible and can never take credit.
"I mistrust all systematizers and avoid them. The will to a system is a
lack of integrity." -- Nietzsche
"The English have a system, which is *no system*, which is also a
system, only better." - (?? British political philosopher c 1900 -
does anyone know this one?)
"A complex system that works is invariably found to have evolved
from a simple system that worked. A complex system designed from
scratch never works and cannot be patched up to make it work. You
have to start over with a working simple system" -- Gall
Overall, I think the thing is that systems are brilliant, until you
try to actually build them and encounter _people_, who have other
ideas. Neither the force of the better argument, nor punishment,
reward, bribery, or flattery will move things. This is neither the
fault of systems thinkers nor people but the misunderstanding that
(outside the immediacy of war) systems can be imposed. Working
systems evolve and are, if the individuals are mentally healthy and
motivated by good attitude, generally such that people are doing the
thing they would naturally be doing anyway were a formal system not
there.A good system is like cat that falls off a tall building and by luck lands on its feet in a box of wool, and licks itself as if to say - sure I meant to do that.
Overall, I think software and ‘physical’ engineers should swap experience. Physical engineering could use a tech-injection, and software could use a ‘structure’-injection.
"What you gotta understand is that in the beggining of Electronics, people were pretty much trying to put two materials together they thought worked and then tried modelling it. It was pretty much trial and error, experimentation..." and then he hits me with the most "holy s*" moment of my academic life: "... much like programming and software engineering is today. You write some code, run it, see if it works. Works, ok, go ahead, make sense of it, explain in the documentation, next task".
I had NEVER thought of software like this, it just hit me like an atomic bomb in the head, I felt like I understood where in the history of software engineering we are right now. Structure is coming, slowly but surely.
The reality is that we’ve been programming (we, the wider public) for a few decades of our millions of years of consciousness. Of course it doesn’t have the scientific rigour of something we’ve been doing for millennia, like construction.
With that said though, I think programming and computers in general is the closest we will ever come to ‘magic’ and wizardry. The fact that we can now effect reality with a few keystrokes is magical.
"Give yourself permission to let the organisation fail".
I agree. As someone is similar situation, the job is to let the "design of system" be heard, be debated and be implemented once green-light. You cannot convince the decision making body (be it the CEO, management, or a design committee) that your idea is the right one for 3 reasons IMO
- idea could be wrong
- People won't know what you are talking about unless they had the first hand experience. (A la you don't know what it is like to be a bat)
- Your meritocracy is limited to a small group of decision making body
If you see the "system" broad enough, you will see a market. It is essentially preaching your "system" to a wider audience. Your system maybe wrong, for which your start up will die or you devise another system. Or you are reaching whole bunch of people who understands your (the initial niche), and your living does not depend on a single decision body but a market.
Convincing one body is hard, but broadcasting whatever you believe, some will respond eventually. This is why start up is great.
Anyone know any good books covering this history?
I initially watched the lectures because of an interest in aerospace and it is a fascinating historical series in its own right with some incredible speakers. The lessons for systems engineers are numerous too.
I mean, the system I started working on 2 years ago was using a business text field for processing decisions. That caused changes to the system when the business wants to make changes. If they do make changes, the reporting queries have to be modified to look for all the historical versions of that text for audit reasons. If the team asking that change forgets to tell one of the numerous other systems that also uses that field, then there are errors. I proposed we add a field with a code that represents this field so that the business can change the display text without affecting the systems that currently use it. It's been at least 18 months, and nothing has changed. You would think that this is a basic design best practice that should have been implemented from the beginning...
Unfortunately our main contract was developing a component of a larger system being developed by our parent organisation who didn’t have a concept of systems engineering. We tried for years to educate them and I watched aghast as their program costs and schedule continued to spiral out of control.
Basically a bomb burst of engineers all doing what they thought was the right thing but no one owning the system design and saying no to good ideas.
In the words of their chief engineer “its like 10 different black boxes, I don’t know what I’m getting and I don’t know when it’ll be finished!”
> we define a stable interface > reproduce behaviour reliably > Predictability is great
Within being believed and valued for protecting engineers and organizations, so much progress applying these principles relies on individuals’ and teams’ readiness to adopt and advocate for them. Hope to see your experience with that in part II.
I've spent my entire career working for a medium sized organization and in the past few years it has tried to become "agile". Most of this push is predicated on the idea that we will be able to go faster this way. As a result we have deconstructed ourselves. What's weird is that now we aren't actually even "faster to say we are finished." No, now we are only faster to say that we are going to be faster to be finished. We still end up taking a long time but as long as the slide deck says we are going to be done in a short amount of time, then all is believed to be running smoothly. It's very strange/depressing/unsettling.
Most systems are described in a simple way: box and line charts which are just voodoo.
We need to develop a algebraic notation for how to represent the various configurations/states of an organization, along with its operations.
With a notation, you can describe what you “feel” and share the knowledge and improve on it. You can also define the organizations desired state and track your progress. You can plop a new manager in it and they will know what to do. You can also do basic engineering like pick a configuration that has the desired cost, throughout and latency that you need.
I'm not sure if I understood it correctly or there was a typo, I am surprised because my experience is almost opposite.
my experience with Swedish and German multinationals is that it's all about consensus, to a point where the best decisions would be rejected if it lacks support, or where the off-ramp would be no decision (which is also a decision). this I found in stark contrast to French, Italian, Spanish, UK/US/Australian organizations, that tend to value the ego driven hero who saves the day. Also these former locations seem to do better when dealing with chaos as they know how to "think on their feed".
I'm curious from your experience would these companies be start-ups or medium sized firms, or could there be other reasons I'm missing that our experience is so different?
«Cassandra was a Trojan priestess of Apollo in Greek mythology cursed to utter true prophecies, but never to be believed», from wikipedia
Like the trade-off between creativity and efficiency
I would actually claim that majority of non-seasoned programmers fall into this category.
It's rare to see an engineer who would actively set out to destroy or cripple the system they are working on - but it is incredibly common to see one fix the problem they have at hand, using what happens to be available, and make the overall system just a tiny bit less pleasant to work with. Or understand. Let alone maintain.
Repeat the above a hundred times and your codebase would make Lovecraft take note. If each such modification reduces the quality just 0.5%, after 100 rounds you're looking at an aggregate damage of 40%.
Yep, the needs of a company change when and if it grows. Early days, a small team "getting shit done" is probably what you want to find your place in the market, make early customers happy, etc. But as you scale on all axes: time, headcount, headcount turnover, customer count, feature count and so on, that early pile of shit that got done can start to bog you down. Even worse if no one recognises the need to change and continues adding to the pile.
I've seen a few really brilliant engineers struggle with this in various ways: 1. Start hating their job but being confused because they love the company -- not realising that the company has become a very different beast to the early days. 2. Struggling to get out of the "get shit done" mentality and just floundering. Even becoming a net negative contributor in some cases.
Worth remembering this I think in an industry where rapidly changing companies are so common. Your company might be a good fit _now_ but that does not mean it will stay that way!
That hasn’t been my experience (in general; had some bad companies for sure).
I don't think, that this is necessarily bad. I even think, that this is a must for a large team. But it changes the way you code.
Right now I have the time and opportunity to 'form' the code the way I want. I can try different approaches, rip out or rewrite bad code and add features as I see fit. That's what I call liberty. I don't feel like a raw coder. Instead I feel like an artist in my own universe I where I'm in charge.
This feeling can get lost, when you need to request, plan, control and review every single change in coordination with multiple team members. You're not in charge anymore. You're not the artist, but a cog in the wheel.
I think some people like one way or the other. But it's hard to transition.
You could run a super high valued company on just a handful of people, and society is just not ready for that. It's a case of technology advancing faster than human organisation and economic system, and the multiples are just too large, it's a big challenge for society to handle the impact.
So it's better for the organisations to be less efficient and chaotic, and to use dumbed down tools, overcomplicated solutions etc. This will fill the void created by the advancement in technology, so that the organisation can continue to function by at least resembling the traditional model.
You would need to significantly shorten the working hours in order to enable more efficient companies. Otherwise you'd get even more hyper concentration of wealth in a very short time.
Here is the hidden truth. So much of the current information sector is just daycare for grownups. Then there's a secret nucleus of people who actually do the real work. The secret to happy employment rests on being able to determine who's actually cutting lumber versus who's just playing dress-up.
If we want to gradually move towards a system of universal basic income, maybe we could help sell it by funneling larger sections of society into IT, and just give them a bullshit job where they can fingerpaint all day to get their paycheck. Eventually you can let them stop fingerpainting and just give them the paycheck.
> The factory of the future will have only two employees, a man and a dog. The man will be there to feed the dog. The dog will be there to keep the man from touching the equipment.
My version of this is that those handful of people become critical to the organization, which must be avoided at all costs. Every place I've been that's weighed down by huge IT staffs, undervalues tech, absolutely hates to give coders raises, and sees tech as a cost center, even while preaching how important technology is.
Sadly, this seems to be the case so often that the only companies that really succeed with good tech have builders in the founder's chair who _still code_, at least until the model is firmly established (but even after is good too). Pretty much everybody else resents software devs and thinks they're overpaid.
Weirdly a smaller, better paid team can often be resented _more_ (because of course, you can't see all the money that's being _saved_ by having a small highly-powered team).
It’s just that such an organisation is unpleasant to work for.
What does «called» mean in this analogy? That the subsystem doesn’t go away?
So your experience is that: Stuff tends to stick around? That’s my experience. I have never encountered a throw-away MVP that was actually thrown away if the product continued in the same direction (and didn’t pivot).
This is only true if the complexity comes from the underlying problem that must be solved or goal that must be achieved.
Complexity that exists for historical reasons, derived from technology or platform choices, solves a problem you don’t really have or don’t need to have, relates to the structure of the organisation, etc. etc. can of course all be made to go away completely.
In my experience a small fraction of the complexity in most organisations and software is truly, fundamentally unavoidable.
There are always processes. Sometimes they are explicit and well-understood; sometimes they are hidden, and not open to improvement because nobody knows what they are. Implicit processes tend to disadvantage newcomers, as well as reinforcing social assumptions and conventions (that tend to disadvantage those that are already disadvantaged).
Explicit processes don't have to be heavyweight and bureaucratic.
There was a good essay on the problems caused by informal/implicit processes, from about 20 years ago. I've spent 45 minutes searching for it, but I'm afraid I can't find it.
I don't, but that quote was used in a comment [1] about five or six weeks ago, and the commenter's relevant bit was:
Nietzche said it best:
I mistrust all systematizers and avoid them. The will to a system is a lack of integrity.
Or maybe (I think Sidgwick):
The English system is "No system", Which is also a system, only better.
https://news.ycombinator.com/item?id=30598863
[Edit to add: I've been dumb there. You were that commenter, so this won't have helped - sorry.]
The problem here is not including people in your “system”. System design needs to be holistic in order to be effective.
Absolutely. All too common. And once you do include them each is not merely a new variable capable of assuming wildly different values, but a whole system in itself capable of interacting with every other such system within your system. That's why reductionists like to try factoring them out as interchangeable cogs. Pretty much the entire edifice of modern industrial economics since Adam Smith and Henry Ford is built on that model simplification/efficiency.
It's only useful and adds value if your idea actually gets used. Otherwise, it's still pretty crushing.
I like how this can be read in two ways and both make sense.
People don't like new systems. But also: New people often don't like (the pre-existing) systems.
I worked for an organization full of such people. Intelligent, competent, and worked hard. And yet... the system, ehh...
(I've read the Citadelle on a long bus ride many years ago. It was exactly what I needed to read back then, I enjoyed it very much. Thank you for reminding me of it.)
The problem is that the organisational equivalent of ‘later’ is ‘never’. Therefore, if something needs to actually be done, the only time is now.
Yeah but finding, or rather estimating, this optimal is a lot of work, and requires you to have one foot in both camps, and some kind of process/authority to make a decision, and some incentive to make a short term sacrifice for long term gain. That's just not going to happen in a weekly sprint planning, in a company that's aiming for the next quarterly report.
Then in 3 months time the choice is the change will happen now, or it will happen never.
You can schedule it for the future, but the choice will always be "do it now" or "don't do it now".
It's frustrating, but as far as search terms go (in the typical tf/idf indexing model), the most unique word is, um, system ... and that doesn't help.
I know only two philosophy people, and I am going to ask them - and get them to ask their friends. Yeah, I shouldn't let stuff like this bug me, but it just does :)
https://github.com/Microsoft/VisualStudioUninstaller/release...
If the code has turned to spagetti then how do you manage to change code quickly (due to e.g. Corona rules) so you can follow where the market went and not get competed out of business?
When the company is in startup mode and has no customers, it's easy to just throw more mud on the wall.
But when you have an existing business based on 1M lines of code and you want to keep being in the market when the market changes quickly, then spagetti code can be death. Being ready means having cleaned up code beforehand so it is easy to change it.
Unless you're saying that you don't have to do refactoring at all in the organisation, but the only way to do that surely is always get it right the first time, which isn't hugely practical. You may sometimes encounter a situation where the quickest way to build a feature is to fix some old ugly code, but that's certainly not the case every time.
I'm saying you can fix problems without dropping everything and redoing work. You're allowed to problem solve and work with people to create a third option. And you can prevent new ones by learning and strategizing.
I'm just using simple analogies for the sake of explanation, but it is nearly always the case that expanding the scope of work to fix previous architectural decisions that were either flawed or no longer relevant will take considerably longer than just fixing the problem at hand.
There may be the odd time, particularly in a large, well defined piece of work, where you can say actually tidying up this other stuff will save time overall. Or perhaps you can batch a bunch of improvements in the same system together into a larger, more thoughtful architectural improvement. All of that is great if you can do it, but it's often not possible.
As far as preventing future architectural issues by learning and strategizing, I feel like that's what we spend our entire career trying to get better at doing ;). But alas I, and everyone else, seem to continue making decisions that don't pan out long term. Even if you did make a perfect decision at the time, often the world/business/third party dependency changes, and what was an excellent decision in the past becomes a pain point a few years later.
It used to be the case that we tried to design infinitely extensible software so future requirements could always be incorporated, but that makes the software unmaintainable. So the pendulum swung to YAGNI and only designing for exactly what was right in front of you, but that leads to major architectural overhauls every few months. True answer is somewhere in the middle, but learning where is something that only seems to come with decades of experience.
Unfortunately older programmers all seem to be forced out of developing and into management or other careers for some reason.
Sure, writing code is the stimulating creative part. And purifying code, making it elegant, polishing the creation to a brilliant sheen is highly rewarding. By contrast, documenting what's been done is tedious, boring, a gigantic pain, a total drag. Yet if it's not adequately documented all that creative effort will inevitably amount to nothing but a reason to hate its creator.
Furthermore, is it really possible to write superb documentation for lousy code? The effort to document the work is a kind of quality check on the work.
Can't say how many times it's come back to bite me where it hurts. Looking at stuff I finished 6 months, or God forbid, 6 years ago, poorly documented programs invariably prompt a shameful recognition that I have no idea what the hell I did. At least when someone else wrote it I can justify feeling indignant at the "mess" I have to figure out. But when it's my own, that's just sad.
Of course the work we leave behind (at the end of a job, retirement, etc.) will usually be resumed by someone. Excellent documentation is our finest legacy. Consider how it supports moving a project forward when the code's author isn't there to "explain" how it works.
And if we're honest with ourselves, writing cogent documentation is hard, often much harder than writing code which is why it's such a brutal task. OTOH doing something hard is a good reason to tell oneself "A big accomplishment! Good for me!" Something to feel good about!
A rule I aim to live by: the workday isn't over until I record what I just did and the reasons for doing it the way it was done.
But, but: it's not actually your universe, and you aren't actually in charge.
In practice, rewriting bad code and adding features "as you see fit" is roughly the same as going rogue. If the bad code works, then whether to rewrite it (and when) comes down to cost/benefit, and some engineer exercising their "liberty" is unlikely to get that right. The features some engineer decides to implement may be features that neither the organisation or its customers need; but someone will still have to be designated to maintain (or remove) them.
If you want liberty, start your own organisation.
Over the years, I’ve discovered a that what I really value is fully grasping a problem domain, deciding what customer problems are worth solving, and then aligning a team on a common vision of appropriate systems architecture to solve those problems. As a result I’ve ended up doing exactly this: I tend to move to organizations where I am at liberty to do this while still having the role of an individual contributor.
A well run series of meetings is like a bubble sort.
Google famously suffers from this: the pm who launches a product gets promoted; the person who adds a feature gets some credit in the employee review and the person who fixes bugs is judged to have wasted their time.
I suspect Apple has this problem too (they definitely prefer reimplementation rather than evolution in many cases) but their processes are more opaque.
Who would want to be on the cleanup team when the glory goes to the path breakers?
This and the second system syndrome are both organizational issues. Why couldn't an organizational simply freeze the customer facing portion of the application (so no UX or added features) and tell a group of developers responsible for the refactoring that they will be judged on a set of achievable metrics, such as decreased infrastructure costs or better performance?
This would fall squarely within common managerial frameworks (it's basically Tuckman's group development model, or what you see at many startups that launch an MVP), except that the initial application development is handled by a different group of 'high performing' developers.
These initiatives have to be top-down priorities in the organization, with agreed upon importance.
2) Beware of Second System Syndrome https://en.wikipedia.org/wiki/Second-system_effect where everyone tries to put in every feature that was missing from the first system, simply because there is no urgency around the second system, because the first system is already running.
It's hard to argue the nitty gritty without examples so here's a real world one from quite a long time ago, in a company that went bust after the death of the owner.
--
We had a system that had a significant quantity of code written in a custom language that would be compiled by an internally written compiler. This compiler was in some ways a work of genius, written in the 80s, but it had a lot of very deep architectural flaws in the optimiser that meant certain patterns of code would generate invalid output. We didn't write much new code in this language but had a pretty large body of code that needed to continue running.
So during a server hardware refresh, we found that almost everything was crashing. Turns out, a compiler optimiser flaw meant that any time a loop had a number of iterations that wasn't a multiple of the number of CPUs, generated programs would segfault.
We investigated what it would take to fix the underlying issue but it would have been a week or more of work just to understand why it was happening. Porting all the old code would have taken even longer.
Instead what we did was, using a pre-existing AST manipulation library we had written, add a prebuild script that hacked all of the files to include a CPU count check then pad out the number of iterations with NOPs. Took a few hours and unblocked the server upgrade.
--
Another, perhaps less esoteric and more recent example:
A third party open source library we use had an issue where a particular function call would sometimes get stuck in an infinite loop due to incorrect network code in the library interacting badly with our network hardware.
We submitted a bug report and fix, but maintainer wouldn't accept a fix unless we also changed a bunch of other related code, added a bunch of tests etc. which we didn't have time to do. We considered a fork but that would involve keeping it up to date, rebuilding packages and so on.
We worked around the issue by running it in a different process and monitoring CPU usage. If CPU usage goes beyond q certain threshold, we kill the process and try again.
Workaround was quick and has been working fine for over a year now. Contributed patch is still languishing in an open PR with various +1s from other users.