TDD Doesn't Work(blog.cleancoder.com) |
TDD Doesn't Work(blog.cleancoder.com) |
I have never let my teams go full TDD. The reason is that in all my experience, TDD sacrifices a lot of velocity for the sake of automated tests. When i hear about the reduction in total bugs injected, it is a "duh" moment. The fastest way to make a team inject 30% fewer bugs is to have them write 30% less code. That isn't snarky, it's true.
Automated testing is one of the many tools available to software engineers. And it is a valuable one. Unfortunately, TDD is too much of a good thing. It relies so heavily on automated testing that it ventures far into the realm of diminishing returns.
Once, in an argument about TDD, i said it was akin to having someone build a shed. But upon checking in on them, you saw they were using a hammer to smash screws into boards. When you ask them what they are doing, they tell you it is Hammer Drive Construction. It is perhaps overly harsh, but it reinforces the point: tools have a place. Automated tests really shine on mission critical logic that does not get rewritten often. Use it where it makes sense. I wouldn't recommend using it ubiquitously.
Then again, i also recommend having fun coding. So i suppose the actual message here is: do what makes you successful, not what comments or studies say.
Where they fall down is when they're more expensive to build than the code under test and they produce false positives/negatives.
I think you need self discipline to keep limiting yourself to an ever evolving subset that is the optimal ROI. This means over time removing tests that don't add as much value any more. Rewriting some other tests. etc. Human nature though is that these keep growing endlessly and become hard to manage just like any other part of software.
If someone on your team is most successful with TDD, do you still not allow it?
re: writing 30% less code - I've found TDD can reduce my percentage of lines of code, as you suggested. Adherence to the "refactoring" part encourages that you reduce duplication, which in my experience has been easier to do with good test coverage.
I would say that the most successful teams i have been a part of focus not on automated testing but instead on other collective practices: informal code reviews, diff analysis of every commit, group discussion of database changes and collective manual testing of other's code. Many people point to the refactoring (or initial code organization) as a benefit of TDD. I find these other practices tend to inspire a more collective ownership of the system. Additionally, and more importantly, they spur a lot of conversation around how and why to organize code certain ways. These learning opportunities are probably the most valuable among young and growing teams.
Having said that, unit tests in the wild have a tendency to be abominably written, so I'm not surprised a lot of people get frustrated changes the tests.
So if a change to your production code causes large changes to your test code, then one, or the other, or both are poorly designed. You have neglected the design. You have allowed couplings to proliferate.
It's a powerful tool, but I think any belief that sufficient test coverage (in most common cases) actively proves correctness is misguided. In the general case, even full test coverage proves only that you've tested for the conditions you expect - but does nothing to verify the correctness of behavior in conditions you didn't expect.[1]
To me the benefits of TDD are three-fold:
1. It makes you think of what you're building in more detail before you build it.
2. The methodology puts heavy emphasis on short test-code cycles.
3. (Applies to any methodology that emphasizes coverage) You end up with an acceptable-to-great regression suite[1], and anecdotally it seems people do a better job of at least ensuring tests exist when required to by the methodology.
All of these things are equally possible without TDD. Short iterative cycles and additional forethought are perfectly possible without TDD, but they do require more discipline - it is harder to remember to stop after completing a small set of changes without a forcing mechanism.
[1] rust is a possible exception here, still wrapping my head around it.
[2] the value of this regression suite varies greatly from project to project. A hint that tests are of low value /potentially high cost can be seen when you're finding that minor internal changes either break a large number of tests or reduce coverage to noticeable degree. Particularly in absence of functional changes.
It's a given that software needs to be tested. The processes around that are a classical "it depends" question.
Most likely any software you deploy will never be "correct" (whatever that means). The quality of that software depends on many variables and it's up to you to try and tweak them while optimizing for things like cost, time etc. Whether it's worthwhile to write the tests ahead of time or after the fact or to do them manually or automatically or any other permutation is just not a question that can be answered in a way that applies to all situations.
which meant that changes to the code might result in a failed mock, but didn't say anything about coverage or correctness. i can't imagine a more useless testing strategy.
is that what mockist TDD is commonly understood to be?
In my programming experience I've found that I prefer to write tests AFTER I do development of a new feature. Oftentimes the implementation is in such flux that continually updating the test as I go along is tedious and kills the creative flow.
However, when it comes to fixing bugs in existing software, I find it more helpful to write a test that duplicates the bug FIRST, then code the solution.
If anything, the reason to recommend TDD is simply to enforce writing tests to begin with. It's so easy to get a feature working and gloss over testing it.
EDIT: What's up with liquidise's statement about commenting on TDD stories being bad practice? Do the TDD fanatics downvote to hell everything anti-TDD?
I didn't mean to criticize either stance with the statement. I said that because i find most TDD threads on HN get very heated, with commenters being highly polarized and entrenched in their opinions. I've avoided commenting them on the past because of this. But i am happy the discussions under this story are a great deal more civil and informative.
I could never understand why the ruby on rails tutorial insisted on walking newbies through TDD and skipped all chapters where they start talking about tests when I started learning rails. I still think it's a bad idea to make newbies do all the weird TDD stuff when they don't even know how to build something.
I'm so opinionated about this that most people around me know this. And in most cases it works without needing to write any tests. And even if something fails, I can quickly patch it. As long as I wrote the app in a nicely modular way, I've not had much problem.
That said, right now I'm working on writing a JS library. And believe it or not, I AM doing TDD right now. I can't believe it myself.
I think in cases where the logic involves a lot of intricate details, it's impossible for me to write something without writing tests. I'm not talking about simple web apps. I'm talking about stuff like: template engine, parser, etc.
My current setup: I write a test and document it before I write a function. That way I don't get carried away while implementing and know exactly what I'm trying to build. Then I write another function that utilizes that function I just wrote, and so forth. This way I know when the next function doesn't work for some reason I know exactly where something went wrong. Instead going back and debugging every single function used along the way, I know it's the most recent one that's causing the problem.
So my conclusion: you probably don't need to write tests for all your stuff, but there are indeed cases where you will NOT be able to proceed without writing tests.
The thing is, most large companies have a QA team so this fear is not super tangible to many developers. And small startups are more focused on building stuff quickly (which they should be).
I think this is why this topic has been polarizing. Some people feel the need and some people do, depending on which role you're playing in your organization.
Nowadays interestingly, even the large companies are moving towards more testing because they can cut QA costs that way.
To clarify the linked study is attempting to replicate https://dl.acm.org/citation.cfm?id=1070834, THE seminal study in Test Driven Development. Well to be more precise it was replicating an existing replication of that study which failed to replicate the original results. They were trying to modify the design so as to account for issues in the experimental design that may have led to the replicated study being inconclusive.
This is significant because if you were not aware of the failed replication, and believed that TDD was supported scientifically as more productive because of that original study, then you SHOULD be reconsidering its place in your development process. If that isn't the case your opinion is unchanged by these particular results(even in the article inspiring this one the author admits that their opinion was already based on a much more thorough analysis, see: http://neverworkintheory.org/2016/10/05/test-driven-developm...).
Now what I want to know is why people insist on writing articles in this awful conversation format. It wastes a lot of words to make a simple argument poorly.
I guess if all code written could be seen as an API, TDD would be great, but that's not the world I live in.
Design remains design, there is no quick implementation trick that makes it simple.
That part about it being good to work in small chunks isn't there.
It is possible to advocate for the former, while thinking the latter is consultantware snake oil. (my position, fwiw)
I mean, sometimes I like writing an acceptance test to begin with and work inwards.
People sometimes interchangeably use "TDD" for "testing". Also, just because you do TDD, doesn't mean your code can be great, I've seen people assert pointless things and the unit has gotten so small that people now define them as methods in classes. Which I also think can lead to some crazy maintenance suite of tests.
If you're interested, this was a fun discussion about TDD between two professionals: https://www.youtube.com/watch?v=KtHQGs3zFAM Jim Coplien and Uncle Bob.
A developer who is writing unit tests must have a good idea of the purpose of the target of the tests, so she is thinking about requirements. Furthermore, if she is writing unit tests for small components (which will often be the case on account of everything being done in short cycles) then a lot of that purpose is contingent on other aspects of the design and how it is all supposed to work together: in other words, she is thinking about design.
If you don't spend some time thinking ahead about big-picture requirements and design issues, you are in danger of going a long way down a dead end.
You CAN indeed cover all your branches with tests afterwards. You can even give that a fancier name, like "Exploratory Testing". Of course it may be more boring or tedious, but is a perfectly valid way to ensure coverage when needed.
TDD was great for popularizing writing test first; However I much prefer the methodology called CABWT - Cover All Branches With Tests. Let the devs choose the way to do it, because not everyone likes these pretend games.
TDD workflow is fine; it's not thinking about the pink elephant (the source code) idea that bugs me.
The author thinks that TDD is preferable because it helps you maintain discipline.
I personally think it's worthwhile besides that because it means you design the API before implementing, meaning it is cheaper to fix API design mistakes. IIRC this aspect wasn't actually tested in the studies (API signatures were given up front).
Might be better than nothing, but if you are designing a reusable API, you'll be better using it on some real code.
It's a bug if someone needs to change code and they, at any moment, see code they don't understand. Stumbled into the wrong place? Bug filed for better notes on organization. The code you need to touch not understood? Understand what you see before you make a single change. If you change code and don't update docs, or documentation and code out of sync? It's a bug, and changing one to match the other _without detailed understanding_ is a bug too!
Now, that seems reasonable. And if a study comes out and says people can't make program changes faster, on average, when participants are given a bit of code identical, but with more (accurate and non-trivial) comments, that doesn't mean UDD doesn't work. It doesn't test it on real, full size applications. The code was the same, despite clarity of code is one of the goals of UDD -- one of the core claims is that UDD gets you better code to begin with. It focuses on a tiny test of something not necessarily core to the UDD mindset.
But it's evidence that at least one claim I've made is false. In fact, that study would be enough for me to throw that idea set into the garbage.
Balance is key.
After I know the organization of the source, I write out each functional unit of the code one at a time. As I go, I write each bit of test code for my source. After this I integrate every function unit.
If a change is needed, I go back to the drawing board and find a better overall organization. This happens often due to either performance constraints or the need to abstract a section further.
After this I'd consider embedding a unit test suite.
Works great for small to medium projects.
In practice there are 'test-heavy' devs who use factory data and the test suite to run skeleton code with crashpoints, and switch actively between test and imp files.
This has tests & implementation being written in parallel vs strict TDD which has us finishing tests before writing program logic.
Most test suites depend not just on functional requirements but also on implementation details, so it seems obvious that tests-before-logic development is inefficient.
I don't really know what to think of the situation? Is this how it has always been? Do most software engineers really have no idea what they're doing?
Please the article before commenting.
It's not totally clear from reading some of the comments that people have actually read the article.
It's a good one, please do.
Did you notice the bias in that question?
Test first isolates that a given test doesn't already pass (without any additional code).
Test after (but before committing) also seems to require a more thorough critical analysis.
And then someone finally fuzzes the code.
Or you may be using a different definition of "mostly bugless" than the rest of us.
I do gamedev. The ability to patch post-release is not a given, even today, for all platforms. Crashes, corruption, progress blockers, etc. are all VERY BAD in this environment.
I see bellow you're writing network code in C. I don't suppose you've done any fuzz testing? Run with address sanitizer? Static analysis? We live in a world of exploitable 1-byte buffer overflows. Maybe not such a big deal for a throwaway blog server, but perhaps a bit scarier if you might be facing HIPAA fines, or running industrial equipment.
A very important note here: Mostly bugless as far as you're aware and mostly bugless in actuality are two very different things. Without testing, I'm not sure how you can have any confidence that you're in the latter camp.
For example, I'm currently writing a TCP/IP stack for embedded systems [1]. While it's not quite complete yet (misses some essential code like fragmentation and congestion control), I'm very confident that it has (and will have when complete) much less bugs than related portions of lwIP; see for yourself all the bugs I've found in lwIP [2].
Again feel free to find bugs in my code. I very much appreciate people pointing out bugs, as it helps me make even fewer bugs :)
> We live in a world of exploitable 1-byte buffer overflows.
Indeed. But buffer overflows are so easy to avoid, just don't write over the end of the buffer. I doubt I've done a buffer overflow in years. The bugs that I do make, are much more complex.
[1] https://github.com/ambrop72/aprinter/tree/ipstack/aprinter/i...
[2] https://savannah.nongnu.org/bugs/index.php?go_report=Apply&g...
You always have to be removing and refactoring tests if you are changing your production code. Changed a sorting algorithm? Good, go and have a look at your tests to see whether there's anything that doesn't need to be tested anymore, or edge cases that need to be tested now that the algorithm has changed.
Red -> Green -> Refactor
How do you keep people from doing TDD though? How do you even know they are doing it?
In what's often termed the "classic" approach, you instead lean toward writing more coarse-grained tests, and you don't shy away from integration tests. You don't avoid mocks, but you tend to prefer saving them for situations where it really is hard to force a collaborator to behave in a certain way. (I also try to stay on guard for the possibility that those situations are code smells indicating that your implementation is getting to be too complex and is due for a refactor.)
IMO, the main argument against classical TDD is that you tend to get a suite of tests that runs more slowly and has more dependencies on external resources such as the database.
IMO, the main arguments against mockist TDD are that you end up with test-induced design damage, and a brittle suite of tests that makes your codebase resistant to refactoring.
Inter-operablility and interchangeability of parts means that it's possible to validate an implementation to at least some degree.
The best example that I can think of off the top of my head is the OpenGL 4.4/4.5 work that is nearing conformance for the Mesa3D project ( https://en.wikipedia.org/wiki/Mesa_(computer_graphics) ); while the functional coverage for the main modern drivers is 'complete' the official conformance testing has already resulted in some bug fixes and additional areas to focus on improving.
That real life case study is yet another example of how an API and conformance tests built around that API result in better code and in a more consistent experience that isn't dependent upon a mono-culture implementation.
If not an "Application Programming Interface", isn't all code an Interface? There's input and there's output.
With Object Oriented programming, that there is an interface is more explicit (even if all you're doing is implementing objects that are already tested). There are function call argument (type) specifications (interfaces) whether it's functional or OO.
Yes. What you need to do to ensure the correctness of flight software for a jet fighter is entirely different than what you need to do to ensure the correctness of an internal tool at your company that automates a task for which you already have a manual process.
This is generally why I take the top down approach to coding:
1) Write ultra high level test
2) Write code that implements it.
3) In that code if I need something lower level, write the API that I want
4) Write a lower level test that mirrors what I just wrote in real code.
5) Implement the code that passes that test & use it at the higher level.
etc.
Agreed, this is why tests should be driven from the outside in.
But at the unit level, tests are often the first contact with production code. Tests are dumb and setting up complicated stateful worlds is painful and tedious.
It becomes easier to simplify the production code to simplify the test code, than to just write the first thing that comes to mind.
Small startups that focus on quickly building stuff have to decide whether to take on more technical debt in order to get something small out the door quickly, or settle on a maintainable velocity over time. Some begin at the former and move quickly to the latter once their MVP is out the door. Some never get to move because they sink under the weight.
Not sure what companies you are referring here. Google and Amazon does not have QA team for most of dev team.
I worked at Amazon and now work at Google.
Just because Google and Amazon don't have QA team doesn't mean the occupation doesn't exist.
Sometimes correctness isn't that valuable compared to other criteria as faults can be quickly corrected and have minimal impact. But I think you need clarity about the tradeoffs you make.
EDIT: in some situations things can be corrected quickly.
Another limitation is that because the tests are (usually?) written by humans, the tests could also be wrong.
So you think your application works, but it turns out that both your application code and your test code were wrong and you didn't catch a bug at all.
When you write tests for what you're about to code, or what you just finished coding, it's a challenge to write a test that is not flawed in the same way the code is - because you don't know the flaw is there in order to test for it.
By thinking things out first you can decrease the number of these (and enforcing that discipline is a major plus for TDD ) but for typical non-trivial application without a lot of control over its inputs, it's close to impossible to do.
Tests are often reactive as well - because the changes that require them are reactive (bugs, new reqs, arch changes, etc). That doesn't detract from the value of them, but it's a limitation that explains well why even 100% coverage never stops the bugs from showing up.
I think I'd be happier with TDD if fewer people presented it as if it solved all the problems. TDD is a powerful tool, but tools are only as omniscient as the people who use them.
Has the benefit of proving correctness of your assumptions, which makes it easier to debug once you insert it in system and things inevitably are not 100% right. It gives you a way to reason about what your code does, what might be different, and then allows you to revise your assumptions and get your new solution in place and tested without the often long wait times to do manual testing on deeply embedded hardware.
Sometimes traditional TDD is the answer, sometimes simulations are the answer, and sometimes you need to just get out of your chair and test it out. It is a tool!
https://www.quora.com/Why-does-Kent-Beck-refer-to-the-redisc...
The original description of TDD was in an ancient book about programming. It said you take the input tape, manually type in the output tape you expect, then program until the actual output tape matches the expected output.
https://en.wikipedia.org/wiki/Scientific_method
https://en.wikipedia.org/wiki/Hypothesis
Test first isolates out a null hypothesis (that the test already passed); but not that it passes/fails because of some other chance variation (e.g. hash randomization and unordered maps).
TDD requires you to draw your target first, then hit or miss it with the code, like in science: hypotheses -> confirmation/declining via experiments -> working theory.
But in practice, lot of coders are hitting a point instead, then they draw target around that point, like in fake science: we throw coin 100 times, distribution is 60/40, our hypothesis: random coin flip has 60 to 40 ratio, our hypothesis confirmed by experiment, huge savings, hooray!
You use the two most edge cases as an example (Amazon and Google). Most other "less techy" companies don't have the luxury to not run QA. Good for you that Google and Amazon doesn't have QA, but those are the exception, not the rule. Just go to glassdoor and search for QA and you'll see tons of QA job positions for large companies.
Since you mention Amazon, for example, WalMart has QA engineer positions.
I do not disagree with your examples, but they are orthogonal to my point #2 #3. Let's go back to your original comment:
""" The thing is, most large companies have a QA team so this fear is not super tangible to many developers. And small startups are more focused on building stuff quickly (which they should be). """
The comment was responding to: """ Once you release a product that will be used by many customers and developed by many people throughout its lifecycle, which come and go as the time passes, you won't be able to maintain/extend it without a proper testing suite. It's not only about complexity, but also about maintainability. Some tests will also rot in time. """
My understanding is that, you meant to say that the fear of lack of testing harms maintainability is not so relevant to developers in big companies, because they have dedicated QA team (writ tests for them).
My comment says, in general, big companies do not have dedicated QA teams for dev teams, so there is no dedicated QA team write tests for dev team. My examples are 2 of the largest software companies in the world. Among them, Google redefined how people access information, Amazon reinvented how developers access computing resources.
I think my examples support my intention to prove that your statement and what it implied, in general, is not true.
If I don't use it because it doesn't pass my peer review process, whatever that might be, then my competitor will use it and come up with a faster/cheaper solution than my one.
- Mostly raise issues about nonfunctional aspects of the code. Often these are about the author knowingly violating coding guidelines, which they found stupid (at least in specific cases). Guidelines may conflict with the techniques that the author uses to achieve bugless code.
- Improve the skill of the reviewer much more than the author, as the reviewer observes high quality code and novel techniques.
> I found bugs
Please do let me know about bugs!
What if the computer encoded your knowledge?
> Don't call it done until you haven't proven to yourself that the code has no bugs.
What happens when the software changes? Do you repeat every single desk-checking exercise to ensure nothing has broken?
Do you even remember every click, every experimental input?
Can you prove that you do?
> This is an informal process of self-code-review but which involves quite rigorous thinking about the behavior of the code.
I trust that smarter developers than me are smarter developers than me.
But I am dumb. I assume that the code is smart and that my mental simulation of the code, which my brain helpfully and invisibly patches on the fly, is correct.
But my mental simulation is frequently wrong. So I wrap myself in explicit statements of what I think the code does. Then I make those explicit statements executable. And then I run them frequently.
And frequently, I realise again that I am dumb and I should leave the flawless coding to others.
> Do you even remember every click, every experimental input?
> Can you prove that you do?
I agree some tests are a good idea depending on the project. Doesn't mean I have to like writing them!
> I assume that the code is smart and that my mental simulation of the code, which my brain helpfully and invisibly patches on the fly, is correct.
I try not to assume things until I've constructed associated proofs in my mind (and sometimes written them into comments). In fact keeping in mind what you've established (proven) and what not is a very important thing. Most of the bugs I've done are because I've simply forgotten to think about / prove something.
It's a completely different way of programming!
> Then I make those explicit statements executable. And then I run them frequently.
But I prefer to write down these explicit statements in the code itself, often as assertions. I can then prove them right on the spot!
That's the most sophomore thing I have read in a long time!!!
Probably works OK if you are working alone in a simple product whose whole code mostly fits inside a single brain. Try doing that as part of a team of dozens that make daily changes to a code base of millions of lines and you will very soon earn the title of the most infamous person in the office.
You win twice, or more.
You get proof of the assertions you're making to your peers, you get regression tests to cover your code when it's being refactored, you also get to strongly document the intent of your code so that others can know it deeply, relatively quickly.
See devs like yourself who claim they aren't writing tests, but in essence they are, the only difference is they're not persisting their tests, and losing their value beyond initial validation of the correctness of the code.
If you will ask me to review piece of your code, written like that, I will immediately ask to write test cases, because team will not be able to improve code base without test coverage, and to add meaningful documentation, so we will spend our time on work instead of solving mental puzzles. In team, your freedom ends where freedom of other members begin.
If you are thinking that you are extremely good developer, then finish your projects, with your team, in fraction of time, comparing to other teams, otherwise you are extremely good programmer, but not an extremely good developer.
> Please do let me know about bugs!
There is no obvious bugs in C code. There are few tiny problems in bash scripts (e.g. error messages are not printed to STDERR and error messages are not helpful). There are lot of problems with style and documentation, which will affect team velocity and maintenability. Peer review is not a bug review. I just need to ensure myself that new code will not create problems for us when committed, and the easier way to do that in less than 10 minutes is to look at test cases, so write test cases and documentation first. When they will be OK, I may say that your code is OK. ;-)
Again I didn't say tests are a bad idea. In industry projects they are usually a very good idea. My point was that people should be able to write working bug-less code without using tests. When a test catches a real bug, it is to be understood as a failure of the developer(s).
Moreover, test cases are saving development time massively, so why waste time for manual tests and debugging sessions, when I can use computer to do that instead of me? When practiced regularly, test cases are written in tens of seconds. For example, if I will work on a calculator, I will write something like assertEquals(calculate("2+2"), 4, "Result of arithmetic addition is wrong."); Then I will work on calculator until this test will pass. Only then I may start to write unit tests. It's easy and fast, so why not?
Actually, I do! My work code does have more comments and tests (for my hobby projects, I do less of that and instead write other code, getting more done faster).
> There is no obvious bugs in C code. Indeed you still haven't pointed out a single bug, so you have not yet invalidated my assertion that I can write (mostly) bugless code :)