How I review code

How I review code(engineering.tumblr.com)

234 points by peterstensmyr 8 years ago | 138 comments

taylodl 8 years ago |

"Senior engineers sometimes need to be reminded that highly performant, abstract, or clever code is often difficult to read and understand later, which usually means asking them for more inline comments and documentation."

Ha! That's not a senior engineer. Senior engineers write the most simple-looking code that just works. In every rainy day scenario imaginable. The clever code writers aren't there yet.

ot 8 years ago | |

Reminds me of the "Evolution of a Haskell programmer":

https://www.willamette.edu/~fruehr/haskell/evolution.html

Make sure you don't miss the punchline, "Tenured professor".

It's the same with the progression of engineering seniority: increasing levels of cleverness and unnecessary sophistication, until you reach a point where you don't have anything to prove anymore, and you can feel comfortable writing the simplest and most elegant solution.

52-6F-62 8 years ago | | |

That was great— though not being versed in Haskell, I tried to follow their link to the original version, but it had died.

Here's the original:

http://www.ariel.com.au/jokes/The_Evolution_of_a_Programmer....

The punchline got me pretty good.

jrs95 8 years ago | | |

It's kind of amusing to me that a lot of engineering interviews also seem to be focused on doing things that you generally won't and shouldn't be doing for the position. What's the point of seeing if someone knows data structures and algorithms they won't be using if they can't write a web application that interfaces with a SQL database without doing queries in a for loop?

madsohm 8 years ago | | |

I've been doing a lot of Haskell on Codewars.com recently. This is exactly what I see. I write up a long solution, that uses the basics like pattern matching, heads of lists and such. The solution that has the most "Best practise" up-votes are usually something involving importing control.monad and other similar stuff.

scalesolved 8 years ago | |

Ha it does make me think of this tweet https://twitter.com/KevlinHenney/status/381021802941906944

You nailed it really, senior engineers code is the most simple looking as generally they've picked the right abstraction for the problem.

weego 8 years ago | | |

Is this a senior engineers are literally superheroes meme I've missed?

Everyone is capable of making poor decisions and straight up logic errors. Senior devs sometimes more-so because we tend to get entrenched in a particular issue solo for longer periods.

pqh 8 years ago | | |

Text of the tweet so people don't have to click:

A common fallacy is to assume authors of incomprehensible code will somehow be able to express themselves lucidly and clearly in comments.

dboreham 8 years ago | | |

Senior Engineers are probably in meetings, not writing code..

mpweiher 8 years ago | | |

There is a conundrum here: architectural mismatch.

Often, the "right abstraction" for a problem is not call/return based. So you get to choose between having the right abstraction, and code that is simple when viewed with that abstraction in mind, but "weird". Or alternatively choose the wrong but better-supported abstraction and have code that is needlessly complex but "straightforward".

erroneousfunk 8 years ago | | |

> they've picked the right abstraction for the problem

I hate to agree with you, but "picking the right abstraction for the problem" elegantly expresses what I've been trying to tell people for the last few years now, but far less succinctly (I usually ramble on about "underlying data models" and "what this actually is in the real world, not just how we view it in our application")

The reason I hate to agree with you is that I just became very disappointed that this is a difficult skill to master. "Just use the right model and the code is easy, duh. Why aren't you doing this?" Well. Now I just feel like a jerk. I thought they just gave me the "senior" title because I was old.

It's unfortunate that one of the most important skills in the industry is so intangible and difficult to quantify. Even more difficult to teach.

cousin_it 8 years ago | |

Ha! That's good engineers. Senior engineers write an ambitious unusable framework, get promoted and move on to the next project. So the article's advice is ironically spot on.

hinkley 8 years ago | | |

Tell me more about getting them to move on. I’m asking for a friend.

hinkley 8 years ago | |

We call people senior after five years. Mastery can take a lot longer than that (and mastery doesn’t mean the end of learning).

Sooner or later someone will have to debug this code late at night. Don’t make it require brain cells.

bluntfang 8 years ago | |

is this a no true scottsman?

jrs95 8 years ago | |

That might be true in theory, but in practice most people I know with that title (& most with titles above it) do this sort of thing more often than junior developers.

yeukhon 8 years ago |

I echo the author's point in "Review the code with its author in mind".

Without comments, sometimes it's really really difficult to navigate the code. I have been adding more comments than ever: don't assume every line is obvious, write a comment to explain what the next few lines really do.

    # base case: stop dividing when we find the largest square.
    if width == height:
        return width, height
    else:
        # otherwise, we know that we can break the land into several
        # pieces, some are already square-sized, but there must be
        # left over, which is the difference of the width and height,
        # and can be divided again.

        remain = abs(width - height)
        return largest_square_plot(remain, min([width, height]))

^ This code is only for me to read so I didn't really care much about grammar... but a year from now I shouldn't have trouble understand the code in a minute or two.

Some of my function/method has a pretty long docstring which may include explaining the rationale, and perhaps even some ascii diagrams. If you have trouble understand a piece of code after a few passes, that's a bad code. Also, use more newlines...

> I check every Github email I get; I make sure that I don’t get notified for everything that happens in the repo

Not sure about others, but I am tired of reading PR notifications in my mailbox. I don't know how kernel developers can live with this.

I have been thinking about just build a bot.

* receives PUSH from GitHub

* adds events to a queue

* notifies based on priority

* pings me once in a while to remind me that I have outstanding PRs to review (as reviewer and as author of the patch).

If someone needs me to review right away, he/she can reach out to me directly in chat.

anameaname 8 years ago |

A conundrum for me is how to get other people to code review the way I want to be code reviewed? Particularly, I noticed code reviewers on my team are pretty pedantic, obsessed with correctness, and need to be explained why each change is okay. These are people that regularly write good quality code themselves, but there is a high amount of distrust. Why doesn't a team of talented programmers trust each other?

(in case it needs to be stated, to date, I have reviewed about as much code as I have written, upwards of 100k lines, as have most of the other people on my team. We aren't amateurs, but it often seems like we're babies.)

pmcollins 8 years ago |

> We have repositories for the PHP backend, our database schemas, our iOS (Swift/Obj-C) and Android (Java/Kotlin) mobile apps, infrastructure projects written in Go, C/C++, Lua, Ruby, Perl, and many other projects written in Scala, Node.js, Python, and more

Why do organizations allow this? I realize that some platforms require their own languages (iOS, Android), but outside of that, just pick one or two and hold the line.

skate22 8 years ago |

My code is well commented. //eslint-disable-line all over the place.

lsadam0 8 years ago | |

Invoking eslint-disable needs to have comment of it's own justifying why the line in question is being skipped. What use is the linter if we just disable it everytime it complains?

skate22 8 years ago | | |

I dont have permission to remove lint rules but some of them are absurd.

I refuse to remove 'extrenious' parenthesis that make the code more readable to junior devs who may not know the language specific order of evaluation in a logical expression.

nimbix 8 years ago |

For me the most important part of reviewing any nontrivial changes it actually check out the branch and test every change I see. This keeps a lot of issues from reaching the QA team and catches issues they could have missed since they don't actually go through the code.

ryanianian 8 years ago | |

This is an ideal but is hardly scalable if you're doing 3-4+ PRs per day and are expected to do your own coding as well (plus attend bureaucracy).

You can effectively do this by checking that every nontrivial change has sufficient automated test coverage. This saves you from having to test changes yourself and saves future devs from having to go through your thought-process when they touch that code next.

bwest87 8 years ago |

Something the author doesn't bring up, but that we started doing about 9 months ago at my company, is synchronous reviews. Meaning the committer is on the phone or in person with the reviewer. It's great. We don't do it for all PR's, but anything medium sized or above, or even small one's if they involve critical logic. The way we usually do it is the committer walks through the changes with the reviewer. Often the committer will realize their own ways of improving the code. And with the added context, the reviewer can often provide better feedback. Plus the X factor of just two people talking who come up with ideas, improvements, etc. And half our team is remote, so this wasn't a natural outgrowth. We make it happen, but I think it's worth it.

uremog 8 years ago | |

That sounds very similar to Rubber Duck Debugging.

soneca 8 years ago |

I am a junior developer and my latest feedback was that one of the main skills I should develop is to make better, more well-thought, critic and deep code reviews (including of PRs from more senior developers).

Any tips on how to improve this?

Would a checklist help? Have a clear process on what to review first?

wiredfool 8 years ago | |

A checklist can help. (he says, and then does a mental one because there isn't one nearby).

What I look for is:

1) What's the problem being solved? Does this look like a reasonable approach? Is the code pythonic (Obv: for python)?

2) What edge cases are there? Does this handle the important ones? Does it punt properly on the less important ones?

3) Look for a short list of bug classes that have come up in the project before that have lead to emergency patches. E.g. Decrefing, Checking mallocs, any exec sorts of things. (This is a clear application for a checklist)

4) Are there tests/documentation/other required fixtures and stuff?

5) Does the code generally match the style of the project?

1000) Code formating and whitespace and line wrapping and all that bikeshedding stuff.

Feel free to short circuit anywhere once it becomes clear that there's more work required.

kripke 8 years ago | | |

1000 should really be handled by automated tools. Takes useless burden from the reviewer, and emotionally easier for both sides too.

bcbrown 8 years ago | |

Ask for examples of good reviews to emulate. Look at the PR without comments first, and give your own review. Then compare with the original review, and see what areas you emphasized more, and emphasized less than the original review. Talk with the original reviewer, and ask about mindset behind why they asked for the changes they did.

Also, read a lot of code reviews. Just like reading a lot of code is helpful for becoming a better developer, reading a lot of reviews is helpful for becoming a better reviewer.

driusan 8 years ago | |

Unless your current reviews are really superficial probably not. You can only review at your level of understanding, so if they're asking for more depth to your code reviews you probably need to develop a deeper understanding of the architecture and design of your codebase. A checklist would do the opposite of that in most cases.

chriswarbo 8 years ago |

> I look for code that is well-documented (both inline and externally), and code that is clear rather than clever. I’d rather read ten lines of verbose-but-understandable code than someone’s ninja-tastic one-liner that involves four nested ternaries.

"Clear" and "clever" aren't in opposition, and likewise "verbose" and "understandable" aren't correlated.

I think this characterisation, and especially the example, shows a lowest-common-denominator straw man of "clever one-liners" which seems to miss the reason that some people like them. In particular, it seems to be bikeshedding about how to write branches. The author doesn't say what those "ten lines of verbose-but-understandable code" would be, but given the context I took it to mean "exactly the same solution, but written with intermediate variables or if/else blocks instead".

This seems like an analogous situation to https://wiki.haskell.org/Wadler's_Law where little thought is given to what the code means, more thought is given to how that meaning is encoded (e.g. ternaries vs branches) and religious crusades are dedicated to how those encodings are written down (tabs vs spaces, braces on same/new lines, etc.).

Note that even in this simple example there lurks a slightly more important issue which the author could have mentioned instead: nested ternaries involve boolean expressions; every boolean expression can be rewritten in a number of ways; some of those expressions are more clear and meaningful to a human than others. For example, `loggedIn && !isAdmin` seems pretty clear to me; playing around with truth tables, I found that `!(loggedIn -> isAdmin)` is apparently equivalent, but it seems rather cryptic to me. This is more obvious if intermediate variables are used, since they're easier to name if they're meaningful.

In any case, compressing code by encoding the same thing with different symbols doesn't make something "clever". It's a purely mechanical process which doesn't involve any insights into the domain.

To me, code is "clever" if it works by exploiting some non-obvious structure/pattern in the domain or system. For example, code which calculates a particular index/offset in a non-obvious way, based on knowledge about invariants in the data model. Another example would be using a language construct in a way which is unusual to a human, but has the perfect semantics for the desired behaviour (e.g. duff's device, exceptions for control flow, etc.).

Such "clever" code is often more terse than a "straightforward" alternative, but that's a side-effect of the "cleverness" (finding an existing thing which behaves like the thing we want) rather than the goal.

If the alternative to some "clever" code is "10 lines of verbose but understandable code" then it's probably not that clever; so it's probably a safe bet to go with the latter. The real issues with clever code are:

- Whether the pattern it relies on is robust or subject to change. Would it end up coupling components together, or complicate implementation changes in unrelated modules?

- How hard it is to understand. Even if it's non-obvious, can it be understood after a moment's pondering; or does it require working through a textbook and several research papers?

- Whether the insights it relies on are enlightening or incidental, i.e. the payoff gained from figuring it out. This is more important if it's harder to understand. Enlightening insights can change the way we understand the system/domain, which may have many benefits going forward. Incidental insights are one-off tricks that won't help us in the future.

- How difficult it would be to replace; or whether it's possible to replace at all.

This last point is what annoys me in naive "clever vs verbose" debates, and prompted this rant, since it's often assumed that the only difference is line count. To me, the best "clever" code isn't that which reduces its own line count; it's the code which removes problems entirely; i.e. where the alternative has caveats like "before calling, make sure to...", "you must manually free the resulting...", "watch out for race conditions with...", etc.

One example which comes to mind is some Javascript I wrote to estimated prices based on user-provided sliders and tick-boxes, and some formulas and constants which sales could edit in our CMS (basically, I had to implement a spreadsheet engine).

Recalculating after user input was pretty gnarly, since formulas could depend on each other in arbitrary ways, resulting in infinite loops and undefined variables when I tried to do it in a "straightforward" way. The "clever" solution I came up with was to evaluate formulas and values lazily: wrapping everything in thunks and using a memo table to turn exponential calculations into linear ones. It was small, simple and heavily-commented; but the team's unfamiliarity with concepts like lazy evaluation and memoising made it hard to get through code review.

Also, regarding "straightforward" or "verbose" code being "readable": it's certainly the case that any particular part of such code can be read and understood locally, but it can make the overall behaviour harder to understand. Just look at machine code: it's very verbose and straightforward: 'load address X into register A then add the value of register B', simple! Yet it's very hard to understand the "big picture" of what's going on. Making code more concise, either by simplifying it or at least abstracting away low-level, nitty-gritty details into well-named functions, can help with this.

When used well, "clever" code can reframe problems into a form which have very concise solutions; not because they've been code-golfed, but because there's so little left to say. This can mean the difference between a comprehensible system and a sprawling tangle of interfering patches. This may harm local reasoning in the short term, since it requires the reader to view things from that new perspective, when they may be expecting something else.

When used poorly, it results in things like nested ternaries, chasing conciseness without offering any deeper understanding of anything.

thetruthseeker1 8 years ago |

I don’t know how much time he spends code reviewing. But if at the end of the day he wants anybody to get the complete context of what the change entails by looking at the PR... I would think the code review process is more elaborate and time consuming than many companies can afford.

agentgt 8 years ago |

> clever code is often difficult to read and understand later

I have seen this many times and they are actually usually talented developers that are just not used to working in groups....

but what I have seen more often (back when I did code reviews)... is lazy copy and pasting or something analogous.

DarkVador 8 years ago |

I don't understand why you need to know the progamer behind the code ? You need to be totaly impartial when you judge something.

So i think it's a wrong way to review the code. You don't need the WHO but the WHY.

dingo_bat 8 years ago |

> I’d rather read ten lines of verbose-but-understandable code than someone’s ninja-tastic one-liner that involves four nested ternaries.

One of my pet peeves is the inability to solve a K-Map for 6 variables and do stuff based on the resulting boolean expression. Just because it is utterly unreadable. I've tried this with many reviewers but nope.

def largest_square_plot(width, height): """Computes the largest square to tile a plot of the given width and height.""" # because we want the grid of squares to fit exactly # the size of the squares needs to divide both width and height # to get the largest square, we use the greatest common divisor from math import gcd square_size = gcd(width, height) return (square_size, square_size)

private double SquareRootApproximation(n) { r = n / 2; while ( abs( r - (n/r) ) > t ) { r = 0.5 * ( r + (n/r) ); } return r; } System.out.println( "r = " + SquareRootApproximation(r) );