Improving Code Review at Datto, Looking at our Nitpicks

Nitpicks, or nits, in code review are ubiquitous and unavoidable. But what if we could have code reviews with fewer nits–and ensure that the nits that did exist were truly valuable? A few months ago, I presented some data internally on Datto’s code review nitpicks. The discussion that followed was lively and is shaping how we at Datto review code. I thought I’d take the opportunity to share the nitpick findings here in a shameless plug to entice readers of today to become the Datto engineers of tomorrow. It is a joy to work with talented engineers who are willing to engage and grow; come join our journey.


Improving Code Reviews

The reason for looking into nits stems from a desire to improve the code review process on my team. The ideal code review process goes something like this:

  1. Engineers work to deliver business value and commit the code that flawlessly delivers that value the very first time.
  2. When colleagues go to review the code, they are edified in its reading and quickly approve the merge request.
  3. There is no need for feedback because the code was perfect.
  4. The business then benefits from the code expeditiously.

That’s the ideal, but as great as our Datto engineers are, we are also human. So, our code can fall prey to the common failings of most engineers and our code is initially imperfect. The reviews and comments of our fellow engineers aim to perfect it. It’s within this review process that the nits appear.

Reviewers have a lot to say during code review, so why am I focusing on nits? The word nit carries a lot of connotations in software review and it is one of the few labels that reviewers proactively apply to their comments. My working definition of a nit is this: a review with the label nit is less substantive than a review without it. When a reviewer applies the label, they intend for the engineer to give it less weight than if the label were missing.


12 Buckets of Nits

Gathering the reviews was fairly trivial with Gitlab’s API. Loop through each of our projects, search for “nit” with a scope of notes, and you’ve got some nits! The results were on the order of magnitude of a few thousand. And as committed as I am to this analysis, I didn’t particularly want to slog my way through reading thousands of nits. Initially I tried to apply some text analysis tools to it (the most fruitful being the YAKE algorithm which helped with some initial bucket identification). That only helped with a fraction of the nits, however. The solution then was sampling. Using a 95% confidence level with a 4% confidence interval brought me down to just a several hundred nits to read through: something I found to be much more manageable.

The nits I reviewed fell into 12 different buckets:

  1. New Line
  2. Spacing
  3. Quotation
  4. Case
  5. General Formatting
  6. Spelling
  7. Descriptions
  8. Duplication
  9. Renaming
  10. Project Consistency
  11. Self Effacing
  12. Voice of the Engineer

The first five buckets constituted about 24% of all nits. These are the types of feedback that we can, and should automate, through our continuous integration pipelines with linters and formatters. These types of trivial feedback can be done by robots, not by humans. The robot feedback has the added benefit of being given much more quickly as a quality gate.


Buckets six through ten should, I believe, stay and could not be automated away. Spelling feels deceptively like something a spell check robot could manage, but our code often has perfectly valid variable names that most spellcheckers would struggle with. (I’m looking at you BypassedVcenterException.)

The last two buckets were, in my mind, the most interesting. Self Effacing reviews were comments that had the “nit” label on them but were, in fact, quite substantive and beneficial. Substantive feedback should remain, even if the reviewer is applying the nit label to make the feedback more palatable to the engineer.

The final bucket, Voice of the Engineer, contained nits that boils down to “this code is technically fine, but the reviewer would have chosen to write it differently.” Given the wide range of ways we can solve problems with code, this is no surprise. Engineers will often prioritize differently when it comes to CPU efficiency, memory efficiency, network efficiency, ease of reading, fewer lines, project consistency, and the ever moving target of whatever “best practices” means today. Roughly 20% of our nits fell into the Voice of the Engineer category.


2 Ways to Improve Code Review

With a basic understanding of our nits, how can we now improve our code review process?

  1. Strive to be intentional, focused, and sparing in our reviews. This step isn’t technological in nature, but human. Each code review triggers emails and notifications. Reviewers should help the engineer deliver the best code as quickly as possible. Sometimes this looks like not leaving a review at all.
  2. Automate the minutia. The first five buckets (New Line, Spacing, Quotation, Case, General Formatting) can and should be handled by linters and quality bots. If a newline at the end of a file matters, we should do the work to automate that check. Our teams now will be making changes to the continuous integration flow so our nits of today become automated. Engineers can take this automation to the next level by using tooling such as the pre-commit library to automate the linting and formatting fixes before the commits are pushed.

While the research and analysis were a fun experience, the collaborative discussion among the 80 or so engineers in attendance afterwards was the real gem in this project. The group shared past experiences with nits, tooling, and visions for the future of code review and delivery. We were unsurprisingly split on tooling that automatically changes the code, if the slowdown of delivery is worth the edification of engineers by suggesting alternatives, and if tooling should fix code or merely inform the engineer of its inadequacies.

I have no doubt that we will continue to iterate on our code review and delivery processes. In my one-year tenure at Datto, it has already been improved via additional CI linters and developers telling me they have withheld unnecessary nits as a result of our discussion; I am excited to see what my fellow engineers come up with next.


About the Author

Matt Crowson

More from this author