Hi. Gee whiz, do I like continuous integration (CI). Though, hmm, that wiki entry has a very different definition from what I mean. I mean CI as in automated testing of your code upon merging into a GitHub or GitLab repo.

Anyway, one of the coolest things I learned from working a little bit on open source was how to run a super-tight ship. Given that these big projects like pandas need to operate with a la carte contributions, a lot of the management of those contributions is automated: automatic testing, yes, but even fancier stuff, like automatic documentation, automatic coverage and super-opinionated formatting.

I just integrated a few of these tools into my own work, and I wanted to talk about them.

Automatic test coverage with coverage

Ned Batchelder has a great PyCon talk on testing in general: Getting started with testing (PyCon 2014). My main takeaway from this talk was respect the dignity of tests 🙇. They do not have to suck. They can be as beautiful and programmy as the rest of your code.

But anyway, Ned also has a great package, coverage, which can integrate with tools like pytest (beloved pytest) to tell you which functions - nay, even lines - of your code are missed by tests.

I used coverage recently to auto-fail any merge requests on a repo by putting this in my CI config file:

coverage run -m --source=my_cool_code/ pytest
coverage report --fail-under 95

This basically says: your code must have at least 95% test coverage to pass CI. I like this because I often add new features and then de-prioritize the tests for them - thinking, well, I'll get to those eventually. No longer!

Automatic docstring coverage with interrogate

Lynn Root has a super cool beta project called interrogate that basically does the same thing as coverage, but for docstrings. Once again, you can easily integrate it into your CI with the following:

interrogate --quiet --fail-under 95 my_cool_code/

As above, I'm setting CI up to fail whenever docstring coverage falls under 95%. Yes, I'm strict! No more cryptic functions.

Automatic code formatting with black

black is a quite opinionated formatter that will, if run on your code, automatically re-format it according to both the PEP8 standard, as well as the opinions of its maintainers. It's like me when someone asks me to read their email for the content, not copyediting. I CANNOT NOT COPYEDIT, OKAY. That is how black is.

Anyway, given how it's tightly coupled with - and seemingly blessed by - the Python Software Foundation, I feel comfortable setting it as a CI requirement as well: all MRs must have "blackened" Python code to be accepted. One format = better readability. Hooray.

This is the line in my CI config:

black --check my_cool_code/

It'll bounce back if any files fail not to be modified by black.


🎉 Summary 🎉

All this to say: now MRs into my project will be more onerous, because my CI has become quite strict indeed. But the marginal added work when merging will, I hope, reap great rewards in the long run - clean, robust code that fails less often, and is easier to read and understand! 😭