In defense of doctests

Many of the projects I maintain in open source make heavy use of doctests. Supplying doctests affords a luxury and elegance not offered by unit tests, namely:

  • canonical and representative examples are described alongside publishable guidance, reducing duplication of work and opportunities for inconsistency.
  • the documented behavior is guaranteed to be correct (assuming tests are run).
  • the syntax is necessarily standardized.
  • the instructions can be readily parsed by machine.
  • the examples can be mechanically re-written for consistent style.
  • behavior and tests can often be consolidated in a single code block for simple patches and easy review.
  • encourages stateless functions with simple inputs and outputs.
  • doctests contribute to code coverage and often are sufficient to capture the full range of supported use-cases.

Therefore, lines in the docstrings that begin with >>> are executed and will fail if the output is not a match for what’s written. I acknowledge that this form of test is often discouraged and is suboptimal for many cases (see Soapbox). In particular, if doctests have to spend a high percentage of the user’s attention to facilitate testing, it’s probably better migrated to a unit test. With judicious use of fixtures and aggressive migration to unit tests when appropriate, doctests can be quite valuable and intuitive. That’s why all skeleton-based projects run doctests as a matter of course.

Where these doctests exist in upstream code, it’s preferable to exercise those tests as the upstream project does instead of maintaining a port of those doctests to unit tests.

Some projects (e.g. jaraco.text, jaraco.context) rely entirely on doctests. More sophisticated projects strike the balance between doctests and unit tests (jaraco.functools, pip-run, irc) and others have nearly no doctests (cherrypy).

Written on January 24, 2023