Day 4 — Hyrum's Law
13 August 2020 · recurse-center TweetToday I worked on some open issues and pull requests for camelot and excalibur. Last month, pdfminer.six
(one of camelot
's dependencies) broke backwards compatibility by renaming the PDFTextExtractionNotAllowed exception. camelot
raises it while getting the page layout if the page is not extractable. I'd added it back in 2016 after looking at the basic usage section of the pdfminer
docs (now unmaintained). The usage of this exception has been abtracted away in the pdfminer.six
docs (the fork that is now maintained by pdfminer
contributors).
Soon after this rename was published on PyPI, someone raised an issue on the camelot
issue tracker. import camelot
started breaking for a lot of users because camelot
pins the minimum version for all its dependencies (including pdfminer.six
) with >=
. I reported it on the pdfminer.six
Gitter room, and a contributor who was facing the same issue made a fix which was merged and then released after 6 days. Meanwhile, I pinned the minimum version for pdfminer.six
to >=20200726
on both PyPI and conda-forge.
This reminded me of an issue someone raised on the excalibur
issue tracker in February, when Werkzeug
broke backwards compatibility by changing how you import the secure_filename
function.
Source: Twitter
Once you put out an interface; the more time it spends in the wild (accumulating users), the more difficult it becomes to change it. Even if that change is something that you now consider to be a private interface. Today I learned that this observation has a name, and it's called Hyrum's Law.
Source: xkcd
The sad fact of life is that no matter how careful you are, the more popular your library is the more likely it is that any change is going to break someone. — Versioning Software
You have a library with some incidental, undocumented and unspecified behavior that you consider to be obviously not part of the public interface. You change it to solve what seems like a bug to you, and make a patch release, only to find that you have angry hordes at the gate who, thanks to Hyrum's Law, depend on the old behavior. — Version numbers: how to use them?
How do you handle Hyrum's Law as creators and users?
As a creator, make sure you test all old interfaces so that your CI breaks when sudden changes are made. In the case where you want to remove an old interface, raise a deprecation warning when that interface is called, and give your users enough time to migrate before you remove it. scikit-learn
does this by raising a deprecation warning and waiting for two minor versions before they remove an old interface. Armin Ronacher said that he used to give users well above a year to migrate. These are good places to start in terms of thinking about deprecation windows.
How do you find out if an old interface is used widely? Use GitHub global search.
Source: Twitter
As a user,
Rely on CI, potentially on a cron job, to detect when a project breaks for you instead of leaving it up to the project to try and make that call. — Why I don't like SemVer anymore
You can run your test suite as a cron job (on GitHub Actions!) to detect breakages. By default, pytest will display DeprecationWarning
and PendingDeprecationWarning
warnings from user code and third-party libraries, as recommended by PEP-0565. There should be a way for pytest
to instead raise an error whenever it catches a deprecation warning. I'll update it here when I find it.
Can we agree on calling this "VerOps"?