Day 38 — What's inside a Python wheel?
04 October 2020 · recurse-centerToday I read PEP 427 (and other PEPs mentioned in there) to learn more about Python wheel format! There are two types of distribution formats in Python - a source distribution, and a built distribution. The wheel format belongs to the second category in which things are already built, ready to be installed on a user's system. The installer just needs to unpack these things and put them in the site-packages! Yes, unpack. A wheel is just a ZIP archive with a special filename and a .whl extension!
A wheel filename follows the format {distribution}-{version}(-{build tag})?-{python tag}-{abi tag}-{platform tag}.whl, where:
distribution- Distribution name. For example:django,pyramid.version- Distribution version. For example:1.0.build tag- Optional build number. Must start with a digit. A tie breaker if two wheels have the same version.python tag- Python implementation and version required by a distribution. Major implementations have abbreviated codes. For example:pyfor Generic Python,cpfor CPython,ipfor IronPython,ppfor PyPy, andjyfor Jython.abi tag- Python ABI required by any included extension modules. For example:cp33dfor the CPython 3.3 ABI with debugging,abi3for the CPython stable ABI, or simplynoneif the distribution is pure-Python, or as a way of saying "don't know".platform tag-distutils.util.get_platform().replace('-', '_').replace('.', '_'). For example:win32,linux_i386,linux_x86_64, orany.
The last three components of the filename (before the extension) are called "compatibility tags", as defined in PEP 425.
Let's look at the wheel for conrad!
$ unzip -l conference_radar-0.8.0-py3-none-any.whl
Archive: conference_radar-0.8.0-py3-none-any.whl
Length Date Time Name
--------- ---------- ----- ----
264 2020-08-01 11:46 conrad/__init__.py
...
273 2020-07-31 00:14 crawlers/__init__.py
...
92 2020-08-01 12:06 conference_radar-0.8.0.dist-info/WHEEL
5735 2020-08-01 12:06 conference_radar-0.8.0.dist-info/METADATA
2204 2020-08-01 12:06 conference_radar-0.8.0.dist-info/RECORD
11343 2020-08-01 12:06 conference_radar-0.8.0.dist-info/LICENSE
49 2020-08-01 12:06 conference_radar-0.8.0.dist-info/entry_points.txt
16 2020-08-01 12:06 conference_radar-0.8.0.dist-info/top_level.txt
--------- -------
60310 27 files
The root of the archive contains all files to be installed in the site-packages. The .dist-info directory includes WHEEL, METADATA, and RECORD at a minimum. Its structure is defined in PEP 376.
WHEELis the wheel metadata specific to a build of the package.METADATAis package metadata version 1.1 or greater, as defined in PEPs 241, 314, 345, and 566.RECORDis a list of all the files in the wheel with their secure hashes, as defined in PEP 376. According to PEP 427 the hash algorithm must besha256or better; specifically,md5andsha1are not permitted, as signed wheel files rely on the strong hashes inRECORDto validate the integrity of the archive.
Installation
Installation for a wheel named conference_radar-0.8.0-py3-none-any.whl happens in two phases:
- Unpack.
- Parse the wheel metadata in
conference_radar-0.8.0.dist-info/WHEEL. - Check if installer is compatible with
Wheel-Version. - Unpack archive into
site-packagesbased on theRoot-Is-Purelibkey.
- Parse the wheel metadata in
- Spread.
- Put scripts in the
bindirectory (other data elsewhere), and add#!pythonto them (if applicable) to point to the correct interpreter. - Update the unpacked
distribution-1.0.dist-info/RECORDwith the paths of all installed files. This can be used to remove all the files when you uninstall a package! - Compile any installed
.pyto.pyc. (Wheels do not contain.pycfiles because they are intended to work across multiple versions of Python!)
- Put scripts in the
Also, wheels do not contain a setup.py or setup.cfg!
WHEEL
The contents of conference_radar-0.8.0.dist-info/WHEEL look like this:
Wheel-Version: 1.0
Generator: bdist_wheel (0.34.2)
Root-Is-Purelib: true
Tag: py3-none-any
Wheel-Versionis the version number of the Wheel specification.Generatoris the name and optionally the version of the software that produced the archive.Root-Is-Purelibis a flag to let the installer know "whichsite-packages" the files need to be installed in.Tagis the wheel's compatibility tags.
METADATA
The contents of conference_radar-0.8.0.dist-info/METADATA look like this:
Metadata-Version: 2.1
Name: conference-radar
Version: 0.8.0
Summary: Track conferences and meetups on your terminal.
Home-page: https://github.com/vinayak-mehta/conrad
Author: Vinayak Mehta
Author-email: vmehta94@gmail.com
License: Apache 2.0
Platform: UNKNOWN
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Description-Content-Type: text/markdown
Requires-Dist: Click (>=7.0)
...
Provides-Extra: dev
Requires-Dist: Sphinx (>=2.2.1) ; extra == 'dev'
...
Package metadata has evolved a lot over the years. You can see the evolution if you go through the following PEPs:
- PEP 241 -
Metadata-Version: 1.0 - PEP 314 -
Metadata-Version: 1.1 - PEP 345 -
Metadata-Version: 1.2 - PEP 566 -
Metadata-Version: 2.1
Each PEP shows a summary of differences from the last one. Between PEPs 345 and 566, there was also PEP 426 which defined the Metadata-Version: 2.0.
It was deferred from December 2013 through to March 2017, until it was withdrawn in February 2018 in favour of PEP 566. During those four years, distutils-sig worked through a number of major changes (including the wheel format) which provided additional perspective on which metadata format changes were really needed and which changes could be omitted.
More recently, PEP 639 has been proposed to add support for SPDX license identifiers to the metadata! It defines the Metadata-Version: 2.2.
RECORD
The contents of conference_radar-0.8.0.dist-info/RECORD look like this:
conrad/__init__.py,sha256=9CLWqIDZ3zdQaWPSE8_MkeYDrkBiMiTRtwvjgH6PMlg,264
...
crawlers/__init__.py,sha256=kjZ6-dgMlTllgh-dgF9oNaSj16zqrkHepkLbRWQnxL4,273
...
conference_radar-0.8.0.dist-info/WHEEL,sha256=g4nMs7d-Xl9-xC9XovUrsDHGXt-FT0E17Yqo92DEfvY,92
conference_radar-0.8.0.dist-info/METADATA,sha256=U3nNq16oYEGW4QXWOEqieH8sLGUHyOi_1wtH36LXHZQ,5735
conference_radar-0.8.0.dist-info/LICENSE,sha256=Bz1pUCrLkNY-AJBPeFhvBV-nTInGGHl5Omfru8Sfs1M,11343
conference_radar-0.8.0.dist-info/entry_points.txt,sha256=ksgimi9VMCvimad9n9R7Yd0uSRX1jcd5kIGxylMd7v4,49
conference_radar-0.8.0.dist-info/top_level.txt,sha256=fKXJ9FYqwcB8otyOU2HQErWf3TcLK3Dqc6rH1qK-Lp0,16
conference_radar-0.8.0.dist-info/RECORD,,
The RECORD file holds a list of all the installed files, which allows the implementation of an uninstall command! It is a CSV, composed of records, one line per installed file. The csv module is used to read the file, with these options:
- field delimiter:
, - quoting char: `
- line terminator:
os.linesep(so\r\nor\n)
Each record is composed of three elements:
- the file's path
- a hash of the file's contents. The hash is either the empty string or the hash algorithm as named in
hashlib.algorithms_guaranteed, followed by the equals character=, followed by the urlsafe-base64-nopad encoding of the digest (base64.urlsafe_b64encode(digest)with trailing = removed). - the file's size in bytes
There's also PEP 491 which defines version 1.9 of the wheel format, but it is currently deferred, with Python packaging improvements currently focusing on the package build process rather than expanding the wheel format to cover additional use cases.