Day 38 — What's inside a Python wheel?
04 October 2020 · recurse-center TweetToday I read PEP 427 (and other PEPs mentioned in there) to learn more about Python wheel format! There are two types of distribution formats in Python - a source distribution, and a built distribution. The wheel format belongs to the second category in which things are already built, ready to be installed on a user's system. The installer just needs to unpack these things and put them in the site-packages
! Yes, unpack. A wheel is just a ZIP
archive with a special filename and a .whl
extension!
A wheel filename follows the format {distribution}-{version}(-{build tag})?-{python tag}-{abi tag}-{platform tag}.whl
, where:
distribution
- Distribution name. For example:django
,pyramid
.version
- Distribution version. For example:1.0
.build tag
- Optional build number. Must start with a digit. A tie breaker if two wheels have the same version.python tag
- Python implementation and version required by a distribution. Major implementations have abbreviated codes. For example:py
for Generic Python,cp
for CPython,ip
for IronPython,pp
for PyPy, andjy
for Jython.abi tag
- Python ABI required by any included extension modules. For example:cp33d
for the CPython 3.3 ABI with debugging,abi3
for the CPython stable ABI, or simplynone
if the distribution is pure-Python, or as a way of saying "don't know".platform tag
-distutils.util.get_platform().replace('-', '_').replace('.', '_')
. For example:win32
,linux_i386
,linux_x86_64
, orany
.
The last three components of the filename (before the extension) are called "compatibility tags", as defined in PEP 425.
Let's look at the wheel for conrad
!
$ unzip -l conference_radar-0.8.0-py3-none-any.whl
Archive: conference_radar-0.8.0-py3-none-any.whl
Length Date Time Name
--------- ---------- ----- ----
264 2020-08-01 11:46 conrad/__init__.py
...
273 2020-07-31 00:14 crawlers/__init__.py
...
92 2020-08-01 12:06 conference_radar-0.8.0.dist-info/WHEEL
5735 2020-08-01 12:06 conference_radar-0.8.0.dist-info/METADATA
2204 2020-08-01 12:06 conference_radar-0.8.0.dist-info/RECORD
11343 2020-08-01 12:06 conference_radar-0.8.0.dist-info/LICENSE
49 2020-08-01 12:06 conference_radar-0.8.0.dist-info/entry_points.txt
16 2020-08-01 12:06 conference_radar-0.8.0.dist-info/top_level.txt
--------- -------
60310 27 files
The root of the archive contains all files to be installed in the site-packages
. The .dist-info
directory includes WHEEL
, METADATA
, and RECORD
at a minimum. Its structure is defined in PEP 376.
WHEEL
is the wheel metadata specific to a build of the package.METADATA
is package metadata version 1.1 or greater, as defined in PEPs 241, 314, 345, and 566.RECORD
is a list of all the files in the wheel with their secure hashes, as defined in PEP 376. According to PEP 427 the hash algorithm must besha256
or better; specifically,md5
andsha1
are not permitted, as signed wheel files rely on the strong hashes inRECORD
to validate the integrity of the archive.
Installation
Installation for a wheel named conference_radar-0.8.0-py3-none-any.whl
happens in two phases:
- Unpack.
- Parse the wheel metadata in
conference_radar-0.8.0.dist-info/WHEEL
. - Check if installer is compatible with
Wheel-Version
. - Unpack archive into
site-packages
based on theRoot-Is-Purelib
key.
- Parse the wheel metadata in
- Spread.
- Put scripts in the
bin
directory (other data elsewhere), and add#!python
to them (if applicable) to point to the correct interpreter. - Update the unpacked
distribution-1.0.dist-info/RECORD
with the paths of all installed files. This can be used to remove all the files when you uninstall a package! - Compile any installed
.py
to.pyc
. (Wheels do not contain.pyc
files because they are intended to work across multiple versions of Python!)
- Put scripts in the
Also, wheels do not contain a setup.py
or setup.cfg
!
WHEEL
The contents of conference_radar-0.8.0.dist-info/WHEEL
look like this:
Wheel-Version: 1.0
Generator: bdist_wheel (0.34.2)
Root-Is-Purelib: true
Tag: py3-none-any
Wheel-Version
is the version number of the Wheel specification.Generator
is the name and optionally the version of the software that produced the archive.Root-Is-Purelib
is a flag to let the installer know "whichsite-packages
" the files need to be installed in.Tag
is the wheel's compatibility tags.
METADATA
The contents of conference_radar-0.8.0.dist-info/METADATA
look like this:
Metadata-Version: 2.1
Name: conference-radar
Version: 0.8.0
Summary: Track conferences and meetups on your terminal.
Home-page: https://github.com/vinayak-mehta/conrad
Author: Vinayak Mehta
Author-email: vmehta94@gmail.com
License: Apache 2.0
Platform: UNKNOWN
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Description-Content-Type: text/markdown
Requires-Dist: Click (>=7.0)
...
Provides-Extra: dev
Requires-Dist: Sphinx (>=2.2.1) ; extra == 'dev'
...
Package metadata has evolved a lot over the years. You can see the evolution if you go through the following PEPs:
- PEP 241 -
Metadata-Version: 1.0
- PEP 314 -
Metadata-Version: 1.1
- PEP 345 -
Metadata-Version: 1.2
- PEP 566 -
Metadata-Version: 2.1
Each PEP shows a summary of differences from the last one. Between PEPs 345 and 566, there was also PEP 426 which defined the Metadata-Version: 2.0
.
It was deferred from December 2013 through to March 2017, until it was withdrawn in February 2018 in favour of PEP 566. During those four years, distutils-sig
worked through a number of major changes (including the wheel format) which provided additional perspective on which metadata format changes were really needed and which changes could be omitted.
More recently, PEP 639 has been proposed to add support for SPDX license identifiers to the metadata! It defines the Metadata-Version: 2.2
.
RECORD
The contents of conference_radar-0.8.0.dist-info/RECORD
look like this:
conrad/__init__.py,sha256=9CLWqIDZ3zdQaWPSE8_MkeYDrkBiMiTRtwvjgH6PMlg,264
...
crawlers/__init__.py,sha256=kjZ6-dgMlTllgh-dgF9oNaSj16zqrkHepkLbRWQnxL4,273
...
conference_radar-0.8.0.dist-info/WHEEL,sha256=g4nMs7d-Xl9-xC9XovUrsDHGXt-FT0E17Yqo92DEfvY,92
conference_radar-0.8.0.dist-info/METADATA,sha256=U3nNq16oYEGW4QXWOEqieH8sLGUHyOi_1wtH36LXHZQ,5735
conference_radar-0.8.0.dist-info/LICENSE,sha256=Bz1pUCrLkNY-AJBPeFhvBV-nTInGGHl5Omfru8Sfs1M,11343
conference_radar-0.8.0.dist-info/entry_points.txt,sha256=ksgimi9VMCvimad9n9R7Yd0uSRX1jcd5kIGxylMd7v4,49
conference_radar-0.8.0.dist-info/top_level.txt,sha256=fKXJ9FYqwcB8otyOU2HQErWf3TcLK3Dqc6rH1qK-Lp0,16
conference_radar-0.8.0.dist-info/RECORD,,
The RECORD
file holds a list of all the installed files, which allows the implementation of an uninstall command! It is a CSV, composed of records, one line per installed file. The csv
module is used to read the file, with these options:
- field delimiter:
,
- quoting char: `
- line terminator:
os.linesep
(so\r\n
or\n
)
Each record is composed of three elements:
- the file's path
- a hash of the file's contents. The hash is either the empty string or the hash algorithm as named in
hashlib.algorithms_guaranteed
, followed by the equals character=
, followed by the urlsafe-base64-nopad encoding of the digest (base64.urlsafe_b64encode(digest)
with trailing = removed). - the file's size in bytes
There's also PEP 491 which defines version 1.9 of the wheel
format, but it is currently deferred, with Python packaging improvements currently focusing on the package build process rather than expanding the wheel
format to cover additional use cases.