Day 45 — I have Linux and macOS wheels!
13 October 2020 · recurse-center TweetToday I continued my quest of packaging my Python C extension for multiple OSes. Yesterday while doing a packaging "test run" with the curses
"hello world" program, I found that curses
is not supported on Windows (it should work with WSL, but not with the "default" Windows terminal I guess).
So the first thing I wanted to ensure today was that poppler
and some of its dependencies that I need (freetype
, fontconfig
, libpng
, and libjpeg
) are supported on Windows. Ilia had suggested that I could use cygwin
to do Linux-y things on Windows, so I borrowed my mom's laptop which runs Windows 7, and installed poppler
and its dependencies; along with git
, gcc
/g++
, make
and cmake
:
I was able to clone and build poppler
from source! It still required the jpeg
library to generate the Makefiles, even though I set ENABLE_LIBOPENJPEG=none
. I'll have to debug that at some point so that I can remove this poppler
dependency.
> cmake -D ENABLE_QT5=OFF -D ENABLE_LIBOPENJPEG=none -D ENABLE_CPP=OFF ..
> make poppler
After the test, I wanted to transfer some files from the Windows laptop to my laptop using python -m http.server
, but didn't know what the Windows equivalent of ifconfig
was (to get the Windows laptop's IP address that I could access on my laptop), and also didn't want to spend time finding and installing ifconfig
on cygwin
. I found this nice alternative!
>>> import socket
>>> s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
>>> s.connect(("8.8.8.8", 80))
>>> print(s.getsockname()[0])
'192.168.1.4'
And after that, I went back to building wheels for Linux and macOS, and started setting up the packaging pipeline (from yesterday) for poppler-utils
. The only changes I had to make to the setup.py
were:
- Include all of the
poppler
andpybind11
header files:
ext_includes = [
"lib/poppler",
"lib/poppler/fofi",
"lib/poppler/goo",
"lib/poppler/poppler",
"lib/poppler/build",
"lib/poppler/build/poppler",
"lib/poppler/utils",
"lib/poppler/build/utils",
pybind11.get_include(),
]
- Declare the module name, and pass in the path to the
C++
source:
ext_modules = [
Extension(
"poppler_utils.pdftopng",
# Sort input source files to ensure bit-for-bit reproducible builds
# (https://github.com/pybind/python_example/pull/53)
sorted(["src/poppler_utils/pdftopng.cpp"]),
include_dirs=ext_includes,
language="c++",
),
]
When I installed the package using pip
, it called setuptools
(which internally uses distutils
, I think) to build the extension using g++
, and link it to the relevant shared libraries using ld
:
$ pip install -v -e .
I also had to make changes to the cibuildwheel GitHub workflow:
- Add bash scripts for Linux and macOS to install external dependencies and build
poppler
. The Linux build script looks like this, and the macOS one is similar:
#!/bin/bash
brew install freetype fontconfig libpng jpeg
cd lib/poppler
mkdir build && cd build
cmake -D ENABLE_QT5=OFF -D ENABLE_LIBOPENJPEG=none -D ENABLE_CPP=OFF ..
make poppler
- Add the
poppler
build directory toLD_LIBRARY_PATH
before calling theauditwheel
repair command, because otherwise the build would fail saying that auditwheel wasn't able to locate thepoppler
shared library. And correspondingly,DYLD_LIBRARY_PATH
fordelocate
on macOS.
CIBW_REPAIR_WHEEL_COMMAND_LINUX: "LD_LIBRARY_PATH=$(pwd)/lib/poppler/build/:$LD_LIBRARY_PATH auditwheel repair -w {dest_dir} {wheel}"
CIBW_REPAIR_WHEEL_COMMAND_MACOS: "DYLD_LIBRARY_PATH=$(pwd)/lib/poppler/build:$DYLD_LIBRARY_PATH delocate-listdeps {wheel} && delocate-wheel -w {dest_dir} -v {wheel}"
There was also this bug I faced where delocate
wasn't copying all required shared libraries into the built wheel on macOS (fastmac
helped me debug again!) because it doesn't look at top-level extension modules. After searching for a fix in the open issues, I found it in this PR!
And after renaming the extension from pdftopng
to poppler_utils.pdftopng
(and making it "not a top-level" module), delocate
started copying all the required shared libraries into the wheel!
I finally have Linux and macOS wheels! Now I just need to figure out the Windows ones.
I also looked at Windows wheels for some existing projects (numpy
and arrow
) to see all the libraries they bundle. Both of those wheels have pyd
files, which were new to me. I learned that a pyd
file is the same as a dll
file on Windows, but with some Python-specific things in it.
If you have a DLL named
foo.pyd
, then it must have a functionPyInit_foo()
. You can then write Python "import foo
", and Python will search forfoo.pyd
(as well asfoo.py
,foo.pyc
) and if it finds it, will attempt to callPyInit_foo()
to initialize it.
- The Windows wheel for
numpy
contains only onedll
(BLAS
) with a lot ofpyd
,pyx
(Cython files to be converted to C/C++),pxd
(Cython equivalent of a C/C++ header), andc
files!
$ unzip -l numpy-1.19.2-cp38-cp38-win_amd64.whl | grep dll
32939993 2020-09-10 01:30 numpy/.libs/libopenblas.NOIJJG62EMASZI6NYURL6JBKM4EVBGM7.gfortran-win_amd64.dll
- The Windows wheel for
arrow
contains multipledlls
! (Along with all the other types of files mentioned above)
$ unzip -l pyarrow-1.0.1-cp38-cp38-win_amd64.whl | grep dll
8459264 2020-08-17 19:35 pyarrow/arrow.dll
910336 2020-08-17 19:35 pyarrow/arrow_dataset.dll
2610176 2020-08-17 19:35 pyarrow/arrow_flight.dll
1264640 2020-08-17 19:35 pyarrow/arrow_python.dll
91648 2020-08-17 19:35 pyarrow/arrow_python_flight.dll
81920 2020-08-17 19:35 pyarrow/cares.dll
3249664 2020-08-17 19:35 pyarrow/libcrypto-1_1-x64.dll
2661888 2020-08-17 19:35 pyarrow/libprotobuf.dll
651264 2020-08-17 19:35 pyarrow/libssl-1_1-x64.dll
2204672 2020-08-17 19:35 pyarrow/parquet.dll
89600 2020-08-17 19:35 pyarrow/zlib.dll
I also did a mock interview (which was really helpful!) with Vaibhav (who is super awesome!). And I also paired with Ilia to look at his C++ & WebAssembly project. We implemented rendering a multi-line string in a canvas inside the browser using C++! Pointers are fun! I need to learn how to write pointer code fluently like Ilia.
Since half of the second half of my batch is over (only 3 weeks left now!), I did a "things" check-in. I haven't included the things I dropped in the check-in I did at the batch midpoint:
Remove ghostscript and opencv as camelot dependencies to make installation easy for users- Almost done here. I wasn't able to "remove" the dependencies, but I think I got to an acceptable solution. Will put this in the background now.- Learn Rust and WebAssembly - I wanted to do this to help with 1, but it finally didn't lead to anything as I ended up going the Python C extension route. But I'm excited to get into these because I'm bamboozled by the fact that I can write something in a low-level language like Rust (which has an awesome ecosystem of tools), and have it run in my browser!
- Make new open-source tools! - I've worked on present, itslit, opep, python-doc, python-peps-graph, and pdftopng. I have more ideas but not sure if I'll be able to work on them.
Go deep into operating systems, learn how Linux and containers work, andimplement a shell! - Dropping the first half but planning to implement a shell using C and/or Rust!- Write blog posts about anything! (1 per week) - This is happening! I'll continue doing this.
- Prepare for job interviews - Might not need this in the near term.
I've condensed these down to the following:
- Write more Rust and C. Work on a large-ish project. Implement a snek game, a shell,
...
! - (Background processes) Continue packaging the extension, and work on OSS issues.
- Continue writing 1 blog post for each day.