Day 45 — I have Linux and macOS wheels!

Today I continued my quest of packaging my Python C extension for multiple OSes. Yesterday while doing a packaging "test run" with the curses "hello world" program, I found that curses is not supported on Windows (it should work with WSL, but not with the "default" Windows terminal I guess).

So the first thing I wanted to ensure today was that poppler and some of its dependencies that I need (freetype, fontconfig, libpng, and libjpeg) are supported on Windows. Ilia had suggested that I could use cygwin to do Linux-y things on Windows, so I borrowed my mom's laptop which runs Windows 7, and installed poppler and its dependencies; along with git, gcc/g++, make and cmake:

packages

I was able to clone and build poppler from source! It still required the jpeg library to generate the Makefiles, even though I set ENABLE_LIBOPENJPEG=none. I'll have to debug that at some point so that I can remove this poppler dependency.


  > cmake -D ENABLE_QT5=OFF -D ENABLE_LIBOPENJPEG=none -D ENABLE_CPP=OFF ..
  > make poppler

After the test, I wanted to transfer some files from the Windows laptop to my laptop using python -m http.server, but didn't know what the Windows equivalent of ifconfig was (to get the Windows laptop's IP address that I could access on my laptop), and also didn't want to spend time finding and installing ifconfig on cygwin. I found this nice alternative!


  >>> import socket
  >>> s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
  >>> s.connect(("8.8.8.8", 80))
  >>> print(s.getsockname()[0])
  '192.168.1.4'

And after that, I went back to building wheels for Linux and macOS, and started setting up the packaging pipeline (from yesterday) for poppler-utils. The only changes I had to make to the setup.py were:


  ext_includes = [
      "lib/poppler",
      "lib/poppler/fofi",
      "lib/poppler/goo",
      "lib/poppler/poppler",
      "lib/poppler/build",
      "lib/poppler/build/poppler",
      "lib/poppler/utils",
      "lib/poppler/build/utils",
      pybind11.get_include(),
  ]


  ext_modules = [
      Extension(
          "poppler_utils.pdftopng",
          # Sort input source files to ensure bit-for-bit reproducible builds
          # (https://github.com/pybind/python_example/pull/53)
          sorted(["src/poppler_utils/pdftopng.cpp"]),
          include_dirs=ext_includes,
          language="c++",
      ),
  ]

When I installed the package using pip, it called setuptools (which internally uses distutils, I think) to build the extension using g++, and link it to the relevant shared libraries using ld:


  $ pip install -v -e .

I also had to make changes to the cibuildwheel GitHub workflow:


  #!/bin/bash

  brew install freetype fontconfig libpng jpeg

  cd lib/poppler
  mkdir build && cd build
  cmake -D ENABLE_QT5=OFF -D ENABLE_LIBOPENJPEG=none -D ENABLE_CPP=OFF ..
  make poppler


  CIBW_REPAIR_WHEEL_COMMAND_LINUX: "LD_LIBRARY_PATH=$(pwd)/lib/poppler/build/:$LD_LIBRARY_PATH auditwheel repair -w {dest_dir} {wheel}"
  CIBW_REPAIR_WHEEL_COMMAND_MACOS: "DYLD_LIBRARY_PATH=$(pwd)/lib/poppler/build:$DYLD_LIBRARY_PATH delocate-listdeps {wheel} && delocate-wheel -w {dest_dir} -v {wheel}"

There was also this bug I faced where delocate wasn't copying all required shared libraries into the built wheel on macOS (fastmac helped me debug again!) because it doesn't look at top-level extension modules. After searching for a fix in the open issues, I found it in this PR!

And after renaming the extension from pdftopng to poppler_utils.pdftopng (and making it "not a top-level" module), delocate started copying all the required shared libraries into the wheel!

I finally have Linux and macOS wheels! Now I just need to figure out the Windows ones.


I also looked at Windows wheels for some existing projects (numpy and arrow) to see all the libraries they bundle. Both of those wheels have pyd files, which were new to me. I learned that a pyd file is the same as a dll file on Windows, but with some Python-specific things in it.

If you have a DLL named foo.pyd, then it must have a function PyInit_foo(). You can then write Python "import foo", and Python will search for foo.pyd (as well as foo.py, foo.pyc) and if it finds it, will attempt to call PyInit_foo() to initialize it.


  $ unzip -l numpy-1.19.2-cp38-cp38-win_amd64.whl | grep dll
   32939993  2020-09-10 01:30   numpy/.libs/libopenblas.NOIJJG62EMASZI6NYURL6JBKM4EVBGM7.gfortran-win_amd64.dll


  $ unzip -l pyarrow-1.0.1-cp38-cp38-win_amd64.whl | grep dll
    8459264  2020-08-17 19:35   pyarrow/arrow.dll
     910336  2020-08-17 19:35   pyarrow/arrow_dataset.dll
    2610176  2020-08-17 19:35   pyarrow/arrow_flight.dll
    1264640  2020-08-17 19:35   pyarrow/arrow_python.dll
      91648  2020-08-17 19:35   pyarrow/arrow_python_flight.dll
      81920  2020-08-17 19:35   pyarrow/cares.dll
    3249664  2020-08-17 19:35   pyarrow/libcrypto-1_1-x64.dll
    2661888  2020-08-17 19:35   pyarrow/libprotobuf.dll
     651264  2020-08-17 19:35   pyarrow/libssl-1_1-x64.dll
    2204672  2020-08-17 19:35   pyarrow/parquet.dll
      89600  2020-08-17 19:35   pyarrow/zlib.dll


I also did a mock interview (which was really helpful!) with Vaibhav (who is super awesome!). And I also paired with Ilia to look at his C++ & WebAssembly project. We implemented rendering a multi-line string in a canvas inside the browser using C++! Pointers are fun! I need to learn how to write pointer code fluently like Ilia.


Since half of the second half of my batch is over (only 3 weeks left now!), I did a "things" check-in. I haven't included the things I dropped in the check-in I did at the batch midpoint:

  1. Remove ghostscript and opencv as camelot dependencies to make installation easy for users - Almost done here. I wasn't able to "remove" the dependencies, but I think I got to an acceptable solution. Will put this in the background now.
  2. Learn Rust and WebAssembly - I wanted to do this to help with 1, but it finally didn't lead to anything as I ended up going the Python C extension route. But I'm excited to get into these because I'm bamboozled by the fact that I can write something in a low-level language like Rust (which has an awesome ecosystem of tools), and have it run in my browser!
  3. Make new open-source tools! - I've worked on present, itslit, opep, python-doc, python-peps-graph, and pdftopng. I have more ideas but not sure if I'll be able to work on them.
  4. Go deep into operating systems, learn how Linux and containers work, and implement a shell! - Dropping the first half but planning to implement a shell using C and/or Rust!
  5. Write blog posts about anything! (1 per week) - This is happening! I'll continue doing this.
  6. Prepare for job interviews - Might not need this in the near term.

I've condensed these down to the following:

  1. Write more Rust and C. Work on a large-ish project. Implement a snek game, a shell, ...!
  2. (Background processes) Continue packaging the extension, and work on OSS issues.
  3. Continue writing 1 blog post for each day.