Day 45b — How to (almost) build a C extension wheel on Windows (with external dependencies)

I looked into how to build C extension wheels on Windows over the weekend. Since there isn't a fastmac equivalent to get a Windows machine for debugging, I booted up Windows on my laptop after a really long time! I need to find a fastwin or winfast!

Installation

Visual Studio 2019 Community Edition

The Python packaging docs mentioned that I needed to install Visual Studio Community Edition, 2015 or later for Python 3.5+. All Visual Studio versions after 2015 are backwards compatible! I installed Visual Studio 2019 and selected the Python native development tools checkbox in the setup application.

The setup put cl.exe and link.exe (Windows equivalent of cc and ld), with some other tools, in my Program Files! Looking into the Program Files brought back very old memories of fiddling with a game's files inside this directory to make everything work :)

Git and Python

I also installed git (which installed a lot of Unix tools too!) and Python 3.8 using the setup exes from their websites. All of these setups seemed to modify my PATH variable automatically, because all of their executables were available in Powershell right after, and I could run cl.exe, link.exe, git, and python in the Powershell terminal.

Building a "hello world" C program

As a test, I tried to build the ncurses "hello world" program. But since ncurses is not supported on Windows (there's PDCurses though), I commented out all of the ncurses function, replacing the printw with a printf which basically made it a "hello world" C program!


  (venv) > python -m pip install .
  💥

BOOM! I got my first error which said "fatal error LNK1112: module machine type 'x86' conflicts with target machine type 'x64'". I was using x86 tools to build something for my x64 system.

Somehow, Powershell was configured to only use the x86 toolchain, and I wasn't sure how to make it use the x64 one. At this point, I found the (x64) native tools command prompt which gets installed with the Visual Studio setup, and has everything configured correctly. So I jumped onto the "native tools command prompt" submarine from the Powershell ship!

After the switch, I was able to build and install the "hello world" C extension!


  (venv) > python -m pip install .
  Processing c:\users\vinayak mehta\desktop\development\onix
    Installing build dependencies ... done
    Getting requirements to build wheel ... done
      Preparing wheel metadata ... done
  Requirement already satisfied: Click>=7.0 in c:\users\vinayak mehta\appdata\local\programs\python\python38\lib\site-packages (from onix==0.1.0) (7.1.2)
  Building wheels for collected packages: onix
    Building wheel for onix (PEP 517) ... done
    Created wheel for onix: filename=onix-0.1.0-cp38-cp38-win_amd64.whl size=47069 sha256=8681e069d73e567d865f601cb212429f0ef335a320d031c188576078ef3f1eba
    Stored in directory: C:\Users\Vinayak Mehta\AppData\Local\Temp\pip-ephem-wheel-cache-jgv05vv3\wheels\33\5f\a8\63d76ba35c8c629936b3485a15ffe5ccb25fe1304159ebc9d8
  Successfully built onix
  Installing collected packages: onix
  Successfully installed onix-0.1.0
  WARNING: You are using pip version 20.2.1; however, version 20.2.3 is available.
  You should consider upgrading via the 'C:\Users\Vinayak Mehta\AppData\Local\Programs\Python\Python38\python.exe -m pip install --upgrade pip' command.

And call the executable!


  (venv) > onix.exe
  Hello, snek!
  (venv) >

vcpkg and external dependencies

After that I moved onto the slightly more complex C extension, which has some external dependencies. I wasn't sure if there was a way for cygwin to work with the native tools command prompt (I'm sure there is). I also wasn't sure if the wheels built on cygwin with gcc/g++ would play wheel on Windows, so I started looking for a way to install external dependencies.

I'd heard of how choco is this new and shiny package manager for Windows, but couldn't find the packages I required to build poppler (freetype, fontconfig, libpng, and libjpeg) on their repository. But I found vcpkg! (A C/C++ library manager for Windows, Linux, and macOS released by Microsoft)

Installing vcpkg was easy, I just followed the quickstart from the README, and was able to install the dependencies I needed after that!


  (venv) c:\dev>.\vcpkg\vcpkg.exe install freetype fontconfig libpng libjpeg-turbo

Building poppler

Once you install libraries using vcpkg, you can use them with cmake by adding -DCMAKE_TOOLCHAIN_FILE=C:/path/to/vcpkg.cmake to your cmake command.


  > cmake -DCMAKE_TOOLCHAIN_FILE=C:/dev/vcpkg/scripts/buildsystems/vcpkg.cmake -D ENABLE_QT5=OFF -D ENABLE_LIBOPENJPEG=none -D ENABLE_CPP=OFF ..

But that didn't do the trick for me! cmake wasn't able to find the libraries I installed. I had to add the directory where vcpkg installed all the dependencies to my PATH:


  > set PATH=%PATH%;C:\dev\vcpkg\installed\x86-windows\bin

After which the previous cmake command succeeded! I thought the poppler build would succeed after that:


  > cmake -DCMAKE_TOOLCHAIN_FILE=C:/dev/vcpkg/scripts/buildsystems/vcpkg.cmake -D ENABLE_QT5=OFF -D ENABLE_LIBOPENJPEG=none -D ENABLE_CPP=OFF ..
  > cmake --build . --target poppler --config Release
  💥

BOOM! I got a lot of "unresolved external symbol" errors! Turns out vcpkg installs the x86 version of libraries by default, and I was building poppler for my x64 target on the x64 native tools command prompt! I needed the 64-bit version of each dependency:


  c:\dev>.\vcpkg\vcpkg.exe install freetype:x64-windows fontconfig:x64-windows libpng:x64-windows libjpeg-turbo:x64-windows

I also removed the earlier x86 vcpkg directory from my PATH and added the new x64 one instead:


  > set PATH=%PATH%;C:\dev\vcpkg\installed\x64-windows\bin

Finally poppler was built successfully!


  $ cmake -DCMAKE_TOOLCHAIN_FILE=C:/dev/vcpkg/scripts/buildsystems/vcpkg.cmake -D ENABLE_QT5=OFF -D ENABLE_LIBOPENJPEG=none -D ENABLE_CPP=OFF ..
  $ cmake --build . --config Release --target poppler --verbose

I'm not sure if I even need the vcpkg toolchain file at all, since the required libraries are available on my PATH. I'll need to get back to Windows to find out.

Building the pdftopng C extension

After that, I moved to building and installing the C extension:


  (venv) > python -m pip install -v -e .
   Creating library build\temp.win-amd64-3.8\Release\src\poppler_utils\pdftopng.cp38-win_amd64.lib and object build\temp.win-amd64-3.8\Release\src\poppler_utils\pdftopng.  cp38-win_amd64.exp
  poppler.lib(GlobalParams.obj) : error LNK2001: unresolved external symbol __imp_RegCloseKey
  ...
  poppler.lib(JpegWriter.obj) : error LNK2001: unresolved external symbol jpeg_std_error
  ...
  poppler.lib(PNGWriter.obj) : error LNK2001: unresolved external symbol png_create_write_struct
  ...
  poppler.lib(DCTStream.obj) : error LNK2001: unresolved external symbol jpeg_CreateDecompress
  ...
  poppler.lib(SplashFTFontEngine.obj) : error LNK2001: unresolved external symbol FT_Init_FreeType
  ...
  poppler.lib(SplashFTFont.obj) : error LNK2001: unresolved external symbol FT_Set_Pixel_Sizes
  ...
  build\lib.win-amd64-3.8\poppler_utils\pdftopng.cp38-win_amd64.pyd : fatal error LNK1120: 51 unresolved externals
  error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\Community\\VC\\Tools\\MSVC\\14.27.29110\\bin\\HostX86\\x64\\link.exe' failed with exit status 1120

BOOM! More "unresolved external symbol" errors! I was able to see the compiler and linker commands in the verbose -v output, so I copied them, tried to understand all the options, and then ran the commands manually.

So my header files were going in correctly to cl.exe, as specified in the setup.py. And the command was exiting successfully.


  (venv) > cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -DVERSION_INFO="0.1.0"
  "-Ic:\Users\Vinayak Mehta\Desktop\development\poppler-utils\lib\poppler"
  "-Ic:\Users\Vinayak Mehta\Desktop\development\poppler-utils\lib\poppler\fofi"
  "-Ic:\Users\Vinayak Mehta\Desktop\development\poppler-utils\lib\poppler\goo"
  "-Ic:\Users\Vinayak Mehta\Desktop\development\poppler-utils\lib\poppler\poppler"
  "-Ic:\Users\Vinayak Mehta\Desktop\development\poppler-utils\lib\poppler\build"
  "-Ic:\Users\Vinayak Mehta\Desktop\development\poppler-utils\lib\poppler\build\poppler"
  "-Ic:\Users\Vinayak Mehta\Desktop\development\poppler-utils\lib\poppler\utils"
  "-Ic:\Users\Vinayak Mehta\Desktop\development\poppler-utils\lib\poppler\build\utils"
  "-IC:\Users\Vinayak Mehta\AppData\Local\Programs\Python\Python38\lib\site-packages\pybind11\include"
  "-IC:\Users\Vinayak Mehta\AppData\Local\Programs\Python\Python38\include"
  ...
  /EHsc /Tpsrc\poppler_utils\pdftopng.cpp /Fobuild\temp.win-amd64-3.8\Release\src\poppler_utils\pdftopng.obj /EHsc /std:c++14

The errors were being raised at the linking stage with link.exe:


  (venv) > link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO
  ...
  /EXPORT:PyInit_pdftopng
  build\temp.win-amd64-3.8\Release\src\poppler_utils\pdftopng.obj
  /OUT:build\lib.win-amd64-3.8\poppler_utils\pdftopng.cp38-win_amd64.pyd
  /IMPLIB:build\temp.win-amd64-3.8\Release\src\poppler_utils\pdftopng.cp38-win_amd64.lib
  "c:\Users\Vinayak Mehta\Desktop\development\poppler-utils\lib\poppler\build\Release\poppler.lib"

I'd made the assumption that link.exe would be able to find all the required libraries, since I put that directory on the PATH right? (Just like LD_LIBRARY_PATH!) Turns out that assumption was incorrect, and I needed to specify all the libraries explicitly!


  (venv) > link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO
  ...
  /EXPORT:PyInit_pdftopng
  build\temp.win-amd64-3.8\Release\src\poppler_utils\pdftopng.obj
  /OUT:build\lib.win-amd64-3.8\poppler_utils\pdftopng.cp38-win_amd64.pyd
  /IMPLIB:build\temp.win-amd64-3.8\Release\src\poppler_utils\pdftopng.cp38-win_amd64.lib
  "c:\Users\Vinayak Mehta\Desktop\development\poppler-utils\lib\poppler\build\Release\poppler.lib"
  "c:\dev\vcpkg\installed\x64-windows\lib\freetype.lib"
  "c:\dev\vcpkg\installed\x64-windows\lib\fontconfig.lib"
  "c:\dev\vcpkg\installed\x64-windows\lib\libpng16.lib"
  "c:\dev\vcpkg\installed\x64-windows\lib\jpeg.lib"

The "unresolved external symbol" errors started going away as I added those libraries one by one! There were still three unresolved symbols though:


  poppler.lib(GlobalParams.obj) : error LNK2001: unresolved external symbol __imp_RegCloseKey
  poppler.lib(GlobalParams.obj) : error LNK2001: unresolved external symbol __imp_RegEnumValueA
  poppler.lib(GlobalParams.obj) : error LNK2001: unresolved external symbol __imp_RegOpenKeyExA

To resolve this, I had to add advapi32.lib which is a Windows-specific library to the link.exe arguments! The linker seemed to find it without even specifying the full path.

There's also /LIBPATH using which you can specify the path where the linker should look for libraries, so that you don't have to specify the full path to each library:


  (venv) > link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO
  "/LIBPATH:C:\dev\vcpkg\installed\x64-windows\lib"
  ...
  /EXPORT:PyInit_pdftopng
  build\temp.win-amd64-3.8\Release\src\poppler_utils\pdftopng.obj
  /OUT:build\lib.win-amd64-3.8\poppler_utils\pdftopng.cp38-win_amd64.pyd
  /IMPLIB:build\temp.win-amd64-3.8\Release\src\poppler_utils\pdftopng.cp38-win_amd64.lib
  "c:\Users\Vinayak Mehta\Desktop\development\poppler-utils\lib\poppler\build\Release\poppler.lib"
  freetype.lib fontconfig.lib libpng16.lib jpeg.lib advapi32.lib

But as it turns out, all of this can be done programatically using the library_dirs and libraries keyword arguments for setuptools.Extension!

You can also specify the libraries to link against when building your extension, and the directories to search for those libraries. The libraries option is a list of libraries to link against, library_dirs is a list of directories to search for libraries at link-time, and runtime_library_dirs is a list of directories to search for shared (dynamically loaded) libraries at run-time. (Again, this sort of non-portable construct should be avoided if you intend to distribute your code.) — Distutils documentation

So I modified the setup.py to use these keyword arguments (I'll have to remove the hard-coded path to vcpkg_dir):


  library_dirs = []
  libraries = []

  if sys.platform == "win32":
      vcpkg_dir = os.path.join("C:\\", "dev", "vcpkg", "installed", "x64-windows", "lib")
      build_dir = os.path.join(os.getcwd(), "lib", "poppler", "build", "Release")
      library_dirs.extend([vcpkg_dir, build_dir])
      libraries.extend(
          ["freetype", "fontconfig", "libpng16", "jpeg", "advapi32", "poppler"]
      )

  ext_modules = [
      Extension(
          "poppler_utils.pdftopng",
          # Sort input source files to ensure bit-for-bit reproducible builds
          # (https://github.com/pybind/python_example/pull/53)
          sorted([os.path.join("src", "poppler_utils", "pdftopng.cpp")]),
          include_dirs=ext_includes,
          library_dirs=library_dirs,
          libraries=libraries,
          language="c++",
      ),
  ]

And after that change, the extension was built and installed successfully! There was a pyd file in the src/poppler_utils directory (as I installed the extension in editable mode)!


  (venv) > python
  >>> from poppler_utils import pdftopng
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
  ImportError: DLL load failed while importing pdftopng: The specified module could not be found.
  >>>

BOOM! ImportError though :(

The error “The specified module could not be found” is a bit misleading on Windows because it means either the DLL you are trying to load or any of its dependencies cannot be located. — SO answer

Dependencies cannot be located? I thought that putting the vcpkg bin directory on my PATH earlier was supposed to solve that, but seems like it didn't. I found this nice tool which printed all the libraries that my pyd file wanted:

dll_missing_imports

The usual suspects! After copying all the DLLs from vcpkg's bin directory to the pyd file's directory, everything worked! At last!


  (venv) > cp C:\dev\vcpkg\installed\x64-windows\bin\*.dll src\poppler_utils
  (venv) > python
  >>> from poppler_utils import pdftopng
  >>> pdftopng.convert(pdf_path="foo.pdf", png_path="foo")
  >>>

I also found this doc about the dynamic link library search order, but I'll check that out later.

Questions

So it looks like I just need to bundle all those DLLs into the wheel somehow. This is how these projects seem to do it:


  $ unzip -l numpy-1.19.2-cp38-cp38-win_amd64.whl | grep dll
    32939993  2020-09-10 01:30   numpy/.libs/libopenblas.NOIJJG62EMASZI6NYURL6JBKM4EVBGM7.gfortran-win_amd64.dll


  $ unzip -l pyarrow-1.0.1-cp38-cp38-win_amd64.whl | grep dll
    8459264  2020-08-17 19:35   pyarrow/arrow.dll
    910336   2020-08-17 19:35   pyarrow/arrow_dataset.dll
    2610176  2020-08-17 19:35   pyarrow/arrow_flight.dll
    1264640  2020-08-17 19:35   pyarrow/arrow_python.dll
    91648    2020-08-17 19:35   pyarrow/arrow_python_flight.dll
    81920    2020-08-17 19:35   pyarrow/cares.dll
    3249664  2020-08-17 19:35   pyarrow/libcrypto-1_1-x64.dll
    2661888  2020-08-17 19:35   pyarrow/libprotobuf.dll
    651264   2020-08-17 19:35   pyarrow/libssl-1_1-x64.dll
    2204672  2020-08-17 19:35   pyarrow/parquet.dll
    89600    2020-08-17 19:35   pyarrow/zlib.dll

I also found some questions about an auditwheel / delocate-like tool for Windows, but there's nothing out there yet.

Resources