Day 40 — Playing with poppler utils

Today I replaced some ghostscript code in camelot with a pdftoppm subprocess call just to see if the tests pass. They did! I think poppler could be the ghostscript alternative that I'm looking for!

  pdftoppm -r 300 -png -singlefile foo.pdf foo

That got me all excited to look into the C++ code for poppler. I had to install libopenjp2-7-dev, libjpeg-dev, libtiff-dev, and libnss3-dev before I could even run cmake to generate all the Makefiles! Later I found out that there were some flags I could set to disable the cmake check for these libraries :(

  mkdir build && cd build && cmake .. && make

Running this built and all the utils! But since I needed just pdftoppm, I tried to compile the file in the repo with the shared library that was generated in the last step.

I did the following:

  1. g++
  2. Get an "undefined symbol" error.
  3. Guess the header file in which the symbol might be defined.
  4. Run 1 with the -I include for the header file path.
  5. Go to 2.

At the end I had included almost every header file path I could see in the repo, but I was still getting an "undefined symbol" error for a symbol that was supposed to be present in an already included header file! Does the order in which you add these -I flags matter?

After struggling with this for a bit, I learned that I could just supply a VERBOSE=1 variable to make, to make it throw out all the g++ commands it was running! It was awesome to see how a large C++ project is built. First, make built object files out of all C++ files, and then it linked them all together into a shared library!

I also found the g++ command it was running to build the pdftoppm executable.

First, it created an object file

  $ g++ -I/home/vinayak/dev/poppler -I/home/vinayak/dev/poppler/fofi \
    -I/home/vinayak/dev/poppler/goo -I/home/vinayak/dev/poppler/poppler \
    -I/home/vinayak/dev/poppler/build -I/home/vinayak/dev/poppler/build/poppler \
    -I/home/vinayak/dev/poppler/utils -I/home/vinayak/dev/poppler/build/utils \
    -isystem /usr/include/freetype2 -isystem /usr/include/openjpeg-2.3 -isystem /usr/include/cairo \
    -Wall -Wextra -Wpedantic -Wno-unused-parameter -Wcast-align -Wformat-security \
    -Wframe-larger-than=65536 -Wlogical-op -Wmissing-format-attribute -Wnon-virtual-dtor \
    -Woverloaded-virtual -Wmissing-declarations -Wundef -Wzero-as-null-pointer-constant -Wshadow -Wsuggest-override \
    -fno-exceptions -fno-check-new -fno-common -D_DEFAULT_SOURCE \
    -O2 -g -std=c++14 -o -c

And then linked it to the shared library

  $ g++ -Wall -Wextra -Wpedantic -Wno-unused-parameter -Wcast-align -Wformat-security \
  -Wframe-larger-than=65536 -Wlogical-op -Wmissing-format-attribute -Wnon-virtual-dtor \
  -Woverloaded-virtual -Wmissing-declarations -Wundef -Wzero-as-null-pointer-constant \
  -Wshadow -Wsuggest-override -fno-exceptions -fno-check-new -fno-common -D_DEFAULT_SOURCE \
  -O2 -g -Wl,--as-needed utils/CMakeFiles/pdftoppm.dir/ utils/CMakeFiles/pdftoppm.dir/ \ -o pdftoppm -Wl,-rpath,/home/vinayak/dev/poppler/build:

I ran these steps manually and the executable seemed to work!

  $ pdftoppm --help

Now I need to figure out what all the -W flags are being used for, and if and are really needed. Why aren't these objects in the shared library already? Is it because they are specific to the command-line executable?

I just know that -Wl,-rpath,/home/vinayak/dev/poppler/build is being used to specify the RPATH for the executable so that the dynamic linker looks for a shared library at this path first, instead of one of the ldconfig paths!

I also want to search for a Python dir() equivalent for C/C++ shared libraries, so that I can look at all the functions that a shared library exports! readelf, objdump, and nm could be used for that I guess, but it would be awesome if the output is simple, just like Python's dir()!