Day 40 — Playing with poppler utils

Today I replaced some ghostscript code in camelot with a pdftoppm subprocess call just to see if the tests pass. They did! I think poppler could be the ghostscript alternative that I'm looking for!


  pdftoppm -r 300 -png -singlefile foo.pdf foo

That got me all excited to look into the C++ code for poppler. I had to install libopenjp2-7-dev, libjpeg-dev, libtiff-dev, and libnss3-dev before I could even run cmake to generate all the Makefiles! Later I found out that there were some flags I could set to disable the cmake check for these libraries :(


  mkdir build && cd build && cmake .. && make

Running this built libpoppler.so and all the utils! But since I needed just pdftoppm, I tried to compile the pdftoppm.cc file in the repo with the libpoppler.so shared library that was generated in the last step.

I did the following:

  1. g++ pdftoppm.cc libpoppler.so
  2. Get an "undefined symbol" error.
  3. Guess the header file in which the symbol might be defined.
  4. Run 1 with the -I include for the header file path.
  5. Go to 2.

At the end I had included almost every header file path I could see in the repo, but I was still getting an "undefined symbol" error for a symbol that was supposed to be present in an already included header file! Does the order in which you add these -I flags matter?

After struggling with this for a bit, I learned that I could just supply a VERBOSE=1 variable to make, to make it throw out all the g++ commands it was running! It was awesome to see how a large C++ project is built. First, make built object files out of all C++ files, and then it linked them all together into a shared library!

I also found the g++ command it was running to build the pdftoppm executable.

First, it created an object file pdftoppm.cc.o:


  $ g++ -I/home/vinayak/dev/poppler -I/home/vinayak/dev/poppler/fofi \
    -I/home/vinayak/dev/poppler/goo -I/home/vinayak/dev/poppler/poppler \
    -I/home/vinayak/dev/poppler/build -I/home/vinayak/dev/poppler/build/poppler \
    -I/home/vinayak/dev/poppler/utils -I/home/vinayak/dev/poppler/build/utils \
    -isystem /usr/include/freetype2 -isystem /usr/include/openjpeg-2.3 -isystem /usr/include/cairo \
    -Wall -Wextra -Wpedantic -Wno-unused-parameter -Wcast-align -Wformat-security \
    -Wframe-larger-than=65536 -Wlogical-op -Wmissing-format-attribute -Wnon-virtual-dtor \
    -Woverloaded-virtual -Wmissing-declarations -Wundef -Wzero-as-null-pointer-constant -Wshadow -Wsuggest-override \
    -fno-exceptions -fno-check-new -fno-common -D_DEFAULT_SOURCE \
    -O2 -g -std=c++14 -o pdftoppm.cc.o -c pdftoppm.cc

And then linked it to the shared library libpoppler.so:


  $ g++ -Wall -Wextra -Wpedantic -Wno-unused-parameter -Wcast-align -Wformat-security \
  -Wframe-larger-than=65536 -Wlogical-op -Wmissing-format-attribute -Wnon-virtual-dtor \
  -Woverloaded-virtual -Wmissing-declarations -Wundef -Wzero-as-null-pointer-constant \
  -Wshadow -Wsuggest-override -fno-exceptions -fno-check-new -fno-common -D_DEFAULT_SOURCE \
  -O2 -g -Wl,--as-needed utils/CMakeFiles/pdftoppm.dir/parseargs.cc.o utils/CMakeFiles/pdftoppm.dir/Win32Console.cc.o \
  pdftoppm.cc.o -o pdftoppm -Wl,-rpath,/home/vinayak/dev/poppler/build: libpoppler.so.103.0.0

I ran these steps manually and the executable seemed to work!


  $ pdftoppm --help

Now I need to figure out what all the -W flags are being used for, and if parseargs.cc.o and Win32Console.cc.o are really needed. Why aren't these objects in the shared library already? Is it because they are specific to the command-line executable?

I just know that -Wl,-rpath,/home/vinayak/dev/poppler/build is being used to specify the RPATH for the executable so that the dynamic linker looks for a shared library at this path first, instead of one of the ldconfig paths!

I also want to search for a Python dir() equivalent for C/C++ shared libraries, so that I can look at all the functions that a shared library exports! readelf, objdump, and nm could be used for that I guess, but it would be awesome if the output is simple, just like Python's dir()!