CMake and vcpkg My Way, Part 1

Intro

I’ve been meaning to start writing blog posts about developing scientific software with C++ for ages. What’s finally tipped me over the edge is being asked about how I use vcpkg on both the #includecpp and Eigen Discords. For these blog posts I’m aiming to keep a Question-Answer-Discussion format, because people seem to like brevity these days but I have deep tendency to rambling. However, I realised that in order to do this topic justice and stick to that format, I was going to have to split it into two parts. This first part will be a brief introduction to using vcpkg. The second part will discuss how I combine cmake and vcpkg to unlock something really powerful - properly using C++ Sanitizers. If you’re the really impatient type, you can skip to part two now.

If you want to see my projects that use this stuff, then head over to either https://github.com/spinicist/riesling or https://github.com/spinicist/quit. We will be using Riesling as an example throughout these posts.

Question - How do I Manage Dependencies in C++?

At some point, if you are a sane coder, you will want to rely on some code written by someone else. To facilitate these dependencies, most languages have a package manager. Python has pip, Rust has cargo, and let’s not talk about npm right now. A package manager takes responsiblity for downloading your dependencies and making them available to your code - the idea is no messing around with include and library paths (or the equivalent), the package manager makes all that just work.

C++, being a middle-aged language that was invented before ubiqitous Internet connections, does not have an official package manager. Over the decade I have been using C++ heavily, the common wisdom has been a mix of listing required dependencies in a README, vendoring (i.e. copying) sources directly into your repo, or using git submodule if you were feeling sophisticated. These solutions are a mix of error-prone, annoying, and just plain terrible. Thankfully things are changing.

Answer - I use vcpkg

There are now several package managers available for C++, of which the ones I see discussed are conan, hunter and vcpkg. I never got round to trying hunter, and at the time I investigated conan, the two libraries I most depended on were not available in the index. I also have a philosophical issue with conan in that it’s written in Python - my problem is not that Python is bad, far from it, but that if C++ as a general purpose language can’t write its own package manager, then C++ has problems and I should probably stop using it.

Thankfully, I came across vcpkg which not only is written in C++, but has had every library I’ve ever gone looking for in the index. vcpkg is fast and fairly intuitive. However if I’m being honest it has two drawbacks. The first is that it is not well documented. That’s why I’m writing this blog. The second is that it’s not finished and is under heavy development. This is only a problem in that things are liable to change - vcpkg is definitely useable in its current state.

vcpkg is open-source, but has a significant advantage in that Microsoft have dedicated a whole team of engineers to it, and best of all they hang out in the #vcpkg channel on the #includecpp Discord. I’ve had quick answers to every question I’ve had about it. Whether that tight knit developer/user relationship continues to scale remains to be seen, but for now it’s a definite plus.

Discussion - Basic Use of vcpkg

As mentioned in the intro, to fully explain how I use vcpkg I need to also explain a bit about how I use cmake, which will come in part two. Here I will describe the basics of using vcpkg in manifest mode as a git submodule of your repo, which in my opinion is how you should use it (at least at the moment). It might feel weird adding a package manager as a submodule of your repo, I know it did for me the first time I did it, but I promise it works beautifully.

To get started, we add vcpkg as a submodule, and then run the bootstrap script:

git submodule add https://github.com/microsoft/vcpkg.git
git submodule update --init
cd vcpkg
./bootstrap-vcpkg.sh -useSystemBinaries

This will build the vcpkg binary, assuming you are on Linux or Mac. If you’re on Windows I assume you need bootstrap-vcpkg.bat but I can’t really help. You don’t necessarily need the -useSystemBinaries flag - all that does is use the system CMake and Ninja binaries if they are the correct versions that CMake relies on. If not, CMake will download its own versions and use them. At this point, there will be a vcpkg binary sitting in the vcpkg directory. You can use this a bit like pip, e.g. type ./vcpkg search boost and you will get back a rather long list of all the boost modules.

An interesting feature of vcpkg is that the package index (the list of all available packages) does not live on a server somewhere. Somebody kindly set up https://vcpkg.info to also list all the packages, but that is not an official site and the binary does not talk to it to get a list of the packages. Instead, the package index lives inside the repo at vcpkg/ports. Every directory in there is a package, which vcpkg decided to call a port. Don’t ask me. Inside each port directory is a portfile.cmake file which defines how to compile it. This is very powerful, as we shall see later.

We’re going to use vcpkg in manifest mode. To do this, cd .. back into your repo and add a vcpkg.json file. Here is the one in my Riesling project:

{
    "name": "riesling",
    "version-string": "0.1",
    "dependencies": [
        "args",
        "catch2",
        "eigen3",
        "fmt",
        "itk",
        {
            "name": "fftw3",
            "features": [ "threads", "avx2" ]
        }
    ]
}

You can see that this defines my project name and version (not really needed but might as well put it there). The important bit is the dependencies list, which contains all the libraries the project needs. For the majority of these, I have only specified the name as I want to use them with the default options. However, for the last library, FFTW, I have specified the particular features I want the library compiled with. Different libraries support different features (most don’t have any). If you do a vcpkg search, if a library supports features they will be listed as an extra line with each feature in [] brackets.

At this point, we have a vcpkg binary, we have a package index, and we have a list of dependencies. The next step is to tell our project about the packages inside CMakeLists.txt. Here is the relevant chunk for Riesling:

find_package(args REQUIRED)
find_package(Catch2 REQUIRED)
find_package(Eigen3 REQUIRED)
find_package(fmt REQUIRED)
find_package(FFTW3f REQUIRED)
find_package(ITK 5.0.0 REQUIRED
              COMPONENTS
                ITKCommon
                ITKIOImageBase
                ITKIONIFTI
            )
include( ${ITK_USE_FILE} )

This is standard CMake - it would likely look the same even if your project didn’t use vcpkg. The only thing to note here is that I’m using the float precision version of FFTW, hence the trailing f, and that ITK has a components system I should probably write a blog post about.

We also need to tell our specific CMake targets about the libraries. Again, here is a relevant block of CMakeLists.txt:

target_link_libraries(vineyard PUBLIC
    Eigen3::Eigen
    fmt::fmt
    FFTW3::fftw3f
    ${ITK_LIBRARIES}
)

This step can be a bit of a “gotcha”. Note that the calls to find_package above used the same names, bar capitalization, as we did in vcpkg.json. Here, the actual CMake target names are namespaced, i.e. package::target, and also potentially capitalized. The correct response to this fact is that capital letters are annoying and should never have been invented.

This can be particularly annoying in vcpkg because it is difficult to look up the target names, particularly in manifest mode. They aren’t listed on vcpkg.info, and they aren’t listed when you vcpkg search. The one way you can get them currently is when you do a vcpkg install, but you can’t do that in manifest mode, and even then it’s only printed when you first install the package (unless you force it by specifying a triplet - but I tried to write this whole blog post without mentioning triplets). At this point in time, your best bet is to first guess, second to look in the portfile.cmake, and third to ask on the Discord. Microsoft are working on a way to get these target names printed, but last time I asked it handed landed.

At this point we have a vcpkg binary, we have a package index, we have a list of dependencies, and we have told CMake about those dependencies. The final step is to put it all together. This, in my opinion, is surprisingly straightforward. All we do is call CMake with one extra argument:

cmake -B build/ -S . -GNinja -DCMAKE_TOOLCHAIN_FILE=vcpkg/scripts/buildsystems/vcpkg.cmake

That’s it. During the configure step, CMake will call vcpkg for us, and vcpkg will go off and build all our dependencies. At the moment, in manifest mode, not much feedback is printed to the terminal during this step, but you if you have large dependencies you can follow the progress by looking in the vcpkg-manifest-install.log in your build directory. Once vcpkg has finished, the CMake configure will finish, and you can build your project as normal. Every time you configure, vcpkg will check if you have changed anything in vcpkg.json or the port files have changed, and rebuild if necessary. If it doesn’t have to do anything, it proudly tells you how long the check took (usually microseconds if you believe vcpkg).

That’s pretty much it for the basics, but earlier I mentioned that having the package index in your repo was powerful, so I should explain why. Having the package index locally means you can edit it. In the extreme case, if vcpkg is missing a package you want, you can probably add it with no great fuss, but I confess I have not actually tried this yet. What I have done though, is tweak packages to my liking, in particular Eigen3 and ITK.

For Eigen3, the edit was a simple case of pinning a recent version. Eigen is a brilliant library, but it’s been a long time since the 3.3 branch was tagged, and I’ve started using features on the main branch. vcpkg does have an option to use the git HEAD of a repo, but when I last checked, it didn’t work in manifest mode. Also, in my experience you rarely want to work at the actual head of a library like Eigen, because things do occasionally break. Better to pick a known good, but recent, commit and pin that one. Again, in my opinion, this is really simple. All I needed to do was edit the following lines in portfile.cmake:

vcpkg_from_gitlab(
    GITLAB_URL https://gitlab.com
    OUT_SOURCE_PATH SOURCE_PATH
    REPO libeigen/eigen
    REF 5b9bfc892a39ad2e95ad7dc9131e777d8f38f587
    SHA512 41aab9da52f01bbefa64ee4420feefa66c08ed164f31178ba2803349da479bf02ccd1884663bfa4d6b6ccc866177760a4796514337ab2cc6f638494865081a93
    HEAD_REF master
    # PATCHES fix-cuda-error.patch # issue https://gitlab.com/libeigen/eigen/-/issues/1526
)

Note the lines REF and SHA512. These describe the particular Git commit id and corresponding hash of the download. I grabbed the REF I wanted from the Eigen repo website and edited that line, then ran a configure. At this point vcpkg complained that the hashes didn’t match, but helpfully prints out the actual and expected hashes, so I could copy and paste the actual hash into the SHA512 line. Obviously if you are concerned about man-in-the-middle attacks or the trustworthiness of a remote repo, don’t do this.

The other edits I made are to the ITK library. Some of these are to do with image file formats and default build options for ITK, which are bit too complicated to go into here. But the other one is much more straightforward. The ITK port has its own vcpkg.json, because ITK has depedencies (it’s a big library). Somehow, the ICU library, which is a very large Unicode library that takes a long time to compile, has snuck into them, so I deleted it. I really should open a pull-request to the vcpkg repo to fix this, when I get the time, but I hope it illustrates that once you get over the weirdness of having a full copy of the package index checked into a submodule of your own repo, it really is a powerful and beautiful thing.