The Lone C++ Coder's Blog

The Lone C++ Coder's Blog

The continued diary of an experienced C++ programmer. Thoughts on C++ and other languages I play with, Emacs, functional, non functional and sometimes non-functioning programming.

Timo Geusch

6-Minute Read

Every reasonably sized C++ project these days will use some third party libraries. Some of them like boost are viewed as extensions of the standard libraries that no sane developer would want to be without. Then there is whatever GUI toolkit your project uses, possibly another toolkit to deal with data access, the ACE libraries, etc etc. You get the picture.

Somehow, these third party libraries have to be integrated into your development and build process in such a way that they don’t become major stumbling blocks. I’ll discuss a few approaches that I have encountered in the multitude of projects I was part of, and will discuss both their advantages and problems.

All the developers download and install the libraries themselves

This is what I call the “good luck with that” approach. You’ll probably end up documenting the various third party libraries and which versions of which library were used in which release on the internal wiki and as long as everybody can still download the appropriate versions, everything works. Kinda.

The problems will start to rear their ugly heads when either someone has to try to build an older version of your project and can’t find a copy of the library anymore, someone forgets to update the CI server or - my favourite - the “bleeding edge” member of the team starts tracking the latest releases and just randomly checks in “fixes” needed to build with newer versions of the library that nobody else is using. Oh, and someone else missed the standup and the discussion that everybody needs to update libgrmblfx to a newer, but not current version and is now having a hard time figuring out why their build is broken.

Whichever way you look at it, this approach is an exercise in controlled chaos. It works most of the times, you can usually get away with it in smaller and/or short term projects but you’re always teetering on the edge of the Abyss Of Massive Headaches.

What’s the problem? Just check third party libraries into your version control repository!

This is the tried and tested approach. It works well if you are using a centralized VCS/CM system that just checks out a copy of the source. Think CVS, Subversion, Perforce and the like. Most of these systems are able to handle binaries well in addition to “just” managing source code. You can easily check in pre-built versions of your third party libraries. Yes, the checkouts may be a little on the slow side when a library is updated but in most cases, that’s an event that occurs every few months. In a lot of teams I used to work in, the libraries would be updated in the main development branches after every release and then kept stable until the next release unless extremely important fixes required further updates. This model works well overall and generally keeps things reasonably stable, which is what you want for a productive team because you don’t want to fight your tools. Third party libraries are tools - never forget that.

The big downside to this approach is when you are using a DVCS like git or Mercurial. Both will happily ingest large “source” trees containing pre-built third-party libraries, but these things can be pretty big even when compressed. A pre-built current boost takes up several gigabytes of disk space depending on your build configurations and if you’re build 32 bit and 64 bit versions at the same time. Assuming a fairly agile release frequency, you’re not going to miss many releases so you’ll be adding those several gigabytes to the repository every six months or so. Over the course of a few years, you will end up with a pretty large repository that will take your local developers half an hour to an hour to clone. Your remote developers will suddenly either have to mirror the repository - which has its own set of challenges if it has to be a two-way mirror - or will suddenly find themselves resorting to overnight clones and hope nothing breaks during the clone. Yes, there are workarounds like Mercurial’s Largefiles extension and git-annex, and they’re certainly workable if you are planning for them from the beginning.

The one big upside of this approach is that it is extremely easy to reproduce the exact combination of source code and third party libraries that go into each and every release provided an appropriate release branching or release tagging strategy is used. You also don’t need to maintain multiple repositories of different types like you have to in the approach I’ll discuss next.

Handle third party libraries using a separate package management tool

I admit I’m hugely biased towards this approach when working with a team that is using a DVCS. It keeps the large binaries out of the source code repository and into a repository managed by a tool that was designed for the express purpose of managing binary packages. Typical package management tools would be NuGet, ivy and similar tools. What they all have in common is that they use a repository format that is optimized for storing large binary packages, usually in a compressed format. They also make it easy to pull a specific version of a package out of the repository and put it into an appropriate place in your source tree or anywhere else on your hard drive.

Instead of containing the whole third party library, your source control system contains a configuration file or two that specifies which versions of which third party libraries are needed to build whichever version of your project. You obviously need to hook these tools into your build process to ensure that the correct third party libraries get pulled in during build time.

The downside of these tools is that you get to maintain and back up yet another repository that needs to be treated as having immutable history like most regular VCS/DVCSs have. This requires additional discipline to ensure nobody touches a package once it’s been made part of the overall build process - if you need to put in a patch, the correct way is to rev the package so you are able to reproduce the correct state of the source tree and its third party libraries at any given time.

TL;DR - how should I manage my third party libraries?

If you’re using a centralised version control system, checking in the binaries into the VCS is fine. Yes, the admin might yell at you for taking up precious space, but it’s easy, reasonably fast and simplifies release management.

If you are using a DVCS, use a separate package management, either a third party one or one that you roll yourself. Just make sure you keep the third party libraries in your own internal repository so you’re not at somebody else’s mercy when they decide to suddenly delete one of the libraries you’re using.

Recent Posts

Categories

About

A developer's journey. Still trying to figure out this software thing after several decades.