homcc - A traffic-efficient distributed C/C++ compiler

Blog Image-11.svg

Large C++ projects usually take a long time to compile locally on developer machines, which leads to reduced engineering productivity. For very large code bases, like our internal C++ database, a clean compile can take way over 10 minutes. Distributed compilers come to the rescue: they offload the compilation processes to an arbitrary number of remote machines so that more files can be compiled simultaneously using more available resources. There are several distributed compilers available for C++, for example, distcc, icecc or goma.

Except for goma, all of these publicly available distributed compilers share the issue that they do not take measures to reduce the amount of needed upload bandwidth. When the COVID-19 pandemic started and working from home became a usual practice, this limitation was crucial: when working from home, engineers only have some MBit/s of upload bandwidth available in contrast to the Gigabit Ethernet link in the office. This poses a bottleneck for distributed compilation and makes working from home less efficient. The other distributed compiler being able to cache dependencies is goma: a quite large and complex project, which is not open-sourced in its entirety (there is another version of goma that is kept internal) and hard to tailor to a specific need due to its complexity.

This is why we developed our own distributed compiler, called homcc (pronounced həʊm siː siː), from scratch. In contrast to other available distributed compilers, homcc explicitly caches dependencies (such as header or source files) on the server so that they do not have to be sent from the client to the server repeatedly therefore, once the cache is warmed up, almost no upload bandwidth is required.

Comparing homcc with distcc shows that this leads to significantly reduced compilation times in environments where there is less than a Gigabit Ethernet connection available:

image__26_.png

The benchmarks were performed on a code base resulting in over 1000 compilation units. For building, clang++-14, CMake and Ninja were utilized with 60 concurrent compilation jobs. wondershaper was used to limit the bandwidth during the benchmarks, simulating a working-from-home scenario. One can see that distcc fails to offload the compilation jobs in a timely manner for an available upload bandwidth under 10 Mbps. It is effectively slower than a local build in these scenarios, which takes about 25 minutes on our developer machines. On the other hand, the homcc compilation times stay relatively constant independent of the available bandwidth.

To further reduce the bandwidth needed to offload compilations, homcc currently offers two compression algorithms (LZMA, LZO). But how does the offloading of C++ compilations even work and how does homcc minimize the traffic so much in detail?

How does homcc work?

In C++, every compilation unit (a source file, .cpp) can be compiled independently. This is perfect for offloading, as a distributed compiler can distribute each of the project’s compile units to the servers to compile them independently. As an example, let’s say you want to compile a single source file of your project (foo.cpp) where dependencies lie in the include folder with the clang compiler:

clang++ -Iinclude example/src/foo.cpp

Offloading the compilation with homcc is as simple as executing:

homcc clang++ -Iinclude example/src/foo.cpp

As you can see, homcc basically acts as a wrapper for the C/C++ compilers. In this case, your local homcc client will execute the preprocessing for the foo.cpp source file and will send the hashes of every file that is needed for compilation of foo.cpp to the homcc server. Dependencies for a source file are a) the source file itself and b) (transitive) includes. The homcc server then checks its cache by using the sent file hashes to see if everything is already available at the server, and will subsequently only request files that are missing from the client, thus avoiding repeated transmission of files from the client to the server.

After all, dependencies have been gathered on the server, the compilation starts and the server sends the result, an object file, back to the homcc client. When compiling a large project with homcc, dozens of these offloadings happen in parallel in the background. After all compile units of a project have been compiled, the linking step happens locally.

While the whole process of compilation offloading may sound simple, compiling on a remote machine is not trivial and multiple edge cases have to be considered: for example, one has to mimic the client’s environment (e.g. file paths) on the server as much as possible. Furthermore, specific macros such as __FILE__ (which returns the path of the current source file) also have to be translated on the server and the resulting object file that is sent back to the client has to contain only the path valid on the client side. These are not the only things to watch out for; it took us a while to get homcc to have an (almost) identical compilation outcome as a local build.

Blog contributors include:

author
  • Celonis