C++ builds often take a long time – especially working with large codebases.
Due to their parallel nature, the standard solution to try and speed up builds is to use a higher core count CPU thus reducing the overall build time.
At first glance distributed grid systems do the same thing – allowing you to use the processing power of other machines, simulating a high core count machine with the use of many lower core count machines. On further inspection however, there is nuance and new possibilities to optimize your builds further when using a distributed grid to build your projects.
First a disclaimer – each codebase has a different in size, dependency architecture, build system performance and logic, so there is no “one solution fits all” kind of trick, but there are some guidelines that from our experience with many customers’ builds usually will yield faster builds.
It should be noted that we are talking just about the build portion here, and not code fetching or publishing / deploying, which deserve a detailed analysis of their own.
In this post we will be looking on how your choice of build hardware can affect your build times.
Builds, just like other compute intensive workloads, can have several bottlenecks. The workflow to accelerate a given workload is to always try and improve the slowest resource. Once that is done – the next bottleneck can be addressed. This process is iterative and should be repeated until you are using the best available resources and further tweaks give very low diminishing returns.
CPU Core count:
As I mentioned, large core numbers are needed to run many compilations and other tasks in parallel. The number of cores you should aim for should not exceed the parallelization capacity of your build: if your project is only capable of running 200 compilations (and other tasks) in parallel, there would usually not be much benefit in having 250 cores in the grid, unless you are running many builds on the grid simultaneously.
This is not to say you must have a grid with as many cores as the maximum parallel tasks your build can produce. To determine when diminishing returns start to kick in, some experimentation and test builds are required for your codebase. A good rule of thumb is to have about ~10% lower number of cores when compared to the maximum parallelization your build can sustain over a substantial portion of the build.
CPU core speed:
This is an important factor that is commonly overlooked in both developer machines and CI systems or sacrificed for the benefits of high core count CPUs.
Since grid solutions solve the “number of cores” problem, you can focus your CPU choice to get a higher clocked and newer generation of CPU, which will allow you to accelerate the parts of your build which are inherently serialized, such as links or final packaging.
Amount of ram is a rather simple – your machine should at minimum have as much ram as the peak usage of your build to avoid swapping which can be very costly.
For CI machines which might be building similar codebases time after time, more RAM can be beneficial, since the “unused” portion of the ram will be used by the OS (both Linux and Windows) to cache files in memory, reducing IO for subsequent builds that try to access the same files.
For many builds simply having an SSD for your builds will improve build performance when compared to performance you will get building your codebase on an HDD.
For highly parallel distributed builds we see that commonly the IO becomes a bottleneck due to the removal of the CPU cores bottleneck. If working on a cloud instance – make sure to compare the IOPS your disk is allowed, and on servers – opt for NVME disks.
It should be noted that you’re the speed of the disks where your system-temp, source, intermediate and output folders reside all play a role, but usually to a different degree.
For some projects the difference can be quite substantial, while others might see little difference,
For regular-standalone builds the network of a machine is not a factor. This changes when using a distributed grid – since data needs to be sent both to remote machines as well as downloaded from those machines after processing. It is recommended to use the best network available – preferably 1 Gbps connection or higher, with as low a latency ass possible.
Some more niche scenarios where the network bandwidth is severely limited, such as for instance working from home when the helper machines are in the cloud (or the office) require some optimization such as canceling precompiled headers (which can get quite large in binary form), to reduce the amount of data sent through the limited network.
These factors apply first and foremost to the initiator machine (the machine which starts the build such as a developer machine or a CI agent) since it would usually be the limiting factor in a distributed build.
Ram and CPU requirements do apply to helper machines that lend their resources as well – however usually the changes to these will not have as a dramatic effect on the overall build time as changes to the initiator.
If you have free choice of hardware, its always better to change just one factor and see how it affects your results, rather than to change whole configurations, to avoid confusion and eventual overspending.
If you don’t have the possibility to obtain testing hardware – you can always do some of the testing on the cloud – where changing your configuration is as easy as selecting a new instance type or a different kind of disk.
While there are many variables in selecting the right hardware for a particular project, and no single truth for all cases, I hope this gives you a good starting point on how to look at hardware when aiming for the fastest builds possible.
Happy Building, and would love to hear your feedback!