Accelerating software configuration space study through incremental build

Distinctions Computer science

The transition from software code to a working runnable software artefact goes through a build, which makes it an absolutely crucial step in software development. Different types of builds are available, such as incremental builds, which intrigue researchers from the DIVERSE team., at IRISA Research Laboratory (CNRS/Univ. Rennes 1). Their latest work explains how they can offer significant speed-up and resource savings, and has been selected for the prestigious ICSE 2022 conference.

Most software exists in multiple versions at a time, combining many elements and configuration options. The build allows to assemble these parts and modules of code, which it compiles, verifies and tests against different possible scenarios. Build refers to both the act of transforming source code into a working runnable software artefact, and the result of this action.

Builds thus play an essential role in software development, and are an important topic of the International Conference on Software Engineering (ICSE), considered the world's leading software engineering conference since 1975. The 44th edition was held throughout the month of May, by videoconference as well as in person in Pittsburgh, Pennsylvania. The only French publication selected this year, in the main conference program (technical track), is the result of work by the DiverSE team from IRISA.

This publication is particularly interested in the build of configurable software, i.e. adaptable to the user's needs or to the particularities of the machine on which the software will be deployed. The various possibilities lead to a potentially colossal number of configurations. However, it is necessary to be able to take them all into account, or a large representative sample, to ensure that future software evolutions are consistent. Hence, the more configurations that are, the more expensive it is to test change on them. 

"An operating system such as Linux offers a multitude of options," says Djamel Eddine Khelladi, a CNRS researcher at IRISA. "If we consider that Linux offers, say, ten thousand options, then there would be 210 000 different possible versions: a number that includes more than 3000 digits!"

With an average build time of ten minutes per configuration, testing "only" 50,000 configurations would take about a year.

To deal with these problems, several types of builds exist. The clean build is the classic version, which starts from scratch for each configuration. The incremental build uses the results of previous builds and changes only part of the files for each version. In theory, this is faster, especially if the builds are built in an optimal order where each new build helps the next one as much as possible.

However, it is not always possible to use incremental builds in configurable software and they have not yet been proven to be any more efficient than clean builds. The researchers from the DivserSE team at IRISA therefore hypothesized that, if they can be used, incremental builds would offer significant time and resource savings.

In our article, we were able to confirm a large part of our initial hypotheses," says Djamel Eddine Khelladi. "We were able to use incremental builds in 78 to 100% of the usecases studied, and they speeded up the build time for 88% of the configurations. The maximum speed gain was 11.76%. Based on these results, we also proved that it is possible to establish an order based on incremental builds in advance that improve their performance even more."

We have yet to confirm this, but we believe that the rules could be better written to facilitate the implementation of incremental builds.

The researchers also confirmed that the main reason incremental builds cannot always be used is because of conflicts and inconsistencies between the build rules. This can cause them to crash and stop the build process. With all these elements in mind, the DiverSE team at IRISA sees many directions to pursue this work.

This publication was built around a doctoral student who is only in his first year, but with whom we have already obtained promising results," emphasises Djamel Eddine Khelladi. "First, we will try to replicate these results on more build systems and computer languages. We also want to check whether incremental builds reduce energy consumption, as well as better understand the cases where they do not work. Finally, we hope to find heuristics to automatically infer optimal scheduling and ordering of incremental builds for a given number of variants to build."




Djamel Eddine Khelladi
CNRS researcher at IRISA