Automating the Build

Computers help us automate repetitive actions. As a developer, one of the most repetitive actions is to compile or build our code. Therefore automating the build process of our code is one of the first steps we should take. Nowadays most build processes are at least somewhat automated, but it's important to understand the nuances of the process to ensure that our automation works in our favor.

Throughout this chapter, keep in mind that automating the build has two purposes: make it easy for your peer developers to build the component, and have the build invoked by your CI (Continuous Integration) system. In a later chapter we go into detail about Continous Integration, but for now, we will only focus on the building part.

In larger projects, developers will only execute some of the tasks locally, while the CI system will execute all of them. In smaller projects, especially projects that have only a single developer, all of the tasks will be executed by the developer from their local machine.

Why is This Important?

In the modern world, the productivity of developers is paramount. As Software Developers, our time as well as the time of our teammates is the most valuable resource. Manually building our codebase over and over again provides no enduring value. Activities that must be done manually over and over again and provide no enduring value are sometimes referred to as "toil". Removing toil respects the time of our fellow developers. Improving developer productivity has always been relevant, but it has gained steam lately as part of the nascent discipline of "Developer Productivity Engineering" or "Developer Experience Engineering".

In their day to day, developers will run the build process multiple times, to verify the changes that they are incrementally making. Automating this process is key not only for developer productivity but also for developer satisfaction.

Build Tools

To automate the build, we always leverage a tool, from a simple bash script to a complex build tool with its ecosystem. In this chapter, we cover build automation tools, which are different from CI tools, which we cover in chapter 14.

Normally, what do we need our build tool to do:

  • Fetch and manage dependencies
  • Build the code
  • Package the code
  • Run tests
  • Generate reports

As we can see from this short list, our "build tool" needs to do a lot more than just build the code.

There are a lot of different build tools to choose from. A big factor in choosing your build tool is the language your component is written. In fact, for some languages you don't require to be "built", but instead we can leverage tools that will manage our dependencies, as well as orchestrate other operations of the software development lifecycle.

For example, in Python pip is traditionally used to install dependencies. In Node npm is used to manage dependencies as well as orchestrate other actions. While JavaScript code normally doesn't have to be compiled, TypeScript is normally compiled into JavaScript. If we're using TypeScript, npm can be configured to compile the code.

For compiled languages (C, C#, Java, Go, etc...) you have a lot more choices, since the build is a more involved process that many organizations or individuals have tried to improve in one way or another.

Some of the choices are strongly linked to a particular ecosystem. For example, both Apache Maven and Gradle are tightly coupled with the Java ecosystem. These tools can be extended to handle languages outside of their main ecosystem, but it's not as well supported.

Some tools are designed independently from any ecosystem and can support multiple languages. Arguably the most widespread general-purpose build utility is (make)[https://www.gnu.org/software/make/], primarily due to its early inclusion in Unix. Newer tools are also available, such as (Bazel )[https://bazel.build/]. Bazel is based on the internal Google build tool called Blaze and is a popular option when implementing monorepos.

Generally, we want to select the tool that will allow us to build our artifacts most simply. In parallel, we also want to select the tool that will allow us to run the unit tests for our software most simply.

Tools like Maven or Gradle will allow us to do both: build the code and run unit tests. If we're using npm, it can run our tests, and it can also compile our code if we're using TypeScript. In the Python ecosystem, generally, there is no build tool, instead relying on virtualenv to manage virtual environments, and pip to manage dependencies. In these cases, tests are just normally executed from the command line using the python executable.

During the execution of a build different types of artifacts will be generated:

  • Binaries
  • Container images
  • Deployment descriptors
  • Libraries
  • Documentation
  • Testing results
  • Code quality results

Decision Point

  • What artifacts do we need to build?
  • What tool are we going to use to build the artifacts?

Build Process

Setting up the whole build process can be broken into seven steps, which we'll review in detail in subsequent sections:

Component Metadata

All software (beyond trivial scripts), should have some kind of metadata that identifies it. The minimum information is normally:

  • Component name or id: The name of the component, which doesn't have to be globally unique, but should be unique within the group it belongs.
  • Group name or id: The unique base name of the company or team that created the project.
  • Version: Version of the component. More information on how to select a version can be found in the component version section.

The metadata can also include other useful information such as the description, links to documentation, the authors of the component, etc.

This data is normally stored in a specially named file, depending on the build tool that will be used.

  • In Java using Gradle: [build.gradle or build.gradle.kts](https://docs.gradle.org/current/userguide/tutorial_using_tasks.html
  • In Java using Maven: pom.xml
  • In Node using npm: package.json
  • In Python: pyproject.toml

Most components will also specify any needed dependencies, either in the same file that holds the metadata or in a separate file (for example requirements.txt in Python). In this same file, we can normally specify any values that must be overridden to alter the default behavior of the build tool. For example, if our project has a non-standard structure and the source code is located in an unexpected directory, we must ensure the toolchain is aware of this setup. The metadata is used by the build system to properly configure the toolchain, execute the build, and create the relevant artifacts.

Most of the metadata will be defined manually and persisted in a file. However, in some cases, some of the metadata will be determined dynamically during the build process. For example, the build process may want to record the current time and date, or the git branch and revision that is being built.

Decision Point

  • What is the component name or id?
  • What is the group name or id?
  • What is the version of our component?
  • Do we need to override any defaults for our build tool?
  • Is there any other useful metadata that should be added?

Component Version

Software versioning is important because it helps both users and developers track the different releases of the software. Users rely on software developers to keep software components up to date, and they expect a methodical way of understanding when and what updates are released. Developers need a methodical way to identify what release (or releases) of a software component have a particular feature or bug.

Therefore we must use a sensible versioning scheme. There are two main versioning schemes that we will be exploring in detail:

  • Semantic Versioning: A set of rules how version numbers should be assigned and incremented. More details here.
  • Calendar Versioning: A set of practices to create version numbers based on the release calendar of your software component. More details here.

To choose the right versioning scheme for your application, it's vital to understand how your users consume your application or component. Are your users more interested in knowing when they should upgrade the application, or are they more interested in what changed? The frequency and consistency of your releases will also help you determine if a semantic or calendar versioning scheme is better for your team and your users.

As always keep context in mind. This might already be defined in your organization.

Decision Point

  • What versioning scheme will our component use?

Semantic Versioning

Semantic Versioning (also referred to as SemVer) provides a simple set of rules and requirements that dictate how version numbers are assigned and incremented. Semantic Versioning assigns a particular meaning to each segment of the version string. In Semantic Versioning, versions are specified in the following format: MAJOR.MINOR.PATCH.

Each one of those components has a particular meaning:

  • MAJOR: The major version must be increased when incompatible API changes are introduced.
  • MINOR: The minor version must be increased when we add functionality in a backward-compatible manner.
  • PATCH: The patch version must be increased when we release backward-compatible bug fixes.

Given the prevalence of semantic versioning in the software world, if a version string looks like SemVer, make sure it behaves like SemVer. We must keep in mind that versioning can also be a marketing concern. For example, version 2.0.0 sounds more exciting than 1.1.0. Therefore there might be external forces driving the versioning.

Software development is never really done. There's always something else we want to add, or improve. Many developers are therefore hesitant to release a 1.0.0 version. However, if your software is being used in production, or if users depend on your API, you're already worrying about backward compatibility, your software should already be at least 1.0.0.

More information can be found on the Semantic Versioning website: https://semver.org/.

Calendar Versioning

Calendar Versioning (also referred to as CalVer) is a scheme based on the release calendar of your software, instead of arbitrary numbers. There are multiple calendar versioning schemes. CalVer does not provide a single scheme, but rather a set of practices that can be leveraged to fit the requirements of different users and organizations.

CalVer makes it easier for users to understand at a glance how up-to-date their software component is. The scheme is especially useful for applications that require manual updates. The information contained in the version can make it easier for users to gauge how important is it for them to update the application. CalVer makes it easy for a user to understand how many months (or years) behind the latest release they are.

Central to CalVer is a set of fields that can be used:

  • MAJOR: The major segment is the most common calendar-based component.
  • MINOR: The second number.
  • MICRO: The third and usually final number. Sometimes referred to as the "patch" segment.
  • MODIFIER: An optional text tag, such as "dev", "alpha", "beta", "rc1", and so on.

There are many examples of popular software packages that leverage CalVer:

  • Ubuntu: 22.04 LTS
  • youtube-dl: 2021.05.16
  • IntelliJ IDEA: 2023.1.1

More information can be found on the Semantic Versioning website: https://semver.org/.

Toolchain Setup

Part of setting up the build is ensuring that the right tools are available. This is often called the "toolchain". The toolchain is a set of programming tools that are used to perform the set of tasks required to create a software product. The chain part comes in because in most cases the output of a tool is used as the input of the next tool in the chain.

The toolchain, in the broadest sense, is composed of multiple components. For example:

  • The build tool itself
  • Preprocessors
  • Compilers and linkers
  • Required runtimes

As part of our role as senior software developers, we must ensure all of the developers have access to the toolchain, easily and consistently.

There are many options to achieve this. The right option will depend on the individual circumstances of the development team. For example, making a toolchain available to a team that is part of a large organization with a well-established I.T. support department could be very different to making the toolchain available to a team of a fledging startup where developers are using their own personal laptops for work.

Some of the options to distribute the toolchain include:

  • Have developers install things manually on their computers.
  • Have the I.T. support department preinstall the tools on the computers that are distributed to developers.
  • Use a wrapper to download the toolchain.
  • Use containers to make the toolchain available, in particular via dev containers.

For the first option, where developers install their toolchain manually, good documentation is critical to achieve any kind of success. It's a workable solution when other options are not available, but is very manual and error-prone.

The second option, where we can leverage the I.T. support department, we can leverage the work that has already been done in the organization to manage the software that is installed in the organization-owned devices. Most large I.T. support departments have sophisticated tooling that makes installing the toolchain possible, but working with the I.T. department might add an extra layer of complexity and reduce flexibility. Due to corporate policies, this might be the only way of installing any piece of software (including our toolchain) into organization-owned devices.

The third and fourth options are explored in more detail in the following sections.

Decision Point

  • How will the required toolchain be distributed to fellow developers?

Wrappers

A wrapper is a very simple script that invokes a declared version of a tool. If the tool is not available locally, the wrapper will download it beforehand. Wrappers are very popular in the JVM ecosystem, in particular for Maven and Gradle.

The wrappers for both Maven and Gradle require the right JVM (Java Virtual Machine) to be installed already. This requires some mechanism to install the JVM in the developer workstations, but once the JVM is installed, it's trivial to ensure all of the developers are using the right version of Maven or Gradle.

To use a wrapper, a very small executable jar is committed into the source code. There is normally also a Shell or Batch script that is used to make it easier to invoke the wrapper. Committing binary files like a jar file is normally discouraged, but this is an example where an exception makes sense to ensure the proper toolchain is installed when needed.

The wrapper is configured via a file, which contains information about which version of the tool must be executed. The configuration file also contains information regarding how the build tool will be downloaded, for example, any proxies that should be used, or custom locations to fetch the files from.

For the Gradle wrapper, the configuration is located in a file called gradle-wrapper.properties inside a directory called wrapper. For the Maven wrapper, the configuration is located in a file called maven-wrapper.properties, inside a directory called .mvn/wrapper/.

When the wrapper is invoked, the wrapper will verify the configuration file to determine what version of the tool is to be invoked. If the particular version of the tool is not locally available, it will be downloaded. The wrapper will then execute the tool, by passing through any options that were specified on the command line and are not wrapper-specific.

The build tool will download other dependencies and perform the required build steps.

Dev Containers

Development Containers (also called Dev Containers for short) is a specification that allows using containers as a full-featured development environment, including providing the full toolchain.

To use Dev Containers, the toolchain is defined as a set of containers. The containers are configured in a file called devcontainer.json.

A sample devcontainer.json file is shown below:

{
  "image": "mcr.microsoft.com/devcontainers/typescript-node",

  "customizations": {
    "vscode": {
      "extensions": ["streetsidesoftware.code-spell-checker"]
    }
  },
  "forwardPorts": [3000]
}

The full configuration documentation can be seen here.

Once the container is configured in the devcontainer.json file, your IDE will manage a running instance of the container to provide the required toolchain.

Dependencies

Build time dependencies are another area worth automating. In particular, automation is important to ensure that builds are reproducible, as when we release software we want to be sure that we have control over what is released.

A reproducible build is a build in which we can recreate our output artifacts to the bit level given the same inputs: the same source code, build environment, and build instructions. All of these inputs must be versioned in a source control system (such as Git).

In some cases, a development team might be ok with builds that are not fully reproducible. Especially during active development, the team might want to have access to the latest version of libraries that are getting developed in parallel. For example, using non-deterministic library versions makes it easier to integrate changes from other teams before said libraries are finalized and released.

In such cases, when a bug is caused by a library, the problem might be solved by just rebuilding the application, without making any changes to the inputs stored in source control. This happens because rebuilding the application will pull the latest version of the library. Of course, the flipside of this is that rebuilding the application could result in the introduction of a bug or incompatibility.

If non-deterministic library versions are used when building during development, it's vital to ensure that a deterministic build is done when building release artifacts. The exact mechanisms vary from language to language. In the following sections, we explore some of the mechanisms for Java, NodeJS, and Python.

Java Libary Versioning

In Java, when using Maven or Gradle, library versions can be specified as a SNAPSHOT. The SNAPSHOT version precedes the actual release version of the library. For example 1.0-SNAPSHOT is 1.0 under development.

The reference to SNAPSHOT dependencies should only exist during development. It is normally recommended not to rely on SNAPSHOT dependencies not developed by your team/organization, as the release cadence of a third-party library might be hard to align with the release of your software.

Any reference to a SNAPSHOT dependency can cause the build to be "not reproducible". With this in mind, remove SNAPSHOT as soon as possible.

Both Maven and Gradle support range versions for dependencies, although their use is much less common than using exact versions and SNAPSHOT libraries. More documentation about range versions can be found in the respective documentation for Maven and Gradle. Gradle also provides a mechanism to lock range versions to provide reproducible builds while maintaining some flexibility.

NodeJS Libary Versioning

In the NodeJS ecosystem, npm has the concept of library version resolution based on ranges and "compatible" versions. NodeJS libraries are expected to be in SemVer format and are specified in the dependencies section of the package.json file. Within the SemVer format, the many different requirements can be expressed:

  • version: Must match version exactly
  • >version: Must be greater than version
  • >=version: Must be greater or equal than version
  • <version: Must be lower than version
  • <=version: Must be lower or equal to version
  • ~version: Accept only patch updates
  • ^version: Accept minor and patch updates
  • latest: Always get the latest

More information can be found on the specification of the package.json file.

To be able to support reproducible, npm maintains the package-lock.json file. The lock file maintains exact versions of the dependencies that have been installed. While package.json is meant to be edited by developers, the lock file is maintained by npm after most commands that manipulate the dependencies. For example, adding a new dependency with npm install will update the pacakge-lock.json file. Running npm install without adding any new dependencies will update the pacakge-lock.json file with the latest versions that are available within the ranges defined in the package.json file.

To perform a "reproducible build", for example in the Continous Integration server, the npm dependencies with npm ci. This command will install the exact versions defined in the package-lock.json file. For this reason, it is recommended to persist the package-lock.json file in source control.

Python Libary Versioning

For Python applications, pip allows using ranges (both inclusive and exclusive) and "compatible" versions.

For example:

  • ~= 2.2: Any 2.X version greater than 2.2
  • ~= 1.4.5: Any 1.4.X version greater than 1.4.5
  • > 1.4.5: Any version greater than 1.4.5 (exclusive range)
  • >= 1.4.5: Any version greater than or equal to 1.4.5 (inclusive range)

The pip Version specifiers are described as part of the Python Packaging User Guide,

Reproducible builds can be achieved by "freezing" the requirements file, using pip freeze > requirements.txt. The resulting requirements.txt file will only contain exact versions as installed in the current environment. Installing from a fronzed requirements file will result in a deterministic set of libraries. In the CI environment, a frozen requirements file should be used to ensure the build is reproducible. More information about the freeze command can be found in the pip documentation.

Code Compilation

For compiled languages, code compilation is the key step where the source code the developer writes gets turned into a binary format that can be executed. This applies to languages like Java and C/C++, but normally not to interpreted languages like NodeJS or Python.

For some interpreted languages, there are some exceptions. For example to compile TypeScript into JavaScript.

In the compilation step, our role as senior developers is to ensure that the build can compile the code without any manual input. This requires that all of the inputs are defined and stored in source control:

  • Source files
  • Flags
  • Preprocessors

Most of the flags and preprocessors used are the same regardless of which environment you are building in, but there might be some flags that will vary from environment to environment, due to different Operating Systems or different hardware architectures. This is more significant for languages that produce native binaries (such as C/C++, Go, Rust) as opposed to languages that compile into bytecode that is executed by a runtime (such as Java and other languages that compile to JRE's bytecode format).

For languages that compile native binaries, there's language-specific tooling that helps determine the proper set of flags that are required depending on the current environment. One prominent example is autoconf.

The actual compilation process depends on each language.

Decision Point

  • What commands and parameters do we need to persist to source control to ensure compilation can be executed automatically?
  • Do we need to use some mechanism to account for compilation in different operation systems or architectures?

Tests

Testing is critical to producing high-quality software. Code must be tested, and the testing must be ingrained in the build process to reduce friction and minimize the likelihood of tests becoming stale. There are many types of tests, but in this section, we focus on "functional tests", a type of testing that seeks to establish whether each application feature works as per the software requirements. In general, functional tests are divided into two main types:

  • Unit tests: Tests that can run quickly and depend on few if any external systems.
  • Integration tests: Tests that require significant external systems or large infrastructure.

Unit tests are normally implemented near the code they are testing, while integration tests might be implemented as a separate project. To provide the most value, tests must be easy to run. And just like with compilation, all tests that don't require large infrastructure or external systems should be automated through our build system so that they can be run with a single command.

Integration tests should also be automated but given the complexity and external dependencies, integration tests tend to be run outside of the regular build. Instead, they are automated to be run by the CI system.

In later chapters, we go into much more detail about Unit Tests and Integration Tests.

Decision Point

  • How will unit tests run?
  • How will integration tests run?

Code Quality

Writing code is hard, but maintaining it tends to be even harder. High-quality code should do more than just run. We need to ensure that our code is reliable, secure, easy to understand, and easy to maintain. To ensure we're producing high-quality code, we can leverage tools that will perform automated checks to provide objective metrics regarding the quality of the code and any areas that need attention.

Keeping track of the quality of our code using objective metrics allows us to keep technical debt from creeping in.

In Chapter 8 we go into detail on how to use code quality tools to improve the quality of our code.

Decision Point

  • What code quality tools will we integrate?
  • How will these code quality tools be executed?

Developer Experience

Automating the build ties back to the idea of Developer Experience, because the time of developers is extremely valuable. While automating the build is generally a prerequisite to improving the productivity of developers,

As a senior developer, it is part of our role to ensure that the tools and processes are maintaining a positive developer experience. Developer experience should be an ongoing concern.

Part of this implies ensuring the build automation supports the experience of the developers. For example, we want to ensure that there are checks that prevent developers from getting a broken build. Guardrails should be set up to prevent merging breaking changes, for example, changes that won't compile or won't pass the automated tests. Source control systems can be configured to prevent merges of changes that haven't passed the automated tests run in the CI.

Developers should also have good visibility into the CI system to be able to get information about builds. Nothing kills productivity and morale like debugging a broken build without good visibility.

As part of the automation of the build, it's important to gather metrics to prevent the developer experience from degrading. These metrics will allow us to detect if builds are getting slower, or failing more often. Detecting these kinds of regressions is the first step to be able to resolve issues that can creep in and degrade the developer experience.

If issues are detected early, it is easier to detect the root cause. Root causes for slower builds are normally related to a change. For example:

  • A large dependency being added which must be downloaded and is not properly cached
  • Caches not working as expected
  • Upgrades to a part of the toolchain

In other cases, an external system outside of our control can be the culprit of the slowness. To identify the issue, good metrics are vital. Metrics should be granular enough to measure the latency of individual tasks within our build process.

To resolve slow builds, many techniques can be used to speed up the build and improve the developer experience. The actual techniques depend on the specific build tool being used, but some examples include:

  • Using caches (both local and remote) to speed up some of the build steps
  • Limit the components that are built locally for each project or module, instead of downloading already-built modules from artifact registries
  • Limit the code that is tested locally for each project or module
  • Offload some of the tasks to remote executors that can provide more computing power or more parallelization

Decision Point

  • What metrics concerning the build will be collected?
  • Where will those metrics be collected? (only in the CI server or also as the developer builds locally)

Tools Referenced

Videos

Automating the Build

Automating the Build, continued