## bzlmod Modules

##### January 24, 2022
bazel tech writing

Recently, Bazel 5 was released. Hidden behind a flag is the new bzlmod tool. This is effectively a package manager for Bazel rulesets that’s baked into Bazel itself, the goal being to replace complicated WORKSPACE stanzas with a simple and purely declarative model. Ultimately, there should be no need for a workspace file at all.

At work, we maintain a relatively complicated suite of choreographed rulesets, and simplifying maintaining that seemed like too good an opportunity to miss, so I dove in to figure out how to get the most from bzlmod. Here’s what I learnt.

### Quick Overview

Although bzlmod is actually baked into Bazel itself and not a standalone tool, I still refer to it by a separate name because that’s how my brain works. If that confuses you, I apologise. If it helps, you can think of bzlmod as the tool that does dependency resolution for the rulesets you’re using before it hands off the build to Bazel “proper”.

bzlmod reads a MODULE.bazel file. To begin with, this is relatively simple, consisting of an opening call to a module function, and then a series of calls to bazel_dep to declare a dependency on another ruleset.

At resolution time, bzlmod will check in the bazel central registry for the lowest version of each dependency declared within the MODULE.bazel file (much like Go does). The nice thing here is that each module (including your project!) need only declare its first order dependencies. That’s different from the approach taken in the regular WORKSPACE-based projects, where you’re responsible for ensuring that the transitive deps of the rulesets you use are also loaded.

If you’re on a corporate network or don’t want to depend on the regular central registry, you can override the location of the registry by using the --registry flag. This takes a URL as its argument, and that URL can be a file:// URL.

Just as regular rulesets can declare repository rules, bzlmod modules can declare “extensions”. These are regular bzl files, written in Starlark, that contain a combination of “tags” and module_extensions. They’re loaded using a call to use_extension More on this later!

One consequence of this design is that a ruleset can be both a regular “workspace”-based ruleset, as well as a module.

I’m calling the process of converting a ruleset to be a bzlmod module “modularisation” (that’s with a “z” in the middle if you’re using US English 😀)

### Preparing for bzlmod

By default, bzlmod isn’t enabled. To opt into using it, the following needs to be added to a project’s .bazelrc:

common --experimental_enable_bzlmod


And, of course, you should pin the repo to using Bazel 5 or above. If you’re using bazelisk this is as easy as echo 5.0.0 >.bazelversion. If you’re reading this in the future (one hopes you are), then just use the version of Bazel that’s current at the moment.

### The Wrong Module Development Workflow

My original attempt to work with modules involved making a local clone of the bazel central registry, and adding the ruleset I wanted to modularise by running the //tools:add_module.py script (by hand: there’s no bazel build file here, and you may need to install some python dependencies to get the thing to work)

Once the module was added, I created a fork in the ruleset I wanted to modularise, created a branch in that and pushed to a private GitHub repo. That’s because I’d used the branch’s URL as the location of the module when calling add_module.py

Within the Central Registry clone, there’s an integrity code. This is a base64 encoded sha256, and every update to the ruleset needs to also be matched with an update to that integrity code.

My workflow was therefore:

1. Make a local change to my fork.
2. Commit the change and push to the GitHub repo
3. Update the integrity code in the Central Registry clone
4. Kill the running bazel instance in the project that uses the module I’m working on because Bazel stores the resolution.
5. Attempt to use the change, find a typo, go back to step 2.

Needless to say, this process was slow and very tedious.

### The Right Module Development Workflow

What I should have done is modify the MODULE.bazel file in the project I was working on that used the ruleset I was modularising by adding a stanza like this:

# I was working on rules_jvm_external. This version hasn't been
# released yet!
bazel_dep(name = "rules_jvm_external", version = "5.0.0")

# And then later
local_path_override(
module_name = "rules_jvm_external", # matches the name of the bazel_dep
path = "../path/to/my/clone/of/rules_jvm_external",
)


Now, every time I made a change in the ruleset I was modularising (rules_jvm_external in this case), it was picked up automatically, without needing to restart the bazel daemon. This sped up development an awful lot.

The only caveat with this approach is that the local_path_override only works in the “top level” project. That is, while the module override works in the project that’s importing the modularised ruleset, if there was a similar call in that, it would be ignored.

### Tags are Strongly Typed Macros

Let’s take an example from rules_jvm_external in a workspace-based project:

maven_install(
artifacts = [
maven.artifact(
artifact = "guava",
version = "27.0-android",
exclusions = [],
),
"junit:junit:4.12",
],
repositories = [
"https://repo1.maven.org/maven2",
],
)


The way to handle this in a “MODULE.bazel” is to use tags. These are like stripped down rules, in that they have no imeplementation function, but they do have a set of attrs, each of which are defined as being one of the entires in the attr module.

One limitation of the MODULE.bazel file is that you’re not allowed to use functions or load an external resource. This means that you can’t use macros in the way that we’re used to, and this caused me some serious head-scratching. Fortunately, after a conversation with Xudong Yang, it became clear there was another way to think about this problem.

The trick is that bzlmod will agregate all the tags defined transitively in a module, and the module implementation function can iterate over them. That means that the above stanza in a MODULE.bazel could be written as:

maven.install(
artifacts = [
"junit:junit:4.12",
],
repositories = [
"https://repo1.maven.org/maven2",
],
)

maven.artifact(
artifact = "guava",
version = "27.0-android",
exclusions = [],
)


There are two things to note here:

1. maven is the value returned by module_extension, and both install and artifact are both tags classes.
2. The module extension’s implementation function aggregates the data from both of these into a single data structure, which is then resolved.

While this allows rules_jvm_external to express what needs to be said, it’s pretty clear that if the artifact tag needed a macro itself, we’d rapidly be in a whole world of pain. Fortunately, in my case, we don’t, so that’s fine 😀

### Module Implementation Functions Replace Workspace Stanzas

A module_extension’s implementation function gets access to a pretty anaemic module_ctx. This would be a problem, but the implementation function is free to call as many repository_rules as it wants to. You can also rely on the rulesets declared as a bazel_dep to be present.

This allows the module implementation function to effectively contain the bulk of what would normally be in the stanzas of code that get added to a workspace file.

Care must be taken to avoid the need to call load in the implementation function: although the implementation is a lot like a subset of a workspace file, it’s not exactly the same. In the case of what I wanted to do for rules_jvm_external, this means that the lock file gets parsed at least twice: once so that I can generate a series of http_file dependencies, and once so that the actual @maven workspace can be set up.

### bzlmod Lock Files

At some point in the future, bzlmod will have its own lock file. I’m not entirely sure what this will look like, but my belief is that this will be structured in such a way that your module implementation function should only be called when one of its inputs (including attributes of tags) has changed. In the case of rules_jvm_external, this will most likely be when the artifacts being imported into your project changes.

This will be incredibly useful. One of the painful things when working with workspace files is waiting for all the transitive deps to download and be set up so Bazel can figure out which bits of them it needs to use. On larger projects, this can take a long time. Being able to start building faster, can only be a Good Thing, and I look forward to it!

### Managing Your MODULE.bazel

Right now, you can’t. Because there’s no way of calling load in a MODULE.bazel file, there’s no way to segment the thing in a meaningful way. I’ve raised an issue to do with this, and I know it’s something the Bazel developers are aware of, so I hope that once bzlmod is no longer hidden behind a flag, this will be possible.

### Final Thoughts

Overall, after kicking the tyres and trying it out, I think that I like bzlmod, and it’ll be fun to see how it grows and changes, especially as rulesets migrate to using it.

Right now, it’s usable, but there are some corner cases where it’s not quite there yet (notably when a module declares a dependency on a repository via a generated build file) Having seen how quickly the Bazel team have leaped on the issues I’ve filed, I’m very confident that problem will be resolved.

My advice? Try migrating your ruleset to bzlmod, and see what works for you and what doesn’t. I suspect there’s enough there for it to work just as you’d expect in many cases.

My thanks to the Bazel developers, Xudong Yang and Alex Eagle for their help as I delved into bzlmod. Alex’s blog post gave me the incentive to start digging into bzlmod and provided enough scaffolding for me to get started. It proved invaluable!

Gravity Less recently

tech writing

tech

tech writing