Recently, Bazel 5 was released. Hidden behind a flag is the
new bzlmod tool. This is effectively a package manager for Bazel
rulesets that’s baked into Bazel itself, the goal being to replace
complicated WORKSPACE
stanzas with a simple and purely declarative
model. Ultimately, there should be no need for a workspace file at
all.
At work, we maintain a relatively complicated suite of choreographed
rulesets, and simplifying maintaining that seemed like too good an
opportunity to miss, so I dove in to figure out how to get the most
from bzlmod
. Here’s what I learnt.
Quick Overview
Although bzlmod
is actually baked into Bazel itself and not a
standalone tool, I still refer to it by a separate name because that’s
how my brain works. If that confuses you, I apologise. If it helps,
you can think of bzlmod
as the tool that does dependency resolution
for the rulesets you’re using before it hands off the build to Bazel
“proper”.
bzlmod
reads a MODULE.bazel
file. To begin with, this is
relatively simple, consisting of an opening call to a module
function, and then a series of calls to bazel_dep to declare a
dependency on another ruleset.
At resolution time, bzlmod
will check in the bazel central
registry for the lowest version of each dependency declared
within the MODULE.bazel
file (much like Go does). The nice thing
here is that each module (including your project!) need only declare
its first order dependencies. That’s different from the approach taken
in the regular WORKSPACE
-based projects, where you’re responsible
for ensuring that the transitive deps of the rulesets you use are also
loaded.
If you’re on a corporate network or don’t want to depend on the
regular central registry, you can override the location of the
registry by using the --registry
flag. This takes a URL as its
argument, and that URL can be a file://
URL.
Just as regular rulesets can declare repository rules, bzlmod
modules can declare “extensions”. These are regular bzl
files,
written in Starlark, that contain a combination of “tags” and
module_extensions. They’re loaded using a call to
use_extension More on this later!
One consequence of this design is that a ruleset can be both a regular “workspace”-based ruleset, as well as a module.
I’m calling the process of converting a ruleset to be a bzlmod
module “modularisation” (that’s with a “z” in the middle if you’re
using US English 😀)
Preparing for bzlmod
By default, bzlmod
isn’t enabled. To opt into using it, the
following needs to be added to a project’s .bazelrc
:
common --experimental_enable_bzlmod
And, of course, you should pin the repo to using Bazel 5 or above. If
you’re using bazelisk this is as easy as
echo 5.0.0 >.bazelversion
. If you’re reading this in the future (one hopes you
are), then just use the version of Bazel that’s current at the moment.
The Wrong Module Development Workflow
My original attempt to work with modules involved making a local clone
of the bazel central registry, and adding the ruleset I wanted
to modularise by running the //tools:add_module.py
script (by hand:
there’s no bazel build file here, and you may need to install some
python dependencies to get the thing to work)
Once the module was added, I created a fork in the ruleset I wanted to
modularise, created a branch in that and pushed to a private GitHub
repo. That’s because I’d used the branch’s URL as the location of the
module when calling add_module.py
Within the Central Registry clone, there’s an integrity code. This is a base64 encoded sha256, and every update to the ruleset needs to also be matched with an update to that integrity code.
My workflow was therefore:
- Make a local change to my fork.
- Commit the change and push to the GitHub repo
- Update the integrity code in the Central Registry clone
- Kill the running
bazel
instance in the project that uses the module I’m working on because Bazel stores the resolution. - Attempt to use the change, find a typo, go back to step 2.
Needless to say, this process was slow and very tedious.
The Right Module Development Workflow
What I should have done is modify the MODULE.bazel
file in the
project I was working on that used the ruleset I was modularising by
adding a stanza like this:
# I was working on `rules_jvm_external`. This version hasn't been
# released yet!
bazel_dep(name = "rules_jvm_external", version = "5.0.0")
# And then later
local_path_override(
module_name = "rules_jvm_external", # matches the name of the `bazel_dep`
path = "../path/to/my/clone/of/rules_jvm_external",
)
Now, every time I made a change in the ruleset I was modularising
(rules_jvm_external
in this case), it was picked up automatically,
without needing to restart the bazel daemon. This sped up development
an awful lot.
The only caveat with this approach is that the local_path_override only works in the “top level” project. That is, while the module override works in the project that’s importing the modularised ruleset, if there was a similar call in that, it would be ignored.
Tags are Strongly Typed Macros
Let’s take an example from rules_jvm_external
in a workspace-based project:
maven_install(
artifacts = [
maven.artifact(
group = "com.google.guava",
artifact = "guava",
version = "27.0-android",
exclusions = [],
),
"junit:junit:4.12",
],
repositories = [
"https://repo1.maven.org/maven2",
],
)
The way to handle this in a “MODULE.bazel” is to use
tags. These are like stripped down rules, in that they
have no imeplementation function, but they do have a set of attrs
,
each of which are defined as being one of the entires in the attr
module.
One limitation of the MODULE.bazel
file is that you’re not allowed
to use functions or load
an external resource. This means that you
can’t use macros in the way that we’re used to, and this caused me
some serious head-scratching. Fortunately, after a conversation with
Xudong Yang, it became clear there was another way to think about this
problem.
The trick is that bzlmod
will agregate all the tags defined
transitively in a module, and the module implementation function can
iterate over them. That means that the above stanza in a
MODULE.bazel
could be written as:
maven.install(
artifacts = [
"junit:junit:4.12",
],
repositories = [
"https://repo1.maven.org/maven2",
],
)
maven.artifact(
group = "com.google.guava",
artifact = "guava",
version = "27.0-android",
exclusions = [],
)
There are two things to note here:
maven
is the value returned bymodule_extension
, and bothinstall
andartifact
are both tags classes.- The module extension’s implementation function aggregates the data from both of these into a single data structure, which is then resolved.
While this allows rules_jvm_external
to express what needs to be
said, it’s pretty clear that if the artifact
tag needed a macro
itself, we’d rapidly be in a whole world of pain. Fortunately, in my
case, we don’t, so that’s fine 😀
Module Implementation Functions Replace Workspace Stanzas
A module_extension’s implementation function gets access to a pretty anaemic module_ctx. This would be a problem, but the implementation function is free to call as many repository_rules as it wants to. You can also rely on the rulesets declared as a bazel_dep to be present.
This allows the module implementation function to effectively contain the bulk of what would normally be in the stanzas of code that get added to a workspace file.
Care must be taken to avoid the need to call load
in the
implementation function: although the implementation is a lot like a
subset of a workspace file, it’s not exactly the same. In the case of
what I wanted to do for rules_jvm_external
, this means that the lock
file gets parsed at least twice: once so that I can generate a series
of http_file
dependencies, and once so that the actual @maven
workspace can be set up.
bzlmod
Lock Files
At some point in the future, bzlmod
will have its own lock
file. I’m not entirely sure what this will look like, but
my belief is that this will be structured in such a way that your
module implementation function should only be called when one of its
inputs (including attributes of tags) has changed. In the case of
rules_jvm_external
, this will most likely be when the artifacts
being imported into your project changes.
This will be incredibly useful. One of the painful things when working with workspace files is waiting for all the transitive deps to download and be set up so Bazel can figure out which bits of them it needs to use. On larger projects, this can take a long time. Being able to start building faster, can only be a Good Thing, and I look forward to it!
Managing Your MODULE.bazel
Right now, you can’t. Because there’s no way of calling load
in a
MODULE.bazel
file, there’s no way to segment the thing in a
meaningful way. I’ve raised an issue to do with this, and I
know it’s something the Bazel developers are aware of, so I hope that
once bzlmod
is no longer hidden behind a flag, this will be
possible.
Final Thoughts
Overall, after kicking the tyres and trying it out, I think that I
like bzlmod
, and it’ll be fun to see how it grows and changes,
especially as rulesets migrate to using it.
Right now, it’s usable, but there are some corner cases where it’s not quite there yet (notably when a module declares a dependency on a repository via a generated build file) Having seen how quickly the Bazel team have leaped on the issues I’ve filed, I’m very confident that problem will be resolved.
My advice? Try migrating your ruleset to bzlmod
, and see what works
for you and what doesn’t. I suspect there’s enough there for it to
work just as you’d expect in many cases.
My thanks to the Bazel developers, Xudong Yang and Alex Eagle for
their help as I delved into bzlmod
. Alex’s blog post
gave me the incentive to start digging into bzlmod
and provided
enough scaffolding for me to get started. It proved invaluable!