up code to have a wildcard import from an internal module:
mod thing;
pub use thing::*;
However, there’s another common exception where wildcard imports make sense.
Some crates have a convention that common items for the crate are re-exported from
a prelude module, which is explicitly intended to be wildcard imported:
use thing::prelude::*;
Although in theory the same concerns apply in this case, in practice such a prelude
module is likely to be carefully curated, and higher convenience may outweigh a
small risk of future problems.
Finally, if you don’t follow the advice in this Item, consider pinning dependencies that
you wildcard import to a precise versiont minor version upgrades of the dependency aren’t automatically allowed.
Item 23: Avoid wildcard imports | 187
Item 24: Re-export dependencies whose types
appear in your API
The title of this Item is a little convoluted, but working through an example will make
things clearer
describes how cargo supports different versions of the same library crate
being linked into a single binary, in a transparent manner. Consider a binary that uses
crate—more specifically, one that uses some 0.8 version of the crate:
# Cargo.toml file for a top-level binary crate.
[dependencies]
# The binary depends on thèrand` crate from crates.io
rand = "=0.8.5"
# It also depends on some other crate (`dep-lib`).
dep-lib = "0.1.0"
// Source code:
let mut rng = rand::thread_rng(); // rand 0.8
let max: usize = rng.gen_range(5..10);
let choice = dep_lib::pick_number(max);
The final line of code also uses a notional dep-lib crate as another dependency. This
crate might be another crate from crates.io, or it could be a local crate that is loca‐
ted via Cargo’
This dep-lib crate internally uses a 0.7 version of the rand crate:
# Cargo.toml file for thèdep-lib` library crate.
[dependencies]
# The library depends on thèrand` crate from crates.io
rand = "=0.7.3"
// Source code:
//! Thèdep-lib` crate provides number picking functionality.
use rand::Rng;
/// Pick a number between 0 and n (exclusive).
pub fn pick_number(n: usize) -> usize {
rand::thread_rng().gen_range(0, n)
}
An eagle-eyed reader might notice a difference between the two code examples:
• In version 0.7.x of rand (as used by the dep-lib library crate), the
method takes two parameters, low and high.
3 This example (and indeed Item) is inspired by the a
188 | Chapter 4: Dependencies
• In version 0.8.x of rand (as used by the binary crate), the
method takes a single parameter range.
This is not a back-compatible change, and so rand has increased its leftmost version
component accordingly, as required by seman). Nevertheless,
the binary that combines the two incompatible versions works just fine—cargo sorts
everything out.
However, things get a lot more awkward if the dep-lib library crate’s API exposes a
type from its dependency, making that dependency a
For example, suppose that the dep-lib entrypoint involves an Rng item—but specifi‐
cally a version-0.7 Rng item:
/// Pick a number between 0 and n (exclusive) using
/// the provided `Rngìnstance.
pub fn pick_number_with<R: Rng>(rng: & mut R, n: usize) -> usize {
rng.gen_range(0, n) // Method from the 0.7.x version of Rng
}
As an aside, think careful y before using another crate’s types in your API: it intimately
ties your crate to that of the dependency. For example, a major version bump for the
dependency (utomatically require a major version bump for your crate
too.
In this case, rand is a semi-standard crate that is widely used and pulls in only a small
n), so including its types in the crate API is probably fine on balance.
Returning to the example, an attempt to use this entrypoint from the top-level binary
fails:
D O E S N O T C O M P I L E
let mut rng = rand::thread_rng();
let max: usize = rng.gen_range(5..10);
let choice = dep_lib::pick_number_with(& mut rng, max);
Unusually for Rust, the compiler error message isn’t : error[E0277]: the trait bound `ThreadRng: rand_core::RngCoreìs not satisfied
--> src/main.rs:22:44
|
22 | let choice = dep_lib::pick_number_with(&mut rng, max);
| ------------------------- ^^^^^^^^ the trait
| | `rand_core::RngCoreìs not
| | implemented for `ThreadRng`
| |
Item 24: Re-export dependencies whose types appear in your API | 189
| required by a bound introduced by this call
|
= help: the following other types implement trait `rand_core::RngCorè:
&'a mut R
Investigating the types involved leads to confusion because the relevant traits do
appear to be implemented—but the caller actually implements a (notional)
RngCore_v0_8_5 and the library is expecting an implementation of RngCore_v0_7_3.
Once you’ve finally deciphered the error message and realized that the version clash
is the underlying cause, how can you fix it? vation is to realize that
while the binary can’t directly use two different versions of the same crate, it can do so
indirectly (as in the original example shown previously).
From the perspective of the binary author, the problem could be worked around by
adding an intermediate wrapper crate that hides the naked use of rand v0.7 types. A
wrapper crate is distinct from the binary crate and so is allowed to depend on rand
v0.7 separately from the binary crate’s dependency on rand v0.8.
This is awkward, and a much better approach is available to the author of the library
crate. It can make life easier for its users by explicitly either of the following:
• The types involved in the API
• The entire dependency crate
For this example, the latter approach works best: as well as making the version 0.7 Rng
and RngCore types available, it also makes available the methods (like thread_rng())
that construct instances of the type:
// Re-export the version of `randùsed in this crate's API.
pub use rand;
The calling code now has a different way to directly refer to version 0.7 of rand, as
dep_lib::rand:
let mut prev_rng = dep_lib::rand::thread_rng(); // v0.7 Rng instance
let choice = dep_lib::pick_number_with(& mut prev_rng, max);
With this example in mind, the advice given in the title of the Item should now be a
little less obscure: re-export dependencies whose types appear in your API.
4 This kind of error can even appear when the dependency graph includes two alternatives for a crate with the same version, when something in the build graph uses the y a local directory instead of a crates.io location.
190 | Chapter 4: Dependencies
Item 25: Manage your dependency graph
Like most modern programming languages, Rust makes it easy to pull in external
libraries, in the form of crates. Most nontrivial Rust programs use external crates, and
those crates may themselves have additional dependencies, forming a dependency
graph for the program as a whole.
By default, Cargo will download any crates named in the [dependencies] section of
your Cargo.toml file from and find versions of those crates that match the requirements configured in Cargo.toml.
A few subtleties lurk underneath this simple statement. The first thing to notice is
that crate names from crates.io form a single flat namespace—and this global
namespace also overlaps with the names of features in a crate (see
If you’re planning on publishing a crate on crates.io, be aware that names are gener‐
ally allocated on a first-come, first-served basis; so you may find that your preferred
name for a public crate is already taken. However, name-squatting—reserving a crate
name by preregistering an empty cra
to release code in the near future.
As a minor wrinkle, there’s also a slight difference between what’s allowed as a crate
name in the crates namespace and what’s allowed as an identifier in code: a crate can
be named some-crate, but it will appear in code as some_crate (with an underscore).
To put it another way: if you see some_crate in code, the corresponding crate name
may be either some-crate or some_crate.
The second subtlety to understand is that Cargo allows multiple semver-incompatible
versions of the same crate to be present in the build. This can seem surprising to
begin with, because each Cargo.toml file can have only a single version of any given
dependency, but the situation frequently arises with indirect dependencies: your crate
depends on some-crate version 3.x but also depends on older-crate, which in turn
depends on some-crate version 1.x.
This can lead to confusion if the dependency is exposed in some way rather than just
being used internally (piler will treat the two versions as being dis‐
tinct crates, but its error messages won’t necessarily make that clear.
Allowing multiple versions of a crate can also go wrong if the crate includes C/C++
code accessed via Rust’The Rust toolchain can internally
disambiguate distinct versions of Rust code, but any included C/C++ code is subject
5 It’tes (for example, an internal corporate registry). Each dependency entry in Cargo.toml can then use the registry key to indicate which registry a dependency should be sourced from.
Item 25: Manage your dependency graph | 191
to the : there can be only a single version of any function, constant, or global variable.
There are restrictions on Cargo’s multiple-version support. Cargo does not allow mul‐
tiple versions of the same crate within a semver-compa):
• some-crate 1.2 and some-crate 3.1 can coexist
• some-crate 1.2 and some-crate 1.3 cannot
Cargo also extends the semantic versioning rules for pre-1.0 crates so that the first
non-zero subversion counts like a major version, so a similar constraint applies:
• other-crate 0.1.2 and other-crate 0.2.0 can coexist
• other-crate 0.1.2 and other-crate 0.1.4 cannot
Cargo’ does the job of figuring out what versions to include. Each Cargo.toml dependency line specifies an acceptable range of versions,
according to semantic versioning rules, and Cargo takes this into account when the
same crate appears in multiple places in the dependency graph. If the acceptable
ranges overlap and are semver-compatible, then Cargo will (by default) pick the most
recent version of the crate within the overlap. If there is no semver-compatible over‐
lap, then Cargo will build multiple copies of the dependency at different versions.
Once Cargo has picked acceptable versions for all dependencies, its choices are recor‐
ded in the Cargo.lock file. Subsequent builds will then reuse the choices encoded in
Cargo.lock so that the build is stable and no new downloads are needed.
This leaves you with a choice: should you commit your Cargo.lock files into version
con is as follows:
• Things that produce a final product, namely applications and binaries, should
commit Cargo.lock to ensure a deterministic build.
• Library crates should not commit a Cargo.lock file, because it’s irrelevant to any
downstream consumers of the library—they will have their own Cargo.lock file;
be aware that the Cargo.lock file for a library crate is ignored by library users.
Even for a library crate, it can be helpful to have a checked-in Cargo.lock file to ensure
that regular builds and CI (’t have a moving target. Although the prom‐
ises of semantic versioning () should prevent failures in theory, mistakes hap‐
pen in practice, and it’s frustrating to have builds that fail because someone
somewhere recently changed a dependency of a dependency.
192 | Chapter 4: Dependencies
However, if you version-control Cargo.lock, set up a process to handle upgrades (such as GitHub’s ’t, your dependencies will stay pinned to versions that get older, outdated, and potentially insecure.
Pinning versions with a checked-in Cargo.lock file doesn’t avoid the pain of handling
dependency upgrades, but it does mean that you can handle them at a time of your
own choosing, rather than immediately when the upstream crate changes. There’s
also some fraction of dependency-upgrade problems that go away on their own: a
crate that’s released with a problem often gets a second, fixed, version released in a
short space of time, and a batched upgrade process might see only the latter version.
The third subtlety of Cargo’s resolution process to be aware of is feature unification:
the features that get activated for a dependent crate are the union of the features
selected by different places in the dependency graph; see for more details.
Version Specification
The version specification clause for a dependency defines a range of allowed versions,
:
Avoid a too-specific version dependency
Pinning to a specific version ("=1.2.3") is usual y a bad idea: you don’t see newer
versions (potentially including security fixes), and you dramatically narrow the
potential overlap range with other crates in the graph that rely on the same
dependency (recall that Cargo allows only a single version of a crate to be used
within a semver-compatible range). If you want to ensure that your builds use a
consistent set of dependencies, the Cargo.lock file is the tool for the job.
Avoid a too-general version dependency
It’s possible to specify a version dependency ("*") that allows any version of the dependency to be used, but it’s a bad idea. If the dependency releases a new major
version of the crate that completely changes every aspect of its API, it’s unlikely
that your code will still work after a cargo update pulls in the new version.
The most common Goldilocks specification—not too precise, not too vague—is to
allow semver-compatible versions ("1") of a crate, possibly with a specific minimum
version that includes a feature or fix that you require ("1.4.23"). Both of these ver‐
sion specifications make use of Cargo’s default behavior, which is to allow versions
that are semver-compatible with the specified version. You can make this more
explicit by adding a caret:
• A version of "1" is equivalent to "^1", which allows all 1.x versions (and so is also equivalent to "1.*").
• A version of "1.4.23" is equivalent to "^1.4.23", which allows any 1.x versions
that are larger than 1.4.23.
Item 25: Manage your dependency graph | 193
Solving Problems with Tooling
t you take advantage of the range of tools that are available
within the Rust ecosystem. This section describes some dependency graph problems
where tools can help.
The compiler will tell you pretty quickly if you use a dependency in your code but
don’t include that dependency in Cargo.toml. But what about the other way around? If
there’s a dependency in Cargo.toml that you don’t use in your code—or more likely, no
longer use in your code—then Cargo will go on with its business. The
tool is designed to solve exactly this problem: it warns you when your Cargo.toml
includes an unused dependency (“udep”).
A more versa, which analyzes your dependency graph to detect a variety of potential problems across the full set of transitive dependencies:
• Dependencies that have known security problems in the included version
• Dependencies that are covered by an unacceptable license
• Dependencies that are just unacceptable
• Dependencies that are included in multiple different versions across the depend‐
ency tree
Each of these features can be configured and can have exceptions specified. The
exception mechanism is usually needed for larger projects, particularly the multiple-
version warning: as the dependency graph grows, so does the chance of transitively
depending on different versions of the same crate. It’s worth trying to reduce these
duplicates where possible—for binary-size and compilation-time reasons if nothing
else—but sometimes there is no possible combination of dependency versions that
can avoid a duplicate.
These tools can be run as a one-off, but it’s better to ensure they’re executed regularly
and reliably by including them in your CI system (This helps to catch newly
introduced problems—including problems that may have been introduced outside of
your code, in an upstream dependency (for example, a newly reported vulnerability).
If one of these tools does report a problem, it can be difficult to figure out exactly
where in the dependency graph the problem arises. The command that’s included with cargo helps here, as it shows the dependency graph as a tree structure:
dep-graph v0.1.0
├── dep-lib v0.1.0
│ └── rand v0.7.3
│ ├── getrandom v0.1.16
│ │ ├── cfg-if v1.0.0
│ │ └── libc v0.2.94
│ ├── libc v0.2.94
194 | Chapter 4: Dependencies
│ ├── rand_chacha v0.2.2
│ │ ├── ppv-lite86 v0.2.10
│ │ └── rand_core v0.5.1
│ │ └── getrandom v0.1.16 (*)
│ └── rand_core v0.5.1 (*)
└── rand v0.8.3
├── libc v0.2.94
├── rand_chacha v0.3.0
│ ├── ppv-lite86 v0.2.10
│ └── rand_core v0.6.2
│ └── getrandom v0.2.3
│ ├── cfg-if v1.0.0
│ └── libc v0.2.94
└── rand_core v0.6.2 (*)
cargo tree includes a variety of options that can help to solve specific problems,
such as these:
--invert
Shows what depends on a specific package, helping you to focus on a particular
problematic dependency
--edges features
Shows what crate features are activated by a dependency link, which helps you
figure out what’s going on with feature unification ()
--duplicates
Shows crates that have multiple versions present in the dependency graph
What to Depend On
The previous sections have covered the more mechanical aspect of working with
dependencies, but there’s a more philosophical (and therefore harder-to-answer)
question: when should you take on a dependency?
Most of the time, there’s not much of a decision involved: if you need the functional‐
ity of a crate, you need that function, and the only alternative would be to write it
But every new dependency has a cost, partly in terms of longer builds and bigger
binaries but mostly in terms of the developer effort involved in fixing problems with
dependencies when they arise.
6 If you are targeting a no_std environment, this choice may be made for you: many crates are not compatible with no_std, particularly if alloc is also una
Item 25: Manage your dependency graph | 195
The bigger your dependency graph, the more likely you are to be exposed to these
kinds of problems. The Rust crate ecosystem is just as vulnerable to accidental
dependency problems as other package ecosystems, where history has shown that
, or a team can have widespread knock-on effects.
More worrying still are supply chain attacks, where a malicious actor deliberately
ted attacks.
This kind of attack doesn’t just affect your compiled code—be aware that a depend‐
ency can run arbitrary code at build time, via scripts or procedural macros
). That means that a compromised dependency could end up running a cryp‐
tocurrency miner as part of your CI system!
So for dependencies that are more “cosmetic,” it’s sometimes worth considering
whether adding the dependency is worth the cost.
The answer is usually “yes,” though; in the end, the amount of time spent dealing with
dependency problems ends up being much less than the time it would take to write
equivalent functionality from scratch.
Things to Remember
• Crate names on crates.io form a single flat namespace (which is shared with
feature names).
• Crate names can include a hyphen, but it will appear as an underscore in code.
• Cargo supports multiple versions of the same crate in the dependency graph, but
only if they are of different semver-incompatible versions. This can go wrong for
crates that include FFI code.
• Prefer to allow semver-compatible versions of dependencies ("1", or "1.4.23" to
include a minimum version).
• Use Cargo.lock files to ensure your builds are repeatable, but remember that the
Cargo.lock file does not ship with a published crate.
• Use tooling (cargo tree, cargo deny, cargo udep, …) to help find and fix
dependency problems.
• Understand that pulling in dependencies saves you writing code but doesn’t come
for free.
196 | Chapter 4: Dependencies
Item 26: Be wary of feature creep
Rust allows the same codebase to support a variety of different configurations via
Cargo’s feature mechanism, which is built on top of a lower-level mechanism for con‐
ditional compilation. However, the feature mechanism has a few subtleties to be
aware of, which this Item explores.
Conditional Compilation
R, which is controlled by (and
ttributes. These attributes govern whether the thing—function, line, block, etc.—that they are attached to is included in the compiled source code or not
(which is in contrast to C/C++’s line-based preprocessor). The conditional inclusion
is controlled by configuration options that are either plain names (e.g., test) or pairs
of names and values (e.g., panic = "abort").
Note that the name/value variants of config options are multivalued—it’s possible to
set more than one value for the same name:
// Build with `RUSTFLAGS` set to:
// '--cfg myname="a" --cfg myname="b"'
#[cfg(myname = "a")]
println!("cfg(myname = 'a') is set");
#[cfg(myname = "b")]
println!("cfg(myname = 'b') is set");
cfg(myname = 'a') is set
cfg(myname = 'b') is set
Other than the feature values described in this section, the most commonly used
config values are those that the toolchain populates automatically, with values that
describe the target environment for the build. These include the OS (
), pointer width (), and endianness (). This allows for code portability, where features that are specific to some particular target are compiled in only when building for that target.
The standard ple of the multivalued nature of config values: both [cfg(target_has_atomic = "32")] and
[cfg(target_has_atomic = "64")] will be set for targets that support both 32-bit
and 64-bit atomic operations. (For more information on atomics, see Chapter 2 of
Mara Bos’’Reilly].)
Item 26: Be wary of feature creep | 197
Features
package manager builds on this base cfg name/value mechanism to pro‐
te that can be enabled when building the crate. Cargo ensures that the feature option is
populated with each of the configured values for each crate that it compiles, and the
values are crate-specific.
This is Cargo-specific functionality: to the Rust compiler, feature is just another
configuration option.
At the time of writing, the most reliable way to determine what features are available
for a crate is to examine the crate’s manifest file. For example, the following chunk of a manifest file includes six features:
[features]
default = ["featureA"]
featureA = []
featureB = []
# Enabling `featureABàlso enables `featureAànd `featureB`.
featureAB = ["featureA", "featureB"]
schema = []
[dependencies]
rand = { version = "^0.8", optional = true }
hex = "^0.4"
Given that there are only five entries in the [features] stanza; there are clearly a
couple of subtleties to watch out for.
The first is that the default line in the [features] stanza is a special feature name,
used to indicate to cargo which of the features should be enabled by default. These
features can still be disabled by passing the --no-default-features flag to the build
command, and a consumer of the crate can encode this in their Cargo.toml file like so:
[dependencies]
somecrate = { version = "^0.3", default-features = false }
However, default still counts as a feature name, which can be tested in code:
#[cfg(feature = "default")]
println!("This crate was built with the \" default\" feature enabled.");
#[cfg(not(feature = "default"))]
println!("This crate was built with the \" default\" feature disabled."); The second subtlety of feature definitions is hidden in the [dependencies] section of
the original Cargo.toml example: the rand crate is a dependency that is marked as
198 | Chapter 4: Dependencies
optional = true, and that effectively makes "rand" into the name of a feaIf the crate is compiled with --features rand, then that dependency is activated:
#[cfg(feature = "rand")]
pub fn pick_a_number() -> u8 {
rand::random::< u8>()
}
#[cfg(not(feature = "rand"))]
pub fn pick_a_number() -> u8 {
4 // chosen by fair dice roll.
}
This also means that crate names and feature names share a namespace, even though
one is typically global (and usually governed by crates.io), and one is local to the
crate in question. Consequently, choose feature names careful y to avoid clashes with
the names of any crates that might be relevant as potential dependencies. It is possible
to work around a clash, beca
package key), but it’s easier not to have to.
So you can determine a crate’s features by examining [features] as well as optional
[dependencies] in the crate’s Cargo.toml file. To turn on a feature of a dependency,
add the features option to the relevant line in the [dependencies] stanza of your
own manifest file:
[dependencies]
somecrate = { version = "^0.3", features = ["featureA", "rand" ] }
This line ensures that somecrate will be built with both the featureA and the rand
feature enabled. However, that might not be the only features that are enabled; other
features ma
means that a crate will get built with the union of all of the features that are requested
by anything in the build graph. In other words, if some other dependency in the build
graph also relies on somecrate, but with just featureB enabled, then the crate will be
built with all of featureA, featureB, and rand enabled, to satisfy ever
consideration applies to default features: if your crate sets default-features =
false for a dependency but some other place in the build graph leaves the default fea‐
tures enabled, then enabled they will be.
7 This default behavior can be disabled by using a "dep:<crate>" reference elsewhere in the features stanza;
for details.
8 The cargo tree --edges features command can help with determining which features are enabled for
which crates, and why.
Item 26: Be wary of feature creep | 199
Feature unification means that features should be additive; it’s a bad idea to have mutually incompatible features because there’s nothing to prevent the incompatible
features being simultaneously enabled by different users.
For example, if a crate exposes a struct and its fields publicly, it’s a bad idea to make
the fields feature-dependent:
U N D E S I R E D B E H A V I O R
/// A structure whose contents are public, so external users can construct
/// instances of it.
#[derive(Debug)]
pub struct ExposedStruct {
pub data: Vec< u8>,
/// Additional data that is required only when thèschemà feature
/// is enabled.
#[cfg(feature = "schema")]
pub schema: String,
}
A user of the crate that tries to build an instance of the struct has a quandary:
should they fill in the schema field or not? One way to try to solve this is to add a
corresponding feature in the user’s Cargo.toml:
[features]
# The ùse-schemà feature here turns on thèschemà feature of `somecratè.
# (This example uses different feature names for clarity; real code is more
# likely to reuse the feature names across both places.)
use-schema = ["somecrate/schema"]
and to make the struct construction depend on this feature:
U N D E S I R E D B E H A V I O R
let s = somecrate::ExposedStruct {
data: vec![0x82, 0x01, 0x01],
// Only populate the field if we've requested
// activation of `somecrate/schemà.
#[cfg(feature = "use_schema")]
schema: "[int int]",
};
However, this doesn’t cover all eventualities: the code will fail to compile if this code
doesn’t activate somecrate/schema but some other transitive dependency does. The
core of the problem is that only the crate that has the feature can check the feature;
200 | Chapter 4: Dependencies
there’s no way for the user of the crate to determine whether Cargo has turned on
somecrate/schema or not. As a result, you should avoid feature-gating public fields in
structures.
A similar consideration applies to public traits, intended to be used outside the crate
they’re defined in. Consider a trait that includes a feature gate on one of its methods:
U N D E S I R E D B E H A V I O R
/// Trait for items that support CBOR serialization.
pub trait AsCbor: Sized {
/// Convert the item into CBOR-serialized data.
fn serialize(&self) -> Result<Vec< u8>, Error>;
/// Create an instance of the item from CBOR-serialized data.
fn deserialize(data: &[u8]) -> Result<Self, Error>;
/// Return the schema corresponding to this item.
#[cfg(feature = "schema")]
fn cddl(&self) -> String;
}
External trait implementors again have a quandary: should they implement the
cddl(&self) method or not? The external code that tries to implement the trait
doesn’t know—and can’t tell—whether to implement the feature-gated method or not.
So the net is that you should avoid feature-gating methods on public traits. A trait
method with a default implementa) might be a partial exception to
this—but only if it never makes sense for external code to override the default.
Feature unification also means that if your crate has N independent features, then all of the 2N possible build combinations can occur in practice. To avoid unpleasant surprises, it’s a good idea to ensure that your CI system (N
combinations, in all of the available test varian).
However, the use of optional features is very helpful in controlling exposure to an
expanded dependency gra). This is particularly useful in low-level crates
that are capable of being used in a no_std environmen’s common to
have a std or alloc feature that turns on functionality that relies on those libraries.
9 Features can force other features to be enabled; in the original example, the featureAB feature forces both featureA and featureB to be enabled.
Item 26: Be wary of feature creep | 201
Things to Remember
• Feature names overlap with dependency names.
• Feature names should be carefully chosen so they don’t clash with potential
dependency names.
• Features should be additive.
• Avoid feature gates on public struct fields or trait methods.
• Having lots of independent features potentially leads to a combinatorial explo‐
sion of different build configurations.
202 | Chapter 4: Dependencies
CHAPTER 5
Tooling
Titus Winters (Google’s C++ library lead) describes software engineering as program‐
ming integrated over time, or sometimes as programming integrated over time and
people. Over longer timescales, and a wider team, there’s more to a codebase than just
the code held within it.
Modern languages, including Rust, are aware of this and come with an ecosystem of
tooling that goes way beyond just converting the program into executable binary
code (the compiler).
This chapter explores the Rust tooling ecosystem, with a general recommendation to
make use of all of this infrastructure. Obviously, doing so needs to be proportionate—
setting up CI, documentation builds, and six types of test would be overkill for a
throwaway program that is run only twice. But for most of the things described in
this chapter, there’s lots of “bang for the buck”: a little bit of investment into tooling
integration will yield worthwhile benefits.
Item 27: Document public interfaces
If your crate is going to be used by other programmers, then it’s a good idea to add
documentation for its contents, particularly its public API. If your crate is more than
just ephemeral, throwaway code, then that “other programmer” includes the you-of-
the-future, when you have forgotten the details of your current code.
This is not advice that’s specific to Rust, nor is it new advice—for exam
tem 44: “Write doc comments for all exposed API elements.”
The particulars of Rust’s documentation comment format—Markdown-based,
delimited with /// or //!—are covered in the ple:
203
/// Calculate the [`BoundingBox`] that exactly encompasses a pair
/// of [`BoundingBox`] objects.
pub fn union(a: & BoundingBox, b: & BoundingBox) -> BoundingBox {
// ...
}
However, there are some specific details about the format that are worth highlighting:
Use a code font for code
For anything that would be typed into source code as is, surround it with back-
quotes to ensure that the resulting documentation is in a fixed-width font, mak‐
ing the distinction between code and text clear.
Add copious cross-references
Add a Markdown link for anything that might provide context for someone read‐
ing the documentation. In particular, cross-reference identifiers with the conve‐
nient [`SomeThing`] syntax—if SomeThing is in scope, then the resulting
documentation will hyperlink to the right place.
Consider including example code
If it’s not trivially obvious how to use an entrypoint, adding an # Examples sec‐
tion with sample code can be helpful. Note that sample code in
gets compiled and executed when you run cargo test (see
it stay in sync with the code it’s demonstrating.
Document panics and unsafe constraints
If there are inputs that cause a function to panic, document (in a # Panics sec‐
tion) the preconditions that are required to avoid the panic!. Similarly, docu‐
ment (in a # Safety section) any requirements for unsafe code.
The documentation for Rust’s provides an excellent example to emu-late for all of these details.
Tooling
The Markdown format that’s used for documentation comments results in elegant
output, but this also means that there is an explicit conversion step (cargo doc). This
in turn raises the possibility that something goes wrong along the way.
The simplest advice for this is just to read the rendered documentation after writing it,
by running cargo doc --open (or cargo doc --no-deps --open to restrict the gen‐
erated documentation to just the current crate).
204 | Chapter 5: Tooling
You could also check that all the generated hyperlinks are valid, but that’s a job more
suited to a machine—via the broken_intra_doc_links crate a
U N D E S I R E D B E H A V I O R
#![deny(broken_intra_doc_links)]
/// The bounding box for a [`Polygonè].
#[derive(Clone, Debug)]
pub struct BoundingBox {
// ...
}
With this attribute enabled, cargo doc will detect invalid links:
error: unresolved link tòPolygonè
--> docs/src/main.rs:4:30
|
4 | /// The bounding box for a [`Polygonè].
| ^^^^^^^^ no item named `Polygoneìn scope
|
You can also require documentation, by enabling the #![warn(missing_docs)]
attribute for the crate. When this is enabled, the compiler will emit a warning for
every undocumented public item. However, there’s a risk that enabling this option
will lead to poor-quality documentation comments that are rushed out just to get the
compiler to shut up—more on this to come.
As ever, any tooling that detects potential problems should form a part of your CI
), to catch any regressions that creep in.
Additional Documentation Locations
The output from cargo doc is the primary place where your crate is documented, but
it’s not the only place—other parts of a Cargo project can help users figure out how to
use your code.
The examples/ subdirectory of a Cargo project can hold the code for standalone
binaries that make use of your crate. These programs are built and run very similarly
to integration tests () but are specifically intended to hold example code that
illustrates the correct use of your crate’s interface.
1 Historically, this option used to be called intra_doc_link_resolution_failure.
Item 27: Document public interfaces | 205
On a related note, bear in mind that the integration tests under the tests/ subdirec‐
tory can also serve as examples for the confused user, even though their primary pur‐
pose is to test the crate’s external interface.
Published Crate Documentation
If you publish your crate to crates.io, the documentation for your project will be
visible aust project that builds and hosts documentation for published crates.
Note that crates.io and docs.rs are intended for slightly different audiences:
crates.io is aimed at people who are choosing what crate to use, whereas docs.rs is
intended for people figuring out how to use a crate they’ve already included
(although there’s obviously considerable overlap between the two).
As a result, the home page for a crate shows different content in each location:
docs.rs
Shows the top-level page from the output of cargo doc, as generated from //!
comments in the top-level src/lib.rs file.
crates.io
Shows the content of any top-level README.md file that’s included in the proj‐
ect’s repo
What Not to Document
When a project requires that documentation be included for all public items (as men‐
tioned in the first section), it’s very easy to fall into the trap of having documentation
that’s a pointless waste of valuable pixels. Having the compiler warn about missing
doc comments is only a proxy for what you really want—useful documentation—and
is likely to incentivize programmers to do the minimum needed to silence the
warning.
Good doc comments are a boon that helps users understand the code they’re using;
bad doc comments impose a maintenance burden and increase the chance of user
confusion when they get out of sync with the code. So how to distinguish between the
two?
The primary advice is to avoid repeating in text something that’s clear from the code.
exhorted you to encode as much semantics as possible into Rust’s type system;
once you’ve done that, allow the type system to document those semantics. Assume
2 The default behavior of automatically including README.md can be overridden with the
.
206 | Chapter 5: Tooling
that the reader is familiar with Rust—possibly because they’ve read a helpful collec‐
tion of Items describing effective use of the language—and don’t repeat things that are
clear from the signatures and types involved.
Returning to the previous example, an overly verbose documentation comment
might be as follows:
U N D E S I R E D B E H A V I O R
/// Return a new [`BoundingBox`] object that exactly encompasses a pair
/// of [`BoundingBox`] objects.
///
/// Parameters:
/// - à`: an immutable reference to àBoundingBox`
/// - `b`: an immutable reference to àBoundingBox`
/// Returns: new `BoundingBoxòbject.
pub fn union(a: & BoundingBox, b: & BoundingBox) -> BoundingBox {
This comment repeats many details that are clear from the function signature, to no
benefit.
Worse, consider what’s likely to happen if the code gets refactored to store the result
in one of the original argumenNo
compiler or tool complains that the comment isn’t updated to match, so it’s easy to
end up with an out-of-sync comment:
U N D E S I R E D B E H A V I O R
/// Return a new [`BoundingBox`] object that exactly encompasses a pair
/// of [`BoundingBox`] objects.
///
/// Parameters:
/// - à`: an immutable reference to àBoundingBox`
/// - `b`: an immutable reference to àBoundingBox`
/// Returns: new `BoundingBoxòbject.
pub fn union(a: & mut BoundingBox, b: & BoundingBox) {
In contrast, the original comment survives the refactoring unscathed, because its text
describes behavior, not syntactic details:
/// Calculate the [`BoundingBox`] that exactly encompasses a pair
/// of [`BoundingBox`] objects.
pub fn union(a: & mut BoundingBox, b: & BoundingBox) {
The mirror image of the preceding advice also helps improve documentation: include
in text anything that’s not clear from the code. This includes preconditions, invariants,
panics, error conditions, and anything else that might surprise a user; if your code
Item 27: Document public interfaces | 207
can’t comt the surprises are documented so you can at least say, “I told you so.”
Another common failure mode is when doc comments describe how some other code
uses a method, rather than what the method does:
/// Return the intersection of two [`BoundingBox`] objects, returning `Nonè
/// if there is no intersection. The collision detection code in `hits.rs`
/// uses this to do an initial check to see whether two objects might overlap,
/// before performing the more expensive pixel-by-pixel check in
/// òbjects_overlap`.
pub fn intersection(
a: & BoundingBox,
b: & BoundingBox,
) -> Option<BoundingBox> {
Comments like this are almost guaranteed to get out of sync: when the using code
(here, hits.rs) changes, the comment that describes the behavior is nowhere nearby.
Rewording the comment to focus more on the why makes it more robust to future
changes:
/// Return the intersection of two [`BoundingBox`] objects, returning `Nonè
/// if there is no intersection. Note that intersection of bounding boxes
/// is necessary but not sufficient for object collision -- pixel-by-pixel
/// checks are still required on overlap.
pub fn intersection(
a: & BoundingBox,
b: & BoundingBox,
) -> Option<BoundingBox> {
When writing software, it’s good advice to “program in the future tensestructure
the code to accommodate future changes. The same principle is true for documenta‐
tion: focusing on the semantics, the whys and the why nots, gives text that is more
likely to remain helpful in the long run.
Things to Remember
• Add doc comments for public API items.
• Describe aspects of the code—such as panics and safety criteria—that aren’t obvi‐
ous from the code itself.
• Don’t describe things that are obvious from the code itself.
• Make navigation clearer by providing cross-references and by making identifiers
stand out.
3 Scott Meyers, More Effective C++ (Addison-Wesley), Item 32.
208 | Chapter 5: Tooling
Item 28: Use macros judiciously
In some cases it’s easy to decide to write a macro instead of a function, because only a macro
can do what’s needed.
—Paul Graham, On Lisp (Prentice Hall)
Rust’s macro systems allow you to perform metaprogramming: to write code that
emits code into your project. This is most valuable when there are chunks of “boiler‐
plate” code that are deterministic and repetitive and that would otherwise need to be
kept in sync manually.
Programmers coming to Rust may have previously encountered the macros provided
by C/C++’s preprocessor, which perform textual substitution on the tokens of the
input text. Rust’s macros are a different beast, because they work on either the parsed
tokens of the program or on the abstract syntax tree (AST) of the program, rather
than just its textual content.
This means Rust macros can be aware of code structure and can consequently avoid
entire classes of macro-related footguns. In particular, we see in the following section
that Rust’s declarative macros are —they cannot accidentally refer to (“capture”) local variables in the surrounding code.
One way to think about macros is to see them as a different level of abstraction in the
code. A simple form of abstraction is a function: it abstracts away the differences
between different values of the same type, with implementation code that can use any
of the features and methods of that type, regardless of the current value being oper‐
ated on. A generic is a different level of abstraction: it abstracts away the difference
between different types that satisfy a trait bound, with implementation code that can
use any of the methods provided by the trait bounds, regardless of the current type
being operated on.
A macro abstracts away the difference between different fragments of the program
that play the same role (type, identifier, expression, etc.); the implementation can
then include any code that makes use of those fragments in the same role.
Rust provides two ways to define macros:
• Declarative macros, also known as “macros by example,” allow the insertion of
arbitrary Rust code into the program, based on the input parameters to the
macro (which are categorized according to their role in the AST).
• Procedural macros allow the insertion of arbitrary Rust code into the program,
based on the parsed tokens of the source code. This is most commonly used for
derive macros, which can generate code based on the contents of data structure
definitions.
Item 28: Use macros judiciously | 209
Declarative Macros
Although this Item isn’
, a few reminders of details to watch out for are in order.
First, be aware that the scoping rules for using a declarative macro are different than
for other Rust items. If a declarative macro is defined in a source code file, only the
code after the macro definition can make use of it:
D O E S N O T C O M P I L E
fn before() {
println!("[before] square {} is {}", 2, square!(2));
}
/// Macro that squares its argument.
macro_rules! square {
{ $e:expr } => { $e * $e }
}
fn after() {
println!("[after] square {} is {}", 2, square!(2));
}
error: cannot find macròsquareìn this scope
--> src/main.rs:4:45
|
4 | println!("[before] square {} is {}", 2, square!(2));
| ^^^^^^
|
= help: have you added thè#[macro_use]òn the module/import?
The #[macro_export] attribute makes a macro more widely visible, but this also has
an oddity: a macro appears at the top level of a crate, even if it’s defined in a module:
mod submod {
#[macro_export]
macro_rules! cube {
{ $e:expr } => { $e * $e * $e }
}
}
mod user {
pub fn use_macro() {
// Note: *not* `crate::submod::cube!`
let cubed = crate::cube!(3);
println!("cube {} is {}", 3, cubed);
}
}
210 | Chapter 5: Tooling
Rust’s declarative macros are what’s known as hygienic: the expanded code in the body of the macro is not allowed to make use of local variable bindings. For example, a
macro that assumes that some variable x exists:
// Create a macro that assumes the existence of a local `x`.
macro_rules! increment_x {
{} => { x += 1; };
}
will trigger a compilation failure when it is used:
D O E S N O T C O M P I L E
let mut x = 2;
increment_x!();
println!("x = {}", x);
error[E0425]: cannot find valuèxìn this scope
--> src/main.rs:55:13
|
55 | {} => { x += 1; };
| ^ not found in this scope
...
314 | increment_x!();
| -------------- in this macro invocation
|
= note: this error originates in the macro ìncrement_x`
This hygienic property means that Rust’s macros are safer than C preprocessor mac‐
ros. However, there are still a couple of minor gotchas to be aware of when using
them.
The first is to realize that even if a macro invocation looks like a function invocation,
it’s not. A macro generates code at the point of invocation, and that generated code
can perform manipulations of its arguments:
macro_rules! inc_item {
{ $x:ident } => { $x.contents += 1; }
}
This means that the normal intuition about whether parameters are moved or &-
referred-to doesn’t apply:
let mut x = Item { contents: 42 }; // type is not `Copy`
// Item is *not* moved, despite the (x) syntax,
// but the body of the macro *can* modify `x`.
inc_item!(x);
println!("x is {x:?}");
Item 28: Use macros judiciously | 211
x is Item { contents: 43 }
This becomes clear if we remember that the macro inserts code at the point of invoca‐
tion—in this case, adding a line of code that increments x.contents.
t the compiler sees, after macro expansion: let mut x = Item { contents: 42 };
x.contents += 1;
{
::std::io::_print(format_args!("x is {0:?}\n", x));
};
The expanded code includes the modification in place, via the owner of the item, not
a reference. (It’s also interesting to see the expanded version of println!, which relies
on the format_args! macro, to be discussed shortly
So the exclamation mark serves as a warning: the expanded code for the macro may
do arbitrary things to or with its arguments.
The expanded code can also include control flow operations that aren’t visible in the
calling code, whether they be loops, conditionals, return statements, or use of the ?
operator. Obviously, this is likely to viola, so prefer macros whose behavior aligns with normal Rust where possible and appropriate.
(On the other hand, if the purpose of the macro is to allow weird control flow, go for
it! But help out your users by making sure the control flow behavior is clearly docu‐
mented.)
For example, consider a macro (for checking HTTP status codes) that silently
includes a return in its body:
/// Check that an HTTP status is successful; exit function if not.
macro_rules! check_successful {
{ $e:expr } => {
if $e.group() != Group::Successful {
return Err(MyError("HTTP operation failed"));
}
}
}
Code that uses this macro to check the result of some kind of HTTP operation can
end up with control flow that’s somewhat obscure:
let rc = perform_http_operation();
check_successful!(rc); // may silently exit the function
// ...
4 An eagle-eyed reader might notice that format_args! still looks like a macro invocation, even after macros have been expanded. That’s because it’s a special macro that’s built into the compiler.
212 | Chapter 5: Tooling
An alternative version of the macro that generates code that emits a Result:
/// Convert an HTTP status into àResult<(), MyError>ìndicating success.
macro_rules! check_success {
{ $e:expr } => {
match $e.group() {
Group::Successful => Ok(()),
_ => Err(MyError("HTTP operation failed")),
}
}
}
gives code that’s easier to follow:
let rc = perform_http_operation();
check_success!(rc)?; // error flow is visible vià?`
// ...
The second thing to watch out for with declarative macros is a problem shared with
the C preprocessor: if the argument to a macro is an expression with side effects,
beware of repeated use of the argument in the macro. The square! macro defined
earlier takes an arbitrary expression as an argument and then uses that argument
twice, which can lead to surprises:
U N D E S I R E D B E H A V I O R
let mut x = 1;
let y = square!({
x += 1;
x
});
println!("x = {x}, y = {y}");
// output: x = 3, y = 6
Assuming that this behavior isn’t intended, one way to fix it is simply to evaluate the
expression once and assign the result to a local variable:
macro_rules! square_once {
{ $e:expr } => {
{
let x = $e;
x*x // Note: there's a detail here to be explained later...
}
}
}
// output now: x = 2, y = 4
Item 28: Use macros judiciously | 213
The other alternative is not to allow an arbitrary expression as input to the macro. If
ident fragment specifier, then the macro will only accept identifiers as inputs, and the attempt to feed it an arbitrary
expression will no longer compile.
Formatting Values
One common style of declarative macro involves assembling a message that includes
various values from the current state of the code. For example, the standard library
includes for assembling a String for printing to standard output,
for printing to standard error, and so on. The describes the syntax of the formatting directives, which are roughly equivalent to C’s printf statement. However, the format arguments are type safe and checked at compile time, and
the implementations of the macro use the Display and Debug traits described in
You can (and should) use the same formatting syntax for any macros of your own that
perform a similar function. For exam
crate use the same syntax as format!. To do this, us for macros that perform argument formatting rather than attempting to reinvent the wheel:
/// Log an error including code location, with `format!`-like arguments.
/// Real code would probably use thèlog` crate.
macro_rules! my_log {
{ $($arg:tt)+ } => {
eprintln!("{}:{}: {}", file!(), line!(), format_args!($($arg)+));
}
}
let x = 10u8;
// Format specifiers:
// - `x` says print as hex
// - `#` says prefix with '0x'
// - `04` says add leading zeroes so width is at least 4
// (this includes the '0x' prefix).
my_log!("x = {:#04x}", x);
src/main.rs:331: x = 0x0a
5 The t are used when displaying data in particular formats. For examx format specifier indicates that lower-case hexadecimal output is required.
214 | Chapter 5: Tooling
Procedural Macros
Rust also supports procedural macros, often known as proc macros. Like a declarative
macro, a has the ability to insert arbitrary Rust code into the program’s source code. However, the inputs to the macro are no longer just the specific
arguments passed to it; instead, a procedural macro has access to the parsed tokens
corresponding to some chunk of the original source code. This gives a level of expres‐
sive power that approaches the flexibility of dynamic languages such as Lisp—but still
with compile-time guarantees. It also helps mitigate the limitations of reflection in
Rust, as discussed in
Procedural macros must be defined in a separate crate (of crate type proc-macro)
from where they are used, and that crate will almost certainly need to depend on
(provided by the standard toolchain) or (provided by David Tolnay) as a support library, to make it possible to work with the input tokens.
There are three distinct types of procedural macro:
Function-like macros
Invoked with an argument
Attribute macros
Attached to some chunk of syntax in the program
Derive macros
Attached to the definition of a data structure
Function-like macros
Function-like procedural macros are invoked with an argument, and the macro defi‐
nition has access to the parsed tokens that make up the argument, and emits arbitrary
tokens as a result. Note that the previous sentence says “argument,” singular—even if
a function-like macro is invoked with what looks like multiple arguments:
my_func_macro!(15, x + y, f32::consts::PI);
the macro itself receives a single argument, which is a stream of parsed tokens. A
macro implementation that just prints (at compile time) the contents of the stream:
use proc_macro::TokenStream;
// Function-like macro that just prints (at compile time) its input stream.
#[proc_macro]
pub fn my_func_macro(args: TokenStream) -> TokenStream {
println!("Input TokenStream is:");
for tt in args {
println!(" {tt:?}");
}
// Return an empty token stream to replace the macro invocation with.
Item 28: Use macros judiciously | 215
TokenStream::new()
}
shows the stream corresponding to the input:
Input TokenStream is:
Literal { kind: Integer, symbol: "15", suffix: None,
span: #0 bytes(10976..10978) }
Punct { ch: ',', spacing: Alone, span: #0 bytes(10978..10979) }
Ident { ident: "x", span: #0 bytes(10980..10981) }
Punct { ch: '+', spacing: Alone, span: #0 bytes(10982..10983) }
Ident { ident: "y", span: #0 bytes(10984..10985) }
Punct { ch: ',', spacing: Alone, span: #0 bytes(10985..10986) }
Ident { ident: "f32", span: #0 bytes(10987..10990) }
Punct { ch: ':', spacing: Joint, span: #0 bytes(10990..10991) }
Punct { ch: ':', spacing: Alone, span: #0 bytes(10991..10992) }
Ident { ident: "consts", span: #0 bytes(10992..10998) }
Punct { ch: ':', spacing: Joint, span: #0 bytes(10998..10999) }
Punct { ch: ':', spacing: Alone, span: #0 bytes(10999..11000) }
Ident { ident: "PI", span: #0 bytes(11000..11002) }
The low-level nature of this input stream means that the macro implementation has
to do its own parsing. For example, separating out what appear to be separate argu‐
ments to the macro involves looking for TokenTree::Punct tokens that hold the
commas dividing the arguments. The crate (from David Tolnay) provides a parsing library tha describes.
Because of this, it’s usually easier to use a declarative macro than a function-like pro‐
cedural macro, because the expected structure of the macro’s inputs can be expressed
in the matching pattern.
The flip side of this need for manual processing is that function-like proc macros
have the flexibility to accept inputs that don’t parse as normal Rust code. That’s not
often needed (or sensible), so function-like macros are comparatively rare as a result.
Attribute macros
Attribute macros are invoked by placing them before some item in the program, and
the parsed tokens for that item are the input to the macro. The macro can again emit
arbitrary tokens as output, but the output is typically some transformation of the
input.
For example, an attribute macro can be used to wrap the body of a function:
#[log_invocation]
fn add_three(x: u32) -> u32 {
x + 3
}
so that invocations of the function are logged:
216 | Chapter 5: Tooling
let x = 2;
let y = add_three(x);
println!("add_three({x}) = {y}");
log: calling function 'add_three'
log: called function 'add_three' => 5
add_three(2) = 5
The implementation of this macro is too large to include here, because the code needs
to check the structure of the input tokens and to build up the new output tokens, but
the syn crate can again help with this processing.
Derive macros
The final type of procedural macro is the derive macro, which allows generated code
to be automatically attached to a data structure definition (a struct, enum, or union).
This is similar to an attribute macro but there are a few derive-specific aspects to be
aware of.
The first is that derive macros add to the input tokens, instead of replacing them
altogether. This means that the data structure definition is left intact but the macro
has the opportunity to append related code.
The second is that a derive macro can declare associated helper attributes, which can
then be used to mark parts of the data structure that need special processing. For
example, ’s derive macro has a serde helper attribute that can provide metadata to guide the deserialization process:
fn generate_value() -> String {
"unknown".to_string()
}
#[derive(Debug, Deserialize)]
struct MyData {
// If `valueìs missing when deserializing, invoke
// `generate_value()` to populate the field instead.
#[serde(default = "generate_value")]
value: String,
}
The final aspect of derive macros to be aware of is that the crate can take care of much of the heavy lifting involved in parsing the input tokens into the equivalent
nodes in the AST macro converts the tokens into a
ta structure that describes the content of the item, and Derive Input is much easier to deal with than a raw stream of tokens.
In practice, derive macros are the most commonly encountered type of procedural
macro—the ability to generate field-by-field (for structs) or variant-by-variant (for
enums) implementations allows for a lot of functionality to be provided with little
Item 28: Use macros judiciously | 217
effort from the programmer—for example, by adding a single line like
#[derive(Debug, Clone, PartialEq, Eq)].
Because the derived implementations are auto-generated, it also means that the
implementations automatically stay in sync with the data structure definition. For
example, if you were to add a new field to a struct, a manual implementation of
Debug would need to be manually updated, whereas an automatically derived version
would display the new field with no additional effort (or would fail to compile if that
wasn’t possible).
When to Use Macros
The primary reason to use macros is to avoid repetitive code—especially repetitive
code that would otherwise have to be manually kept in sync with other parts of the
code. In this respect, writing a macro is just an extension of the same kind of general‐
ization process that normally forms part of programming:
• If you repeat exactly the same code for multiple values of a specific type, encapsu‐
late that code into a common function and call the function from all of the
repeated places.
• If you repeat exactly the same code for multiple types, encapsulate that code into
a generic with a trait bound and use the generic from all of the repeated places.
• If you repeat the same structure of code in multiple places, encapsulate that code
into a macro and use the macro from all of the repeated places.
For example, avoiding repetition for code that works on different enum variants can
be done only by a macro:
enum Multi {
Byte(u8),
Int(i32),
Str(String),
}
/// Extract copies of all the values of a specific enum variant.
#[macro_export]
macro_rules! values_of_type {
{ $values:expr, $variant:ident } => {
{
let mut result = Vec::new();
for val in $values {
if let Multi::$variant(v) = val {
result.push(v.clone());
}
}
result
}
218 | Chapter 5: Tooling
}
}
fn main() {
let values = vec![
Multi::Byte(1),
Multi::Int(1000),
Multi::Str("a string".to_string()),
Multi::Byte(2),
];
let ints = values_of_type!(&values, Int);
println!("Integer values: {ints:?}");
let bytes = values_of_type!(&values, Byte);
println!("Byte values: {bytes:?}");
// Output:
// Integer values: [1000]
// Byte values: [1, 2]
}
Another scenario where macros help avoid manual repetition is when information
about a collection of data values would otherwise be spread out across different areas
of the code.
For example, consider a data structure that encodes information about HTTP status
codes; a macro can help keep all of the related information together:
// http.rs module
#[derive(Debug, PartialEq, Eq, Clone, Copy)]
pub enum Group {
Informational, // 1xx
Successful, // 2xx
Redirection, // 3xx
ClientError, // 4xx
ServerError, // 5xx
}
// Information about HTTP response codes.
http_codes! {
Continue => (100, Informational, "Continue"),
SwitchingProtocols => (101, Informational, "Switching Protocols"),
// ...
Ok => (200, Successful, "Ok"),
Created => (201, Successful, "Created"),
// ...
}
Item 28: Use macros judiciously | 219
The macro invocation holds all the related information—numeric value, group,
description—for each HTTP status code, acting as a kind of domain-specific lan‐
guage (DSL) holding the source of truth for the data.
The macro definition then describes the generated code; each line of the form
$( ... )+ expands to multiple lines in the generated code, one per argument to the
macro:
macro_rules! http_codes {
{ $( $name:ident => ($val:literal, $group:ident, $text:literal), )+ } => {
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
#[repr(i32)]
enum Status {
$( $name = $val, )+
}
impl Status {
fn group(&self) -> Group {
match self {
$( Self::$name => Group::$group, )+
}
}
fn text(&self) -> &'static str {
match self {
$( Self::$name => $text, )+
}
}
}
impl core::convert::TryFrom< i32> for Status {
type Error = ();
fn try_from(v: i32) -> Result<Self, Self::Error> {
match v {
$( $val => Ok(Self::$name), )+
_ => Err(())
}
}
}
}
}
As a result, the overall output from the macro takes care of generating all of the code
that derives from the source-of-truth values:
• The definition of an enum holding all the variants
• The definition of a group() method, which indicates which group an HTTP sta‐
tus belongs to
• The definition of a text() method, which maps a status to a text description
• An implementation of TryFrom<i32> to convert numbers to status enum values
If an extra value needs to be added later, all that’s needed is a single additional line:
220 | Chapter 5: Tooling
ImATeapot => (418, ClientError, "I'm a teapot"),
Without the macro, four different places would have to be manually updated. The
compiler would point out some of them (because match expressions need to cover all
cases) but not all—TryFrom<i32> could easily be forgotten.
Because macros are expanded in place in the invoking code, they can also be used to
automatically emit additional diagnostic information—in particular, by using the
standard library’ macros, which emit source code location information:
macro_rules! log_failure {
{ $e:expr } => {
{
let result = $e;
if let Err(err) = &result {
eprintln!("{}:{}: operation '{}' failed: {:?}",
file!(),
line!(),
stringify!($e),
err);
}
result
}
}
}
When failures occur, the log file then automatically includes details of what failed and
where:
use std::convert::TryInto;
let x: Result< u8, _> = log_failure!(512.try_into()); // too big for ù8`
let y = log_failure!(std::str::from_utf8(b" \xc3\x28")); // invalid UTF-8
src/main.rs:340: operation '512.try_into()' failed: TryFromIntError(())
src/main.rs:341: operation 'std::str::from_utf8(b"\xc3\x28")' failed:
Utf8Error { valid_up_to: 0, error_len: Some(1) }
Disadvantages of Macros
The primary disadvantage of using a macro is the impact that it has on code readabil‐
ity and maintainabilityt macros allow
you to create a DSL to concisely express key features of your code and data. However,
this means that anyone reading or maintaining the code now has to understand this
DSL—and its implementation in macro definitions—in addition to understanding
Rust. For example, the http_codes! example in the previous section creates a Rust
enum named Status, but it’s not visible in the DSL used for the macro invocation.
Item 28: Use macros judiciously | 221
This potential impenetrability of macro-based code extends beyond other engineers:
various tools that analyze and interact with Rust code may treat the code as opaque,
because it no longer follows the syntactical conventions of Rust code. The
square_once! macro shown earlier provided one trivial example of this: the body of
the macro has not been formatted according to the normal rustfmt rules:
{
let x = $e;
// Thèrustfmt` tool doesn't really cope with code in
// macros, so this has not been reformatted tòx * x`.
x*x
}
Another example is the earlier http_codes! macro, where the DSL uses Group enum
variant names like Informational with neither a Group:: prefix nor a use statement,
which may confuse some code navigation tools.
Even the compiler itself is less helpful: its error messages don’t always follow the chain
of macro use and definition. (However, there are parts of the tooling ecosystem [see
t can help with this, such as David Tolnay’s , used earlier.) Another possible downside for macro use is the possibility of code bloat—a single
line of macro invocation can result in hundreds of lines of generated code, which will
be invisible to a cursory survey of the code. This is less likely to be a problem when
the code is first written, because at that point the code is needed and saves the
humans involved from having to write it themselves. However, if the code subse‐
quently stops being necessary, it’s not so obvious that there are large amounts of code
that could be deleted.
Advice
Although the previous section listed some downsides of macros, they are still funda‐
mentally the right tool for the job when there are different chunks of code that need
to be kept consistent but that cannot be coalesced any other way: use a macro when‐
ever it’s the only way to ensure that disparate code stays in sync.
Macros are also the tool to reach for when there’s boilerplate code to be squashed: use
a macro for repeated boilerplate code that can’t be coalesced into a function or a
generic.
To reduce the impact on readability, try to avoid syntax in your macros that clashes
with Rust’s normal syntax rules; either make the macro invocation look like normal
code or make it look sufficiently different so that no one could confuse the two. In
particular, follow these guidelines:
222 | Chapter 5: Tooling
• Avoid macro expansions that insert references where possible—a macro invocation
like my_macro!(&list) aligns better with normal Rust code than my_macro!
(list) would.
• Prefer to avoid nonlocal control flow operations in macros so that anyone reading
the code is able to follow the flow without needing to know the details of the
macro.
This preference for Rust-like readability sometimes affects the choice between declar‐
ative macros and procedural macros. If you need to emit code for each field of a
structure, or each variant of an enum, prefer a derive macro to a procedural macro that
emits a type (despite the exam’s more idiomatic and makes the code easier to read.
However, if you’re adding a derive macro with functionality that’s not specific to your
project, check whether an external crate already provides what you need (see
). For example, the problem of converting integer values into the appropriate var‐
iant of a C-like enum,
, and
problem.
Item 29: Listen to Clippy
It looks like you’re writing a letter. Would you like help?
—Microsoft Clippit
describes the ecosystem of helpful tools available in the Rust toolbox, but one tool is sufficiently helpful and important to get promoted to an Item of its very own:
.
Clippy is an additional component for Cargo (cargo clippy) that emits warnings
about your Rust usage, across a variety of categories:
Correctness
Warns about common programming errors
Idiom Warns about code constructs that aren’t quite in standard Rust style
Concision
Points out variations on the code that are more compact
Performance
Suggests alternatives that avoid unnecessary processing or allocation
Item 29: Listen to Clippy | 223
Readability
Describes alterations to the code that would make it easier for humans to read
and understand
For example, the following code builds fine:
U N D E S I R E D B E H A V I O R
pub fn circle_area(radius: f64) -> f64 {
let pi = 3.14;
pi * radius * radius
}
but Clippy points out that the local approximation to π is unnecessary and inaccu‐
rate:
error: approximate value of `f{32, 64}::consts::PÌ found
--> src/main.rs:5:18
|
5 | let pi = 3.14;
| ^^^^
|
= help: consider using the constant directly
= help: for further information visit
https://rust-lang.github.io/rust-clippy/master/index.html#approx_constant
= note: `#[deny(clippy::approx_constant)]òn by default
The linked webpage explains the problem and points the way to a suitable modifica‐
tion of the code:
pub fn circle_area(radius: f64) -> f64 {
std::f64::consts::PI * radius * radius
}
As shown previously, each Clippy warning comes with a link to a webpage describing
the error, which explains why the code is considered bad. This is vital, because it
allows you to decide whether those reasons apply to your code or whether there is
some particular reason why the lint check isn’t relevant. In some cases, the text also
describes known problems with the lint, which might explain an otherwise confusing
false positive.
If you decide that a lint warning isn’t relevant for your code, you can disable it either
for that particular item (#[allow(clippy::some_lint)]) or for the entire crate (#!
[allow(clippy::some_lint)], with an extra !, at the top level). However, it’s usually
better to take the cost of a minor refactoring of the code than to waste time and
energy arguing about whether the warning is a genuine false positive.
224 | Chapter 5: Tooling
Whether you choose to fix or disable the warnings, you should make your code
Clippy-warning free.
That way, when new warnings appear—whether because the code has been changed
or because Clippy has been upgraded to include new checks—they will be obvious.
Clippy should also be enabled in your CI system (
Clippy’s warnings are particularly helpful when you’re learning Rust, because they
reveal gotchas you might not have noticed and help you become familiar with Rust
idiom.
Many of the Items in this book also have corresponding Clippy warnings, when it’s
possible to mechanically check the relevant concern:
• bools, and Clippy will
also point out the use of multiple bools in
• covers manipulations of Option and Result types, and Clippy points out a
few possible redundancies, such as the following:
—
—
• t errors should be returned to the caller where possible;
Clippy
• suggests implementing From rather than Into
• ult) warnings for the
following:
—
—
—
—
—
—
• describes fat pointer types, and various Clippy lints point out scenarios
where there are unnecessary extra pointer indirections:
—
—
—
Item 29: Listen to Clippy | 225
• describes the myriad ways to manipulate Iterator instances; Clippy
.
• describes Rust’s standard traits and included some implementation
requirements that Clippy checks:
— .
—
— .
— .
• suggests or rela, which Clippy also detects.
• ves that importing a wildcard version of a crate isn’t sensible;
Clippy .
• voiding wildcard im
• I touch on the fact that multiple versions of the same crate can appear in your dependency gra
.
• ture of Cargo features, and Clippy includes a
warning about (e.g., "no_std") that are likely to indicate a feature that falls foul of this.
• t a crate’s optional dependencies form part of its feature
that could just make use of this instead.
• ventions for documentation comments, and Clippy will
also point out the following:
—
—
As the size of this list should make clear, it can be a valuable learning experience to
read the —including the checks that are disabled by default because they are overly pedantic or because they have a high rate of false positives.
Even though you’re unlikely to want to enable these warnings for your code, under‐
standing the reasons why they were written in the first place will improve your
understanding of Rust and its idiom.
226 | Chapter 5: Tooling
Item 30: Write more than unit tests
All companies have test environments.
The lucky ones have production environments separate from the test environment.
—
Like most other modern languages, Rust includes features that make it easy to
that live alongside your code and that give you confidence that the code is working correctly.
This isn’t the place to expound on the importance of tests; suffice it to say that if code
isn’t tested, it probably doesn’t work the way you think it does. So this Item assumes
that you’re already signed up to write tests for your code.
Unit tests and integration tests, described in the next two sections, are the key forms
of tests. However, the Rust toolchain, and extensions to the toolchain, allow for vari‐
ous other types of tests. This Item describes their distinct logistics and rationales.
Unit Tests
The most common form of test for Rust code is a unit test, which might look some‐
thing like this:
// ... (code defining `nat_subtract*` functions for natural
// number subtraction)
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_nat_subtract() {
assert_eq!(nat_subtract(4, 3).unwrap(), 1);
assert_eq!(nat_subtract(4, 5), None);
}
#[should_panic]
#[test]
fn test_something_that_panics() {
nat_subtract_unchecked(4, 5);
}
}
Some aspects of this example will appear in every unit test:
• A collection of unit test functions.
• Each test function is marked with the #[test] attribute.
Item 30: Write more than unit tests | 227
• The module holding the test functions is annotated with a #[cfg(test)]
attribute, so the code gets built only in test configurations.
Other aspects of this example illustrate things that are optional and may be relevant
only for particular tests:
• The test code here is held in a separate module, conventionally called tests or
test. This module may be inline (as here) or held in a separate tests.rs file. Using
a separate file for the test module has the advantage that it’s easier to spot
whether code that uses a function is test code or “real” code.
• The test module might have a wildcard use super::* to pull in everything from
the parent module under test. This makes it more convenient to add tests (and is