Item 24: Re-export dependencies whose types appear in your API - Effective Rust - David Drysdale - RutLib.com

Книга: Effective Rust

Назад: Deadlocks

Дальше: Integration Tests

up code to have a wildcard import from an internal module:

mod thing;

pub use thing::*;

However, there’s another common exception where wildcard imports make sense.

Some crates have a convention that common items for the crate are re-exported from

a prelude module, which is explicitly intended to be wildcard imported:

use thing::prelude::*;

Although in theory the same concerns apply in this case, in practice such a prelude

module is likely to be carefully curated, and higher convenience may outweigh a

small risk of future problems.

Finally, if you don’t follow the advice in this Item, consider pinning dependencies that

you wildcard import to a precise versiont minor version upgrades of the dependency aren’t automatically allowed.

Item 23: Avoid wildcard imports | 187

Item 24: Re-export dependencies whose types

appear in your API

The title of this Item is a little convoluted, but working through an example will make

things clearer

describes how cargo supports different versions of the same library crate

being linked into a single binary, in a transparent manner. Consider a binary that uses

crate—more specifically, one that uses some 0.8 version of the crate:

# Cargo.toml file for a top-level binary crate.

[dependencies]

# The binary depends on thèrand` crate from crates.io

rand = "=0.8.5"

# It also depends on some other crate (`dep-lib`).

dep-lib = "0.1.0"

// Source code:

let mut rng = rand::thread_rng(); // rand 0.8

let max: usize = rng.gen_range(5..10);

let choice = dep_lib::pick_number(max);

The final line of code also uses a notional dep-lib crate as another dependency. This

crate might be another crate from crates.io, or it could be a local crate that is loca‐

ted via Cargo’

This dep-lib crate internally uses a 0.7 version of the rand crate:

# Cargo.toml file for thèdep-lib` library crate.

[dependencies]

# The library depends on thèrand` crate from crates.io

rand = "=0.7.3"

// Source code:

//! Thèdep-lib` crate provides number picking functionality.

use rand::Rng;

/// Pick a number between 0 and n (exclusive).

pub fn pick_number(n: usize) -> usize {

rand::thread_rng().gen_range(0, n)

}

An eagle-eyed reader might notice a difference between the two code examples:

• In version 0.7.x of rand (as used by the dep-lib library crate), the

method takes two parameters, low and high.

3 This example (and indeed Item) is inspired by the a

188 | Chapter 4: Dependencies

• In version 0.8.x of rand (as used by the binary crate), the

method takes a single parameter range.

This is not a back-compatible change, and so rand has increased its leftmost version

component accordingly, as required by seman). Nevertheless,

the binary that combines the two incompatible versions works just fine—cargo sorts

everything out.

However, things get a lot more awkward if the dep-lib library crate’s API exposes a

type from its dependency, making that dependency a

For example, suppose that the dep-lib entrypoint involves an Rng item—but specifi‐

cally a version-0.7 Rng item:

/// Pick a number between 0 and n (exclusive) using

/// the provided `Rngìnstance.

pub fn pick_number_with<R: Rng>(rng: & mut R, n: usize) -> usize {

rng.gen_range(0, n) // Method from the 0.7.x version of Rng

}

As an aside, think careful y before using another crate’s types in your API: it intimately

ties your crate to that of the dependency. For example, a major version bump for the

dependency (utomatically require a major version bump for your crate

too.

In this case, rand is a semi-standard crate that is widely used and pulls in only a small

n), so including its types in the crate API is probably fine on balance.

Returning to the example, an attempt to use this entrypoint from the top-level binary

fails:

D O E S N O T C O M P I L E

let mut rng = rand::thread_rng();

let max: usize = rng.gen_range(5..10);

let choice = dep_lib::pick_number_with(& mut rng, max);

Unusually for Rust, the compiler error message isn’t : error[E0277]: the trait bound `ThreadRng: rand_core::RngCoreìs not satisfied

--> src/main.rs:22:44

22 | let choice = dep_lib::pick_number_with(&mut rng, max);

| ------------------------- ^^^^^^^^ the trait

| | `rand_core::RngCoreìs not

| | implemented for `ThreadRng`

| |

Item 24: Re-export dependencies whose types appear in your API | 189

| required by a bound introduced by this call

= help: the following other types implement trait `rand_core::RngCorè:

&'a mut R

Investigating the types involved leads to confusion because the relevant traits do

appear to be implemented—but the caller actually implements a (notional)

RngCore_v0_8_5 and the library is expecting an implementation of RngCore_v0_7_3.

Once you’ve finally deciphered the error message and realized that the version clash

is the underlying cause, how can you fix it? vation is to realize that

while the binary can’t directly use two different versions of the same crate, it can do so

indirectly (as in the original example shown previously).

From the perspective of the binary author, the problem could be worked around by

adding an intermediate wrapper crate that hides the naked use of rand v0.7 types. A

wrapper crate is distinct from the binary crate and so is allowed to depend on rand

v0.7 separately from the binary crate’s dependency on rand v0.8.

This is awkward, and a much better approach is available to the author of the library

crate. It can make life easier for its users by explicitly either of the following:

• The types involved in the API

• The entire dependency crate

For this example, the latter approach works best: as well as making the version 0.7 Rng

and RngCore types available, it also makes available the methods (like thread_rng())

that construct instances of the type:

// Re-export the version of `randùsed in this crate's API.

pub use rand;

The calling code now has a different way to directly refer to version 0.7 of rand, as

dep_lib::rand:

let mut prev_rng = dep_lib::rand::thread_rng(); // v0.7 Rng instance

let choice = dep_lib::pick_number_with(& mut prev_rng, max);

With this example in mind, the advice given in the title of the Item should now be a

little less obscure: re-export dependencies whose types appear in your API.

4 This kind of error can even appear when the dependency graph includes two alternatives for a crate with the same version, when something in the build graph uses the y a local directory instead of a crates.io location.

190 | Chapter 4: Dependencies

Item 25: Manage your dependency graph

Like most modern programming languages, Rust makes it easy to pull in external

libraries, in the form of crates. Most nontrivial Rust programs use external crates, and

those crates may themselves have additional dependencies, forming a dependency

graph for the program as a whole.

By default, Cargo will download any crates named in the [dependencies] section of

your Cargo.toml file from and find versions of those crates that match the requirements configured in Cargo.toml.

A few subtleties lurk underneath this simple statement. The first thing to notice is

that crate names from crates.io form a single flat namespace—and this global

namespace also overlaps with the names of features in a crate (see

If you’re planning on publishing a crate on crates.io, be aware that names are gener‐

ally allocated on a first-come, first-served basis; so you may find that your preferred

name for a public crate is already taken. However, name-squatting—reserving a crate

name by preregistering an empty cra

to release code in the near future.

As a minor wrinkle, there’s also a slight difference between what’s allowed as a crate

name in the crates namespace and what’s allowed as an identifier in code: a crate can

be named some-crate, but it will appear in code as some_crate (with an underscore).

To put it another way: if you see some_crate in code, the corresponding crate name

may be either some-crate or some_crate.

The second subtlety to understand is that Cargo allows multiple semver-incompatible

versions of the same crate to be present in the build. This can seem surprising to

begin with, because each Cargo.toml file can have only a single version of any given

dependency, but the situation frequently arises with indirect dependencies: your crate

depends on some-crate version 3.x but also depends on older-crate, which in turn

depends on some-crate version 1.x.

This can lead to confusion if the dependency is exposed in some way rather than just

being used internally (piler will treat the two versions as being dis‐

tinct crates, but its error messages won’t necessarily make that clear.

Allowing multiple versions of a crate can also go wrong if the crate includes C/C++

code accessed via Rust’The Rust toolchain can internally

disambiguate distinct versions of Rust code, but any included C/C++ code is subject

5 It’tes (for example, an internal corporate registry). Each dependency entry in Cargo.toml can then use the registry key to indicate which registry a dependency should be sourced from.

Item 25: Manage your dependency graph | 191

to the : there can be only a single version of any function, constant, or global variable.

There are restrictions on Cargo’s multiple-version support. Cargo does not allow mul‐

tiple versions of the same crate within a semver-compa):

• some-crate 1.2 and some-crate 3.1 can coexist

• some-crate 1.2 and some-crate 1.3 cannot

Cargo also extends the semantic versioning rules for pre-1.0 crates so that the first

non-zero subversion counts like a major version, so a similar constraint applies:

• other-crate 0.1.2 and other-crate 0.2.0 can coexist

• other-crate 0.1.2 and other-crate 0.1.4 cannot

Cargo’ does the job of figuring out what versions to include. Each Cargo.toml dependency line specifies an acceptable range of versions,

according to semantic versioning rules, and Cargo takes this into account when the

same crate appears in multiple places in the dependency graph. If the acceptable

ranges overlap and are semver-compatible, then Cargo will (by default) pick the most

recent version of the crate within the overlap. If there is no semver-compatible over‐

lap, then Cargo will build multiple copies of the dependency at different versions.

Once Cargo has picked acceptable versions for all dependencies, its choices are recor‐

ded in the Cargo.lock file. Subsequent builds will then reuse the choices encoded in

Cargo.lock so that the build is stable and no new downloads are needed.

This leaves you with a choice: should you commit your Cargo.lock files into version

con is as follows:

• Things that produce a final product, namely applications and binaries, should

commit Cargo.lock to ensure a deterministic build.

• Library crates should not commit a Cargo.lock file, because it’s irrelevant to any

downstream consumers of the library—they will have their own Cargo.lock file;

be aware that the Cargo.lock file for a library crate is ignored by library users.

Even for a library crate, it can be helpful to have a checked-in Cargo.lock file to ensure

that regular builds and CI (’t have a moving target. Although the prom‐

ises of semantic versioning () should prevent failures in theory, mistakes hap‐

pen in practice, and it’s frustrating to have builds that fail because someone

somewhere recently changed a dependency of a dependency.

192 | Chapter 4: Dependencies

However, if you version-control Cargo.lock, set up a process to handle upgrades (such as GitHub’s ’t, your dependencies will stay pinned to versions that get older, outdated, and potentially insecure.

Pinning versions with a checked-in Cargo.lock file doesn’t avoid the pain of handling

dependency upgrades, but it does mean that you can handle them at a time of your

own choosing, rather than immediately when the upstream crate changes. There’s

also some fraction of dependency-upgrade problems that go away on their own: a

crate that’s released with a problem often gets a second, fixed, version released in a

short space of time, and a batched upgrade process might see only the latter version.

The third subtlety of Cargo’s resolution process to be aware of is feature unification:

the features that get activated for a dependent crate are the union of the features

selected by different places in the dependency graph; see for more details.

Version Specification

The version specification clause for a dependency defines a range of allowed versions,

Avoid a too-specific version dependency

Pinning to a specific version ("=1.2.3") is usual y a bad idea: you don’t see newer

versions (potentially including security fixes), and you dramatically narrow the

potential overlap range with other crates in the graph that rely on the same

dependency (recall that Cargo allows only a single version of a crate to be used

within a semver-compatible range). If you want to ensure that your builds use a

consistent set of dependencies, the Cargo.lock file is the tool for the job.

Avoid a too-general version dependency

It’s possible to specify a version dependency ("*") that allows any version of the dependency to be used, but it’s a bad idea. If the dependency releases a new major

version of the crate that completely changes every aspect of its API, it’s unlikely

that your code will still work after a cargo update pulls in the new version.

The most common Goldilocks specification—not too precise, not too vague—is to

allow semver-compatible versions ("1") of a crate, possibly with a specific minimum

version that includes a feature or fix that you require ("1.4.23"). Both of these ver‐

sion specifications make use of Cargo’s default behavior, which is to allow versions

that are semver-compatible with the specified version. You can make this more

explicit by adding a caret:

• A version of "1" is equivalent to "^1", which allows all 1.x versions (and so is also equivalent to "1.*").

• A version of "1.4.23" is equivalent to "^1.4.23", which allows any 1.x versions

that are larger than 1.4.23.

Item 25: Manage your dependency graph | 193

Solving Problems with Tooling

t you take advantage of the range of tools that are available

within the Rust ecosystem. This section describes some dependency graph problems

where tools can help.

The compiler will tell you pretty quickly if you use a dependency in your code but

don’t include that dependency in Cargo.toml. But what about the other way around? If

there’s a dependency in Cargo.toml that you don’t use in your code—or more likely, no

longer use in your code—then Cargo will go on with its business. The

tool is designed to solve exactly this problem: it warns you when your Cargo.toml

includes an unused dependency (“udep”).

A more versa, which analyzes your dependency graph to detect a variety of potential problems across the full set of transitive dependencies:

• Dependencies that have known security problems in the included version

• Dependencies that are covered by an unacceptable license

• Dependencies that are just unacceptable

• Dependencies that are included in multiple different versions across the depend‐

ency tree

Each of these features can be configured and can have exceptions specified. The

exception mechanism is usually needed for larger projects, particularly the multiple-

version warning: as the dependency graph grows, so does the chance of transitively

depending on different versions of the same crate. It’s worth trying to reduce these

duplicates where possible—for binary-size and compilation-time reasons if nothing

else—but sometimes there is no possible combination of dependency versions that

can avoid a duplicate.

These tools can be run as a one-off, but it’s better to ensure they’re executed regularly

and reliably by including them in your CI system (This helps to catch newly

introduced problems—including problems that may have been introduced outside of

your code, in an upstream dependency (for example, a newly reported vulnerability).

If one of these tools does report a problem, it can be difficult to figure out exactly

where in the dependency graph the problem arises. The command that’s included with cargo helps here, as it shows the dependency graph as a tree structure:

dep-graph v0.1.0

├── dep-lib v0.1.0

│ └── rand v0.7.3

│ ├── getrandom v0.1.16

│ │ ├── cfg-if v1.0.0

│ │ └── libc v0.2.94

│ ├── libc v0.2.94

194 | Chapter 4: Dependencies

│ ├── rand_chacha v0.2.2

│ │ ├── ppv-lite86 v0.2.10

│ │ └── rand_core v0.5.1

│ │ └── getrandom v0.1.16 (*)

│ └── rand_core v0.5.1 (*)

└── rand v0.8.3

├── libc v0.2.94

├── rand_chacha v0.3.0

│ ├── ppv-lite86 v0.2.10

│ └── rand_core v0.6.2

│ └── getrandom v0.2.3

│ ├── cfg-if v1.0.0

│ └── libc v0.2.94

└── rand_core v0.6.2 (*)

cargo tree includes a variety of options that can help to solve specific problems,

such as these:

--invert

Shows what depends on a specific package, helping you to focus on a particular

problematic dependency

--edges features

Shows what crate features are activated by a dependency link, which helps you

figure out what’s going on with feature unification ()

--duplicates

Shows crates that have multiple versions present in the dependency graph

What to Depend On

The previous sections have covered the more mechanical aspect of working with

dependencies, but there’s a more philosophical (and therefore harder-to-answer)

question: when should you take on a dependency?

Most of the time, there’s not much of a decision involved: if you need the functional‐

ity of a crate, you need that function, and the only alternative would be to write it

But every new dependency has a cost, partly in terms of longer builds and bigger

binaries but mostly in terms of the developer effort involved in fixing problems with

dependencies when they arise.

6 If you are targeting a no_std environment, this choice may be made for you: many crates are not compatible with no_std, particularly if alloc is also una

Item 25: Manage your dependency graph | 195

The bigger your dependency graph, the more likely you are to be exposed to these

kinds of problems. The Rust crate ecosystem is just as vulnerable to accidental

dependency problems as other package ecosystems, where history has shown that

, or a team can have widespread knock-on effects.

More worrying still are supply chain attacks, where a malicious actor deliberately

ted attacks.

This kind of attack doesn’t just affect your compiled code—be aware that a depend‐

ency can run arbitrary code at build time, via scripts or procedural macros

). That means that a compromised dependency could end up running a cryp‐

tocurrency miner as part of your CI system!

So for dependencies that are more “cosmetic,” it’s sometimes worth considering

whether adding the dependency is worth the cost.

The answer is usually “yes,” though; in the end, the amount of time spent dealing with

dependency problems ends up being much less than the time it would take to write

equivalent functionality from scratch.

Things to Remember

• Crate names on crates.io form a single flat namespace (which is shared with

feature names).

• Crate names can include a hyphen, but it will appear as an underscore in code.

• Cargo supports multiple versions of the same crate in the dependency graph, but

only if they are of different semver-incompatible versions. This can go wrong for

crates that include FFI code.

• Prefer to allow semver-compatible versions of dependencies ("1", or "1.4.23" to

include a minimum version).

• Use Cargo.lock files to ensure your builds are repeatable, but remember that the

Cargo.lock file does not ship with a published crate.

• Use tooling (cargo tree, cargo deny, cargo udep, …) to help find and fix

dependency problems.

• Understand that pulling in dependencies saves you writing code but doesn’t come

for free.

196 | Chapter 4: Dependencies

Item 26: Be wary of feature creep

Rust allows the same codebase to support a variety of different configurations via

Cargo’s feature mechanism, which is built on top of a lower-level mechanism for con‐

ditional compilation. However, the feature mechanism has a few subtleties to be

aware of, which this Item explores.

Conditional Compilation

R, which is controlled by (and

ttributes. These attributes govern whether the thing—function, line, block, etc.—that they are attached to is included in the compiled source code or not

(which is in contrast to C/C++’s line-based preprocessor). The conditional inclusion

is controlled by configuration options that are either plain names (e.g., test) or pairs

of names and values (e.g., panic = "abort").

Note that the name/value variants of config options are multivalued—it’s possible to

set more than one value for the same name:

// Build with `RUSTFLAGS` set to:

// '--cfg myname="a" --cfg myname="b"'

#[cfg(myname = "a")]

println!("cfg(myname = 'a') is set");

#[cfg(myname = "b")]

println!("cfg(myname = 'b') is set");

cfg(myname = 'a') is set

cfg(myname = 'b') is set

Other than the feature values described in this section, the most commonly used

config values are those that the toolchain populates automatically, with values that

describe the target environment for the build. These include the OS (

), pointer width (), and endianness (). This allows for code portability, where features that are specific to some particular target are compiled in only when building for that target.

The standard ple of the multivalued nature of config values: both [cfg(target_has_atomic = "32")] and

[cfg(target_has_atomic = "64")] will be set for targets that support both 32-bit

and 64-bit atomic operations. (For more information on atomics, see Chapter 2 of

Mara Bos’’Reilly].)

Item 26: Be wary of feature creep | 197

Features

package manager builds on this base cfg name/value mechanism to pro‐

te that can be enabled when building the crate. Cargo ensures that the feature option is

populated with each of the configured values for each crate that it compiles, and the

values are crate-specific.

This is Cargo-specific functionality: to the Rust compiler, feature is just another

configuration option.

At the time of writing, the most reliable way to determine what features are available

for a crate is to examine the crate’s manifest file. For example, the following chunk of a manifest file includes six features:

[features]

default = ["featureA"]

featureA = []

featureB = []

# Enabling `featureABàlso enables `featureAànd `featureB`.

featureAB = ["featureA", "featureB"]

schema = []

[dependencies]

rand = { version = "^0.8", optional = true }

hex = "^0.4"

Given that there are only five entries in the [features] stanza; there are clearly a

couple of subtleties to watch out for.

The first is that the default line in the [features] stanza is a special feature name,

used to indicate to cargo which of the features should be enabled by default. These

features can still be disabled by passing the --no-default-features flag to the build

command, and a consumer of the crate can encode this in their Cargo.toml file like so:

[dependencies]

somecrate = { version = "^0.3", default-features = false }

However, default still counts as a feature name, which can be tested in code:

#[cfg(feature = "default")]

println!("This crate was built with the \" default\" feature enabled.");

#[cfg(not(feature = "default"))]

println!("This crate was built with the \" default\" feature disabled."); The second subtlety of feature definitions is hidden in the [dependencies] section of

the original Cargo.toml example: the rand crate is a dependency that is marked as

198 | Chapter 4: Dependencies

optional = true, and that effectively makes "rand" into the name of a feaIf the crate is compiled with --features rand, then that dependency is activated:

#[cfg(feature = "rand")]

pub fn pick_a_number() -> u8 {

rand::random::< u8>()

}

#[cfg(not(feature = "rand"))]

pub fn pick_a_number() -> u8 {

4 // chosen by fair dice roll.

}

This also means that crate names and feature names share a namespace, even though

one is typically global (and usually governed by crates.io), and one is local to the

crate in question. Consequently, choose feature names careful y to avoid clashes with

the names of any crates that might be relevant as potential dependencies. It is possible

to work around a clash, beca

package key), but it’s easier not to have to.

So you can determine a crate’s features by examining [features] as well as optional

[dependencies] in the crate’s Cargo.toml file. To turn on a feature of a dependency,

add the features option to the relevant line in the [dependencies] stanza of your

own manifest file:

[dependencies]

somecrate = { version = "^0.3", features = ["featureA", "rand" ] }

This line ensures that somecrate will be built with both the featureA and the rand

feature enabled. However, that might not be the only features that are enabled; other

features ma

means that a crate will get built with the union of all of the features that are requested

by anything in the build graph. In other words, if some other dependency in the build

graph also relies on somecrate, but with just featureB enabled, then the crate will be

built with all of featureA, featureB, and rand enabled, to satisfy ever

consideration applies to default features: if your crate sets default-features =

false for a dependency but some other place in the build graph leaves the default fea‐

tures enabled, then enabled they will be.

7 This default behavior can be disabled by using a "dep:<crate>" reference elsewhere in the features stanza;

for details.

8 The cargo tree --edges features command can help with determining which features are enabled for

which crates, and why.

Item 26: Be wary of feature creep | 199

Feature unification means that features should be additive; it’s a bad idea to have mutually incompatible features because there’s nothing to prevent the incompatible

features being simultaneously enabled by different users.

For example, if a crate exposes a struct and its fields publicly, it’s a bad idea to make

the fields feature-dependent:

U N D E S I R E D B E H A V I O R

/// A structure whose contents are public, so external users can construct

/// instances of it.

#[derive(Debug)]

pub struct ExposedStruct {

pub data: Vec< u8>,

/// Additional data that is required only when thèschemà feature

/// is enabled.

#[cfg(feature = "schema")]

pub schema: String,

}

A user of the crate that tries to build an instance of the struct has a quandary:

should they fill in the schema field or not? One way to try to solve this is to add a

corresponding feature in the user’s Cargo.toml:

[features]

# The ùse-schemà feature here turns on thèschemà feature of `somecratè.

# (This example uses different feature names for clarity; real code is more

# likely to reuse the feature names across both places.)

use-schema = ["somecrate/schema"]

and to make the struct construction depend on this feature:

U N D E S I R E D B E H A V I O R

let s = somecrate::ExposedStruct {

data: vec![0x82, 0x01, 0x01],

// Only populate the field if we've requested

// activation of `somecrate/schemà.

#[cfg(feature = "use_schema")]

schema: "[int int]",

};

However, this doesn’t cover all eventualities: the code will fail to compile if this code

doesn’t activate somecrate/schema but some other transitive dependency does. The

core of the problem is that only the crate that has the feature can check the feature;

200 | Chapter 4: Dependencies

there’s no way for the user of the crate to determine whether Cargo has turned on

somecrate/schema or not. As a result, you should avoid feature-gating public fields in

structures.

A similar consideration applies to public traits, intended to be used outside the crate

they’re defined in. Consider a trait that includes a feature gate on one of its methods:

U N D E S I R E D B E H A V I O R

/// Trait for items that support CBOR serialization.

pub trait AsCbor: Sized {

/// Convert the item into CBOR-serialized data.

fn serialize(&self) -> Result<Vec< u8>, Error>;

/// Create an instance of the item from CBOR-serialized data.

fn deserialize(data: &[u8]) -> Result<Self, Error>;

/// Return the schema corresponding to this item.

#[cfg(feature = "schema")]

fn cddl(&self) -> String;

}

External trait implementors again have a quandary: should they implement the

cddl(&self) method or not? The external code that tries to implement the trait

doesn’t know—and can’t tell—whether to implement the feature-gated method or not.

So the net is that you should avoid feature-gating methods on public traits. A trait

method with a default implementa) might be a partial exception to

this—but only if it never makes sense for external code to override the default.

Feature unification also means that if your crate has N independent features, then all of the 2N possible build combinations can occur in practice. To avoid unpleasant surprises, it’s a good idea to ensure that your CI system (N

combinations, in all of the available test varian).

However, the use of optional features is very helpful in controlling exposure to an

expanded dependency gra). This is particularly useful in low-level crates

that are capable of being used in a no_std environmen’s common to

have a std or alloc feature that turns on functionality that relies on those libraries.

9 Features can force other features to be enabled; in the original example, the featureAB feature forces both featureA and featureB to be enabled.

Item 26: Be wary of feature creep | 201

Things to Remember

• Feature names overlap with dependency names.

• Feature names should be carefully chosen so they don’t clash with potential

dependency names.

• Features should be additive.

• Avoid feature gates on public struct fields or trait methods.

• Having lots of independent features potentially leads to a combinatorial explo‐

sion of different build configurations.

202 | Chapter 4: Dependencies

CHAPTER 5

Tooling

Titus Winters (Google’s C++ library lead) describes software engineering as program‐

ming integrated over time, or sometimes as programming integrated over time and

people. Over longer timescales, and a wider team, there’s more to a codebase than just

the code held within it.

Modern languages, including Rust, are aware of this and come with an ecosystem of

tooling that goes way beyond just converting the program into executable binary

code (the compiler).

This chapter explores the Rust tooling ecosystem, with a general recommendation to

make use of all of this infrastructure. Obviously, doing so needs to be proportionate—

setting up CI, documentation builds, and six types of test would be overkill for a

throwaway program that is run only twice. But for most of the things described in

this chapter, there’s lots of “bang for the buck”: a little bit of investment into tooling

integration will yield worthwhile benefits.

Item 27: Document public interfaces

If your crate is going to be used by other programmers, then it’s a good idea to add

documentation for its contents, particularly its public API. If your crate is more than

just ephemeral, throwaway code, then that “other programmer” includes the you-of-

the-future, when you have forgotten the details of your current code.

This is not advice that’s specific to Rust, nor is it new advice—for exam

tem 44: “Write doc comments for all exposed API elements.”

The particulars of Rust’s documentation comment format—Markdown-based,

delimited with /// or //!—are covered in the ple:

203

/// Calculate the [`BoundingBox`] that exactly encompasses a pair

/// of [`BoundingBox`] objects.

pub fn union(a: & BoundingBox, b: & BoundingBox) -> BoundingBox {

// ...

}

However, there are some specific details about the format that are worth highlighting:

Use a code font for code

For anything that would be typed into source code as is, surround it with back-

quotes to ensure that the resulting documentation is in a fixed-width font, mak‐

ing the distinction between code and text clear.

Add copious cross-references

Add a Markdown link for anything that might provide context for someone read‐

ing the documentation. In particular, cross-reference identifiers with the conve‐

nient [`SomeThing`] syntax—if SomeThing is in scope, then the resulting

documentation will hyperlink to the right place.

Consider including example code

If it’s not trivially obvious how to use an entrypoint, adding an # Examples sec‐

tion with sample code can be helpful. Note that sample code in

gets compiled and executed when you run cargo test (see

it stay in sync with the code it’s demonstrating.

Document panics and unsafe constraints

If there are inputs that cause a function to panic, document (in a # Panics sec‐

tion) the preconditions that are required to avoid the panic!. Similarly, docu‐

ment (in a # Safety section) any requirements for unsafe code.

The documentation for Rust’s provides an excellent example to emu-late for all of these details.

Tooling

The Markdown format that’s used for documentation comments results in elegant

output, but this also means that there is an explicit conversion step (cargo doc). This

in turn raises the possibility that something goes wrong along the way.

The simplest advice for this is just to read the rendered documentation after writing it,

by running cargo doc --open (or cargo doc --no-deps --open to restrict the gen‐

erated documentation to just the current crate).

204 | Chapter 5: Tooling

You could also check that all the generated hyperlinks are valid, but that’s a job more

suited to a machine—via the broken_intra_doc_links crate a

U N D E S I R E D B E H A V I O R

#![deny(broken_intra_doc_links)]

/// The bounding box for a [`Polygonè].

#[derive(Clone, Debug)]

pub struct BoundingBox {

// ...

}

With this attribute enabled, cargo doc will detect invalid links:

error: unresolved link tòPolygonè

--> docs/src/main.rs:4:30

4 | /// The bounding box for a [`Polygonè].

| ^^^^^^^^ no item named `Polygoneìn scope

You can also require documentation, by enabling the #![warn(missing_docs)]

attribute for the crate. When this is enabled, the compiler will emit a warning for

every undocumented public item. However, there’s a risk that enabling this option

will lead to poor-quality documentation comments that are rushed out just to get the

compiler to shut up—more on this to come.

As ever, any tooling that detects potential problems should form a part of your CI

), to catch any regressions that creep in.

Additional Documentation Locations

The output from cargo doc is the primary place where your crate is documented, but

it’s not the only place—other parts of a Cargo project can help users figure out how to

use your code.

The examples/ subdirectory of a Cargo project can hold the code for standalone

binaries that make use of your crate. These programs are built and run very similarly

to integration tests () but are specifically intended to hold example code that

illustrates the correct use of your crate’s interface.

1 Historically, this option used to be called intra_doc_link_resolution_failure.

Item 27: Document public interfaces | 205

On a related note, bear in mind that the integration tests under the tests/ subdirec‐

tory can also serve as examples for the confused user, even though their primary pur‐

pose is to test the crate’s external interface.

Published Crate Documentation

If you publish your crate to crates.io, the documentation for your project will be

visible aust project that builds and hosts documentation for published crates.

Note that crates.io and docs.rs are intended for slightly different audiences:

crates.io is aimed at people who are choosing what crate to use, whereas docs.rs is

intended for people figuring out how to use a crate they’ve already included

(although there’s obviously considerable overlap between the two).

As a result, the home page for a crate shows different content in each location:

docs.rs

Shows the top-level page from the output of cargo doc, as generated from //!

comments in the top-level src/lib.rs file.

crates.io

Shows the content of any top-level README.md file that’s included in the proj‐

ect’s repo

What Not to Document

When a project requires that documentation be included for all public items (as men‐

tioned in the first section), it’s very easy to fall into the trap of having documentation

that’s a pointless waste of valuable pixels. Having the compiler warn about missing

doc comments is only a proxy for what you really want—useful documentation—and

is likely to incentivize programmers to do the minimum needed to silence the

warning.

Good doc comments are a boon that helps users understand the code they’re using;

bad doc comments impose a maintenance burden and increase the chance of user

confusion when they get out of sync with the code. So how to distinguish between the

two?

The primary advice is to avoid repeating in text something that’s clear from the code.

exhorted you to encode as much semantics as possible into Rust’s type system;

once you’ve done that, allow the type system to document those semantics. Assume

2 The default behavior of automatically including README.md can be overridden with the

206 | Chapter 5: Tooling

that the reader is familiar with Rust—possibly because they’ve read a helpful collec‐

tion of Items describing effective use of the language—and don’t repeat things that are

clear from the signatures and types involved.

Returning to the previous example, an overly verbose documentation comment

might be as follows:

U N D E S I R E D B E H A V I O R

/// Return a new [`BoundingBox`] object that exactly encompasses a pair

/// of [`BoundingBox`] objects.

///

/// Parameters:

/// - à`: an immutable reference to àBoundingBox`

/// - `b`: an immutable reference to àBoundingBox`

/// Returns: new `BoundingBoxòbject.

pub fn union(a: & BoundingBox, b: & BoundingBox) -> BoundingBox {

This comment repeats many details that are clear from the function signature, to no

benefit.

Worse, consider what’s likely to happen if the code gets refactored to store the result

in one of the original argumenNo

compiler or tool complains that the comment isn’t updated to match, so it’s easy to

end up with an out-of-sync comment:

U N D E S I R E D B E H A V I O R

/// Return a new [`BoundingBox`] object that exactly encompasses a pair

/// of [`BoundingBox`] objects.

///

/// Parameters:

/// - à`: an immutable reference to àBoundingBox`

/// - `b`: an immutable reference to àBoundingBox`

/// Returns: new `BoundingBoxòbject.

pub fn union(a: & mut BoundingBox, b: & BoundingBox) {

In contrast, the original comment survives the refactoring unscathed, because its text

describes behavior, not syntactic details:

/// Calculate the [`BoundingBox`] that exactly encompasses a pair

/// of [`BoundingBox`] objects.

pub fn union(a: & mut BoundingBox, b: & BoundingBox) {

The mirror image of the preceding advice also helps improve documentation: include

in text anything that’s not clear from the code. This includes preconditions, invariants,

panics, error conditions, and anything else that might surprise a user; if your code

Item 27: Document public interfaces | 207

can’t comt the surprises are documented so you can at least say, “I told you so.”

Another common failure mode is when doc comments describe how some other code

uses a method, rather than what the method does:

/// Return the intersection of two [`BoundingBox`] objects, returning `Nonè

/// if there is no intersection. The collision detection code in `hits.rs`

/// uses this to do an initial check to see whether two objects might overlap,

/// before performing the more expensive pixel-by-pixel check in

/// òbjects_overlap`.

pub fn intersection(

a: & BoundingBox,

b: & BoundingBox,

) -> Option<BoundingBox> {

Comments like this are almost guaranteed to get out of sync: when the using code

(here, hits.rs) changes, the comment that describes the behavior is nowhere nearby.

Rewording the comment to focus more on the why makes it more robust to future

changes:

/// Return the intersection of two [`BoundingBox`] objects, returning `Nonè

/// if there is no intersection. Note that intersection of bounding boxes

/// is necessary but not sufficient for object collision -- pixel-by-pixel

/// checks are still required on overlap.

pub fn intersection(

a: & BoundingBox,

b: & BoundingBox,

) -> Option<BoundingBox> {

When writing software, it’s good advice to “program in the future tensestructure

the code to accommodate future changes. The same principle is true for documenta‐

tion: focusing on the semantics, the whys and the why nots, gives text that is more

likely to remain helpful in the long run.

Things to Remember

• Add doc comments for public API items.

• Describe aspects of the code—such as panics and safety criteria—that aren’t obvi‐

ous from the code itself.

• Don’t describe things that are obvious from the code itself.

• Make navigation clearer by providing cross-references and by making identifiers

stand out.

3 Scott Meyers, More Effective C++ (Addison-Wesley), Item 32.

208 | Chapter 5: Tooling

Item 28: Use macros judiciously

In some cases it’s easy to decide to write a macro instead of a function, because only a macro

can do what’s needed.

—Paul Graham, On Lisp (Prentice Hall)

Rust’s macro systems allow you to perform metaprogramming: to write code that

emits code into your project. This is most valuable when there are chunks of “boiler‐

plate” code that are deterministic and repetitive and that would otherwise need to be

kept in sync manually.

Programmers coming to Rust may have previously encountered the macros provided

by C/C++’s preprocessor, which perform textual substitution on the tokens of the

input text. Rust’s macros are a different beast, because they work on either the parsed

tokens of the program or on the abstract syntax tree (AST) of the program, rather

than just its textual content.

This means Rust macros can be aware of code structure and can consequently avoid

entire classes of macro-related footguns. In particular, we see in the following section

that Rust’s declarative macros are —they cannot accidentally refer to (“capture”) local variables in the surrounding code.

One way to think about macros is to see them as a different level of abstraction in the

code. A simple form of abstraction is a function: it abstracts away the differences

between different values of the same type, with implementation code that can use any

of the features and methods of that type, regardless of the current value being oper‐

ated on. A generic is a different level of abstraction: it abstracts away the difference

between different types that satisfy a trait bound, with implementation code that can

use any of the methods provided by the trait bounds, regardless of the current type

being operated on.

A macro abstracts away the difference between different fragments of the program

that play the same role (type, identifier, expression, etc.); the implementation can

then include any code that makes use of those fragments in the same role.

Rust provides two ways to define macros:

• Declarative macros, also known as “macros by example,” allow the insertion of

arbitrary Rust code into the program, based on the input parameters to the

macro (which are categorized according to their role in the AST).

• Procedural macros allow the insertion of arbitrary Rust code into the program,

based on the parsed tokens of the source code. This is most commonly used for

derive macros, which can generate code based on the contents of data structure

definitions.

Item 28: Use macros judiciously | 209

Declarative Macros

Although this Item isn’

, a few reminders of details to watch out for are in order.

First, be aware that the scoping rules for using a declarative macro are different than

for other Rust items. If a declarative macro is defined in a source code file, only the

code after the macro definition can make use of it:

D O E S N O T C O M P I L E

fn before() {

println!("[before] square {} is {}", 2, square!(2));

}

/// Macro that squares its argument.

macro_rules! square {

{ $e:expr } => { $e * $e }

}

fn after() {

println!("[after] square {} is {}", 2, square!(2));

}

error: cannot find macròsquareìn this scope

--> src/main.rs:4:45

4 | println!("[before] square {} is {}", 2, square!(2));

| ^^^^^^

= help: have you added thè#[macro_use]òn the module/import?

The #[macro_export] attribute makes a macro more widely visible, but this also has

an oddity: a macro appears at the top level of a crate, even if it’s defined in a module:

mod submod {

#[macro_export]

macro_rules! cube {

{ $e:expr } => { $e * $e * $e }

}

mod user {

pub fn use_macro() {

// Note: *not* `crate::submod::cube!`

let cubed = crate::cube!(3);

println!("cube {} is {}", 3, cubed);

}

210 | Chapter 5: Tooling

Rust’s declarative macros are what’s known as hygienic: the expanded code in the body of the macro is not allowed to make use of local variable bindings. For example, a

macro that assumes that some variable x exists:

// Create a macro that assumes the existence of a local `x`.

macro_rules! increment_x {

{} => { x += 1; };

}

will trigger a compilation failure when it is used:

D O E S N O T C O M P I L E

let mut x = 2;

increment_x!();

println!("x = {}", x);

error[E0425]: cannot find valuèxìn this scope

--> src/main.rs:55:13

55 | {} => { x += 1; };

| ^ not found in this scope

...

314 | increment_x!();

| -------------- in this macro invocation

= note: this error originates in the macro ìncrement_x`

This hygienic property means that Rust’s macros are safer than C preprocessor mac‐

ros. However, there are still a couple of minor gotchas to be aware of when using

them.

The first is to realize that even if a macro invocation looks like a function invocation,

it’s not. A macro generates code at the point of invocation, and that generated code

can perform manipulations of its arguments:

macro_rules! inc_item {

{ $x:ident } => { $x.contents += 1; }

}

This means that the normal intuition about whether parameters are moved or &-

referred-to doesn’t apply:

let mut x = Item { contents: 42 }; // type is not `Copy`

// Item is *not* moved, despite the (x) syntax,

// but the body of the macro *can* modify `x`.

inc_item!(x);

println!("x is {x:?}");

Item 28: Use macros judiciously | 211

x is Item { contents: 43 }

This becomes clear if we remember that the macro inserts code at the point of invoca‐

tion—in this case, adding a line of code that increments x.contents.

t the compiler sees, after macro expansion: let mut x = Item { contents: 42 };

x.contents += 1;

{

::std::io::_print(format_args!("x is {0:?}\n", x));

};

The expanded code includes the modification in place, via the owner of the item, not

a reference. (It’s also interesting to see the expanded version of println!, which relies

on the format_args! macro, to be discussed shortly

So the exclamation mark serves as a warning: the expanded code for the macro may

do arbitrary things to or with its arguments.

The expanded code can also include control flow operations that aren’t visible in the

calling code, whether they be loops, conditionals, return statements, or use of the ?

operator. Obviously, this is likely to viola, so prefer macros whose behavior aligns with normal Rust where possible and appropriate.

(On the other hand, if the purpose of the macro is to allow weird control flow, go for

it! But help out your users by making sure the control flow behavior is clearly docu‐

mented.)

For example, consider a macro (for checking HTTP status codes) that silently

includes a return in its body:

/// Check that an HTTP status is successful; exit function if not.

macro_rules! check_successful {

{ $e:expr } => {

if $e.group() != Group::Successful {

return Err(MyError("HTTP operation failed"));

}

Code that uses this macro to check the result of some kind of HTTP operation can

end up with control flow that’s somewhat obscure:

let rc = perform_http_operation();

check_successful!(rc); // may silently exit the function

// ...

4 An eagle-eyed reader might notice that format_args! still looks like a macro invocation, even after macros have been expanded. That’s because it’s a special macro that’s built into the compiler.

212 | Chapter 5: Tooling

An alternative version of the macro that generates code that emits a Result:

/// Convert an HTTP status into àResult<(), MyError>ìndicating success.

macro_rules! check_success {

{ $e:expr } => {

match $e.group() {

Group::Successful => Ok(()),

_ => Err(MyError("HTTP operation failed")),

}

gives code that’s easier to follow:

let rc = perform_http_operation();

check_success!(rc)?; // error flow is visible vià?`

// ...

The second thing to watch out for with declarative macros is a problem shared with

the C preprocessor: if the argument to a macro is an expression with side effects,

beware of repeated use of the argument in the macro. The square! macro defined

earlier takes an arbitrary expression as an argument and then uses that argument

twice, which can lead to surprises:

U N D E S I R E D B E H A V I O R

let mut x = 1;

let y = square!({

x += 1;

});

println!("x = {x}, y = {y}");

// output: x = 3, y = 6

Assuming that this behavior isn’t intended, one way to fix it is simply to evaluate the

expression once and assign the result to a local variable:

macro_rules! square_once {

{ $e:expr } => {

{

let x = $e;

x*x // Note: there's a detail here to be explained later...

}

// output now: x = 2, y = 4

Item 28: Use macros judiciously | 213

The other alternative is not to allow an arbitrary expression as input to the macro. If

ident fragment specifier, then the macro will only accept identifiers as inputs, and the attempt to feed it an arbitrary

expression will no longer compile.

Formatting Values

One common style of declarative macro involves assembling a message that includes

various values from the current state of the code. For example, the standard library

includes for assembling a String for printing to standard output,

for printing to standard error, and so on. The describes the syntax of the formatting directives, which are roughly equivalent to C’s printf statement. However, the format arguments are type safe and checked at compile time, and

the implementations of the macro use the Display and Debug traits described in

You can (and should) use the same formatting syntax for any macros of your own that

perform a similar function. For exam

crate use the same syntax as format!. To do this, us for macros that perform argument formatting rather than attempting to reinvent the wheel:

/// Log an error including code location, with `format!`-like arguments.

/// Real code would probably use thèlog` crate.

macro_rules! my_log {

{ $($arg:tt)+ } => {

eprintln!("{}:{}: {}", file!(), line!(), format_args!($($arg)+));

}

let x = 10u8;

// Format specifiers:

// - `x` says print as hex

// - `#` says prefix with '0x'

// - `04` says add leading zeroes so width is at least 4

// (this includes the '0x' prefix).

my_log!("x = {:#04x}", x);

src/main.rs:331: x = 0x0a

5 The t are used when displaying data in particular formats. For examx format specifier indicates that lower-case hexadecimal output is required.

214 | Chapter 5: Tooling

Procedural Macros

Rust also supports procedural macros, often known as proc macros. Like a declarative

macro, a has the ability to insert arbitrary Rust code into the program’s source code. However, the inputs to the macro are no longer just the specific

arguments passed to it; instead, a procedural macro has access to the parsed tokens

corresponding to some chunk of the original source code. This gives a level of expres‐

sive power that approaches the flexibility of dynamic languages such as Lisp—but still

with compile-time guarantees. It also helps mitigate the limitations of reflection in

Rust, as discussed in

Procedural macros must be defined in a separate crate (of crate type proc-macro)

from where they are used, and that crate will almost certainly need to depend on

(provided by the standard toolchain) or (provided by David Tolnay) as a support library, to make it possible to work with the input tokens.

There are three distinct types of procedural macro:

Function-like macros

Invoked with an argument

Attribute macros

Attached to some chunk of syntax in the program

Derive macros

Attached to the definition of a data structure

Function-like macros

Function-like procedural macros are invoked with an argument, and the macro defi‐

nition has access to the parsed tokens that make up the argument, and emits arbitrary

tokens as a result. Note that the previous sentence says “argument,” singular—even if

a function-like macro is invoked with what looks like multiple arguments:

my_func_macro!(15, x + y, f32::consts::PI);

the macro itself receives a single argument, which is a stream of parsed tokens. A

macro implementation that just prints (at compile time) the contents of the stream:

use proc_macro::TokenStream;

// Function-like macro that just prints (at compile time) its input stream.

#[proc_macro]

pub fn my_func_macro(args: TokenStream) -> TokenStream {

println!("Input TokenStream is:");

for tt in args {

println!(" {tt:?}");

}

// Return an empty token stream to replace the macro invocation with.

Item 28: Use macros judiciously | 215

TokenStream::new()

}

shows the stream corresponding to the input:

Input TokenStream is:

Literal { kind: Integer, symbol: "15", suffix: None,

span: #0 bytes(10976..10978) }

Punct { ch: ',', spacing: Alone, span: #0 bytes(10978..10979) }

Ident { ident: "x", span: #0 bytes(10980..10981) }

Punct { ch: '+', spacing: Alone, span: #0 bytes(10982..10983) }

Ident { ident: "y", span: #0 bytes(10984..10985) }

Punct { ch: ',', spacing: Alone, span: #0 bytes(10985..10986) }

Ident { ident: "f32", span: #0 bytes(10987..10990) }

Punct { ch: ':', spacing: Joint, span: #0 bytes(10990..10991) }

Punct { ch: ':', spacing: Alone, span: #0 bytes(10991..10992) }

Ident { ident: "consts", span: #0 bytes(10992..10998) }

Punct { ch: ':', spacing: Joint, span: #0 bytes(10998..10999) }

Punct { ch: ':', spacing: Alone, span: #0 bytes(10999..11000) }

Ident { ident: "PI", span: #0 bytes(11000..11002) }

The low-level nature of this input stream means that the macro implementation has

to do its own parsing. For example, separating out what appear to be separate argu‐

ments to the macro involves looking for TokenTree::Punct tokens that hold the

commas dividing the arguments. The crate (from David Tolnay) provides a parsing library tha describes.

Because of this, it’s usually easier to use a declarative macro than a function-like pro‐

cedural macro, because the expected structure of the macro’s inputs can be expressed

in the matching pattern.

The flip side of this need for manual processing is that function-like proc macros

have the flexibility to accept inputs that don’t parse as normal Rust code. That’s not

often needed (or sensible), so function-like macros are comparatively rare as a result.

Attribute macros

Attribute macros are invoked by placing them before some item in the program, and

the parsed tokens for that item are the input to the macro. The macro can again emit

arbitrary tokens as output, but the output is typically some transformation of the

input.

For example, an attribute macro can be used to wrap the body of a function:

#[log_invocation]

fn add_three(x: u32) -> u32 {

x + 3

}

so that invocations of the function are logged:

216 | Chapter 5: Tooling

let x = 2;

let y = add_three(x);

println!("add_three({x}) = {y}");

log: calling function 'add_three'

log: called function 'add_three' => 5

add_three(2) = 5

The implementation of this macro is too large to include here, because the code needs

to check the structure of the input tokens and to build up the new output tokens, but

the syn crate can again help with this processing.

Derive macros

The final type of procedural macro is the derive macro, which allows generated code

to be automatically attached to a data structure definition (a struct, enum, or union).

This is similar to an attribute macro but there are a few derive-specific aspects to be

aware of.

The first is that derive macros add to the input tokens, instead of replacing them

altogether. This means that the data structure definition is left intact but the macro

has the opportunity to append related code.

The second is that a derive macro can declare associated helper attributes, which can

then be used to mark parts of the data structure that need special processing. For

example, ’s derive macro has a serde helper attribute that can provide metadata to guide the deserialization process:

fn generate_value() -> String {

"unknown".to_string()

}

#[derive(Debug, Deserialize)]

struct MyData {

// If `valueìs missing when deserializing, invoke

// `generate_value()` to populate the field instead.

#[serde(default = "generate_value")]

value: String,

}

The final aspect of derive macros to be aware of is that the crate can take care of much of the heavy lifting involved in parsing the input tokens into the equivalent

nodes in the AST macro converts the tokens into a

ta structure that describes the content of the item, and Derive Input is much easier to deal with than a raw stream of tokens.

In practice, derive macros are the most commonly encountered type of procedural

macro—the ability to generate field-by-field (for structs) or variant-by-variant (for

enums) implementations allows for a lot of functionality to be provided with little

Item 28: Use macros judiciously | 217

effort from the programmer—for example, by adding a single line like

#[derive(Debug, Clone, PartialEq, Eq)].

Because the derived implementations are auto-generated, it also means that the

implementations automatically stay in sync with the data structure definition. For

example, if you were to add a new field to a struct, a manual implementation of

Debug would need to be manually updated, whereas an automatically derived version

would display the new field with no additional effort (or would fail to compile if that

wasn’t possible).

When to Use Macros

The primary reason to use macros is to avoid repetitive code—especially repetitive

code that would otherwise have to be manually kept in sync with other parts of the

code. In this respect, writing a macro is just an extension of the same kind of general‐

ization process that normally forms part of programming:

• If you repeat exactly the same code for multiple values of a specific type, encapsu‐

late that code into a common function and call the function from all of the

repeated places.

• If you repeat exactly the same code for multiple types, encapsulate that code into

a generic with a trait bound and use the generic from all of the repeated places.

• If you repeat the same structure of code in multiple places, encapsulate that code

into a macro and use the macro from all of the repeated places.

For example, avoiding repetition for code that works on different enum variants can

be done only by a macro:

enum Multi {

Byte(u8),

Int(i32),

Str(String),

}

/// Extract copies of all the values of a specific enum variant.

#[macro_export]

macro_rules! values_of_type {

{ $values:expr, $variant:ident } => {

{

let mut result = Vec::new();

for val in $values {

if let Multi::$variant(v) = val {

result.push(v.clone());

}

result

}

218 | Chapter 5: Tooling

}

fn main() {

let values = vec![

Multi::Byte(1),

Multi::Int(1000),

Multi::Str("a string".to_string()),

Multi::Byte(2),

];

let ints = values_of_type!(&values, Int);

println!("Integer values: {ints:?}");

let bytes = values_of_type!(&values, Byte);

println!("Byte values: {bytes:?}");

// Output:

// Integer values: [1000]

// Byte values: [1, 2]

}

Another scenario where macros help avoid manual repetition is when information

about a collection of data values would otherwise be spread out across different areas

of the code.

For example, consider a data structure that encodes information about HTTP status

codes; a macro can help keep all of the related information together:

// http.rs module

#[derive(Debug, PartialEq, Eq, Clone, Copy)]

pub enum Group {

Informational, // 1xx

Successful, // 2xx

Redirection, // 3xx

ClientError, // 4xx

ServerError, // 5xx

}

// Information about HTTP response codes.

http_codes! {

Continue => (100, Informational, "Continue"),

SwitchingProtocols => (101, Informational, "Switching Protocols"),

// ...

Ok => (200, Successful, "Ok"),

Created => (201, Successful, "Created"),

// ...

}

Item 28: Use macros judiciously | 219

The macro invocation holds all the related information—numeric value, group,

description—for each HTTP status code, acting as a kind of domain-specific lan‐

guage (DSL) holding the source of truth for the data.

The macro definition then describes the generated code; each line of the form

$( ... )+ expands to multiple lines in the generated code, one per argument to the

macro:

macro_rules! http_codes {

{ $( $name:ident => ($val:literal, $group:ident, $text:literal), )+ } => {

#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]

#[repr(i32)]

enum Status {

$( $name = $val, )+

}

impl Status {

fn group(&self) -> Group {

match self {

$( Self::$name => Group::$group, )+

}

fn text(&self) -> &'static str {

match self {

$( Self::$name => $text, )+

}

impl core::convert::TryFrom< i32> for Status {

type Error = ();

fn try_from(v: i32) -> Result<Self, Self::Error> {

match v {

$( $val => Ok(Self::$name), )+

_ => Err(())

}

As a result, the overall output from the macro takes care of generating all of the code

that derives from the source-of-truth values:

• The definition of an enum holding all the variants

• The definition of a group() method, which indicates which group an HTTP sta‐

tus belongs to

• The definition of a text() method, which maps a status to a text description

• An implementation of TryFrom<i32> to convert numbers to status enum values

If an extra value needs to be added later, all that’s needed is a single additional line:

220 | Chapter 5: Tooling

ImATeapot => (418, ClientError, "I'm a teapot"),

Without the macro, four different places would have to be manually updated. The

compiler would point out some of them (because match expressions need to cover all

cases) but not all—TryFrom<i32> could easily be forgotten.

Because macros are expanded in place in the invoking code, they can also be used to

automatically emit additional diagnostic information—in particular, by using the

standard library’ macros, which emit source code location information:

macro_rules! log_failure {

{ $e:expr } => {

{

let result = $e;

if let Err(err) = &result {

eprintln!("{}:{}: operation '{}' failed: {:?}",

file!(),

line!(),

stringify!($e),

err);

}

result

}

When failures occur, the log file then automatically includes details of what failed and

where:

use std::convert::TryInto;

let x: Result< u8, _> = log_failure!(512.try_into()); // too big for ù8`

let y = log_failure!(std::str::from_utf8(b" \xc3\x28")); // invalid UTF-8

src/main.rs:340: operation '512.try_into()' failed: TryFromIntError(())

src/main.rs:341: operation 'std::str::from_utf8(b"\xc3\x28")' failed:

Utf8Error { valid_up_to: 0, error_len: Some(1) }

Disadvantages of Macros

The primary disadvantage of using a macro is the impact that it has on code readabil‐

ity and maintainabilityt macros allow

you to create a DSL to concisely express key features of your code and data. However,

this means that anyone reading or maintaining the code now has to understand this

DSL—and its implementation in macro definitions—in addition to understanding

Rust. For example, the http_codes! example in the previous section creates a Rust

enum named Status, but it’s not visible in the DSL used for the macro invocation.

Item 28: Use macros judiciously | 221

This potential impenetrability of macro-based code extends beyond other engineers:

various tools that analyze and interact with Rust code may treat the code as opaque,

because it no longer follows the syntactical conventions of Rust code. The

square_once! macro shown earlier provided one trivial example of this: the body of

the macro has not been formatted according to the normal rustfmt rules:

{

let x = $e;

// Thèrustfmt` tool doesn't really cope with code in

// macros, so this has not been reformatted tòx * x`.

x*x

}

Another example is the earlier http_codes! macro, where the DSL uses Group enum

variant names like Informational with neither a Group:: prefix nor a use statement,

which may confuse some code navigation tools.

Even the compiler itself is less helpful: its error messages don’t always follow the chain

of macro use and definition. (However, there are parts of the tooling ecosystem [see

t can help with this, such as David Tolnay’s , used earlier.) Another possible downside for macro use is the possibility of code bloat—a single

line of macro invocation can result in hundreds of lines of generated code, which will

be invisible to a cursory survey of the code. This is less likely to be a problem when

the code is first written, because at that point the code is needed and saves the

humans involved from having to write it themselves. However, if the code subse‐

quently stops being necessary, it’s not so obvious that there are large amounts of code

that could be deleted.

Advice

Although the previous section listed some downsides of macros, they are still funda‐

mentally the right tool for the job when there are different chunks of code that need

to be kept consistent but that cannot be coalesced any other way: use a macro when‐

ever it’s the only way to ensure that disparate code stays in sync.

Macros are also the tool to reach for when there’s boilerplate code to be squashed: use

a macro for repeated boilerplate code that can’t be coalesced into a function or a

generic.

To reduce the impact on readability, try to avoid syntax in your macros that clashes

with Rust’s normal syntax rules; either make the macro invocation look like normal

code or make it look sufficiently different so that no one could confuse the two. In

particular, follow these guidelines:

222 | Chapter 5: Tooling

• Avoid macro expansions that insert references where possible—a macro invocation

like my_macro!(&list) aligns better with normal Rust code than my_macro!

(list) would.

• Prefer to avoid nonlocal control flow operations in macros so that anyone reading

the code is able to follow the flow without needing to know the details of the

macro.

This preference for Rust-like readability sometimes affects the choice between declar‐

ative macros and procedural macros. If you need to emit code for each field of a

structure, or each variant of an enum, prefer a derive macro to a procedural macro that

emits a type (despite the exam’s more idiomatic and makes the code easier to read.

However, if you’re adding a derive macro with functionality that’s not specific to your

project, check whether an external crate already provides what you need (see

). For example, the problem of converting integer values into the appropriate var‐

iant of a C-like enum,

, and

problem.

Item 29: Listen to Clippy

It looks like you’re writing a letter. Would you like help?

—Microsoft Clippit

describes the ecosystem of helpful tools available in the Rust toolbox, but one tool is sufficiently helpful and important to get promoted to an Item of its very own:

Clippy is an additional component for Cargo (cargo clippy) that emits warnings

about your Rust usage, across a variety of categories:

Correctness

Warns about common programming errors

Idiom Warns about code constructs that aren’t quite in standard Rust style

Concision

Points out variations on the code that are more compact

Performance

Suggests alternatives that avoid unnecessary processing or allocation

Item 29: Listen to Clippy | 223

Readability

Describes alterations to the code that would make it easier for humans to read

and understand

For example, the following code builds fine:

U N D E S I R E D B E H A V I O R

pub fn circle_area(radius: f64) -> f64 {

let pi = 3.14;

pi * radius * radius

}

but Clippy points out that the local approximation to π is unnecessary and inaccu‐

rate:

error: approximate value of `f{32, 64}::consts::PÌ found

--> src/main.rs:5:18

5 | let pi = 3.14;

| ^^^^

= help: consider using the constant directly

= help: for further information visit

https://rust-lang.github.io/rust-clippy/master/index.html#approx_constant

= note: `#[deny(clippy::approx_constant)]òn by default

The linked webpage explains the problem and points the way to a suitable modifica‐

tion of the code:

pub fn circle_area(radius: f64) -> f64 {

std::f64::consts::PI * radius * radius

}

As shown previously, each Clippy warning comes with a link to a webpage describing

the error, which explains why the code is considered bad. This is vital, because it

allows you to decide whether those reasons apply to your code or whether there is

some particular reason why the lint check isn’t relevant. In some cases, the text also

describes known problems with the lint, which might explain an otherwise confusing

false positive.

If you decide that a lint warning isn’t relevant for your code, you can disable it either

for that particular item (#[allow(clippy::some_lint)]) or for the entire crate (#!

[allow(clippy::some_lint)], with an extra !, at the top level). However, it’s usually

better to take the cost of a minor refactoring of the code than to waste time and

energy arguing about whether the warning is a genuine false positive.

224 | Chapter 5: Tooling

Whether you choose to fix or disable the warnings, you should make your code

Clippy-warning free.

That way, when new warnings appear—whether because the code has been changed

or because Clippy has been upgraded to include new checks—they will be obvious.

Clippy should also be enabled in your CI system (

Clippy’s warnings are particularly helpful when you’re learning Rust, because they

reveal gotchas you might not have noticed and help you become familiar with Rust

idiom.

Many of the Items in this book also have corresponding Clippy warnings, when it’s

possible to mechanically check the relevant concern:

• bools, and Clippy will

also point out the use of multiple bools in

• covers manipulations of Option and Result types, and Clippy points out a

few possible redundancies, such as the following:

—

• t errors should be returned to the caller where possible;

Clippy

• suggests implementing From rather than Into

• ult) warnings for the

following:

—

• describes fat pointer types, and various Clippy lints point out scenarios

where there are unnecessary extra pointer indirections:

—

Item 29: Listen to Clippy | 225

• describes the myriad ways to manipulate Iterator instances; Clippy

• describes Rust’s standard traits and included some implementation

requirements that Clippy checks:

— .

—

— .

• suggests or rela, which Clippy also detects.

• ves that importing a wildcard version of a crate isn’t sensible;

Clippy .

• voiding wildcard im

• I touch on the fact that multiple versions of the same crate can appear in your dependency gra

• ture of Cargo features, and Clippy includes a

warning about (e.g., "no_std") that are likely to indicate a feature that falls foul of this.

• t a crate’s optional dependencies form part of its feature

that could just make use of this instead.

• ventions for documentation comments, and Clippy will

also point out the following:

—

As the size of this list should make clear, it can be a valuable learning experience to

read the —including the checks that are disabled by default because they are overly pedantic or because they have a high rate of false positives.

Even though you’re unlikely to want to enable these warnings for your code, under‐

standing the reasons why they were written in the first place will improve your

understanding of Rust and its idiom.

226 | Chapter 5: Tooling

Item 30: Write more than unit tests

All companies have test environments.

The lucky ones have production environments separate from the test environment.

—

Like most other modern languages, Rust includes features that make it easy to

that live alongside your code and that give you confidence that the code is working correctly.

This isn’t the place to expound on the importance of tests; suffice it to say that if code

isn’t tested, it probably doesn’t work the way you think it does. So this Item assumes

that you’re already signed up to write tests for your code.

Unit tests and integration tests, described in the next two sections, are the key forms

of tests. However, the Rust toolchain, and extensions to the toolchain, allow for vari‐

ous other types of tests. This Item describes their distinct logistics and rationales.

Unit Tests

The most common form of test for Rust code is a unit test, which might look some‐

thing like this:

// ... (code defining `nat_subtract*` functions for natural

// number subtraction)

#[cfg(test)]

mod tests {

use super::*;

#[test]

fn test_nat_subtract() {

assert_eq!(nat_subtract(4, 3).unwrap(), 1);

assert_eq!(nat_subtract(4, 5), None);

}

#[should_panic]

#[test]

fn test_something_that_panics() {

nat_subtract_unchecked(4, 5);

}

Some aspects of this example will appear in every unit test:

• A collection of unit test functions.

• Each test function is marked with the #[test] attribute.

Item 30: Write more than unit tests | 227

• The module holding the test functions is annotated with a #[cfg(test)]

attribute, so the code gets built only in test configurations.

Other aspects of this example illustrate things that are optional and may be relevant

only for particular tests:

• The test code here is held in a separate module, conventionally called tests or

test. This module may be inline (as here) or held in a separate tests.rs file. Using

a separate file for the test module has the advantage that it’s easier to spot

whether code that uses a function is test code or “real” code.

• The test module might have a wildcard use super::* to pull in everything from

the parent module under test. This makes it more convenient to add tests (and is

Назад: Deadlocks

Дальше: Integration Tests