Libraries Versus Applications - Effective Rust - David Drysdale - RutLib.com

Книга: Effective Rust

Назад: Cover

Дальше: Loop Transformation

}

error[E0119]: conflicting implementations of trait `From<WrappedError>` for

typèWrappedError`

--> src/main.rs:279:5

279 | impl<E: 'static + Error> From<E> for WrappedError {

| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

= note: conflicting implementation in cratècorè:

- impl<T> From<T> for T;

32 | Chapter 1: Types

David Tolnay’ is a crate that has already solved these problems (by adding an

t adds other helpful features (such as stack traces) besides. As a result, it is rapidly becoming the standard recommendation for

error handling—a recommendation seconded here: consider using the anyhow crate for

error handling in applications.

Libraries Versus Applications

The final advice from the previous section included the qualification “…for error

handling in applications.” That’s because there’s often a distinction between code that’s

written for reuse in a library and code that forms a top-level applica

Code that’s written for a library can’t predict the environment in which the code is

used, so it’s preferable to emit concrete, detailed error information and leave the caller

to figure out how to use that information. This leans toward the enum-style nested

errors described previously (and also avoids a dependency on anyhow in the public

API of the library, see

However, application code typically needs to concentrate more on how to present

errors to the user. It also potentially has to cope with all of the different error types

emitted by all of the libraries that are present in its dependency graph (

such, a more dynamic error type (such as ) makes error handling simpler and more consistent across the application.

Things to Remember

• The standard Error trait requires little of you, so prefer to implement it for your

error types.

• When dealing with heterogeneous underlying error types, decide whether it’s

necessary to preserve those types.

— If not, consider using anyhow to wrap suberrors in application code.

— If so, encode them in an enum and provide conversions. Consider using

thiserror to help with this.

• Consider using the anyhow crate for convenient idiomatic error handling in

application code.

• It’s your decision, but whatever you decide, encode it in the type system ().

11 This section is inspired by

Item 4: Prefer idiomatic Error types | 33

Item 5: Understand type conversions

Rust type conversions fall into three categories:

Manual

User-defined type conversions provided by implementing the From and Into

traits

Semi-automatic

Explicit casts between values using the as keyword

Automatic

Implicit coercion into a new type

The majority of this Item focuses on the first of these, manual conversions of types,

because the latter two mostly don’t apply to conversions of user-defined types. There

are a couple of exceptions to this, so sections at the end of the Item discuss casting

and coercion—including how they can apply to a user-defined type.

Note that in contrast to many older languages, Rust does not perform automatic con‐

version between numeric types. This even applies to “safe” transformations of integral

types:

D O E S N O T C O M P I L E

let x: u32 = 2;

let y: u64 = x;

error[E0308]: mismatched types

--> src/main.rs:70:18

70 | let y: u64 = x;

| --- ^ expected ù64`, found ù32`

| |

| expected due to this

help: you can convert a ù32` to a ù64`

70 | let y: u64 = x.into();

| +++++++

User-Defined Type Conversions

As with other feaversions

between values of different user-defined types is encapsulated as a standard trait—or

rather, as a set of related generic traits.

34 | Chapter 1: Types

The four relevant traits that express the ability to convert values of a type are as

follows:

Items of this type can be built from items of type T, and the conversion always

succeeds.

Items of this type can be built from items of type T, but the conversion might not

succeed.

Items of this type can be converted into items of type T, and the conversion

always succeeds.

Items of this type can be converted into items of type T, but the conversion might

not succeed.

about expressing things in the type system, it’s no sur‐

prise to discover that the difference with the Try... variants is that the sole trait

method returns a Result rather than a guaranteed new item. The Try... trait defini‐

tions also require an associated type that gives the type of the error E emitted for fail‐

ure situations.

The first piece of advice is therefore to implement (just) the Try... trait if it’s possible

for a conversion to fail. The alternative is to ignore the possibility of error (e.g., with .unwrap()), but that needs to be a deliberate choice, and in most

cases it’s best to leave that choice to the caller.

The type conversion traits have an obvious symmetry: if a type T can be transformed

into a type U (via Into<U>), isn’t that the same as it being possible to create an item of

type U by transforming from an item of type T (via From<T>)?

This is indeed the case, and it leads to the second piece of advice: implement the From

trait for conversions. The Rust standard library had to pick just one of the two possi‐

bilities, in order to prevent the system from spiraling around in dizzy circles, and it came down on the side of automatically providing Into from a From implementation.

If you’re consuming one of these two traits, as a trait bound on a new generic of your

own, then the advice is reversed: use the Into trait for trait bounds. That way, the

bound will be satisfied both by things that directly implement Into and by things that

only directly implement From.

12 More properly known as the trait coherence rules.

Item 5: Understand type conversions | 35

This automatic conversion is highlighted by the documentation for From and Into,

but it’s worth reading the relevant part of the standard library code too, which is a

blanket trait implementation:

impl<T, U> Into<U> for T

where

U: From<T>,

{

fn into(self) -> U {

U::from(self)

}

Translating a trait specification into words can help with understanding more com‐

plex trait bounds. In this case, it’s fairly simple: “I can implement Into<U> for a type T

whenever U already implements From<T>.”

The standard library also includes various implementations of these conversion traits

for standard library types. As you’d expect, there are From implementations for inte‐

gral conversions where the destination type includes all possible values of the source

type (From<u32> for u64), and TryFrom implementations when the source might not

fit in the destination (TryFrom<u64> for u32).

There are also various other blanket trait implementations in addition to the Into

version previously shown; these are mostly for smart pointer types, allowing the smart

pointer to be automatically constructed from an instance of the type that it holds.

This means that generic methods that accept smart pointer parameters can also be

The TryFrom trait also has a blanket implementation for any type that already imple‐

ments the Into trait in the opposite direction—which automatically includes (as

shown previously) any type that implements From in the same direction. In other

words, if you can infallibly convert a T into a U, you can also fallibly obtain a U from a

T; as this conversion will always succeed, the associated error type is the helpfully

There’s also one very specific generic implementation of From that sticks out, the

reflexive implementation:

impl<T> From<T> for T {

fn from(t: T) -> T {

}

13 For now—this is likely to be replaced with the in a future version of Rust.

36 | Chapter 1: Types

Translated into words, this just says that “given a T, I can get a T.” That’s such an obvi‐

ous “well, duh” that it’s worth stopping to understand why this is useful.

Consider a simple newtype struct () and a function that operates on it (ignor‐

ing that this function would be better expressed as a method):

/// Integer value from an IANA-controlled range.

#[derive(Clone, Copy, Debug)]

pub struct IanaAllocated(pub u64);

/// Indicate whether value is reserved.

pub fn is_iana_reserved(s: IanaAllocated) -> bool {

s.0 == 0 || s.0 == 65535

}

This function can be invoked with instances of the struct:

let s = IanaAllocated(1);

println!("{:?} reserved? {}", s, is_iana_reserved(s));

// output: "IanaAllocated(1) reserved? false"

but even if From<u64> is implemented for the newtype wrapper:

impl From< u64> for IanaAllocated {

fn from(v: u64) -> Self {

Self(v)

}

the function can’t be directly invoked for u64 values:

D O E S N O T C O M P I L E

if is_iana_reserved(42) {

// ...

}

error[E0308]: mismatched types

--> src/main.rs:77:25

77 | if is_iana_reserved(42) {

| ---------------- ^^ expected ÌanaAllocated`, found integer

| |

| arguments to this function are incorrect

note: function defined here

--> src/main.rs:7:8

7 | pub fn is_iana_reserved(s: IanaAllocated) -> bool {

| ^^^^^^^^^^^^^^^^ ----------------

help: try wrapping the expression in ÌanaAllocated`

Item 5: Understand type conversions | 37

77 | if is_iana_reserved(IanaAllocated(42)) {

| ++++++++++++++ +

However, a generic version of the function that accepts (and explicitly converts) any‐

thing satisfying Into<IanaAllocated>:

pub fn is_iana_reserved<T>(s: T) -> bool

where

T: Into<IanaAllocated>,

{

let s = s.into();

s.0 == 0 || s.0 == 65535

}

allows this use:

if is_iana_reserved(42) {

// ...

}

With this trait bound in place, the reflexive trait implementation of From<T> makes

more sense: it means that the generic function copes with items that are already

IanaAllocated instances, no conversion needed.

This pattern also explains why (and how) Rust code sometimes appears to be doing

implicit casts between types: the combination of From<T> implementations and

Into<T> trait bounds leads to code that appears to magically convert at the call site

(but is still doing safe, explicit, conversions under the covers). This pattern becomes

even more powerful when combined with reference types and their related conver‐

Casts

Rust includes the as keyword to perform explicit between some pairs of types.

The pairs of types that can be converted in this way constitute a fairly limited set, and

the only user-defined types it includes are “C-like” enums (those that have just an

associated integer value). General integral conversions are included, though, giving

an alternative to into():

let x: u32 = 9;

let y = x as u64;

let z: u64 = x.into();

The as version also allows lossy conversions:

14 Allowing lossy conversions in Rust was probably a mistake, and there have been around trying to remove this behavior.

38 | Chapter 1: Types

let x: u32 = 9;

let y = x as u16;

which would be rejected by the from/into versions:

error[E0277]: the trait bound ù16: From<u32>ìs not satisfied

--> src/main.rs:136:20

136 | let y: u16 = x.into();

| ^^^^ the trait `From<u32>ìs not implemented for ù16`

= help: the following other types implement trait `From<T>`:

= note: required for ù32` to implement Ìnto<u16>`

For consistency and safety, you should prefer from/into conversions over as casts,

teroperability). This advice can be reinforced by Clippy (), which includes several

lin; however, these lints are disabled by default.

Coercion

The explicit as casts described in the previous section are a superset of the implicit

that the compiler will silently perform: any coercion can be forced with an explicit as, but the converse is not true. In particular, the integral conversions performed in the previous section are not coercions and so will always require as.

Most coercions involve silent conversions of pointer and reference types in ways that

are sensible and convenient for the programmer, such as converting the following:

• A mutable reference to an immutable reference (so you can use a &mut T as the

argument to a function that takes a &T)

• A reference to a raw pointer (this isn’t unsafe—the unsafety happens at the point

where you’re foolish enough to dereference a raw pointer)

• A closure that happens to not capture any variables into a bare function pointer

•

• t the concrete item implements

• An item lifetime to a “shorter” one (

15 Rust refers to these con but it’s quite different from the definition of “subtyping” used in object-oriented languages.

Item 5: Understand type conversions | 39

There are only two coercions whose behavior can be affected by user-defined types.

The first happens when a user-defined type implements the

trait. These traits indicate that the user-defined type is acting as a smart pointer of

), and in this case the compiler will coerce a reference to the smart

pointer item into being a reference to an item of the type that the smart pointer con‐

tains (indicated by its ).

The second coercion of a user-defined type happens when a concrete item is con‐

verted to a trait object. This operation builds a fat pointer to the item; this pointer is

fat because it includes both a pointer to the item’s location in memory and a pointer

to the vtable for the concrete type’s implementa.

Item 6: Embrace the newtype pattern

described tuple structs, where the fields of a struct have no names and are instead referred to by number (self.0). This Item focuses on tuple structs that have a

single entry of some existing type, thus creating a new type that can hold exactly the

same range of values as the enclosed type. This pattern is sufficiently pervasive in

Rust that it deserves its own Item and has its own name: the newtype pattern.

The simplest use of the newtype pattern is to indicate , over and above its normal behavior. To illustrate this, imagine a project that’s going to

send a satellite to M It’s a big project, so different groups have built different parts of the project. One group has handled the code for the rocket engines:

/// Fire the thrusters. Returns generated impulse in pound-force seconds.

pub fn thruster_impulse(direction: Direction) -> f64 {

// ...

return 42.0;

}

while a different group handles the inertial guidance system:

/// Update trajectory model for impulse, provided in Newton seconds.

pub fn update_trajectory(force: f64) {

// ...

}

Eventually these different parts need to be joined together:

let thruster_force: f64 = thruster_impulse(direction);

let new_direction = update_trajectory(thruster_force);

Ruh-roh.

16 Specifically, the .

17 See use of failure.

40 | Chapter 1: Types

Rust includes a type alias feature, which allows the different groups to make their intentions clearer:

/// Units for force.

pub type PoundForceSeconds = f64;

/// Fire the thrusters. Returns generated impulse.

pub fn thruster_impulse(direction: Direction) -> PoundForceSeconds {

// ...

return 42.0;

}

/// Units for force.

pub type NewtonSeconds = f64;

/// Update trajectory model for impulse.

pub fn update_trajectory(force: NewtonSeconds) {

// ...

}

However, the type aliases are effectively just documentation; they’re a stronger hint

than the doc comments of the previous version, but nothing stops a PoundForceSec

onds value being used where a NewtonSeconds value is expected:

let thruster_force: PoundForceSeconds = thruster_impulse(direction);

let new_direction = update_trajectory(thruster_force);

Ruh-roh once more.

This is the point where the newtype pattern helps:

/// Units for force.

pub struct PoundForceSeconds(pub f64);

/// Fire the thrusters. Returns generated impulse.

pub fn thruster_impulse(direction: Direction) -> PoundForceSeconds {

// ...

return PoundForceSeconds(42.0);

}

/// Units for force.

pub struct NewtonSeconds(pub f64);

/// Update trajectory model for impulse.

pub fn update_trajectory(force: NewtonSeconds) {

// ...

}

As the name implies, a newtype is a new type, and as such the compiler objects when

there’s a mismatch of types—here attempting to pass a PoundForceSeconds value to

something that expects a NewtonSeconds value:

Item 6: Embrace the newtype pattern | 41

D O E S N O T C O M P I L E

let thruster_force: PoundForceSeconds = thruster_impulse(direction);

let new_direction = update_trajectory(thruster_force);

error[E0308]: mismatched types

--> src/main.rs:76:43

76 | let new_direction = update_trajectory(thruster_force);

| ----------------- ^^^^^^^^^^^^^^ expected

| | `NewtonSeconds`, found `PoundForceSeconds`

| |

| arguments to this function are incorrect

note: function defined here

--> src/main.rs:66:8

66 | pub fn update_trajectory(force: NewtonSeconds) {

| ^^^^^^^^^^^^^^^^^ --------------------

help: call Ìnto::intoòn this expression to convert `PoundForceSecondsìntòNewtonSeconds`

76 | let new_direction = update_trajectory(thruster_force.into());

| +++++++

As described in , adding an implementation of the standard From trait:

impl From<PoundForceSeconds> for NewtonSeconds {

fn from(val: PoundForceSeconds) -> NewtonSeconds {

NewtonSeconds(4.448222 * val.0)

}

allows the necessary unit—and type—conversion to be performed with .into():

let thruster_force: PoundForceSeconds = thruster_impulse(direction);

let new_direction = update_trajectory(thruster_force.into());

The same pattern of using a newtype to mark additional “unit” semantics for a type

can also help to make purely Boolean arguments less ambiguous. Revisiting the

exam, using newtypes makes the meaning of arguments clear:

struct DoubleSided(pub bool);

struct ColorOutput(pub bool);

fn print_page(sides: DoubleSided, color: ColorOutput) {

// ...

}

print_page(DoubleSided(true), ColorOutput(false));

42 | Chapter 1: Types

If size efficiency or binary compatibility is a concern, then the

ensures that a newtype has the same representation in memory as the inner type.

That’s the simple use of newtype, and it’s a specific example of

semantics into the type system, so that the compiler takes care of policing those

semantics.

Bypassing the Orphan Rule for Traits

The other t requires the newtype pattern revolves around Rust’s orphan rule. Roughly speaking, this says that a crate can

implement a trait for a type only if one of the following conditions holds:

• The crate has defined the trait

• The crate has defined the type

Attempting to implement a foreign trait for a foreign type:

D O E S N O T C O M P I L E

use std::fmt;

impl fmt::Display for rand::rngs::StdRng {

fn fmt(&self, f: & mut fmt::Formatter<'_>) -> Result<(), fmt::Error> {

write!(f, "<StdRng instance>")

}

leads to a compiler error (which in turn points the way back to newtypes):

error[E0117]: only traits defined in the current crate can be implemented for

types defined outside of the crate

--> src/main.rs:146:1

146 | impl fmt::Display for rand::rngs::StdRng {

| ^^^^^^^^^^^^^^^^^^^^^^------------------

| | |

| | `StdRngìs not defined in the current crate

| impl doesn't use only types from inside the current crate

= note: define and implement a trait or new type instead

The reason for this restriction is due to the risk of ambiguity: if two different crates in

the dependency gra both to (say) impl std::fmt::Display for

rand::rngs::StdRng, then the compiler/linker has no way to choose between them.

Item 6: Embrace the newtype pattern | 43

This can frequently lead to frustration: for example, if you’re trying to serialize data

that includes a type from another crate, the orphan rule prevents you from writing

impl serde::Serialize for somecrate::SomeType

But the newtype pattern means that you’re defining a new type, which is part of the

current crate, and so the second part of the orphan trait rule applies. Implementing a

foreign trait is now possible:

struct MyRng(rand::rngs::StdRng);

impl fmt::Display for MyRng {

fn fmt(&self, f: & mut fmt::Formatter<'_>) -> Result<(), fmt::Error> {

write!(f, "<MyRng instance>")

}

Newtype Limitations

The newtype pattern solves these two classes of problems—preventing unit conver‐

sions and bypassing the orphan rule—but it does come with some awkwardness:

every operation that involves the newtype needs to forward to the inner type.

On a trivial level, that means that the code has to use thing.0 throughout, rather

than just thing, but that’s easy, and the compiler will tell you where it’s needed.

The more significant awkwardness is that any trait implementations on the inner type

are lost: because the newtype is a new type, the existing inner implementation doesn’t

apply.

For derivable traits, this just means that the newtype declaration ends up with lots of

derives:

#[derive(Debug, Copy, Clone, Eq, PartialEq, Ord, PartialOrd)]

pub struct NewType(InnerType);

However, for more sophisticated traits, some forwarding boilerplate is needed to

recover the inner type’s implementation, for example:

use std::fmt;

impl fmt::Display for NewType {

fn fmt(&self, f: & mut fmt::Formatter<'_>) -> Result<(), fmt::Error> {

self.0.fmt(f)

}

18 This is a sufficiently common problem for serde tha

44 | Chapter 1: Types

Item 7: Use builders for complex types

This Item describes the builder pattern, where complex data structures have an asso‐

ciated builder type that makes it easier for users to create instances of the data

structure.

Rust insists that all fields in a struct must be filled in when a new instance of that

struct is created. This keeps the code safe by ensuring that there are never any unini‐

tialized values but does lead to more verbose boilerplate code than is ideal.

For example, any optional fields have to be explicitly marked as absent with None:

/// Phone number in E164 format.

#[derive(Debug, Clone)]

pub struct PhoneNumberE164(pub String);

#[derive(Debug, Default)]

pub struct Details {

pub given_name: String,

pub preferred_name: Option<String>,

pub middle_name: Option<String>,

pub family_name: String,

pub mobile_phone: Option<PhoneNumberE164>,

}

// ...

let dizzy = Details {

given_name: "Dizzy".to_owned(),

preferred_name: None,

middle_name: None,

family_name: "Mixer".to_owned(),

mobile_phone: None,

};

This boilerplate code is also brittle, in the sense that a future change that adds a new

field to the struct requires an update to every place that builds the structure.

The boilerplate can be significantly reduced by implementing and using the

trait, as described in :

let dizzy = Details {

given_name: "Dizzy".to_owned(),

family_name: "Mixer".to_owned(),

..Default::default()

};

Using Default also helps reduce the changes needed when a new field is added, pro‐

vided that the new field is itself of a type that implements Default.

Item 7: Use builders for complex types | 45

That’s a more general concern: the automatically derived implementation of Default

works only if all of the field types implement the Default trait. If there’s a field that

doesn’t play along, the derive step doesn’t work:

D O E S N O T C O M P I L E

#[derive(Debug, Default)]

pub struct Details {

pub given_name: String,

pub preferred_name: Option<String>,

pub middle_name: Option<String>,

pub family_name: String,

pub mobile_phone: Option<PhoneNumberE164>,

pub date_of_birth: time::Date,

pub last_seen: Option<time::OffsetDateTime>,

}

error[E0277]: the trait bound `Date: Defaultìs not satisfied

--> src/main.rs:48:9

41 | #[derive(Debug, Default)]

| ------- in this derive macro expansion

...

48 | pub date_of_birth: time::Date,

| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `Defaultìs not

| implemented for `Datè

= note: this error originates in the derive macròDefault`

The code can’t implement Default for chrono::Utc because of the orphan rule; but

even if it could, it wouldn’t be helpful—using a default value for date of birth is going

to be wrong almost all of the time.

The absence of Default means that all of the fields have to be filled out manually:

let bob = Details {

given_name: "Robert".to_owned(),

preferred_name: Some("Bob".to_owned()),

middle_name: Some("the".to_owned()),

family_name: "Builder".to_owned(),

mobile_phone: None,

date_of_birth: time::Date::from_calendar_date(

1998,

time::Month::November,

28,

)

.unwrap(),

last_seen: None,

};

46 | Chapter 1: Types

These ergonomics can be improved if you implement the builder pattern for complex

data structures.

The simplest variant of the builder pattern is a separate struct that holds the infor‐

mation needed to construct the item. For simplicity, the example will hold an instance

of the item itself:

pub struct DetailsBuilder(Details);

impl DetailsBuilder {

/// Start building a new [`Details`] object.

pub fn new(

given_name: & str,

family_name: & str,

date_of_birth: time::Date,

) -> Self {

DetailsBuilder(Details {

given_name: given_name.to_owned(),

preferred_name: None,

middle_name: None,

family_name: family_name.to_owned(),

mobile_phone: None,

date_of_birth,

last_seen: None,

})

}

The builder type can then be equipped with helper methods that fill out the nascent

item’s fields. Each such method consumes self but emits a new Self, allowing differ‐

ent construction methods to be chained:

/// Set the preferred name.

pub fn preferred_name(mut self, preferred_name: & str) -> Self {

self.0.preferred_name = Some(preferred_name.to_owned());

self

}

/// Set the middle name.

pub fn middle_name(mut self, middle_name: & str) -> Self {

self.0.middle_name = Some(middle_name.to_owned());

self

}

These helper methods can be more helpful than just simple setters:

/// Update thèlast_seen` field to the current date/time.

pub fn just_seen(mut self) -> Self {

self.0.last_seen = Some(time::OffsetDateTime::now_utc());

self

}

Item 7: Use builders for complex types | 47

The final method to be invoked for the builder consumes the builder and emits the

built item:

/// Consume the builder object and return a fully built [`Details`]

/// object.

pub fn build(self) -> Details {

self.0

}

Overall, this allows clients of the builder to have a more ergonomic building

experience:

let also_bob = DetailsBuilder::new(

"Robert",

"Builder",

time::Date::from_calendar_date(1998, time::Month::November, 28)

.unwrap(),

)

.middle_name("the")

.preferred_name("Bob")

.just_seen()

.build();

The all-consuming nature of this style of builder leads to a couple of wrinkles. The

first is that separating out stages of the build process can’t be done on its own:

D O E S N O T C O M P I L E

let builder = DetailsBuilder::new(

"Robert",

"Builder",

time::Date::from_calendar_date(1998, time::Month::November, 28)

.unwrap(),

);

if informal {

builder.preferred_name("Bob");

}

let bob = builder.build();

error[E0382]: use of moved value: `builder`

--> src/main.rs:256:15

247 | let builder = DetailsBuilder::new(

| ------- move occurs becausèbuilder` has typèDetailsBuilder`,

| which does not implement thèCopy` trait

...

254 | builder.preferred_name("Bob");

| --------------------- `builder` moved due to this method

| call

255 | }

48 | Chapter 1: Types

256 | let bob = builder.build();

| ^^^^^^^ value used here after move

note: `DetailsBuilder::preferred_namè takes ownership of the receiver `self`,

which moves `builder`

--> src/main.rs:60:35

27 | pub fn preferred_name(mut self, preferred_name: &str) -> Self {

| ^^^^

This can be worked around by assigning the consumed builder back to the same

variable:

let mut builder = DetailsBuilder::new(

"Robert",

"Builder",

time::Date::from_calendar_date(1998, time::Month::November, 28)

.unwrap(),

);

if informal {

builder = builder.preferred_name("Bob");

}

let bob = builder.build();

The other downside to the all-consuming nature of this builder is that only one item

can be built; trying to create multiple instances by repeatedly calling build() on the

same builder falls foul of the compiler, as you’d expect:

D O E S N O T C O M P I L E

let smithy = DetailsBuilder::new(

"Agent",

"Smith",

time::Date::from_calendar_date(1999, time::Month::June, 11).unwrap(),

);

let clones = vec![smithy.build(), smithy.build(), smithy.build()];

error[E0382]: use of moved value: `smithy`

--> src/main.rs:159:39

154 | let smithy = DetailsBuilder::new(

| ------ move occurs becausèsmithy` has typèbase::DetailsBuilder`,

| which does not implement thèCopy` trait

...

159 | let clones = vec![smithy.build(), smithy.build(), smithy.build()];

| ------- ^^^^^^ value used here after move

| |

| `smithy` moved due to this method call

Item 7: Use builders for complex types | 49

An alternative approach is for the builder’s methods to take a &mut self and emit a

&mut Self:

/// Update thèlast_seen` field to the current date/time.

pub fn just_seen(& mut self) -> & mut Self {

self.0.last_seen = Some(time::OffsetDateTime::now_utc());

self

}

This removes the need for self-assignment in separate build stages:

let mut builder = DetailsBuilder::new(

"Robert",

"Builder",

time::Date::from_calendar_date(1998, time::Month::November, 28)

.unwrap(),

);

if informal {

builder.preferred_name("Bob"); // nòbuilder = ...`

}

let bob = builder.build();

However, this version makes it impossible to chain the construction of the builder

together with invocation of its setter methods:

D O E S N O T C O M P I L E

let builder = DetailsBuilder::new(

"Robert",

"Builder",

time::Date::from_calendar_date(1998, time::Month::November, 28)

.unwrap(),

)

.middle_name("the")

.just_seen();

let bob = builder.build();

error[E0716]: temporary value dropped while borrowed

--> src/main.rs:265:19

265 | let builder = DetailsBuilder::new(

| ___________________^

266 | | "Robert",

267 | | "Builder",

268 | | time::Date::from_calendar_date(1998, time::Month::November, 28)

269 | | .unwrap(),

270 | | )

| |_____^ creates a temporary value which is freed while still in use

271 | .middle_name("the")

272 | .just_seen();

| - temporary value is freed at the end of this statement

50 | Chapter 1: Types

273 | let bob = builder.build();

| --------------- borrow later used here

= note: consider using àlet` binding to create a longer lived value

As indicated by the compiler error, you can work around this by letting the builder

item have a name:

let mut builder = DetailsBuilder::new(

"Robert",

"Builder",

time::Date::from_calendar_date(1998, time::Month::November, 28)

.unwrap(),

);

builder.middle_name("the").just_seen();

if informal {

builder.preferred_name("Bob");

}

let bob = builder.build();

This mutating builder variant also allows for building multiple items. The signature

of the build() method has to not consume self and so must be as follows:

/// Construct a fully built [`Details`] object.

pub fn build(&self) -> Details {

// ...

}

The implementation of this repeatable build() method then has to construct a fresh

item on each invocation. If the underlying item implements Clone, this is easy—the

builder can hold a template and clone() it for each build. If the underlying item

doesn’t implement Clone, then the builder needs to have enough state to be able to

manually construct an instance of the underlying item on each call to build().

With any style of builder pattern, the boilerplate code is now confined to one place—

the builder—rather than being needed at every place that uses the underlying type.

The boilerplate that remains can potentially be reduced still further by use of a macro

), but if you go down this road, you should also check whether there’s an

existing crate (such as the te, in particular) that provides what’s needed—assuming that you’re happy to take a dependency on it ().

Item 8: Familiarize yourself with reference

and pointer types

For programming in general, a reference is a way to indirectly access some data struc‐

ture, separately from whatever variable owns that data structure. In practice, this is

usually implemented as a pointer: a number whose value is the address in memory of

the data structure.

Item 8: Familiarize yourself with reference and pointer types | 51

A modern CPU will typically police a few constraints on pointers—the memory

address should be in a valid range of memory (whether virtual or physical) and may

need to be aligned (e.g., a 4-byte integer value might be accessible only if its address is

a multiple of 4).

However, higher-level programming languages usually encode more information

about pointers in their type systems. In C-derived languages, including Rust, pointers

have a type that indicates what kind of data structure is expected to be present at the

pointed-to memory address. This allows the code to interpret the contents of mem‐

ory at that address and in the memory following that address.

This basic level of pointer information—putative memory location and expected data

structure layout—is represented in Rust as a raw pointer. However, safe Rust code

does not use raw pointers, because Rust provides richer reference and pointer types

that provide additional safety guarantees and constraints. These reference and

pointer types are the subject of this Item; raw pointers are relega (which discusses unsafe code).

Rust References

The most ubiquitous pointer-like type in Rust is the reference, with a type that is writ‐

ten as &T for some type T. Although this is a pointer value under the covers, the com‐

piler ensures that various rules around its use are observed: it must always point to a

valid, correctly aligned instance of the relevant type T

extends beyond its use, and it must satisfy the borrow checking rules (

additional constraints are always implied by the term reference in Rust, and so the

bare term pointer is generally rare.

The constraint that a Rust reference must point to a valid, correctly aligned item is

shared by C++’s reference types. However, C++ has no concept of lifetimes and so

allows footguns with dangling references:

U N D E S I R E D B E H A V I O R

// C++

const int& dangle() {

int x = 32; // on the stack, overwritten later

return x; // return reference to stack variable!

}

19 Albeit with a warning from modern compilers.

52 | Chapter 1: Types

Rust’s borrowing and lifetime checks mean that the equivalent code doesn’t even

compile:

D O E S N O T C O M P I L E

fn dangle() -> &'static i64 {

let x: i64 = 32; // on the stack

}

error[E0515]: cannot return reference to local variablèx`

--> src/main.rs:477:5

477 | &x

| ^^ returns a reference to data owned by the current function

A Rust reference &T allows read-only access to the underlying item (roughly equiva‐

lent to C++’s const T&). A mutable reference that also allows the underlying item to

be modified is written as &mut T and is also subject to the borrow checking rules dis‐

ttern reflects a slightly different mindset between

Rust and C++:

• In Rust, the default variant is read-only, and writable types are marked specially

(with mut).

• In C++, the default variant is writable, and read-only types are marked specially

(with const).

The compiler converts Rust code that uses references into machine code that uses

simple pointers, which are eight bytes in size on a 64-bit platform (which this Item

assumes throughout). For example, a pair of local variables together with references

to them:

pub struct Point {

pub x: u32,

pub y: u32,

}

let pt = Point { x: 1, y: 2 };

let x = 0u64;

let ref_x = &x;

let ref_pt = &pt;

migh.

Item 8: Familiarize yourself with reference and pointer types | 53

Figure 1-2. Stack layout with pointers to local variables

A Rust reference can refer to items that are located either on the stack or on the heap.

Rust allocates items on the stack by default, but the Box<T> pointer type (roughly

equivalent to C++’s std::unique_ptr<T>) forces allocation to occur on the heap,

which in turn means that the allocated item can outlive the scope of the current

block. Under the covers, Box<T> is also a simple eight-byte pointer value:

let box_pt = Box::new(Point { x: 10, y: 20 });

Figure 1-3. Stack Box pointer to struct on heap

Pointer Traits

A method that expects a reference argument like &Point can also be fed a

&Box<Point>:

fn show(pt: & Point) {

println!("({}, {})", pt.x, pt.y);

}

54 | Chapter 1: Types

show(ref_pt);

show(&box_pt);

(1, 2)

(10, 20)

This is possible because Box<T> implemenTarget = T. An implementation of this trait for some type means that the trait’s method can be used to create a reference to the Target type. There’s also an equivalen

trait, which emits a mutable reference to the Target type.

The Deref/DerefMut traits are somewhat special, because the Rust compiler has spe‐

cific behavior when dealing with types that implement them. When the compiler

encounters a dereferencing expression (e.g., plementation of one of these traits, depending on whether the dereference requires mutable

access or not. This Deref coercion allows various smart pointer types to behave like

normal references and is one of the few mechanisms that allow implicit type conver‐

sion in Rust (as described in ).

As a technical aside, it’s worth understanding why the Deref traits can’t be generic

(Deref<Target>) for the destination type. If they were, then it would be possible for

some type ConfusedPtr to implement both Deref<TypeA> and Deref<TypeB>, and

that would leave the compiler unable to deduce a single unique type for an expression

like *x. So instead, the destination type is encoded as the associated type named

Target.

This technical aside provides a contrast to two other standard pointer traits, the

’t induce special behavior in the compiler but allow conversions to a reference or mutable reference via an explicit call to their trait

and , respectively). The destination type for these conversions is encoded as a type parameter (e.g., AsRef<Point>), which means that a sin‐

gle container type can support multiple destinations.

For example, the standard type implements the Deref trait with Target =

str, meaning that an expression like &my_string can be coerced to type &str. But it

also implements the following:

• AsRef<[u8]>, allowing conversion to a byte slice &[u8]

• AsRef<OsStr>, allowing conversion to an OS string

• AsRef<Path>, allowing conversion to a filesystem path

• AsRef<str>, allowing conversion to a string slice &str (as with Deref)

Item 8: Familiarize yourself with reference and pointer types | 55

Fat Pointer Types

Rust has two built-in fat pointer types: slices and trait objects. These are types that act

as pointers but hold additional information about the thing they are pointing to.

Slices

The first fat pointer type is the slice: a reference to a subset of some contiguous collec‐

tion of values. It’s built from a (non-owning) simple pointer, together with a length

field, making it twice the size of a simple pointer (16 bytes on a 64-bit platform). The

type of a slice is written as &[T]—a reference to [T], which is the notional type for a

contiguous collection of values of type T.

The notional type [T] can’t be instantiated, but there are two common containers that

embody it. The first is the array: a contiguous collection of values having a size that is

known at compile time—an array with five values will always have five values. A slice

can therefore refer to a subset of an arra):

let array: [u64; 5] = [0, 1, 2, 3, 4];

let slice = &array[1..3];

Figure 1-4. Stack slice pointing into a stack array

56 | Chapter 1: Types

The other common container for contiguous values is a Vec<T>. This holds a contigu‐

ous collection of values like an array, but unlike an array, the number of values in the

Vec) or shrink (e.g., with ).

The contents of the Vec are kept on the heap (which allows for this variation in size)

but are always contiguous, and so a slice can refer to a subset of a vector, as shown in

let mut vector = Vec::< u64>::with_capacity(8);

for i in 0..5 {

vector.push(i);

}

let vslice = &vector[1..3];

Figure 1-5. Stack slice pointing into Vec contents on the heap

There’s quite a lot going on under the covers for the expression &vector[1..3], so it’s

worth breaking it down into its components:

• The 1..3 part is a piler converts this into an instance of the

upper bound.

Item 8: Familiarize yourself with reference and pointer types | 57

• The Range type

operations on slices of an arbitrary type T (so the Output type is [T]).

• The vector[ ] ; the compiler converts this into an invoca’s vector, together with a dereference (i.e., *vector.index( )).

• vector[1..3] therefore invokes Vec<T>’ of Index<I>, which requires I to be an instance of SliceIndex<[u64]>. This works because

Range<usize> implements SliceIndex<[T]> for any T, including u64.

• &vector[1..3] undoes the dereference, resulting in a final expression type of

&[u64].

Trait objects

The second built-in fat pointer type is a trait object: a reference to some item that

implements a particular trait. It’s built from a simple pointer to the item, together

with an internal pointer to the type’s vtable, giving a size of 16 bytes (on a 64-bit plat‐

for a type’s implementation of a trait holds function pointers for each of the method implementations, allowing dynamic dispatch at runtime

So a simple trait:

trait Calculate {

fn add(&self, l: u64, r: u64) -> u64;

fn mul(&self, l: u64, r: u64) -> u64;

}

with a struct that implements it:

struct Modulo(pub u64);

impl Calculate for Modulo {

fn add(&self, l: u64, r: u64) -> u64 {

(l + r) % self.0

}

fn mul(&self, l: u64, r: u64) -> u64 {

(l * r) % self.0

}

let mod3 = Modulo(3);

20 The equivalent trait for mutable expressions is .

21 This is somewhat simplified; a full vtable also includes information about the size and alignment of the type, together with a drop() function pointer so that the underlying object can be safely dropped.

58 | Chapter 1: Types

can be converted to a trait object of type &dyn Trait. The highlights the fact that dynamic dispatch is involved:

// Need an explicit type to force dynamic dispatch.

let tobj: & dyn Calculate = &mod3;

let result = tobj.add(2, 2);

assert_eq!(result, 1);

The equivalent memory layout is shown in

Figure 1-6. Trait object with pointers to concrete item and vtable

Code that holds a trait object can invoke the methods of the trait via the function

pointers in the vtable, passing in the item pointer as the &self parameter; see

for more information and advice.

More Pointer Traits

Deref/DerefMut, AsRef/

AsMut) that are used when dealing with types that can be easily converted into refer‐

ences. There are a few more standard traits that can also come into play when work‐

ing with pointer-like types, whether from the standard library or user defined.

The simplest of these is the trait, which formats a pointer value for output.

This can be helpful for low-level debugging, and the compiler will reach for this trait

automatically when it encounters the {:p} format specifier.

More inve a single

, respectively). This method has the same signature as the equivalent AsRef/AsMut trait methods.

Item 8: Familiarize yourself with reference and pointer types | 59

The key difference in intents between these traits is visible via the blanket implemen‐

tations that the standard library provides. Given an arbitrary Rust reference &T, there

is a blanket implementation of both AsRef and Borrow; likewise, for a mutable refer‐

ence &mut T, there’s a blanket implementation of both AsMut and BorrowMut.

However, Borrow also has a blanket implementation for (non-reference) types:

impl<T> Borrow<T> for T.

This means that a method accepting the Borrow trait can cope equally with instances

of T as well as references-to-T:

fn add_four<T: std::borrow::Borrow< i32>>(v: T) -> i32 {

v.borrow() + 4

}

assert_eq!(add_four(&2), 6);

assert_eq!(add_four(2), 6);

The standard library’s container types have more realistic uses of Borrow. For exam‐

ple, uses Borrow to allow convenient retrieval of entries whether keyed by value or by reference.

Borrowt produces a new owned item of the underlying type. This is a generalization of the Clone

trait: where Clone specifically requires a Rust reference &T, ToOwned instead copes

with things that implement Borrow.

This gives a couple of possibilities for handling both references and moved items in a

unified way:

• A function that operates on references to some type can accept Borrow so that it

can also be called with moved items as well as references.

• A function that operates on owned items of some type can accept ToOwned so that

it can also be called with references to items as well as moved items; any refer‐

ences passed to it will be replicated into a locally owned item.

Although it’s not a pointioning at this point, because it provides an alternative way of dealing with the same kind of situation. Cow

is an enum that can hold either owned data or a reference to borrowed data. The pecu‐

liar name stands for “clone-on-write”: a Cow input can remain as borrowed data right

up to the point where it needs to be modified, but it becomes an owned copy at the

point where the data needs to be altered.

Smart Pointer Types

The Rust standard library includes a variety of types that act like pointers to some

degree or another, mediated by the standard library traits previously described. These

60 | Chapter 1: Types

smart pointer types each come with some particular semantics and guarantees, which

has the advantage that the right combination of them can give fine-grained control

over the pointer’s behavior, but has the disadvantage that the resulting types can seem

overwhelming at first (Rc<RefCell<Vec<T>>>, anyone?).

The first smart pointed pointer to an item (roughly analogous to C++’t implements all of the pointer-related traits and so acts like a Box<T> in many ways.

This is useful for data structures where the same item can be reached in different

ways, but it removes one of Rust’s core rules around ownership—that each item has

only one owner. Relaxing this rule means that it is now possible to leak data: if item A

has an Rc pointer to item B, and item B has an Rc pointer to A, then the pair will

never be dropped. To put it another way: you need Rc to support cyclical data struc‐

tures, but the downside is that there are now cycles in your data structures.

The risk of leaks can be ameliorated in some cases by the related type, which holds a non-owning reference to the underlying item (roughly analogous to C++’s

olding a weak reference doesn’t prevent the underlying item from being dropped (when all strong references are removed), so making use of the

Weak<T> involves an upgrade to an Rc<T>—which can fail.

Under the hood, Rc is (currently) implemented as a pair of reference counts together

with the referenced item, all stored on the heap (as depicted in

use std::rc::Rc;

let rc1: Rc< u64> = Rc::new(42);

let rc2 = rc1.clone();

let wk = Rc::downgrade(&rc1);

Figure 1-7. Rc and Weak pointers all referring to the same heap item

22 Note that this doesn’t affect Rust’s memory safety guarantees: the items are still safe, just inaccessible.

Item 8: Familiarize yourself with reference and pointer types | 61

The underlying item is dropped when the strong reference count drops to zero, but

the bookkeeping structure is dropped only when the weak reference count also drops

to zero.

An Rc on its own gives you the ability to reach an item in different ways, but when

you reach that item, you can modify it (via ys to reach the item—i.e., there are no other extant Rc or Weak references to the same

item. That’s hard to arrange, so Rc is often combined with RefCell.

The next smart point an item can be mutated only by its owner or by code that holds the (only) mutable reference to

the item. This interior mutability allows for greater flexibility—for example, allowing

trait implementations that mutate internals even when the method signature allows

only &self. However, it also incurs costs: as well as the extra storage overhead (an

extra isize to track curren), the normal borrow

checks are moved from compile time to runtime:

use std::cell::RefCell;

let rc: RefCell< u64> = RefCell::new(42);

let b1 = rc.borrow();

let b2 = rc.borrow();

Figure 1-8. Ref borrows referring to a RefCell container

62 | Chapter 1: Types

The runtime nature of these checks means that the RefCell user has to choose

between two options, neither pleasant:

• Accept that borrowing is an operation that might fail, and cope with Result val‐

ues from try_borrow[_mut]

• Use the allegedly infallible borrowing methods borrow[_mut], and accept the risk

of a panic! at run) if the borrow rules have not been complied with

In either case, this runtime checking means that RefCell itself implements none of

the standard pointer traits; instead, its access opera or

smart pointer type that does implement those traits.

If the underlying type T implements the Copy trait (indicating that a fast bit-for-bit

), then the Cell<T> type allows interior mutation with less overhead—the get(&self) method copies out the current value, and

the set(&self, val) method copies in a new value. The Cell type is used internally

by both the Rc and RefCell implementations, for shared tracking of counters that can

be mutated without a &mut self.

The smart pointer types described so far are suitable only for single-threaded use;

their implementations assume that there is no concurrent access to their internals. If

this is not the case, then smart pointers that include additional synchronization over‐

head are needed.

The thread-safe equivalent of Rc<T> , which uses atomic counters to ensure that the reference counts remain accurate. Like Rc, Arc implements all of the various

pointer-related traits.

However, Arc on its own does not allow any kind of mutable access to the underlying

item. This is covered by the type, which ensures that only one thread has access—whether mutably or immutably—to the underlying item. As with RefCell,

Mutex itself does not implement any pointer traits, but its lock() operation returns a

value of a type tha, which implements Deref[Mut].

type is preferable, as it allows multiple readers access to the underlying item in parallel, provided that there

isn’t currently a (single) writer.

In either case, Rust’s borrowing and threading rules force the use of one of these syn‐

chronization containers in multithreaded code (but this guards against only some of

the problems of shared-state concurrency).

The same strategy—see what the compiler rejects and what it suggests instead—can

sometimes be applied with the other smart pointer types. However, it’s faster and less

Item 8: Familiarize yourself with reference and pointer types | 63

frustrating to understand what the behavior of the different smart pointers implies.

To borrow (pun intended) an exam

• Rc<RefCell<Vec<T>>> holds a vector (Vec) with shared ownership (Rc), where

the vector can be mutated—but only as a whole vector.

• Rc<Vec<RefCell<T>>> also holds a vector with shared ownership, but here each

individual entry in the vector can be mutated independently of the others.

The types involved precisely describe these behaviors.

Item 9: Consider using iterator transforms

instead of explicit loops

The humble loop has had a long journey of increasing convenience and increasing

abstraction. had only while (condition)

{ ... }, but with the arrival of C, the common scenario of iterating through indexes

of an array became more convenient with the addition of the for loop:

// C code

int i;

for (i = 0; i < len; i++) {

Item item = collection[i];

// body

}

The early versions of C++ further improved convenience and scoping by allowing the

loop variable declaration to be embedded in the for statement (this was also adopted

by C in C99):

// C++98 code

for (int i = 0; i < len; i++) {

Item item = collection[i];

// ...

}

Most modern languages abstract the idea of the loop further: the core function of a

loop is often to move to the next item of some container. Tracking the logistics that

are required to reach that item (index++ or ++it) is mostly an irrelevant detail. This

realization produced two core concepts:

Iterators

A type whose purpose is to repeatedly emit the next item of a container, until

exha

23 In fact, the iterator can be more general—the idea of emitting next items until completion need not be associated with a container.

64 | Chapter 1: Types

For-each loops

A compact loop expression for iterating over all of the items in a container, bind‐

ing a loop variable to the item rather than to the details of reaching that item

These concepts allow for loop code that’s shorter and (more importantly) clearer

about what’s intended:

// C++11 code

for (Item& item : collection) {

// ...

}

Once these concepts were available, they were so obviously powerful that they were

quickly retrofitted to those languages that didn’t already have them (e.g., for-each

loops were added to

Rust includes iterators and for-each–style loops, but it also includes the next step in

abstraction: allowing the whole loop to be expressed as an iterator transform (some‐

times also referred to as an iterator adaptor’s discussion of Option and Result, this Item will attempt to show how these iterator transforms can be used

instead of explicit loops, and will give guidance as to when it’s a good idea. In particu‐

lar, iterator transforms can be more efficient than an explicit loop, because the com‐

piler can skip the bounds checks it might otherwise need to perform.

By the end of this Item, a C-like explicit loop to sum the squares of the first five even

items of a vector:

let values: Vec< u64> = vec![1, 1, 2, 3, 5 /* ... */];

let mut even_sum_squares = 0;

let mut even_count = 0;

for i in 0..values.len() {

if values[i] % 2 != 0 {

continue;

}

even_sum_squares += values[i] * values[i];

even_count += 1;

if even_count == 5 {

break;

}

should start to feel more natural expressed as a functional-style expression:

let even_sum_squares: u64 = values

.iter()

.filter(|x| *x % 2 == 0)

.take(5)

.map(|x| x * x)

.sum();

Item 9: Consider using iterator transforms instead of explicit loops | 65

Iterator transformation expressions like this can roughly be broken down into three

parts:

• An initial source iterator, from an instance of a type that implements one of

Rust’s iterator traits

• A sequence of iterator transforms

• A final consumer method to combine the results of the iteration into a final value

The first two of these parts effectively move functionality out of the loop body and

into the for expression; the last removes the need for the for statement altogether.

Iterator Traits

The core y simple in that yields Some items until it doesn’t (None). The type of the emitted items is given by the trait’s

associated Item type.

Collections that allow iteration over their contents—what would be called iterables in

other languages—implemenmethod of this trait consumes Self and emits an Iterator in its stead. The compiler will automatically use this trait for expressions of the form:

for item in collection {

// body

}

effectively converting them to code roughly like:

let mut iter = collection.into_iter();

loop {

let item: Thing = match iter.next() {

Some(item) => item,

None => break,

};

// body

}

or more succinctly and more idiomatically:

let mut iter = collection.into_iter();

while let Some(item) = iter.next() {

// body

}

To keep things running smoothly, there’s also an implementation of IntoIterator for

any Iterator, which just returns self; after all, it’s easy to convert an Iterator into

an Iterator!

This initial form is a consuming iterator, using up the collection as it’s created:

66 | Chapter 1: Types

let collection = vec![Thing(0), Thing(1), Thing(2), Thing(3)];

for item in collection {

println!("Consumed item {item:?}");

}

Any attempt to use the collection after it’s been iterated over fails:

println!("Collection = {collection:?}");

error[E0382]: borrow of moved value: `collection`

--> src/main.rs:171:28

163 | let collection = vec![Thing(0), Thing(1), Thing(2), Thing(3)];

| ---------- move occurs becausècollection` has typèVec<Thing>`,

| which does not implement thèCopy` trait

164 | for item in collection {

| ---------- `collection` moved due to this implicit call to

| `.into_iter()`

...

171 | println!("Collection = {collection:?}");

| ^^^^^^^^^^^^^^ value borrowed here after move

note: ìnto_iter` takes ownership of the receiver `self`, which moves

`collection`

While simple to understand, this all-consuming behavior is often undesired; some

kind of borrow of the iterated items is needed.

To ensure that behavior is clear, the examples here use a Thing type that does not

implement Copy ), as that would hide questions of ownership ()—the compiler would silently make copies everywhere:

// Deliberately not `Copy`

#[derive(Clone, Debug, Eq, PartialEq)]

struct Thing(u64);

let collection = vec![Thing(0), Thing(1), Thing(2), Thing(3)];

If the collection being iterated over is prefixed with &:

for item in &collection {

println!("{}", item.0);

}

println!("collection still around {collection:?}");

then the Rust compiler will look for an implementa for the type

&Collection. Properly designed collection types will provide such an implementa‐

tion; this implementation will still consume Self, but now Self is &Collection

rather than Collection, and the associated Item type will be a reference &Thing.

This leaves the collection intact after iteration, and the equivalent expanded code is as

follows:

Item 9: Consider using iterator transforms instead of explicit loops | 67

let mut iter = (&collection).into_iter();

while let Some(item) = iter.next() {

println!("{}", item.0);

}

If it makes sense to provide iteration over mutable references, then a similar pattern applies for for item in &mut collection: the compiler looks for and uses an implementation of IntoIterator for &mut Collection, with each Item being of type &mut

Thing.

By convention, standard containers also provide an iter() method that returns an

iterator over references to the underlying item, and an equivalent iter_mut()

method, if appropriate, with the same behavior as just described. These methods can

be used in for loops but have a more obvious benefit when used as the start of an

iterator transformation:

let result: u64 = (&collection).into_iter().map(|thing| thing.0).sum();

becomes:

let result: u64 = collection.iter().map(|thing| thing.0).sum();

Iterator Transforms

) but also provides default implementations () of a large number of other methods that perform transformations on an iterator.

Some of these transformations affect the overall iteration process:

Restricts an iterator to emitting at most n items.

Skips over the first n elements of the iterator.

Converts an iterator so it emits only every nth item.

Glues together two iterators, to build a combined iterator that moves through

one then the other.

24 This method can’t be provided if a mutation to the item might invalidate the container’s internal guarantees.

For example, changing the item’s contents in a way that alters its validate the internal data structures of a HashMap.

68 | Chapter 1: Types

Converts an iterator that terminates into one that repeats forever, starting at the

beginning again whenever it reaches the end. (The iterator must support Clone

to allow this.)

Reverses the direction of an iterator. (The iterator must implemen

Other transformations affect the nature of the Item that’s the subject of the Iterator:

Repeatedly applies a closure to transform each item in turn. This is the most gen‐

eral version; several of the following entries in this list are convenience variants

that could be equivalently implemented as a map.

Produces a clone of all of the items in the original iterator; this is particularly

useful with iterators over &Item references. (This obviously requires the underly‐

ing Item type to implement Clone.)

Produces a copy of all of the items in the original iterator; this is particularly use‐

ful with iterators over &Item references. (This obviously requires the underlying

Item type to implement Copy, but it is likely to be faster than cloned(), if that’s

the case.)

Converts an iterator over items to be an iterator over (usize, Item) pairs, pro‐

viding an index to the items in the iterator.

Joins an iterator with a second iterator, to produce a combined iterator that emits

pairs of items, one from each of the original iterators, until the shorter of the two

iterators is finished.

Yet other transformations perform filtering on the Items being emitted by the

Iterator:

Applies a bool-returning closure to each item reference to determine whether it

should be passed through.

Emits an initial subrange of the iterator, based on a predicate. Mirror image of

skip_while.

Item 9: Consider using iterator transforms instead of explicit loops | 69

Emits a final subrange of the iterator, based on a predicate. Mirror image of

take_while.

method deals with an iterator whose items are themselves iterators, flattening the result. On its own, this doesn’t seem that helpful, but it becomes much

more useful when combined with the observation tha act as iterators: they produce either zero (for None, Err(e)) or one (for Some(v), Ok(v))

items. This means that flattening a stream of Option/Result values is a simple way

to extract just the valid values, ignoring the rest.

Taken as a whole, these methods allow iterators to be transformed so that they pro‐

duce exactly the sequence of elements that are needed for most situations.

Iterator Consumers

The previous two sections described how to obtain an iterator and how to transform

it into exactly the right shape for precise iteration. This precisely targeted iteration

could happen as an explicit for-each loop:

let mut even_sum_squares = 0;

for value in values.iter().filter(|x| *x % 2 == 0).take(5) {

even_sum_squares += value * value;

}

Howeverincludes many that allow an iteration to be consumed in a single method call, removing the need for an explicit for

loop.

The most general of these methods is

for each item produced by the Iterator. This can do most of the things that an

explicit for loop can do (the exceptions are described in a later section), but its gen‐

erality also makes it a little awkward to use—the closure needs to use mutable refer‐

ences to external state in order to emit anything:

let mut even_sum_squares = 0;

values

.iter()

.filter(|x| *x % 2 == 0)

.take(5)

.for_each(|value| {

// closure needs a mutable reference to state elsewhere

even_sum_squares += value * value;

});

However, if the body of the for loop matches one of a number of common patterns,

there are more specific iterator-consuming methods that are clearer, shorter, and

more idiomatic.

70 | Chapter 1: Types

These patterns include shortcuts for building a single value out of the collection:

Sums a collection of numeric values (integers or floats).

Multiplies a collection of numeric values.

Finds the minimum value of a collection, relative to the Item’s Ord implementa‐

Finds the maximum value of a collection, relative to the Item’s Ord implementa‐

Finds the minimum value of a collection, relative to a user-specified comparison

function f.

Finds the maximum value of a collection, relative to a user-specified comparison

function f.

Builds an accumulated value of the Item type by running a closure at each step

that takes the value accumulated so far and the current item. This is a more gen‐

eral operation that encompasses the previous methods.

Builds an accumulated value of an arbitrary type (not just the Iterator::Item

type) by running a closure at each step that takes the value accumulated so far

and the current item. This is a generalization of reduce.

Builds an accumulated value of an arbitrary type by running a closure at each

step that takes a mutable reference to some internal state and the current item.

This is a slightly different generalization of reduce.

There are also methods for selecting a single value out of the collection:

Finds the first item that satisfies a predicate.

Also finds the first item satisfying a predicate, but this time it returns the index of

the item.

Item 9: Consider using iterator transforms instead of explicit loops | 71

Returns the nth element of the iterator, if available.

There are methods for testing against every item in the collection:

Indicates whether a predicate is true for any item in the collection.

Indicates whether a predicate is true for al items in the collection.

In either case, iteration will terminate early if the relevant counterexample is found.

There are methods that allow for the possibility of failure in the closures used with

each item. In each case, if a closure returns a failure for an item, the iteration is termi‐

nated and the operation as a whole returns the first failure:

Behaves like for_each, but the closure can fail

Behaves like fold, but the closure can fail

Behaves like find, but the closure can fail

Finally, there are methods that accumulate all of the iterated items into a new collec‐

tion. The most importan

instance of any collection type that implemen trait.

The FromIterator trait is implemented for all of the standard library collection types

that you often have to use explicit types, because otherwise the compiler can’t figure out whether you’re trying

to assemble (say) a Vec<i32> or HashSet<i32>:

use std::collections::HashSet;

// Build collections of even numbers. Type must be specified, because

// the expression is the same for either type.

let myvec: Vec< i32> = (0..10).into_iter().filter(|x| x % 2 == 0).collect();

let h: HashSet< i32> = (0..10).into_iter().filter(|x| x % 2 == 0).collect();

This example also illustra to generate the initial data to be iterated over.

Other (more obscure) collection-producing methods include the following:

Divides an iterator of pairs into two collections

72 | Chapter 1: Types

Splits an iterator into two collections based on a predicate that is applied to each

item

This Item has touched on a wide selection of Iterator methods, but this is only a

subset of the methods available; for more information, consult the

or read

extensive coverage of the possibilities.

This rich collection of iterator transformations is there to be used. It produces code

that is more idiomatic, more compact, and has clearer intent.

Expressing loops as iterator transformations can also produce code that is more effi‐

cient. In the interests of safety, Rust performs bounds checking on access to contiguous

containers such as vectors and slices; an attempt to access a value beyond the bounds

of the collection triggers a panic rather than an access to invalid data. An old-style

loop that accesses container values (e.g., values[i]) might be subject to these runtime

checks, whereas an iterator that produces one value after another is already known to

be within range.

However, it’s also the case that an old-style loop might not be subject to additional

bounds checks compared to the equivalent iterator transformation. The Rust com‐

piler and optimizer is very good at analyzing the code surrounding a slice access to

determine whether it’s safe to skip the bounds checks; Sergey “Shnatsel” Davidoff’s

volved.

Building Collections from Result Values

The previous section described the use of collect() to build collections from itera‐

tors, but collect() also has a particularly helpful feature when dealing with Result

values.

Consider an attempt to convert a vector of i64 values into bytes (u8), with the opti‐

mistic expectation that they will all fit:

Назад: Cover

Дальше: Loop Transformation