Cover - Effective Rust - David Drysdale - RutLib.com - Ваша домашняя библиотека

Книга: Effective Rust

На главную: Предисловие

Дальше: Libraries Versus Applications

Ef fective Rust

“Ef fective Rust is an

Rust’s popularity is growing, due in part to features like

memory safety, type safety, and thread safety. But these same

excel ent collection

elements can also make learning Rust a chal enge, even for

of real-world Rust

experienced programmers. This practical guide helps you

knowledge beyond the

make the transition to writing idiomatic Rust. In the process,

basics. The advice in

you’ll also make full use of Rust’s type system, safety guarantees,

this book will help you

and burgeoning ecosystem.

become a conf ident

If you’re a software engineer who has experience with an

and wel -rounded

existing compiled language, or if you’ve struggled to convert

Rustacean.”

a basic understanding of Rust syntax into working programs,

—Carol Nichols

this book is for you. Effective Rust focuses on the conceptual

Coauthor of The Rust

differences between Rust and other compiled languages,

Programming Language

and provides specific recommendations that programmers can

easily fol ow. Author David Drysdale will soon have you writing

“Ef fective Rust dives

fluent Rust, rather than badly translated C++.

deep into most of the

Effective Rust will help you:

recommendations I

• Understand the structure of Rust’s type system

give people on how to

• Learn Rust idioms for error handling, iteration, and more

improve their projects.

•

It’s a great resource

Discover how to work with Rust’s crate ecosystem

to level up your Rust.”

• Use Rust’s type system to express your design

—Pietro Albini

• Win fights with the borrow checker

Former member of the

• Build a robust project that takes full advantage

Rust Core Team

of the Rust tooling ecosystem

David Drysdale is a Google staff software engineer who has worked in Rust since 2019, primarily

in security-related areas. He led the Rust rewrite of Android’s hardware cryptography subsystem

and authored the Rust port of the Tink cryptography library. He has also worked in C/C++ and Go and

on projects as diverse as the Linux kernel and video conferencing mobile apps.

RUST PROGRAMMING

linkedin.com/company/oreilly-media

youtube.com/oreillymedia

US $59.99 CAN $74.99

ISBN: 978-1-098-15140-9

5 9 9 9 9

9 7 8 1 0 9 8 1 5 1 4 0 9

Effective Rust

35 Specific Ways to Improve Your Rust Code

David Drysdale

Beijing Boston Farnham Sebastopol Tokyo

Effective Rust

by David Drysdale

Printed in the United States of America.

Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.

O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are

also available for most titles (or more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreil y.com.

Acquisitions Editor: Brian Guerin

Indexer: Ellen Troutman-Zaig

Development Editor: Jeff Bleiel

Interior Designer: David Futato

Production Editor: Katherine Tozer

Cover Designer: Karen Montgomery

Copyeditor: Piper Editorial Consulting, LLC

Illustrator: Kate Dullea

Proofreader: Dwight Ramsey

April 2024:

First Edition

Revision History for the First Edition

2024-04-01: First Release

See for release details.

The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Effective Rust, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc.

The views expressed in this work are those of the author and do not represent the publisher’s views. While

the publisher and the author have used good faith efforts to ensure that the information and instructions

contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance

on this work. Use of the information and instructions contained in this work is at your own risk. If any

code samples or other technology this work contains or describes is subject to open source licenses or the

intellectual property rights of others, it is your responsibility to ensure that your use thereof complies

with such licenses and/or rights.

978-1-098-15140-9

[LSI]

Table of Contents

103

106

123

142

145

159

162

169

iii

176

181

186

188

191

197

203

209

223

227

235

237

243

249

261

iv | Table of Contents

Preface

The code is more what you’d call guidelines than actual rules.

—Hector Barbossa

In the crowded landscape of modern programming languages, Rust is different. Rust

offers the speed of a compiled language, the efficiency of a non-garbage-collected lan‐

guage, and the type safety of a functional language—as well as a unique solution to

memory safety problems. As a result, R

The strength and consistency of Rust’s type system means that if a Rust program

compiles, there is already a decent chance that it will work—a phenomenon previ‐

ously observed only with more academic, less accessible languages such as Haskell. If

a Rust program compiles, it will also work safely.

This safety—both type safety and memory safety—does come with a cost, though.

Despite the quality of the basic documentation, Rust has a reputation for having a

steep on-ramp, where newcomers have to go through the initiation rituals of fighting

the borrow checker, redesigning their data structures, and being befuddled by life‐

times. A Rust program that compiles may have a good chance of working the first

time, but the struggle to get it to compile is real—even with the Rust compiler’s

remarkably helpful error diagnostics.

Who This Book Is For

This book tries to help with these areas where programmers struggle, even if they

already have experience with an existing compiled language like C++. As such—and

in common with other Effective <Language> books—this book is intended to be the

second book that a newcomer to Rust might need, after they have already encountered

the basics elsewhere—for example, in (Steve Klabnik and Carol Nichols, N (Jim Blandy et al., O’Reilly).

However, Rust’s safety leads to a slightly different slant to the Items here, particularly

when compared to Scott Meyers’s original Effective C++ series. The C++ language

was (and is) full of footguns, so Effective C++ focused on a collection of advice for

avoiding those footguns, based on real-world experience creating software in C++.

Significantly, it contained guidelines not rules, because guidelines have exceptions—

providing the detailed rationale for a guideline allows readers to decide for them‐

selves whether their particular scenario warranted breaking the rule.

The general style of giving advice together with the reasons for that advice is pre‐

served here. However, since Rust is remarkably free of footguns, the Items here con‐

centrate more on the concepts that Rust introduces. Many Items have titles like

“Understand…” and “Familiarize yourself with…” , and help on the journey toward

writing fluent, idiomatic Rust.

Rust’s safety also leads to a complete absence of Items titled “Never…” . If you really

should never do something, the compiler will generally prevent you from doing it.

Rust Version

Rust’s back-compatibility mean that any later edition of Rust, including the

t later edition introduces breaking changes. Rust is now also stable enough that the differences

between the 2018 and 2021 editions are minor; none of the code in the book needs

altering to be 2021-edition compliant (but includes one exception in which a

later version of Rust allows new behavior that wasn’t previously possible).

The Items here do not cover any aspects of Rust’s , as this involves more advanced concepts and less stable toolchain support—there’s already enough

ground to cover with synchronous Rust. Perhaps an Effective Async Rust will emerge

in the future…

The specific rustc version used for code fragments and error messages is 1.70. The

code fragments are unlikely to need changes for later versions, but the error messages

may vary with your particular compiler version. The error messages included in the

text have also been manually edited to fit within the width constraints of the book but

are otherwise as produced by the compiler.

vi | Preface

The text has a number of references to and comparisons with other statically typed

languages, such as Java, Go, and C++, to help readers with experience in those lan‐

guages orient themselves. (C++ is probably the closest equivalent language, particu‐

larly when C++11’s move semantics come into play.)

Navigating This Book

The Items that make up the book are divided into six chapters:

Suggestions that revolve around Rust’s core type system

Suggestions for working with Rust’s traits

Core ideas that form the design of Rust

Advice for working with Rust’s package ecosystem

Suggestions for improving your codebase by going beyond just the Rust compiler

Suggestions for when you have to work beyond Rust’s standard, safe environment

Although the “Concepts” chapter is arguably more fundamental than the “Types” and

“Traits” chapters, it is deliberately placed later in the book so that readers who are

reading from beginning to end can build up some confidence first.

Conventions Used in This Book

The following typographical conventions are used in this book:

Italic Indicates new terms, URLs, email addresses, filenames, and file extensions.

Constant width

Used for program listings, as well as within paragraphs to refer to program ele‐

ments such as variable or function names, databases, data types, environment

variables, statements, and keywords.

Preface | vii

D O E S N O T C O M P I L E

// Marks code samples that do not compile

U N D E S I R E D B E H A V I O R

// Marks code samples that exhibit undesired behavior

O’Reilly Online Learning

ogy and business training, knowledge, and insight to help

companies succeed.

Our unique network of experts and innovators share their knowledge and expertise

through books, articles, and our online learning platform. O’Reilly’s online learning

platform gives you on-demand access to live training courses, in-depth learning

paths, interactive coding environments, and a vast collection of text and video from

O’Reilly and 200+ other publishers. For more information, visit .

How to Contact Us

Please address comments and questions concerning this book to the publisher:

O’Reilly Media, Inc.

1005 Gravenstein Highway North

Sebastopol, CA 95472

800-889-8969 (in the United States or Canada)

707-827-7019 (international or local)

707-829-0104 (fax)

We have a web page for this book, where we list errata, examples, and any additional

information. You can access this page a.

For news and information about our books and courses, visit

Watch us on YouTube: .

viii | Preface

Acknowledgments

My thanks go to the people who helped make this book possible:

• The technical reviewers who gave expert and detailed feedback on all aspects of

the text: Pietro Albini, Jess Males, Mike Capp, and especially Carol Nichols.

• My editors at O’Reilly: Jeff Bleiel, Brian Guerin, and Katie Tozer.

• Tiziano Santoro, from whom I originally learned many things about Rust.

• Danny Elfanbaum, who provided vital technical assistance for dealing with the

AsciiDoc formatting of the book.

• Diligent readers of the original web version of the book, in particular:

— Julian Rosse, who spotted dozens of typos and other errors in the online text.

— Martin Disch, who pointed out potential improvements and inaccuracies in

several Items.

— Chris Fleetwood, Sergey Kaunov, Clifford Matthews, Remo Senekowitsch,

Kirill Zaborsky, and an anonymous Proton Mail user, who pointed out mis‐

takes in the text.

• My family, who coped with many weekends when I was distracted by writing.

Preface | ix

CHAPTER 1

Types

This first chapter of this book covers advice that revolves around Rust’s type system.

This type system is more expressive than that of other mainstream languages; it has

more in common with “academic” languages such as

One core part of this is Rust’s enum type, which is considerably more expressive than

the enumeration types in other languages and which allows for .

The Items in this chapter cover the fundamental types that the language provides and

how to combine them into data structures that precisely express the semantics of

your program. This concept of encoding behavior into the type system helps to

reduce the amount of checking and error path code that’s required, because invalid

states are rejected by the toolchain at compile time rather than by the program at run‐

time.

This chapter also describes some of the ubiquitous data structures that are provided

by Rust’s standard library: Options, Results, Errors and Iterators. Familiarity with

these standard tools will help you write idiomatic Rust that is efficient and compact—

in particular, they allow use of Rust’s question mark operator, which supports error

handling that is unobtrusive but still type-safe.

Note that Items that involve Rust traits are covered in the following chapter, but there

is necessarily a degree of overlap with the Items in this chapter, because traits describe

the behavior of types.

Item 1: Use the type system to express

your data structures

who cal ed them programers and not type writers

—

This Item provides a quick tour of Rust’s type system, starting with the fundamental

types that the compiler makes available, then moving on to the various ways that val‐

ues can be combined into data structures.

Rust’s enum type then takes a starring role. Although the basic version is equivalent to

what other languages provide, the ability to combine enum variants with data fields

allows for enhanced flexibility and expressivity.

Fundamental Types

The basics of Rust’s type system are pretty familiar to anyone coming from another

statically typed programming language (such as C++, Go, or Java). There’s a collec‐

tion of integer types with specific sizes, both signed (,

) integers whose sizes match the pointer size on the target system. However, you won’t be doing much in the way of

converting between pointers and integers with Rust, so that size equivalence isn’t

really relevant. However, standard collections return their size as a usize

(from .len()), so collection indexing means that usize values are quite common—

which is obviously fine from a capacity perspective, as there can’t be more items in an

in-memory collection than there are memory addresses on the system.

The integral types do give us the first hint that Rust is a stricter world than C++. In

Rust, attempting to put a larger integer type (i32) into a smaller integer type (i16)

generates a compile-time error:

D O E S N O T C O M P I L E

let x: i32 = 42;

let y: i16 = x;

error[E0308]: mismatched types

--> src/main.rs:18:18

18 | let y: i16 = x;

| --- ^ expected ì16`, found ì32`

| |

| expected due to this

2 | Chapter 1: Types

help: you can convert an ì32` to an ì16ànd panic if the converted value

doesn't fit

18 | let y: i16 = x.try_into().unwrap();

| ++++++++++++++++++++

This is reassuring: Rust is not going to sit there quietly while the programmer does

things that are risky. Although we can see that the values involved in this particular

conversion would be just fine, the compiler has to allow for the possibility of values

where the conversion is not fine:

D O E S N O T C O M P I L E

let x: i32 = 66_000;

let y: i16 = x; // What would this value be?

The error output also gives an early indication that while Rust has stronger rules, it

also has helpful compiler messages that point the way to how to comply with the

rules. The suggested solution raises the question of how to handle situations where

the conversion would have to alter the value to fit, and we’ll have more to say on both

panic! (ter.

Rust also doesn’t allow some things that might appear “safe,” such as putting a value

from a smaller integer type into a larger integer type:

D O E S N O T C O M P I L E

let x = 42i32; // Integer literal with type suffix

let y: i64 = x;

error[E0308]: mismatched types

--> src/main.rs:36:18

36 | let y: i64 = x;

| --- ^ expected ì64`, found ì32`

| |

| expected due to this

help: you can convert an ì32` to an ì64`

36 | let y: i64 = x.into();

| +++++++

Here, the suggested solution doesn’t raise the specter of error handling, but the con‐

version does still need to be explicit. We’ll discuss type conversions in more detail

later ().

Item 1: Use the type system to express your data structures | 3

Continuing with the unsurprising primitive types, Rust has a type, floating point types ( (like C’s void).

More in(similar to Go’ternally, there are again no silent conversions to or from a 32-bit integer.

This precision in the type system forces you to be explicit about what you’re trying to

express—a u32 value is different from a char, which in turn is different from a

sequence of UTF-8 bytes, which in turn is different from a sequence of arbitrary

bytes, and it’s up to you to specifJoel Spolsky’

Of course, there are helper methods that allow you to convert between these different

types, but their signatures force you to handle (or explicitly ignore) the possibility of

failure. For example, a Unicode code point can always be represented in 32 bits, so

'a' as u32 is allowed, but the other direction is trickier (as there are some u32 val‐

ues that are not valid Unicode code points):

Returns an Option<char>, forcing the caller to handle the failure case.

Makes the assumption of validity but has the potential to result in undefined

behavior if that assumption turns out not to be true. The function is marked

unsafe as a result, forcing the caller to use unsafe).

Aggregate Types

Moving on to aggregate types, Rust has a variety of ways to combine related values.

Most of these are familiar equivalents to the aggregation mechanisms available in

other languages:

Hold multiple instances of a single type, where the number of instances is known

at compile time. For example, [u32; 4] is four 4-byte integers in a row.

Hold instances of multiple heterogeneous types, where the number of elements

and their types are known at compile time, for example, (WidgetOffset, Widget

Size, WidgetColor). If the types in the tuple aren’t distinctive—for example,

1 The situation gets muddier still if the filesystem is involved, since filenames on popular platforms are somewhere in between arbitrary bytes and UTF-8 sequences: see the tation.

2 Technically rather than a code point.

4 | Chapter 1: Types

(i32, i32, &'static str, bool)—it’s better to give each element a name and

use a struct.

Also hold instances of heterogeneous types known at compile time but allow

both the overall type and the individual fields to be referred to by name.

Rust also includes the tuple struct, which is a crossbreed of a struct and a tuple:

there’s a name for the overall type but no names for the individual fields—they are

referred to by number instead: s.0, s.1, and so on:

/// Struct with two unnamed fields.

struct TextMatch(usize, String);

// Construct by providing the contents in order.

let m = TextMatch(12, "needle".to_owned());

// Access by field number.

assert_eq!(m.0, 12);

enums

This brings us to the jewel in the crown of Rust’s type system, the enum. With the basic

form of an enum, it’s hard to see what there is to get excited about. As with other lan‐

guages, the enum allows you to specify a set of mutually exclusive values, possibly with

a numeric value attached:

enum HttpResultCode {

Ok = 200,

NotFound = 404,

Teapot = 418,

}

let code = HttpResultCode::NotFound;

assert_eq!(code as i32, 404);

Because each enum definition creates a distinct type, this can be used to improve read‐

ability and maintainability of functions that take bool arguments. Instead of:

print_page( /* both_sides= */ true, /* color= */ false);

a version that uses a pair of enums:

pub enum Sides {

Both,

Single,

}

pub enum Output {

BlackAndWhite,

Color,

Item 1: Use the type system to express your data structures | 5

}

pub fn print_page(sides: Sides, color: Output) {

// ...

}

is more type-safe and easier to read at the point of invocation:

print_page(Sides::Both, Output::BlackAndWhite);

Unlike the bool version, if a library user were to accidentally flip the order of the

arguments, the compiler would immediately complain:

error[E0308]: arguments to this function are incorrect

--> src/main.rs:104:9

104 | print_page(Output::BlackAndWhite, Sides::Single);

| ^^^^^^^^^^ --------------------- ------------- expected ènums::Output`,

| | found ènums::Sides`

| |

| expected ènums::Sides`, found ènums::Output`

note: function defined here

--> src/main.rs:145:12

145 | pub fn print_page(sides: Sides, color: Output) {

| ^^^^^^^^^^ ------------ -------------

help: swap these arguments

104 | print_page(Sides::Single, Output::BlackAndWhite);

| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Using the newtype pattern—see p a bool also achieves type safety and

maintainability; it’s generally best to use the newtype pattern if the semantics will

always be Boolean, and to use an enum if there’s a chance that a new alternative—e.g.,

Sides::BothAlternateOrientation—could arise in the future.

The type safety of Rust’s enums continues with the match expression:

D O E S N O T C O M P I L E

let msg = match code {

HttpResultCode::Ok => "Ok",

HttpResultCode::NotFound => "Not found",

// forgot to deal with the all-important "I'm a teapot" code

};

error[E0004]: non-exhaustive patterns: `HttpResultCode::Teapot` not covered

--> src/main.rs:44:21

6 | Chapter 1: Types

44 | let msg = match code {

| ^^^^ pattern `HttpResultCode::Teapot` not covered

note: `HttpResultCodè defined here

--> src/main.rs:10:5

7 | enum HttpResultCode {

| --------------

...

10 | Teapot = 418,

| ^^^^^^ not covered

= note: the matched value is of type `HttpResultCodè

help: ensure that all possible cases are being handled by adding a match arm

with a wildcard pattern or an explicit pattern as shown

46 ~ HttpResultCode::NotFound => "Not found",

47 ~ HttpResultCode::Teapot => todo!(),

The compiler forces the programmer to consider al of the possibilities that are repre‐

sented by the enum, even if the result is just to add a default arm _ => {}. (Note that modern C++ compilers can and do warn about missing switch arms for enums as

well.)

enums with Fields

The true power of Rust’s enum feature comes from the fact that each variant can have

data that comes along with it, making it an aggregate type tha

C/C++ terms, it’s like a combination of an enum with a union—only type-safe.

This means that the invariants of the program’s data structures can be encoded into

Rust’s type system; states that don’t comply with those invariants won’t even compile.

A well-designed enum makes the creator’s intent clear to humans as well as to the

compiler:

use std::collections::{HashMap, HashSet};

pub enum SchedulerState {

Inert,

Pending(HashSet<Job>),

Running(HashMap<CpuId, Vec<Job>>),

}

3 The need to consider all possibilities also means that adding a new variant to an existing enum in a library is a breaking changey clients will need to change their code to cope with the new variant. If an enum is really just a C-like list of related numerical values, this behavior can be avoided by marking it as a

enum

Item 1: Use the type system to express your data structures | 7

Just from the type definition, it’s reasonable to guess that Jobs get queued up in the

Pending state until the scheduler is fully active, at which point they’re assigned to

some per-CPU pool.

This highlights the central theme of this Item, which is to use Rust’s type system to

express the concepts that are associated with the design of your software.

A dead giveaway for when this is not happening is a comment that explains when

some field or parameter is valid:

U N D E S I R E D B E H A V I O R

pub struct DisplayProps {

pub x: u32,

pub y: u32,

pub monochrome: bool,

// `fg_color` must be (0, 0, 0) if `monochromeìs true.

pub fg_color: RgbColor,

}

This is a prime candidate for replacement with an enum holding data:

pub enum Color {

Monochrome,

Foreground(RgbColor),

}

pub struct DisplayProps {

pub x: u32,

pub y: u32,

pub color: Color,

}

This small example illustrates a key piece of advice: make invalid states inexpressible in

your types. Types that support only valid combinations of values mean that whole

classes of errors are rejected by the compiler, leading to smaller and safer code.

Ubiquitous enum Types

Returning to the power of the enum, there are two concepts that are so common that

Rust’s standard library includes built-in enum types to express them; these types are

ubiquitous in Rust code.

8 | Chapter 1: Types

Option<T>

The first concept is tha: either there’s a value of a particular type (Some(T)) or there isn’t (None). Always use Option for values that can be absent; never

fall back to using sentinel values (-1, nullptr, …) to try to express the same concept

in-band.

There is one subtle point to consider, though. If you’re dealing with a col ection of

things, you need to decide whether having zero things in the collection is the same as

not having a collection. For most situations, the distinction doesn’t arise and you can

go ahead and use (say) Vec<Thing>: a count of zero things implies an absence of

things.

However, there are definitely other rare scenarios where the two cases need to be dis‐

tinguished with Option<Vec<Thing>>—for example, a cryptographic system might

need to distinguish between empty payload provided.” (This is related to the deba for columns in SQL.)

Similarly, what’s the best choice for a String that might be absent? Does "" or None

make more sense to indicate the absence of a value? Either way works, but

Option<String> clearly communicates the possibility that this value may be absent.

Result<T, E>

The second common concept arises from error processing: if a function fails, how

should that failure be reported? Historically, special sentinel values (e.g., -errno

return values from Linux system calls) or global variables (errno for POSIX systems)

were used. More recently, languages that support multiple or tuple return values

(such as Go) from functions may have a convention of returning a (result, error)

pair, assuming the existence of some suitable “zero” value for the result when the

error is non-“zero.”

In Rust, there’s an enum for just this purpose: always encode the result of an operation

that might faiT type holds the successful result (in the Ok variant), and the E type holds error details (in the Err variant) on failure.

Using the standard type makes the intent of the design clear. It also allows the use of

standard transforma) and error processing (

makes it possible to streamline error processing with the ? operator as well.

Item 1: Use the type system to express your data structures | 9

Item 2: Use the type system to express common behavior

ta structures in the type system; this Item moves

on to discuss the encoding of behavior in Rust’s type system.

The mechanisms described in this Item will generally feel familiar, as they all have

direct analogs in other languages:

Functions

The universal mechanism for associating a chunk of code with a name and a

parameter list.

Methods

Functions that are associated with an instance of a particular data structure.

Methods are common in programming languages created after object-orientation

arose as a programming paradigm.

Function pointers

Supported by most languages in the C family, including C++ and Go, as a mecha‐

nism that allows an extra level of indirection when invoking other code.

Closures

Originally most common in the Lisp family of languages but have been retrofit‐

ted to many popular programming languages, including C++ (since C++11) and

Java (since Java 8).

Traits Describe collections of related functionality that all apply to the same underlying

item. Traits have rough equivalents in many other languages, including abstract

classes in C++ and interfaces in Go and Java.

Of course, all of these mechanisms have Rust-specific details that this Item will cover.

Of the preceding list, traits have the most significance for this book, as they describe

so much of the behavior provided by the Rust compiler and standard library.

focuses on Items that give advice on designing and implementing traits, but

their pervasiveness means that they crop up frequently in the other Items in this

chapter too.

Functions and Methods

As with every other programming language, Rust uses functions to organize code into

named chunks for reuse, with inputs to the code expressed as parameters. As with

every other statically typed language, the types of the parameters and the return value

are explicitly specified:

10 | Chapter 1: Types

/// Return `x` divided by `y`.

fn div(x: f64, y: f64) -> f64 {

if y == 0.0 {

// Terminate the function and return a value.

return f64::NAN;

}

// The last expression in the function body is implicitly returned.

x / y

}

/// Function called just for its side effects, with no return value.

/// Can also write the return value as `-> ()`.

fn show(x: f64) {

println!("x = {x}");

}

If a function is intimately involved with a particular data structure, it is expressed as a

method. A method acts on an item of that type, identified by self, and is included

within an impl DataStructure block. This encapsulates related data and code

together in an object-oriented way that’s similar to other languages; however, in Rust,

methods can be added to enum types as well as to struct types, in keeping with the

pervasive nature of Rust’s enum):

enum Shape {

Rectangle { width: f64, height: f64 },

Circle { radius: f64 },

}

impl Shape {

pub fn area(&self) -> f64 {

match self {

Shape::Rectangle { width, height } => width * height,

Shape::Circle { radius } => std::f64::consts::PI * radius * radius,

}

The name of a method creates a label for the behavior it encodes, and the method

signature gives type information for its inputs and outputs. The first input for a

method will be some variant of self, indicating what the method might do to the

data structure:

• A &self parameter indicates that the contents of the data structure may be read

from but will not be modified.

• A &mut self parameter indicates that the method might modify the contents of

the data structure.

• A self parameter indicates that the method consumes the data structure.

Item 2: Use the type system to express common behavior | 11

Function Pointers

The previous section described how to associate a name (and a parameter list) with

some code. However, invoking a function always results in the same code being exe‐

cuted; all that changes from invocation to invocation is the data that the function

operates on. That covers a lot of possible scenarios, but what if the code needs to vary

at runtime?

The simplest behavioral abstraction that allows this is the ter to (just) some code, with a type that reflects the signature of the function:

fn sum(x: i32, y: i32) -> i32 {

x + y

}

// Explicit coercion tòfn` type is required...

let op: fn(i32, i32) -> i32 = sum;

The type is checked at compile time, so by the time the program runs, the value is just

the size of a pointer. Function pointers have no other data associated with them, so

they can be treated as values in various ways:

// `fn` types implement `Copy`

let op1 = op;

let op2 = op;

// `fn` types implement Èq`

assert!(op1 == op2);

// `fnìmplements `std::fmt::Pointer`, used by the {:p} format specifier.

println!("op = {:p}", op);

// Example output: "op = 0x101e9aeb0"

One technical detail to watch out for: explicit coercion to a fn type is needed, because

just using the name of a function doesn’t give you something of fn type:

D O E S N O T C O M P I L E

let op1 = sum;

let op2 = sum;

// Both op1 and op2 are of a type that cannot be named in user code,

// and this internal type does not implement Èq`.

assert!(op1 == op2);

error[E0369]: binary operation `==` cannot be applied to typèfn(i32, i32) -> i32 {main::sum}`

--> src/main.rs:102:17

102 | assert!(op1 == op2);

| --- ^^ --- fn(i32, i32) -> i32 {main::sum}

| |

| fn(i32, i32) -> i32 {main::sum}

12 | Chapter 1: Types

help: use parentheses to call these

102 | assert!(op1(/* i32 */, /* i32 */) == op2(/* i32 */, /* i32 */));

| ++++++++++++++++++++++ ++++++++++++++++++++++

Instead, the compiler error indicates that the type is something like fn(i32, i32) ->

i32 {main::sum}, a type that’s entirely internal to the compiler (i.e., could not be

written in user code) and that identifies the specific function as well as its signature.

To put it another way, the type of sum encodes both the function’s signature and its

location utomatically coerced (

fn type.

Closures

The bare function pointers are limiting, because the only inputs available to the

invoked function are those that are explicitly passed as parameter values. For exam‐

ple, consider some code that modifies every element of a slice using a function

pointer:

// In real code, an Ìterator` method would be more appropriate.

pub fn modify_all(data: & mut [u32], mutator: fn(u32) -> u32) {

for value in data {

*value = mutator(*value);

}

This works for a simple mutation of the slice:

fn add2(v: u32) -> u32 {

v + 2

}

let mut data = vec![1, 2, 3];

modify_all(& mut data, add2);

assert_eq!(data, vec![3, 4, 5]);

However, if the modification relies on any additional state, it’s not possible to implic‐

itly pass that into the function pointer:

D O E S N O T C O M P I L E

let amount_to_add = 3;

fn add_n(v: u32) -> u32 {

v + amount_to_add

}

let mut data = vec![1, 2, 3];

modify_all(& mut data, add_n);

assert_eq!(data, vec![3, 4, 5]);

Item 2: Use the type system to express common behavior | 13

error[E0434]: can't capture dynamic environment in a fn item

--> src/main.rs:125:13

125 | v + amount_to_add

| ^^^^^^^^^^^^^

= help: use thè|| { ... }` closure form instead

The error message points to the right tool for the job: a closure. A closure is a chunk

of code that looks like the body of a function definition (a lambda expression), except

for the following:

• It can be built as part of an expression, and so it need not have a name associated

with it.

• The input parameters are given in vertical bars |param1, param2| (their associ‐

ated types can usually be automatically deduced by the compiler).

• It can capture parts of the environment around it:

let amount_to_add = 3;

let add_n = |y| {

// a closure capturing àmount_to_add`

y + amount_to_add

};

let z = add_n(5);

assert_eq!(z, 8);

To (roughly) understand how the capture works, imagine that the compiler creates a

one-off, internal type that holds all of the parts of the environment that get men‐

tioned in the lambda expression. When the closure is created, an instance of this

ephemeral type is created to hold the relevant values, and when the closure is

invoked, that instance is used as additional context:

let amount_to_add = 3;

// *Rough* equivalent to a capturing closure.

struct InternalContext<'a> {

// references to captured variables

amount_to_add: &'a u32,

}

impl<'a> InternalContext<'a> {

fn internal_op(&self, y: u32) -> u32 {

// body of the lambda expression

y + *self.amount_to_add

}

let add_n = InternalContext {

amount_to_add: & amount_to_add,

};

let z = add_n.internal_op(5);

assert_eq!(z, 8);

14 | Chapter 1: Types

The values that are held in this notional context are often references (

but they can also be mutable references to things in the environment, or values that

are moved out of the environment altogether (by using the move keyword before the

input parameters).

Returning to the modify_all example, a closure can’t be used where a function

pointer is expected:

error[E0308]: mismatched types

--> src/main.rs:199:31

199 | modify_all(&mut data, |y| y + amount_to_add);

| ---------- ^^^^^^^^^^^^^^^^^^^^^ expected fn pointer,

| | found closure

| |

| arguments to this function are incorrect

= note: expected fn pointer `fn(u32) -> u32`

found closurè[closure@src/main.rs:199:31: 199:34]`

note: closures can only be coerced tòfn` types if they do not capture any

variables

--> src/main.rs:199:39

199 | modify_all(&mut data, |y| y + amount_to_add);

| ^^^^^^^^^^^^^ àmount_to_add`

| captured here

note: function defined here

--> src/main.rs:60:12

60 | pub fn modify_all(data: &mut [u32], mutator: fn(u32) -> u32) {

| ^^^^^^^^^^ -----------------------

Instead, the code that receives the closure has to accept an instance of one of the Fn*

traits:

pub fn modify_all<F>(data: & mut [u32], mut mutator: F) where

F: FnMut(u32) -> u32,

{

for value in data {

*value = mutator(*value);

}

Rust has three different Fn* traits, which between them express some distinctions

around this environment-capturing behavior:

Describes a closure that can be called only once. If some part of the environment

is moved into the closure’s context, and the closure’s body subsequently moves it

out of the closure’s context, then those moves can happen only once—there’s no

Item 2: Use the type system to express common behavior | 15

other copy of the source item to move from—and so the closure can be invoked

only once.

Describes a closure that can be called repeatedly and that can make changes to its

environment because it mutably borrows from the environment.

Describes a closure that can be called repeatedly and that only borrows values

from the environment immutably.

The compiler automatical y implements the appropriate subset of these Fn* traits for

any lambda expression in the code; it’s not possible to manually implement any of

these traits (unlike C++’s operator()

Returning to the preceding rough mental model of closures, which of the traits the

compiler auto-implements roughly corresponds to whether the captured environ‐

mental context has these elements:

FnOnce

Any moved values

FnMut

Any mutable references to values (&mut T)

Only normal references to values (&T)

The latter two traits in this list each have a trait bound of the preceding trait, which

makes sense when you consider the things that use the closures:

• If something expects to call a closure only once (indicated by receiving a FnOnce),

it’s OK to pass it a closure that’s capable of being repeatedly called (FnMut).

• If something expects to repeatedly call a closure that might mutate its environ‐

ment (indicated by receiving a FnMut), it’s OK to pass it a closure that doesn’t need

to mutate its environment (Fn).

The bare function pointer type fn also notionally belongs at the end of this list; any

(not-unsafe) fn type automatically implements all of the Fn* traits, because it bor‐

rows nothing from the environment.

4 At least not in stable Rust at the time of writing. The and tal features may change this in the future.

16 | Chapter 1: Types

As a result, when writing code that accepts closures, use the most general Fn* trait that works, to allow the greatest flexibility for callers—for example, accept FnOnce for closures that are used only once. The same reasoning also leads to advice to prefer Fn*

trait bounds over bare function pointers (fn).

Traits

The Fn* traits are more flexible than bare function pointers, but they can still describe

only the behavior of a single function, and even then only in terms of the function’s

signature.

However, they are themselves examples of another mechanism for describing behav‐

ior in Rust’s type system, the trait. A trait defines a set of related functions that some

underlying item makes publicly available; moreover, the functions are typically (but

don’t have to be) methods, taking some variant of self as their first argument.

Each function in a trait also has a name, providing a label that allows the compiler to

disambiguate functions with the same signature, and more importantly, that allows

programmers to deduce the intent of the function.

A Rust trait is roughly analogous to an “interface” in Go and Java, or to an “abstract

class” (all virtual methods, no data members) in C++. Implementations of the trait

must provide all the functions (but note that the trait definition can include a default

implementave associated data that those implementa‐

tions make use of. This means that code and data gets encapsulated together in a

common abstraction, in a somewhat object-oriented (OO) manner.

Code that accepts a struct and calls functions on it is constrained to only ever work

with that specific type. If there are multiple types that implement common behavior,

then it is more flexible to define a trait that encapsulates that common behavior, and

have the code make use of the trait’s functions rather than functions involving a spe‐

cific struct.

This leads to the same kind of advice that turns up for other OO-influenced lan‐

prefer accepting trait types over concrete types if future flexibility is anticipated.

Sometimes, there is some behavior that you want to distinguish in the type system,

but it cannot be expressed as some specific function signature in a trait definition. For

example, consider a Sort trait for sorting collections; an implementation might be

stable (elements that compare the same will appear in the same order before and after

the sort), but there’s no way to express this in the sort method arguments.

5 For example, Joshua Bloch’ddison-Wesley) includes Item 64: Refer to objects by their interfaces.

Item 2: Use the type system to express common behavior | 17

In this case, it’s still worth using the type system to track this requirement, using a

marker trait:

pub trait Sort {

/// Rearrange contents into sorted order.

fn sort(& mut self);

}

/// Marker trait to indicate that a [`Sort`] sorts stably.

pub trait StableSort: Sort {}

A marker trait has no functions, but an implementation still has to declare that it is

implementing the trait—which acts as a promise from the implementer: “I solemnly

swear that my implementation sorts stably.” Code that relies on a stable sort can then

specify the StableSort trait bound, relying on the honor system to preserve its invar‐

iants. Use marker traits to distinguish behaviors that cannot be expressed in the trait

function signatures.

Once behavior has been encapsulated into Rust’s type system as a trait, it can be used

in two ways:

• As a trait bound, which constrains what types are acceptable for a generic data

type or function at compile time

• As a trait object, which constrains what types can be stored or passed to a func‐

tion at runtime

The following sections describe these two possibilities, and gives more detail

about the trade-offs between them.

Trait bounds

A trait bound indicates that generic code that is parameterized by some type T can be

used only when that type T implements some specific trait. The presence of the trait

bound means that the implementation of the generic can use the functions from that

trait, secure in the knowledge that the compiler will ensure that any T that compiles

does indeed have those functions. This check happens at compile time, when the

generic is monomorphized—converted from the generic code that deals with an arbi‐

trary type T into specific code that deals with one particular SomeType (what C++

would call template instantiation).

This restriction on the target type T is explicit, encoded in the trait bounds: the trait

can be implemented only by types that satisfy the trait bounds. This contrasts with

the equivalent situation in C++, where the constraints on the type T used in a

18 | Chapter 1: Types

template<typename T> are implicit:plate code still compiles only if all of the referenced functions are available at compile time, but the checks are purely based

on function name and signature. (This

template that uses t.pop() might compile for a T type parameter of either Stack or

Balloon—which is unlikely to be desired behavior.)

The need for explicit trait bounds also means that a large fraction of generics use trait

bounds. To see why this is, turn the observation around and consider what can be

done with a struct Thing<T> where there are no trait bounds on T. Without a trait

bound, the Thing can perform only operations that apply to any type T—basically just

moving or dropping the value. This in turn allows for generic containers, collections,

and smart pointers, but not much else. Anything that uses the type T is going to need

a trait bound:

pub fn dump_sorted<T>(mut collection: T)

where

T: Sort + IntoIterator,

T::Item: std::fmt::Debug,

{

// Next line requires `T: Sort` trait bound.

collection.sort();

// Next line requires `T: IntoIterator` trait bound.

for item in collection {

// Next line requires `T::Item : Debug` trait bound

println!("{:?}", item);

}

So the advice here is to use trait bounds to express requirements on the types used in

generics, but it’s easy advice to follow—the compiler will force you to comply with it

regardless.

Trait objects

A trait object is the other way to make use of the encapsulation defined by a trait, but

here, different possible implementations of the trait are chosen at runtime rather than

compile time. This dynamic dispatch is analogous to using virtual functions in C++,

and under the covers, Rust has “vtable” objects that are roughly analogous to those in

C++.

This dynamic aspect of trait objects also means that they always have to be handled

indirectly, via a reference (e.g., &dyn Trait) or a pointer (e.g., Box<dyn Trait>) of

some kind. The reason is that the size of the object implementing the trait isn’t known

6 in C++20 allows explicit specification of constraints on template types, but the checks are still performed only when the template is instantiated, not when it is declared.

Item 2: Use the type system to express common behavior | 19

at compile time—it could be a giant struct or a tiny enum—so there’s no way to allo‐

cate the right amount of space for a bare trait object.

Not knowing the size of the concrete object also means that traits used as trait objects

cannot have functions that return the Self type or arguments (other than the

receiver—the object on which the method is being invoked) that use Self. The reason

is that the compiled-in-advance code that uses the trait object would have no idea

how big that Self might be.

A trait that has a generic function fn some_fn<T>(t:T) allows for the possibility of

an infinite number of implemented functions, for all of the different types T that

might exist. This is fine for a trait used as a trait bound, because the infinite set of

possibly invoked generic functions becomes a finite set of actual y invoked generic

functions at compile time. The same is not true for a trait object: the code available at

compile time has to cope with all possible Ts that might arrive at runtime.

These two restrictions—no use of Self and no generic functions—are combined in

Item 3: Prefer Option and Result transforms

over explicit match expressions

enum and showed how match expressions force the

programmer to take all possibilities into account. also introduced the two

ubiquitous enums that the Rust standard library provides:

To express that a value (of type T) may or may not be present

For when an operation to return a value (of type T) may not succeed and may

instead return an error (of type E)

This Item explores situations where you should try to avoid explicit match expres‐

sions for these particular enums, preferring instead to use various transformation

methods that the standard library provides for these types. Using these transforma‐

tion methods (which are typically themselves implemented as match expressions

under the covers) leads to code that is more compact and idiomatic and has clearer

intent.

The first situation where a match is unnecessary is when only the value is relevant and

the absence of value (and any associated error) can just be ignored:

struct S {

field: Option< i32>,

20 | Chapter 1: Types

}

let s = S { field: Some(42) };

match &s.field {

Some(i) => println!("field is {i}"),

None => {}

}

For this situation, an expression is one line shorter and, more importantly, clearer:

if let Some(i) = &s.field {

println!("field is {i}");

}

However, most of the time the programmer needs to provide the corresponding else

arm: the absence of a value (Option::None), possibly with an associated error

(Result::Err(e)), is something that the programmer needs to deal with. Designing

software to cope with failure paths is hard, and most of that is essential complexity

that no amount of syntactic support can help with—specifically, deciding what should

happen if an operation fails.

In some situations, the right decision is to perform an ostrich maneuver—put our

heads in the sand and explicitly not cope with failure. You can’t completely ignore the

error arm, because Rust requires that the code deal with both variants of the Error

enum, but you can choose to treat a failure as fatal. Performing a panic! on failure

means that the program terminates, but the rest of the code can then be written with

the assumption of success. Doing this with an explicit match would be needlessly

verbose:

let result = std::fs::File::open("/etc/passwd");

let f = match result {

Ok(f) => f,

Err(_e) => panic!("Failed to open /etc/passwd!"),

};

// Assumèfìs a valid `std::fs::Filè from here onward.

Both Option and Result provide a pair of methods that extract their inner value and

panic! if it’s absent: tter allows the error message on failure to be personalized, but in either case, the resulting code is shorter and simpler—

error handling is delegated to the .unwrap() suffix (but is still present):

let f = std::fs::File::open("/etc/passwd").unwrap();

Be clear, though: these helper functions still panic!, so choosing to use them is the

same as choosing to panic! ).

However, in many situations, the right decision for error handling is to defer the deci‐

sion to somebody else. This is particularly true when writing a library, where the code

may be used in all sorts of different environments that can’t be foreseen by the library

Item 3: Prefer Option and Result transforms over explicit match expressions | 21

author. To make that somebody else’s job easier, prefer Result to Option for expressing errors, even though this may involve conversions between different error types

Of course, this opens up the question, What counts as an error? In this example, fail‐

ing to open a file is definitely an error, and the details of that error (no such file? per‐

mission denied?) can help the user decide what to do next. On the other hand, failing

to retrieve the element of a slice because that slice is empty isn’t really an error, and so it is expressed as an Option return type in the standard library. Choosing between the two possibilities requires judgment, but lean toward Result if an

error might communicate anything useful.

Result also has a #[must_use] to nudge library users in the right direction—if the code using the returned Result ignores it, the compiler will generate

a warning:

warning: unused `Result` that must be used

--> src/main.rs:63:5

63 | f.set_len(0); // Truncate the file

| ^^^^^^^^^^^^

= note: this `Result` may be an Èrr` variant, which should be handled

= note: `#[warn(unused_must_use)]òn by default

help: usèlet _ = ...` to ignore the resulting value

63 | let _ = f.set_len(0); // Truncate the file

| +++++++

Explicitly using a match allows an error to propagate, but at the cost of some visible

boilerplate (reminiscent of ):

pub fn find_user(username: & str) -> Result<UserId, std::io::Error> {

let f = match std::fs::File::open("/etc/passwd") {

Ok(f) => f,

Err(e) => return Err(From::from(e)),

};

// ...

}

The key ingredient for reducing boilerplate is Rust’This piece of syntactic sugar takes care of matching the Err arm, transforming the error

type if necessary, and building the return Err(...) expression, all in a single

character:

pub fn find_user(username: & str) -> Result<UserId, std::io::Error> {

let f = std::fs::File::open("/etc/passwd")?;

// ...

}

22 | Chapter 1: Types

Newcomers to Rust sometimes find this disconcerting: the question mark can be hard

to spot on first glance, leading to disquiet as to how the code can possibly work.

However, even with a single character, the type system is still at work, ensuring that

all of the possibilities expressed in the relevan) are covered—leaving

the programmer to focus on the mainline code path without distractions.

What’s more, there’s generally no cost to these apparent method invocations: they are

all generic functions marked as , so the generated code will typically compile to machine code that’s identical to the manual version.

These two factors taken together mean that you should prefer Option and Result

transforms over explicit match expressions.

In the previous example, the error types lined up: both the inner and outer methods

expressed errors as . That’s often not the case: one function may accumulate errors from a variety of different sublibraries, each of which uses different

error types.

Error mapping in general is discussed in , just be aware that a

manual mapping:

pub fn find_user(username: & str) -> Result<UserId, String> {

let f = match std::fs::File::open("/etc/passwd") {

Ok(f) => f,

Err(e) => {

return Err(format!("Failed to open password file: {:?}", e))

}

};

// ...

}

could be more succinctly and idioma

transformation:

pub fn find_user(username: & str) -> Result<UserId, String> {

let f = std::fs::File::open("/etc/passwd")

.map_err(|e| format!("Failed to open password file: {:?}", e))?;

// ...

}

Better still, even this may not be necessary—if the outer error type can be created

from the inner error type via an implementation of the From standard trait (

then the compiler will automatically perform the conversion without the need for a

call to .map_err().

These kinds of transformations generalize more widely. The question mark operator

is a big hammer; use transformation methods on Option and Result types to maneu‐

ver them into a position where they can be a nail.

Item 3: Prefer Option and Result transforms over explicit match expressions | 23

The standard library provides a wide variety of these transformation methods to

make this possible. t rec‐

tangles) that transform between the relevant types (dark rectangles). In line with

, methods that can panic! are marked with an asterisk.

Figure 1-1. Option and Result transformatio

One common situation the diagram doesn’t cover deals with references. For example,

consider a structure that optionally holds some data:

struct InputData {

payload: Option<Vec< u8>>,

}

A method on this struct that tries to pass the payload to an encryption function with

signature (&[u8]) -> Vec<u8> fails if there’s a naive attempt to take a reference:

D O E S N O T C O M P I L E

impl InputData {

pub fn encrypted(&self) -> Vec< u8> {

encrypt(&self.payload.unwrap_or(vec![]))

7 The t documentation.

24 | Chapter 1: Types

}

error[E0507]: cannot move out of `self.payload` which is behind a shared

reference

--> src/main.rs:15:18

15 | encrypt(&self.payload.unwrap_or(vec![]))

| ^^^^^^^^^^^^ move occurs becausèself.payload` has type

| Òption<Vec<u8>>`, which does not implement the

| `Copy` trait

The right tool for this is the OptionThis method converts a reference-to-an-Option into an Option-of-a-reference:

pub fn encrypted(&self) -> Vec< u8> {

encrypt(self.payload.as_ref().unwrap_or(&vec![]))

}

Things to Remember

• Get used to the transformations of Option and Result, and prefer Result to

Option. Use .as_ref() as needed when transformations involve references.

• Use these transformations in preference to explicit match operations on Option

and Result.

• In particular, use these transformations to convert result types into a form where

the ? operator applies.

Item 4: Prefer idiomatic Error types

described how to use the transformations that the standard library provides

for the Option and Result types to allow concise, idiomatic handling of result types

using the ? operator. It stopped short of discussing how best to handle the variety of

different error types E that arise as the second type argument of a Result<T, E>;

that’s the subject of this Item.

This is relevant only when there are a variety of different error types in play. If all of

the different errors that a function encounters are already of the same type, it can just

return that type. When there are errors of different types, there’s a decision to make

about whether the suberror type information should be preserved.

8 Note that this method is separate from the AsRef trait, even though the method name is the same.

Item 4: Prefer idiomatic Error types | 25

The Error Trait

It’s always good to understand wha) involve, and the rel‐

evanE type parameter for a Result doesn’t have to be a type that implements Error, but it’s a common convention that allows

wrappers to express appropriate trait bounds—so prefer to implement Error for your

error types.

The first thing to notice is that the only hard requirement for Error types is the trait

bounds: any type that implements Error also has to implement the following traits:

• The Display trait, meaning that it can be format!ed with {}

• The Debug trait, meaning that it can be format!ed with {:?}

In other words, it should be possible to display Error types to both the user and the

programmer.

The only method in the trait is which allows an Error type to expose an inner, nested error. This method is optional—it comes with a default implementation

) returning None, indicating that inner error information isn’t available.

One final thing to note: if you’re writing code for a no_std environmen

may not be possible to implement Error—the Error trait is currently implemented in

std, not core, and so is not a

Minimal Errors

If nested error information isn’t needed, then an implementation of the Error type

need not be much more than a String—one rare occasion where a “stringly typed”

variable might be appropriate. It does need to be a little more than a String though;

while it’s possible to use String as the E type parameter:

pub fn find_user(username: & str) -> Result<UserId, String> {

let f = std::fs::File::open("/etc/passwd")

.map_err(|e| format!("Failed to open password file: {:?}", e))?;

// ...

}

a String doesn’t implement Error, which we’d prefer so that other areas of code can

deal with Errors. It’s not possible to impl Error for String, because neither the trait

nor the type belong to us (the so-called orphan rule):

9 Or at least the only nondeprecated, stable method.

10 At the time of writing, Errorvailable in stable Rust.

26 | Chapter 1: Types

D O E S N O T C O M P I L E

impl std::error::Error for String {}

error[E0117]: only traits defined in the current crate can be implemented for

types defined outside of the crate

--> src/main.rs:18:5

18 | impl std::error::Error for String {}

| ^^^^^^^^^^^^^^^^^^^^^^^^^^^------

| | |

| | `Stringìs not defined in the current crate

| impl doesn't use only types from inside the current crate

= note: define and implement a trait or new type instead

’t help either, because it doesn’t create a new type and so doesn’t change the error message:

D O E S N O T C O M P I L E

pub type MyError = String;

impl std::error::Error for MyError {}

error[E0117]: only traits defined in the current crate can be implemented for

types defined outside of the crate

--> src/main.rs:41:5

41 | impl std::error::Error for MyError {}

| ^^^^^^^^^^^^^^^^^^^^^^^^^^^-------

| | |

| | `Stringìs not defined in the current crate

| impl doesn't use only types from inside the current crate

= note: define and implement a trait or new type instead

Item 4: Prefer idiomatic Error types | 27

As usual, the compiler error message gives a hint to solving the problem. Defining a

tuple struct that wraps the String type (the “newtype pattern, ) allows the

Error trait to be implemented, provided that Debug and Display are implemented

too:

#[derive(Debug)]

pub struct MyError(String);

impl std::fmt::Display for MyError {

fn fmt(&self, f: & mut std::fmt::Formatter<'_>) -> std::fmt::Result {

write!(f, "{}", self.0)

}

impl std::error::Error for MyError {}

pub fn find_user(username: & str) -> Result<UserId, MyError> {

let f = std::fs::File::open("/etc/passwd").map_err(|e| {

MyError(format!("Failed to open password file: {:?}", e))

})?;

// ...

}

For convenience, it may make sense to implement the From<String> trait to allow

string values to be easily converted into MyError):

impl From<String> for MyError {

fn from(msg: String) -> Self {

Self(msg)

}

When it encounters the question mark operator (?), the compiler will automatically

apply any relevant From trait implementations that are needed to reach the destina‐

tion error return type. This allows further minimization:

pub fn find_user(username: & str) -> Result<UserId, MyError> {

let f = std::fs::File::open("/etc/passwd")

.map_err(|e| format!("Failed to open password file: {:?}", e))?;

// ...

}

The error path here covers the following steps:

• File::open returns an error of type .

• format! converts this to a String, using the Debug implementation of

std::io::Error.

• ? makes the compiler look for and use a From implementation that can take it

from String to MyError.

28 | Chapter 1: Types

Nested Errors

The alternative scenario is where the content of nested errors is important enough

that it should be preserved and made available to the caller.

Consider a library function that attempts to return the first line of a file as a string, as

long as the line is not too long. A moment’s thought reveals (at least) three distinct

types of failure that could occur:

• The file might not exist or might be inaccessible for reading.

• The file might contain data that isn’t valid UTF-8 and so can’t be converted into a

String.

• The file might have a first line that is too long.

pass all of these

possibilities as an enum:

#[derive(Debug)]

pub enum MyError {

Io(std::io::Error),

Utf8(std::string::FromUtf8Error),

General(String),

}

This enum definition includes a derive(Debug), but to satisfy the Error trait, a

Display implementation is also needed:

impl std::fmt::Display for MyError {

fn fmt(&self, f: & mut std::fmt::Formatter<'_>) -> std::fmt::Result {

match self {

MyError::Io(e) => write!(f, "IO error: {}", e),

MyError::Utf8(e) => write!(f, "UTF-8 error: {}", e),

MyError::General(s) => write!(f, "General error: {}", s),

}

It also makes sense to override the default source() implementation for easy access

to nested errors:

use std::error::Error;

impl Error for MyError {

fn source(&self) -> Option<&(dyn Error + 'static)> {

match self {

MyError::Io(e) => Some(e),

MyError::Utf8(e) => Some(e),

MyError::General(_) => None,

}

Item 4: Prefer idiomatic Error types | 29

}

The use of an enum allows the error handling to be concise while still preserving all of

the type information across different classes of error:

use std::io::BufRead; // for `.read_until()`

/// Maximum supported line length.

const MAX_LEN: usize = 1024;

/// Return the first line of the given file.

pub fn first_line(filename: & str) -> Result<String, MyError> {

let file = std::fs::File::open(filename).map_err(MyError::Io)?;

let mut reader = std::io::BufReader::new(file);

// (A real implementation could just usèreader.read_line()`)

let mut buf = vec![];

let len = reader.read_until(b'\n', & mut buf).map_err(MyError::Io)?;

let result = String::from_utf8(buf).map_err(MyError::Utf8)?;

if result.len() > MAX_LEN {

return Err(MyError::General(format!("Line too long: {}", len)));

}

Ok(result)

}

It’s also a good idea to implement the From): impl From<std::io::Error> for MyError {

fn from(e: std::io::Error) -> Self {

Self::Io(e)

}

impl From<std::string::FromUtf8Error> for MyError {

fn from(e: std::string::FromUtf8Error) -> Self {

Self::Utf8(e)

}

This prevents library users from suffering under the orphan rules themselves: they

aren’t allowed to implement From on MyError, because both the trait and the struct

are external to them.

Better still, implementing From allows for even more concision, beca

utomatically perform any necessary From conversions, removing the need for .map_err():

use std::io::BufRead; // for `.read_until()`

/// Maximum supported line length.

pub const MAX_LEN: usize = 1024;

30 | Chapter 1: Types

/// Return the first line of the given file.

pub fn first_line(filename: & str) -> Result<String, MyError> {

let file = std::fs::File::open(filename)?; // `From<std::io::Error>`

let mut reader = std::io::BufReader::new(file);

let mut buf = vec![];

let len = reader.read_until(b'\n', & mut buf)?; // `From<std::io::Error>`

let result = String::from_utf8(buf)?; // `From<string::FromUtf8Error>`

if result.len() > MAX_LEN {

return Err(MyError::General(format!("Line too long: {}", len)));

}

Ok(result)

}

Writing a complete error type can involve a fair amount of boilerplate, which makes it

a good candidate for automation via a derive macro ). However, there’s no need to write such a macro yourself: consider using th crate from David Tolnay, which provides a high-quality, widely used implementation of just such a

macro. The code generated by thiserror is also careful to avoid making any this

error types visible in the generated API, which in turn means that the concerns asso‐

ciated with ’t apply.

Trait Objects

The first approach to nested errors threw away all of the suberror detail, just preserv‐

ing some string output (format!("{:?}", err)). The second approach preserved the

full type information for all possible suberrors but required a full enumeration of all

possible types of suberror.

This raises the question, Is there a middle ground between these two approaches, pre‐

serving suberror information without needing to manually include every possible

error type?

Encoding the suberror information as a voids the need for an enum variant for every possibility but erases the details of the specific underlying error types.

The receiver of such an object would have access to the methods of the Error trait

and its trait bounds—source(), Display::fmt(), and Debug::fmt(), in turn—but

wouldn’t know the original static type of the suberror:

U N D E S I R E D B E H A V I O R

#[derive(Debug)]

pub enum WrappedError {

Wrapped(Box< dyn Error>),

General(String),

}

impl std::fmt::Display for WrappedError {

fn fmt(&self, f: & mut std::fmt::Formatter<'_>) -> std::fmt::Result {

Item 4: Prefer idiomatic Error types | 31

match self {

Self::Wrapped(e) => write!(f, "Inner error: {}", e),

Self::General(s) => write!(f, "{}", s),

}

It turns out that this is possible, but it’s surprisingly subtle. Part of the difficulty comes

from the object safety constrainust’s coherence rules also come into play, which (roughly) say that there can be at most one implementation of a trait for a type.

A putative WrappedError type would naively be expected to implement both of the

following:

• The Error trait, because it is an error itself.

• The From<Error> trait, to allow suberrors to be easily wrapped.

That means that a WrappedError can be created from an inner WrappedError, as Wrap

pedError implements Error, and that clashes with the blanket reflexive implementa‐

tion of From:

D O E S N O T C O M P I L E

impl Error for WrappedError {}

impl<E: 'static + Error> From<E> for WrappedError {

fn from(e: E) -> Self {

Self::Wrapped(Box::new(e))

}

Дальше: Libraries Versus Applications