Ef fective Rust
“Ef fective Rust is an
Rust’s popularity is growing, due in part to features like
memory safety, type safety, and thread safety. But these same
excel ent collection
elements can also make learning Rust a chal enge, even for
of real-world Rust
experienced programmers. This practical guide helps you
knowledge beyond the
make the transition to writing idiomatic Rust. In the process,
basics. The advice in
you’ll also make full use of Rust’s type system, safety guarantees,
this book will help you
and burgeoning ecosystem.
become a conf ident
If you’re a software engineer who has experience with an
and wel -rounded
existing compiled language, or if you’ve struggled to convert
Rustacean.”
a basic understanding of Rust syntax into working programs,
—Carol Nichols
this book is for you. Effective Rust focuses on the conceptual
Coauthor of The Rust
differences between Rust and other compiled languages,
Programming Language
and provides specific recommendations that programmers can
easily fol ow. Author David Drysdale will soon have you writing
“Ef fective Rust dives
fluent Rust, rather than badly translated C++.
deep into most of the
Effective Rust will help you:
recommendations I
• Understand the structure of Rust’s type system
give people on how to
• Learn Rust idioms for error handling, iteration, and more
improve their projects.
•
It’s a great resource
Discover how to work with Rust’s crate ecosystem
to level up your Rust.”
• Use Rust’s type system to express your design
—Pietro Albini
• Win fights with the borrow checker
Former member of the
• Build a robust project that takes full advantage
Rust Core Team
of the Rust tooling ecosystem
David Drysdale is a Google staff software engineer who has worked in Rust since 2019, primarily
in security-related areas. He led the Rust rewrite of Android’s hardware cryptography subsystem
and authored the Rust port of the Tink cryptography library. He has also worked in C/C++ and Go and
on projects as diverse as the Linux kernel and video conferencing mobile apps.
RUST PROGRAMMING
linkedin.com/company/oreilly-media
youtube.com/oreillymedia
US $59.99 CAN $74.99
ISBN: 978-1-098-15140-9
5 9 9 9 9
9 7 8 1 0 9 8 1 5 1 4 0 9
Effective Rust
35 Specific Ways to Improve Your Rust Code
David Drysdale
Beijing Boston Farnham Sebastopol Tokyo
Effective Rust
by David Drysdale
Copyright © 2024 Galloglass Consulting Limited. All rights reserved.
Printed in the United States of America.
Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.
O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are
also available for most titles (or more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreil y.com.
Acquisitions Editor: Brian Guerin
Indexer: Ellen Troutman-Zaig
Development Editor: Jeff Bleiel
Interior Designer: David Futato
Production Editor: Katherine Tozer
Cover Designer: Karen Montgomery
Copyeditor: Piper Editorial Consulting, LLC
Illustrator: Kate Dullea
Proofreader: Dwight Ramsey
April 2024:
First Edition
Revision History for the First Edition
2024-04-01: First Release
See for release details.
The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Effective Rust, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc.
The views expressed in this work are those of the author and do not represent the publisher’s views. While
the publisher and the author have used good faith efforts to ensure that the information and instructions
contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance
on this work. Use of the information and instructions contained in this work is at your own risk. If any
code samples or other technology this work contains or describes is subject to open source licenses or the
intellectual property rights of others, it is your responsibility to ensure that your use thereof complies
with such licenses and/or rights.
978-1-098-15140-9
[LSI]
Table of Contents
1.
2
10
20
25
34
40
45
51
64
2.
77
89
93
103
3.
106
123
142
145
159
162
169
iii
4.
176
181
186
188
191
197
5.
203
209
223
227
235
237
6.
243
249
261
iv | Table of Contents
Preface
The code is more what you’d call guidelines than actual rules.
—Hector Barbossa
In the crowded landscape of modern programming languages, Rust is different. Rust
offers the speed of a compiled language, the efficiency of a non-garbage-collected lan‐
guage, and the type safety of a functional language—as well as a unique solution to
memory safety problems. As a result, R
The strength and consistency of Rust’s type system means that if a Rust program
compiles, there is already a decent chance that it will work—a phenomenon previ‐
ously observed only with more academic, less accessible languages such as Haskell. If
a Rust program compiles, it will also work safely.
This safety—both type safety and memory safety—does come with a cost, though.
Despite the quality of the basic documentation, Rust has a reputation for having a
steep on-ramp, where newcomers have to go through the initiation rituals of fighting
the borrow checker, redesigning their data structures, and being befuddled by life‐
times. A Rust program that compiles may have a good chance of working the first
time, but the struggle to get it to compile is real—even with the Rust compiler’s
remarkably helpful error diagnostics.
Who This Book Is For
This book tries to help with these areas where programmers struggle, even if they
already have experience with an existing compiled language like C++. As such—and
in common with other Effective <Language> books—this book is intended to be the
second book that a newcomer to Rust might need, after they have already encountered
the basics elsewhere—for example, in (Steve Klabnik and Carol Nichols, N (Jim Blandy et al., O’Reilly).
v
However, Rust’s safety leads to a slightly different slant to the Items here, particularly
when compared to Scott Meyers’s original Effective C++ series. The C++ language
was (and is) full of footguns, so Effective C++ focused on a collection of advice for
avoiding those footguns, based on real-world experience creating software in C++.
Significantly, it contained guidelines not rules, because guidelines have exceptions—
providing the detailed rationale for a guideline allows readers to decide for them‐
selves whether their particular scenario warranted breaking the rule.
The general style of giving advice together with the reasons for that advice is pre‐
served here. However, since Rust is remarkably free of footguns, the Items here con‐
centrate more on the concepts that Rust introduces. Many Items have titles like
“Understand…” and “Familiarize yourself with…” , and help on the journey toward
writing fluent, idiomatic Rust.
Rust’s safety also leads to a complete absence of Items titled “Never…” . If you really
should never do something, the compiler will generally prevent you from doing it.
Rust Version
Rust’s back-compatibility mean that any later edition of Rust, including the
t later edition introduces breaking changes. Rust is now also stable enough that the differences
between the 2018 and 2021 editions are minor; none of the code in the book needs
altering to be 2021-edition compliant (but includes one exception in which a
later version of Rust allows new behavior that wasn’t previously possible).
The Items here do not cover any aspects of Rust’s , as this involves more advanced concepts and less stable toolchain support—there’s already enough
ground to cover with synchronous Rust. Perhaps an Effective Async Rust will emerge
in the future…
The specific rustc version used for code fragments and error messages is 1.70. The
code fragments are unlikely to need changes for later versions, but the error messages
may vary with your particular compiler version. The error messages included in the
text have also been manually edited to fit within the width constraints of the book but
are otherwise as produced by the compiler.
vi | Preface
The text has a number of references to and comparisons with other statically typed
languages, such as Java, Go, and C++, to help readers with experience in those lan‐
guages orient themselves. (C++ is probably the closest equivalent language, particu‐
larly when C++11’s move semantics come into play.)
Navigating This Book
The Items that make up the book are divided into six chapters:
Suggestions that revolve around Rust’s core type system
Suggestions for working with Rust’s traits
Core ideas that form the design of Rust
Advice for working with Rust’s package ecosystem
Suggestions for improving your codebase by going beyond just the Rust compiler
Suggestions for when you have to work beyond Rust’s standard, safe environment
Although the “Concepts” chapter is arguably more fundamental than the “Types” and
“Traits” chapters, it is deliberately placed later in the book so that readers who are
reading from beginning to end can build up some confidence first.
Conventions Used in This Book
The following typographical conventions are used in this book:
Italic Indicates new terms, URLs, email addresses, filenames, and file extensions.
Constant width
Used for program listings, as well as within paragraphs to refer to program ele‐
ments such as variable or function names, databases, data types, environment
variables, statements, and keywords.
Preface | vii
D O E S N O T C O M P I L E
// Marks code samples that do not compile
U N D E S I R E D B E H A V I O R
// Marks code samples that exhibit undesired behavior
O’Reilly Online Learning
F
ogy and business training, knowledge, and insight to help
companies succeed.
Our unique network of experts and innovators share their knowledge and expertise
through books, articles, and our online learning platform. O’Reilly’s online learning
platform gives you on-demand access to live training courses, in-depth learning
paths, interactive coding environments, and a vast collection of text and video from
O’Reilly and 200+ other publishers. For more information, visit .
How to Contact Us
Please address comments and questions concerning this book to the publisher:
O’Reilly Media, Inc.
1005 Gravenstein Highway North
Sebastopol, CA 95472
800-889-8969 (in the United States or Canada)
707-827-7019 (international or local)
707-829-0104 (fax)
We have a web page for this book, where we list errata, examples, and any additional
information. You can access this page a.
For news and information about our books and courses, visit
.
Watch us on YouTube: .
viii | Preface
Acknowledgments
My thanks go to the people who helped make this book possible:
• The technical reviewers who gave expert and detailed feedback on all aspects of
the text: Pietro Albini, Jess Males, Mike Capp, and especially Carol Nichols.
• My editors at O’Reilly: Jeff Bleiel, Brian Guerin, and Katie Tozer.
• Tiziano Santoro, from whom I originally learned many things about Rust.
• Danny Elfanbaum, who provided vital technical assistance for dealing with the
AsciiDoc formatting of the book.
• Diligent readers of the original web version of the book, in particular:
— Julian Rosse, who spotted dozens of typos and other errors in the online text.
— Martin Disch, who pointed out potential improvements and inaccuracies in
several Items.
— Chris Fleetwood, Sergey Kaunov, Clifford Matthews, Remo Senekowitsch,
Kirill Zaborsky, and an anonymous Proton Mail user, who pointed out mis‐
takes in the text.
• My family, who coped with many weekends when I was distracted by writing.
Preface | ix
CHAPTER 1
Types
This first chapter of this book covers advice that revolves around Rust’s type system.
This type system is more expressive than that of other mainstream languages; it has
more in common with “academic” languages such as
One core part of this is Rust’s enum type, which is considerably more expressive than
the enumeration types in other languages and which allows for .
The Items in this chapter cover the fundamental types that the language provides and
how to combine them into data structures that precisely express the semantics of
your program. This concept of encoding behavior into the type system helps to
reduce the amount of checking and error path code that’s required, because invalid
states are rejected by the toolchain at compile time rather than by the program at run‐
time.
This chapter also describes some of the ubiquitous data structures that are provided
by Rust’s standard library: Options, Results, Errors and Iterators. Familiarity with
these standard tools will help you write idiomatic Rust that is efficient and compact—
in particular, they allow use of Rust’s question mark operator, which supports error
handling that is unobtrusive but still type-safe.
Note that Items that involve Rust traits are covered in the following chapter, but there
is necessarily a degree of overlap with the Items in this chapter, because traits describe
the behavior of types.
1
Item 1: Use the type system to express
your data structures
who cal ed them programers and not type writers
—
This Item provides a quick tour of Rust’s type system, starting with the fundamental
types that the compiler makes available, then moving on to the various ways that val‐
ues can be combined into data structures.
Rust’s enum type then takes a starring role. Although the basic version is equivalent to
what other languages provide, the ability to combine enum variants with data fields
allows for enhanced flexibility and expressivity.
Fundamental Types
The basics of Rust’s type system are pretty familiar to anyone coming from another
statically typed programming language (such as C++, Go, or Java). There’s a collec‐
tion of integer types with specific sizes, both signed (,
) integers whose sizes match the pointer size on the target system. However, you won’t be doing much in the way of
converting between pointers and integers with Rust, so that size equivalence isn’t
really relevant. However, standard collections return their size as a usize
(from .len()), so collection indexing means that usize values are quite common—
which is obviously fine from a capacity perspective, as there can’t be more items in an
in-memory collection than there are memory addresses on the system.
The integral types do give us the first hint that Rust is a stricter world than C++. In
Rust, attempting to put a larger integer type (i32) into a smaller integer type (i16)
generates a compile-time error:
D O E S N O T C O M P I L E
let x: i32 = 42;
let y: i16 = x;
error[E0308]: mismatched types
--> src/main.rs:18:18
|
18 | let y: i16 = x;
| --- ^ expected ì16`, found ì32`
| |
| expected due to this
2 | Chapter 1: Types
|
help: you can convert an ì32` to an ì16ànd panic if the converted value
doesn't fit
|
18 | let y: i16 = x.try_into().unwrap();
| ++++++++++++++++++++
This is reassuring: Rust is not going to sit there quietly while the programmer does
things that are risky. Although we can see that the values involved in this particular
conversion would be just fine, the compiler has to allow for the possibility of values
where the conversion is not fine:
D O E S N O T C O M P I L E
let x: i32 = 66_000;
let y: i16 = x; // What would this value be?
The error output also gives an early indication that while Rust has stronger rules, it
also has helpful compiler messages that point the way to how to comply with the
rules. The suggested solution raises the question of how to handle situations where
the conversion would have to alter the value to fit, and we’ll have more to say on both
panic! (ter.
Rust also doesn’t allow some things that might appear “safe,” such as putting a value
from a smaller integer type into a larger integer type:
D O E S N O T C O M P I L E
let x = 42i32; // Integer literal with type suffix
let y: i64 = x;
error[E0308]: mismatched types
--> src/main.rs:36:18
|
36 | let y: i64 = x;
| --- ^ expected ì64`, found ì32`
| |
| expected due to this
|
help: you can convert an ì32` to an ì64`
|
36 | let y: i64 = x.into();
| +++++++
Here, the suggested solution doesn’t raise the specter of error handling, but the con‐
version does still need to be explicit. We’ll discuss type conversions in more detail
later ().
Item 1: Use the type system to express your data structures | 3
Continuing with the unsurprising primitive types, Rust has a type, floating point types ( (like C’s void).
More in(similar to Go’ternally, there are again no silent conversions to or from a 32-bit integer.
This precision in the type system forces you to be explicit about what you’re trying to
express—a u32 value is different from a char, which in turn is different from a
sequence of UTF-8 bytes, which in turn is different from a sequence of arbitrary
bytes, and it’s up to you to specifJoel Spolsky’
Of course, there are helper methods that allow you to convert between these different
types, but their signatures force you to handle (or explicitly ignore) the possibility of
failure. For example, a Unicode code point can always be represented in 32 bits, so
'a' as u32 is allowed, but the other direction is trickier (as there are some u32 val‐
ues that are not valid Unicode code points):
Returns an Option<char>, forcing the caller to handle the failure case.
Makes the assumption of validity but has the potential to result in undefined
behavior if that assumption turns out not to be true. The function is marked
unsafe as a result, forcing the caller to use unsafe).
Aggregate Types
Moving on to aggregate types, Rust has a variety of ways to combine related values.
Most of these are familiar equivalents to the aggregation mechanisms available in
other languages:
Hold multiple instances of a single type, where the number of instances is known
at compile time. For example, [u32; 4] is four 4-byte integers in a row.
Hold instances of multiple heterogeneous types, where the number of elements
and their types are known at compile time, for example, (WidgetOffset, Widget
Size, WidgetColor). If the types in the tuple aren’t distinctive—for example,
1 The situation gets muddier still if the filesystem is involved, since filenames on popular platforms are somewhere in between arbitrary bytes and UTF-8 sequences: see the tation.
2 Technically rather than a code point.
4 | Chapter 1: Types
(i32, i32, &'static str, bool)—it’s better to give each element a name and
use a struct.
Also hold instances of heterogeneous types known at compile time but allow
both the overall type and the individual fields to be referred to by name.
Rust also includes the tuple struct, which is a crossbreed of a struct and a tuple:
there’s a name for the overall type but no names for the individual fields—they are
referred to by number instead: s.0, s.1, and so on:
/// Struct with two unnamed fields.
struct TextMatch(usize, String);
// Construct by providing the contents in order.
let m = TextMatch(12, "needle".to_owned());
// Access by field number.
assert_eq!(m.0, 12);
enums
This brings us to the jewel in the crown of Rust’s type system, the enum. With the basic
form of an enum, it’s hard to see what there is to get excited about. As with other lan‐
guages, the enum allows you to specify a set of mutually exclusive values, possibly with
a numeric value attached:
enum HttpResultCode {
Ok = 200,
NotFound = 404,
Teapot = 418,
}
let code = HttpResultCode::NotFound;
assert_eq!(code as i32, 404);
Because each enum definition creates a distinct type, this can be used to improve read‐
ability and maintainability of functions that take bool arguments. Instead of:
print_page( /* both_sides= */ true, /* color= */ false);
a version that uses a pair of enums:
pub enum Sides {
Both,
Single,
}
pub enum Output {
BlackAndWhite,
Color,
Item 1: Use the type system to express your data structures | 5
}
pub fn print_page(sides: Sides, color: Output) {
// ...
}
is more type-safe and easier to read at the point of invocation:
print_page(Sides::Both, Output::BlackAndWhite);
Unlike the bool version, if a library user were to accidentally flip the order of the
arguments, the compiler would immediately complain:
error[E0308]: arguments to this function are incorrect
--> src/main.rs:104:9
|
104 | print_page(Output::BlackAndWhite, Sides::Single);
| ^^^^^^^^^^ --------------------- ------------- expected ènums::Output`,
| | found ènums::Sides`
| |
| expected ènums::Sides`, found ènums::Output`
|
note: function defined here
--> src/main.rs:145:12
|
145 | pub fn print_page(sides: Sides, color: Output) {
| ^^^^^^^^^^ ------------ -------------
help: swap these arguments
|
104 | print_page(Sides::Single, Output::BlackAndWhite);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Using the newtype pattern—see p a bool also achieves type safety and
maintainability; it’s generally best to use the newtype pattern if the semantics will
always be Boolean, and to use an enum if there’s a chance that a new alternative—e.g.,
Sides::BothAlternateOrientation—could arise in the future.
The type safety of Rust’s enums continues with the match expression:
D O E S N O T C O M P I L E
let msg = match code {
HttpResultCode::Ok => "Ok",
HttpResultCode::NotFound => "Not found",
// forgot to deal with the all-important "I'm a teapot" code
};
error[E0004]: non-exhaustive patterns: `HttpResultCode::Teapot` not covered
--> src/main.rs:44:21
|
6 | Chapter 1: Types
44 | let msg = match code {
| ^^^^ pattern `HttpResultCode::Teapot` not covered
|
note: `HttpResultCodè defined here
--> src/main.rs:10:5
|
7 | enum HttpResultCode {
| --------------
...
10 | Teapot = 418,
| ^^^^^^ not covered
= note: the matched value is of type `HttpResultCodè
help: ensure that all possible cases are being handled by adding a match arm
with a wildcard pattern or an explicit pattern as shown
|
46 ~ HttpResultCode::NotFound => "Not found",
47 ~ HttpResultCode::Teapot => todo!(),
|
The compiler forces the programmer to consider al of the possibilities that are repre‐
sented by the enum, even if the result is just to add a default arm _ => {}. (Note that modern C++ compilers can and do warn about missing switch arms for enums as
well.)
enums with Fields
The true power of Rust’s enum feature comes from the fact that each variant can have
data that comes along with it, making it an aggregate type tha
C/C++ terms, it’s like a combination of an enum with a union—only type-safe.
This means that the invariants of the program’s data structures can be encoded into
Rust’s type system; states that don’t comply with those invariants won’t even compile.
A well-designed enum makes the creator’s intent clear to humans as well as to the
compiler:
use std::collections::{HashMap, HashSet};
pub enum SchedulerState {
Inert,
Pending(HashSet<Job>),
Running(HashMap<CpuId, Vec<Job>>),
}
3 The need to consider all possibilities also means that adding a new variant to an existing enum in a library is a breaking changey clients will need to change their code to cope with the new variant. If an enum is really just a C-like list of related numerical values, this behavior can be avoided by marking it as a
enum
Item 1: Use the type system to express your data structures | 7
Just from the type definition, it’s reasonable to guess that Jobs get queued up in the
Pending state until the scheduler is fully active, at which point they’re assigned to
some per-CPU pool.
This highlights the central theme of this Item, which is to use Rust’s type system to
express the concepts that are associated with the design of your software.
A dead giveaway for when this is not happening is a comment that explains when
some field or parameter is valid:
U N D E S I R E D B E H A V I O R
pub struct DisplayProps {
pub x: u32,
pub y: u32,
pub monochrome: bool,
// `fg_color` must be (0, 0, 0) if `monochromeìs true.
pub fg_color: RgbColor,
}
This is a prime candidate for replacement with an enum holding data:
pub enum Color {
Monochrome,
Foreground(RgbColor),
}
pub struct DisplayProps {
pub x: u32,
pub y: u32,
pub color: Color,
}
This small example illustrates a key piece of advice: make invalid states inexpressible in
your types. Types that support only valid combinations of values mean that whole
classes of errors are rejected by the compiler, leading to smaller and safer code.
Ubiquitous enum Types
Returning to the power of the enum, there are two concepts that are so common that
Rust’s standard library includes built-in enum types to express them; these types are
ubiquitous in Rust code.
8 | Chapter 1: Types
Option<T>
The first concept is tha: either there’s a value of a particular type (Some(T)) or there isn’t (None). Always use Option for values that can be absent; never
fall back to using sentinel values (-1, nullptr, …) to try to express the same concept
in-band.
There is one subtle point to consider, though. If you’re dealing with a col ection of
things, you need to decide whether having zero things in the collection is the same as
not having a collection. For most situations, the distinction doesn’t arise and you can
go ahead and use (say) Vec<Thing>: a count of zero things implies an absence of
things.
However, there are definitely other rare scenarios where the two cases need to be dis‐
tinguished with Option<Vec<Thing>>—for example, a cryptographic system might
need to distinguish between empty payload provided.” (This is related to the deba for columns in SQL.)
Similarly, what’s the best choice for a String that might be absent? Does "" or None
make more sense to indicate the absence of a value? Either way works, but
Option<String> clearly communicates the possibility that this value may be absent.
Result<T, E>
The second common concept arises from error processing: if a function fails, how
should that failure be reported? Historically, special sentinel values (e.g., -errno
return values from Linux system calls) or global variables (errno for POSIX systems)
were used. More recently, languages that support multiple or tuple return values
(such as Go) from functions may have a convention of returning a (result, error)
pair, assuming the existence of some suitable “zero” value for the result when the
error is non-“zero.”
In Rust, there’s an enum for just this purpose: always encode the result of an operation
that might faiT type holds the successful result (in the Ok variant), and the E type holds error details (in the Err variant) on failure.
Using the standard type makes the intent of the design clear. It also allows the use of
standard transforma) and error processing (
makes it possible to streamline error processing with the ? operator as well.
Item 1: Use the type system to express your data structures | 9
Item 2: Use the type system to express common behavior
ta structures in the type system; this Item moves
on to discuss the encoding of behavior in Rust’s type system.
The mechanisms described in this Item will generally feel familiar, as they all have
direct analogs in other languages:
Functions
The universal mechanism for associating a chunk of code with a name and a
parameter list.
Methods
Functions that are associated with an instance of a particular data structure.
Methods are common in programming languages created after object-orientation
arose as a programming paradigm.
Function pointers
Supported by most languages in the C family, including C++ and Go, as a mecha‐
nism that allows an extra level of indirection when invoking other code.
Closures
Originally most common in the Lisp family of languages but have been retrofit‐
ted to many popular programming languages, including C++ (since C++11) and
Java (since Java 8).
Traits Describe collections of related functionality that all apply to the same underlying
item. Traits have rough equivalents in many other languages, including abstract
classes in C++ and interfaces in Go and Java.
Of course, all of these mechanisms have Rust-specific details that this Item will cover.
Of the preceding list, traits have the most significance for this book, as they describe
so much of the behavior provided by the Rust compiler and standard library.
focuses on Items that give advice on designing and implementing traits, but
their pervasiveness means that they crop up frequently in the other Items in this
chapter too.
Functions and Methods
As with every other programming language, Rust uses functions to organize code into
named chunks for reuse, with inputs to the code expressed as parameters. As with
every other statically typed language, the types of the parameters and the return value
are explicitly specified:
10 | Chapter 1: Types
/// Return `x` divided by `y`.
fn div(x: f64, y: f64) -> f64 {
if y == 0.0 {
// Terminate the function and return a value.
return f64::NAN;
}
// The last expression in the function body is implicitly returned.
x / y
}
/// Function called just for its side effects, with no return value.
/// Can also write the return value as `-> ()`.
fn show(x: f64) {
println!("x = {x}");
}
If a function is intimately involved with a particular data structure, it is expressed as a
method. A method acts on an item of that type, identified by self, and is included
within an impl DataStructure block. This encapsulates related data and code
together in an object-oriented way that’s similar to other languages; however, in Rust,
methods can be added to enum types as well as to struct types, in keeping with the
pervasive nature of Rust’s enum):
enum Shape {
Rectangle { width: f64, height: f64 },
Circle { radius: f64 },
}
impl Shape {
pub fn area(&self) -> f64 {
match self {
Shape::Rectangle { width, height } => width * height,
Shape::Circle { radius } => std::f64::consts::PI * radius * radius,
}
}
}
The name of a method creates a label for the behavior it encodes, and the method
signature gives type information for its inputs and outputs. The first input for a
method will be some variant of self, indicating what the method might do to the
data structure:
• A &self parameter indicates that the contents of the data structure may be read
from but will not be modified.
• A &mut self parameter indicates that the method might modify the contents of
the data structure.
• A self parameter indicates that the method consumes the data structure.
Item 2: Use the type system to express common behavior | 11
Function Pointers
The previous section described how to associate a name (and a parameter list) with
some code. However, invoking a function always results in the same code being exe‐
cuted; all that changes from invocation to invocation is the data that the function
operates on. That covers a lot of possible scenarios, but what if the code needs to vary
at runtime?
The simplest behavioral abstraction that allows this is the ter to (just) some code, with a type that reflects the signature of the function:
fn sum(x: i32, y: i32) -> i32 {
x + y
}
// Explicit coercion tòfn` type is required...
let op: fn(i32, i32) -> i32 = sum;
The type is checked at compile time, so by the time the program runs, the value is just
the size of a pointer. Function pointers have no other data associated with them, so
they can be treated as values in various ways:
// `fn` types implement `Copy`
let op1 = op;
let op2 = op;
// `fn` types implement Èq`
assert!(op1 == op2);
// `fnìmplements `std::fmt::Pointer`, used by the {:p} format specifier.
println!("op = {:p}", op);
// Example output: "op = 0x101e9aeb0"
One technical detail to watch out for: explicit coercion to a fn type is needed, because
just using the name of a function doesn’t give you something of fn type:
D O E S N O T C O M P I L E
let op1 = sum;
let op2 = sum;
// Both op1 and op2 are of a type that cannot be named in user code,
// and this internal type does not implement Èq`.
assert!(op1 == op2);
error[E0369]: binary operation `==` cannot be applied to typèfn(i32, i32) -> i32 {main::sum}`
--> src/main.rs:102:17
|
102 | assert!(op1 == op2);
| --- ^^ --- fn(i32, i32) -> i32 {main::sum}
| |
| fn(i32, i32) -> i32 {main::sum}
|
12 | Chapter 1: Types
help: use parentheses to call these
|
102 | assert!(op1(/* i32 */, /* i32 */) == op2(/* i32 */, /* i32 */));
| ++++++++++++++++++++++ ++++++++++++++++++++++
Instead, the compiler error indicates that the type is something like fn(i32, i32) ->
i32 {main::sum}, a type that’s entirely internal to the compiler (i.e., could not be
written in user code) and that identifies the specific function as well as its signature.
To put it another way, the type of sum encodes both the function’s signature and its
location utomatically coerced (
fn type.
Closures
The bare function pointers are limiting, because the only inputs available to the
invoked function are those that are explicitly passed as parameter values. For exam‐
ple, consider some code that modifies every element of a slice using a function
pointer:
// In real code, an Ìterator` method would be more appropriate.
pub fn modify_all(data: & mut [u32], mutator: fn(u32) -> u32) {
for value in data {
*value = mutator(*value);
}
}
This works for a simple mutation of the slice:
fn add2(v: u32) -> u32 {
v + 2
}
let mut data = vec![1, 2, 3];
modify_all(& mut data, add2);
assert_eq!(data, vec![3, 4, 5]);
However, if the modification relies on any additional state, it’s not possible to implic‐
itly pass that into the function pointer:
D O E S N O T C O M P I L E
let amount_to_add = 3;
fn add_n(v: u32) -> u32 {
v + amount_to_add
}
let mut data = vec![1, 2, 3];
modify_all(& mut data, add_n);
assert_eq!(data, vec![3, 4, 5]);
Item 2: Use the type system to express common behavior | 13
error[E0434]: can't capture dynamic environment in a fn item
--> src/main.rs:125:13
|
125 | v + amount_to_add
| ^^^^^^^^^^^^^
|
= help: use thè|| { ... }` closure form instead
The error message points to the right tool for the job: a closure. A closure is a chunk
of code that looks like the body of a function definition (a lambda expression), except
for the following:
• It can be built as part of an expression, and so it need not have a name associated
with it.
• The input parameters are given in vertical bars |param1, param2| (their associ‐
ated types can usually be automatically deduced by the compiler).
• It can capture parts of the environment around it:
let amount_to_add = 3;
let add_n = |y| {
// a closure capturing àmount_to_add`
y + amount_to_add
};
let z = add_n(5);
assert_eq!(z, 8);
To (roughly) understand how the capture works, imagine that the compiler creates a
one-off, internal type that holds all of the parts of the environment that get men‐
tioned in the lambda expression. When the closure is created, an instance of this
ephemeral type is created to hold the relevant values, and when the closure is
invoked, that instance is used as additional context:
let amount_to_add = 3;
// *Rough* equivalent to a capturing closure.
struct InternalContext<'a> {
// references to captured variables
amount_to_add: &'a u32,
}
impl<'a> InternalContext<'a> {
fn internal_op(&self, y: u32) -> u32 {
// body of the lambda expression
y + *self.amount_to_add
}
}
let add_n = InternalContext {
amount_to_add: & amount_to_add,
};
let z = add_n.internal_op(5);
assert_eq!(z, 8);
14 | Chapter 1: Types
The values that are held in this notional context are often references (
but they can also be mutable references to things in the environment, or values that
are moved out of the environment altogether (by using the move keyword before the
input parameters).
Returning to the modify_all example, a closure can’t be used where a function
pointer is expected:
error[E0308]: mismatched types
--> src/main.rs:199:31
|
199 | modify_all(&mut data, |y| y + amount_to_add);
| ---------- ^^^^^^^^^^^^^^^^^^^^^ expected fn pointer,
| | found closure
| |
| arguments to this function are incorrect
|
= note: expected fn pointer `fn(u32) -> u32`
found closurè[closure@src/main.rs:199:31: 199:34]`
note: closures can only be coerced tòfn` types if they do not capture any
variables
--> src/main.rs:199:39
|
199 | modify_all(&mut data, |y| y + amount_to_add);
| ^^^^^^^^^^^^^ àmount_to_add`
| captured here
note: function defined here
--> src/main.rs:60:12
|
60 | pub fn modify_all(data: &mut [u32], mutator: fn(u32) -> u32) {
| ^^^^^^^^^^ -----------------------
Instead, the code that receives the closure has to accept an instance of one of the Fn*
traits:
pub fn modify_all<F>(data: & mut [u32], mut mutator: F) where
F: FnMut(u32) -> u32,
{
for value in data {
*value = mutator(*value);
}
}
Rust has three different Fn* traits, which between them express some distinctions
around this environment-capturing behavior:
Describes a closure that can be called only once. If some part of the environment
is moved into the closure’s context, and the closure’s body subsequently moves it
out of the closure’s context, then those moves can happen only once—there’s no
Item 2: Use the type system to express common behavior | 15
other copy of the source item to move from—and so the closure can be invoked
only once.
Describes a closure that can be called repeatedly and that can make changes to its
environment because it mutably borrows from the environment.
Describes a closure that can be called repeatedly and that only borrows values
from the environment immutably.
The compiler automatical y implements the appropriate subset of these Fn* traits for
any lambda expression in the code; it’s not possible to manually implement any of
these traits (unlike C++’s operator()
Returning to the preceding rough mental model of closures, which of the traits the
compiler auto-implements roughly corresponds to whether the captured environ‐
mental context has these elements:
FnOnce
Any moved values
FnMut
Any mutable references to values (&mut T)
Fn
Only normal references to values (&T)
The latter two traits in this list each have a trait bound of the preceding trait, which
makes sense when you consider the things that use the closures:
• If something expects to call a closure only once (indicated by receiving a FnOnce),
it’s OK to pass it a closure that’s capable of being repeatedly called (FnMut).
• If something expects to repeatedly call a closure that might mutate its environ‐
ment (indicated by receiving a FnMut), it’s OK to pass it a closure that doesn’t need
to mutate its environment (Fn).
The bare function pointer type fn also notionally belongs at the end of this list; any
(not-unsafe) fn type automatically implements all of the Fn* traits, because it bor‐
rows nothing from the environment.
4 At least not in stable Rust at the time of writing. The and tal features may change this in the future.
16 | Chapter 1: Types
As a result, when writing code that accepts closures, use the most general Fn* trait that works, to allow the greatest flexibility for callers—for example, accept FnOnce for closures that are used only once. The same reasoning also leads to advice to prefer Fn*
trait bounds over bare function pointers (fn).
Traits
The Fn* traits are more flexible than bare function pointers, but they can still describe
only the behavior of a single function, and even then only in terms of the function’s
signature.
However, they are themselves examples of another mechanism for describing behav‐
ior in Rust’s type system, the trait. A trait defines a set of related functions that some
underlying item makes publicly available; moreover, the functions are typically (but
don’t have to be) methods, taking some variant of self as their first argument.
Each function in a trait also has a name, providing a label that allows the compiler to
disambiguate functions with the same signature, and more importantly, that allows
programmers to deduce the intent of the function.
A Rust trait is roughly analogous to an “interface” in Go and Java, or to an “abstract
class” (all virtual methods, no data members) in C++. Implementations of the trait
must provide all the functions (but note that the trait definition can include a default
implementave associated data that those implementa‐
tions make use of. This means that code and data gets encapsulated together in a
common abstraction, in a somewhat object-oriented (OO) manner.
Code that accepts a struct and calls functions on it is constrained to only ever work
with that specific type. If there are multiple types that implement common behavior,
then it is more flexible to define a trait that encapsulates that common behavior, and
have the code make use of the trait’s functions rather than functions involving a spe‐
cific struct.
This leads to the same kind of advice that turns up for other OO-influenced lan‐
prefer accepting trait types over concrete types if future flexibility is anticipated.
Sometimes, there is some behavior that you want to distinguish in the type system,
but it cannot be expressed as some specific function signature in a trait definition. For
example, consider a Sort trait for sorting collections; an implementation might be
stable (elements that compare the same will appear in the same order before and after
the sort), but there’s no way to express this in the sort method arguments.
5 For example, Joshua Bloch’ddison-Wesley) includes Item 64: Refer to objects by their interfaces.
Item 2: Use the type system to express common behavior | 17
In this case, it’s still worth using the type system to track this requirement, using a
marker trait:
pub trait Sort {
/// Rearrange contents into sorted order.
fn sort(& mut self);
}
/// Marker trait to indicate that a [`Sort`] sorts stably.
pub trait StableSort: Sort {}
A marker trait has no functions, but an implementation still has to declare that it is
implementing the trait—which acts as a promise from the implementer: “I solemnly
swear that my implementation sorts stably.” Code that relies on a stable sort can then
specify the StableSort trait bound, relying on the honor system to preserve its invar‐
iants. Use marker traits to distinguish behaviors that cannot be expressed in the trait
function signatures.
Once behavior has been encapsulated into Rust’s type system as a trait, it can be used
in two ways:
• As a trait bound, which constrains what types are acceptable for a generic data
type or function at compile time
• As a trait object, which constrains what types can be stored or passed to a func‐
tion at runtime
The following sections describe these two possibilities, and gives more detail
about the trade-offs between them.
Trait bounds
A trait bound indicates that generic code that is parameterized by some type T can be
used only when that type T implements some specific trait. The presence of the trait
bound means that the implementation of the generic can use the functions from that
trait, secure in the knowledge that the compiler will ensure that any T that compiles
does indeed have those functions. This check happens at compile time, when the
generic is monomorphized—converted from the generic code that deals with an arbi‐
trary type T into specific code that deals with one particular SomeType (what C++
would call template instantiation).
This restriction on the target type T is explicit, encoded in the trait bounds: the trait
can be implemented only by types that satisfy the trait bounds. This contrasts with
the equivalent situation in C++, where the constraints on the type T used in a
18 | Chapter 1: Types
template<typename T> are implicit:plate code still compiles only if all of the referenced functions are available at compile time, but the checks are purely based
on function name and signature. (This
template that uses t.pop() might compile for a T type parameter of either Stack or
Balloon—which is unlikely to be desired behavior.)
The need for explicit trait bounds also means that a large fraction of generics use trait
bounds. To see why this is, turn the observation around and consider what can be
done with a struct Thing<T> where there are no trait bounds on T. Without a trait
bound, the Thing can perform only operations that apply to any type T—basically just
moving or dropping the value. This in turn allows for generic containers, collections,
and smart pointers, but not much else. Anything that uses the type T is going to need
a trait bound:
pub fn dump_sorted<T>(mut collection: T)
where
T: Sort + IntoIterator,
T::Item: std::fmt::Debug,
{
// Next line requires `T: Sort` trait bound.
collection.sort();
// Next line requires `T: IntoIterator` trait bound.
for item in collection {
// Next line requires `T::Item : Debug` trait bound
println!("{:?}", item);
}
}
So the advice here is to use trait bounds to express requirements on the types used in
generics, but it’s easy advice to follow—the compiler will force you to comply with it
regardless.
Trait objects
A trait object is the other way to make use of the encapsulation defined by a trait, but
here, different possible implementations of the trait are chosen at runtime rather than
compile time. This dynamic dispatch is analogous to using virtual functions in C++,
and under the covers, Rust has “vtable” objects that are roughly analogous to those in
C++.
This dynamic aspect of trait objects also means that they always have to be handled
indirectly, via a reference (e.g., &dyn Trait) or a pointer (e.g., Box<dyn Trait>) of
some kind. The reason is that the size of the object implementing the trait isn’t known
6 in C++20 allows explicit specification of constraints on template types, but the checks are still performed only when the template is instantiated, not when it is declared.
Item 2: Use the type system to express common behavior | 19
at compile time—it could be a giant struct or a tiny enum—so there’s no way to allo‐
cate the right amount of space for a bare trait object.
Not knowing the size of the concrete object also means that traits used as trait objects
cannot have functions that return the Self type or arguments (other than the
receiver—the object on which the method is being invoked) that use Self. The reason
is that the compiled-in-advance code that uses the trait object would have no idea
how big that Self might be.
A trait that has a generic function fn some_fn<T>(t:T) allows for the possibility of
an infinite number of implemented functions, for all of the different types T that
might exist. This is fine for a trait used as a trait bound, because the infinite set of
possibly invoked generic functions becomes a finite set of actual y invoked generic
functions at compile time. The same is not true for a trait object: the code available at
compile time has to cope with all possible Ts that might arrive at runtime.
These two restrictions—no use of Self and no generic functions—are combined in
Item 3: Prefer Option and Result transforms
over explicit match expressions
enum and showed how match expressions force the
programmer to take all possibilities into account. also introduced the two
ubiquitous enums that the Rust standard library provides:
To express that a value (of type T) may or may not be present
For when an operation to return a value (of type T) may not succeed and may
instead return an error (of type E)
This Item explores situations where you should try to avoid explicit match expres‐
sions for these particular enums, preferring instead to use various transformation
methods that the standard library provides for these types. Using these transforma‐
tion methods (which are typically themselves implemented as match expressions
under the covers) leads to code that is more compact and idiomatic and has clearer
intent.
The first situation where a match is unnecessary is when only the value is relevant and
the absence of value (and any associated error) can just be ignored:
struct S {
field: Option< i32>,
20 | Chapter 1: Types
}
let s = S { field: Some(42) };
match &s.field {
Some(i) => println!("field is {i}"),
None => {}
}
For this situation, an expression is one line shorter and, more importantly, clearer:
if let Some(i) = &s.field {
println!("field is {i}");
}
However, most of the time the programmer needs to provide the corresponding else
arm: the absence of a value (Option::None), possibly with an associated error
(Result::Err(e)), is something that the programmer needs to deal with. Designing
software to cope with failure paths is hard, and most of that is essential complexity
that no amount of syntactic support can help with—specifically, deciding what should
happen if an operation fails.
In some situations, the right decision is to perform an ostrich maneuver—put our
heads in the sand and explicitly not cope with failure. You can’t completely ignore the
error arm, because Rust requires that the code deal with both variants of the Error
enum, but you can choose to treat a failure as fatal. Performing a panic! on failure
means that the program terminates, but the rest of the code can then be written with
the assumption of success. Doing this with an explicit match would be needlessly
verbose:
let result = std::fs::File::open("/etc/passwd");
let f = match result {
Ok(f) => f,
Err(_e) => panic!("Failed to open /etc/passwd!"),
};
// Assumèfìs a valid `std::fs::Filè from here onward.
Both Option and Result provide a pair of methods that extract their inner value and
panic! if it’s absent: tter allows the error message on failure to be personalized, but in either case, the resulting code is shorter and simpler—
error handling is delegated to the .unwrap() suffix (but is still present):
let f = std::fs::File::open("/etc/passwd").unwrap();
Be clear, though: these helper functions still panic!, so choosing to use them is the
same as choosing to panic! ).
However, in many situations, the right decision for error handling is to defer the deci‐
sion to somebody else. This is particularly true when writing a library, where the code
may be used in all sorts of different environments that can’t be foreseen by the library
Item 3: Prefer Option and Result transforms over explicit match expressions | 21
author. To make that somebody else’s job easier, prefer Result to Option for expressing errors, even though this may involve conversions between different error types
Of course, this opens up the question, What counts as an error? In this example, fail‐
ing to open a file is definitely an error, and the details of that error (no such file? per‐
mission denied?) can help the user decide what to do next. On the other hand, failing
to retrieve the element of a slice because that slice is empty isn’t really an error, and so it is expressed as an Option return type in the standard library. Choosing between the two possibilities requires judgment, but lean toward Result if an
error might communicate anything useful.
Result also has a #[must_use] to nudge library users in the right direction—if the code using the returned Result ignores it, the compiler will generate
a warning:
warning: unused `Result` that must be used
--> src/main.rs:63:5
|
63 | f.set_len(0); // Truncate the file
| ^^^^^^^^^^^^
|
= note: this `Result` may be an Èrr` variant, which should be handled
= note: `#[warn(unused_must_use)]òn by default
help: usèlet _ = ...` to ignore the resulting value
|
63 | let _ = f.set_len(0); // Truncate the file
| +++++++
Explicitly using a match allows an error to propagate, but at the cost of some visible
boilerplate (reminiscent of ):
pub fn find_user(username: & str) -> Result<UserId, std::io::Error> {
let f = match std::fs::File::open("/etc/passwd") {
Ok(f) => f,
Err(e) => return Err(From::from(e)),
};
// ...
}
The key ingredient for reducing boilerplate is Rust’This piece of syntactic sugar takes care of matching the Err arm, transforming the error
type if necessary, and building the return Err(...) expression, all in a single
character:
pub fn find_user(username: & str) -> Result<UserId, std::io::Error> {
let f = std::fs::File::open("/etc/passwd")?;
// ...
}
22 | Chapter 1: Types
Newcomers to Rust sometimes find this disconcerting: the question mark can be hard
to spot on first glance, leading to disquiet as to how the code can possibly work.
However, even with a single character, the type system is still at work, ensuring that
all of the possibilities expressed in the relevan) are covered—leaving
the programmer to focus on the mainline code path without distractions.
What’s more, there’s generally no cost to these apparent method invocations: they are
all generic functions marked as , so the generated code will typically compile to machine code that’s identical to the manual version.
These two factors taken together mean that you should prefer Option and Result
transforms over explicit match expressions.
In the previous example, the error types lined up: both the inner and outer methods
expressed errors as . That’s often not the case: one function may accumulate errors from a variety of different sublibraries, each of which uses different
error types.
Error mapping in general is discussed in , just be aware that a
manual mapping:
pub fn find_user(username: & str) -> Result<UserId, String> {
let f = match std::fs::File::open("/etc/passwd") {
Ok(f) => f,
Err(e) => {
return Err(format!("Failed to open password file: {:?}", e))
}
};
// ...
}
could be more succinctly and idioma
transformation:
pub fn find_user(username: & str) -> Result<UserId, String> {
let f = std::fs::File::open("/etc/passwd")
.map_err(|e| format!("Failed to open password file: {:?}", e))?;
// ...
}
Better still, even this may not be necessary—if the outer error type can be created
from the inner error type via an implementation of the From standard trait (
then the compiler will automatically perform the conversion without the need for a
call to .map_err().
These kinds of transformations generalize more widely. The question mark operator
is a big hammer; use transformation methods on Option and Result types to maneu‐
ver them into a position where they can be a nail.
Item 3: Prefer Option and Result transforms over explicit match expressions | 23
The standard library provides a wide variety of these transformation methods to
make this possible. t rec‐
tangles) that transform between the relevant types (dark rectangles). In line with
, methods that can panic! are marked with an asterisk.
Figure 1-1. Option and Result transformatio
One common situation the diagram doesn’t cover deals with references. For example,
consider a structure that optionally holds some data:
struct InputData {
payload: Option<Vec< u8>>,
}
A method on this struct that tries to pass the payload to an encryption function with
signature (&[u8]) -> Vec<u8> fails if there’s a naive attempt to take a reference:
D O E S N O T C O M P I L E
impl InputData {
pub fn encrypted(&self) -> Vec< u8> {
encrypt(&self.payload.unwrap_or(vec![]))
7 The t documentation.
24 | Chapter 1: Types
}
}
error[E0507]: cannot move out of `self.payload` which is behind a shared
reference
--> src/main.rs:15:18
|
15 | encrypt(&self.payload.unwrap_or(vec![]))
| ^^^^^^^^^^^^ move occurs becausèself.payload` has type
| Òption<Vec<u8>>`, which does not implement the
| `Copy` trait
The right tool for this is the OptionThis method converts a reference-to-an-Option into an Option-of-a-reference:
pub fn encrypted(&self) -> Vec< u8> {
encrypt(self.payload.as_ref().unwrap_or(&vec![]))
}
Things to Remember
• Get used to the transformations of Option and Result, and prefer Result to
Option. Use .as_ref() as needed when transformations involve references.
• Use these transformations in preference to explicit match operations on Option
and Result.
• In particular, use these transformations to convert result types into a form where
the ? operator applies.
Item 4: Prefer idiomatic Error types
described how to use the transformations that the standard library provides
for the Option and Result types to allow concise, idiomatic handling of result types
using the ? operator. It stopped short of discussing how best to handle the variety of
different error types E that arise as the second type argument of a Result<T, E>;
that’s the subject of this Item.
This is relevant only when there are a variety of different error types in play. If all of
the different errors that a function encounters are already of the same type, it can just
return that type. When there are errors of different types, there’s a decision to make
about whether the suberror type information should be preserved.
8 Note that this method is separate from the AsRef trait, even though the method name is the same.
Item 4: Prefer idiomatic Error types | 25
The Error Trait
It’s always good to understand wha) involve, and the rel‐
evanE type parameter for a Result doesn’t have to be a type that implements Error, but it’s a common convention that allows
wrappers to express appropriate trait bounds—so prefer to implement Error for your
error types.
The first thing to notice is that the only hard requirement for Error types is the trait
bounds: any type that implements Error also has to implement the following traits:
• The Display trait, meaning that it can be format!ed with {}
• The Debug trait, meaning that it can be format!ed with {:?}
In other words, it should be possible to display Error types to both the user and the
programmer.
The only method in the trait is which allows an Error type to expose an inner, nested error. This method is optional—it comes with a default implementation
) returning None, indicating that inner error information isn’t available.
One final thing to note: if you’re writing code for a no_std environmen
may not be possible to implement Error—the Error trait is currently implemented in
std, not core, and so is not a
Minimal Errors
If nested error information isn’t needed, then an implementation of the Error type
need not be much more than a String—one rare occasion where a “stringly typed”
variable might be appropriate. It does need to be a little more than a String though;
while it’s possible to use String as the E type parameter:
pub fn find_user(username: & str) -> Result<UserId, String> {
let f = std::fs::File::open("/etc/passwd")
.map_err(|e| format!("Failed to open password file: {:?}", e))?;
// ...
}
a String doesn’t implement Error, which we’d prefer so that other areas of code can
deal with Errors. It’s not possible to impl Error for String, because neither the trait
nor the type belong to us (the so-called orphan rule):
9 Or at least the only nondeprecated, stable method.
10 At the time of writing, Errorvailable in stable Rust.
26 | Chapter 1: Types
D O E S N O T C O M P I L E
impl std::error::Error for String {}
error[E0117]: only traits defined in the current crate can be implemented for
types defined outside of the crate
--> src/main.rs:18:5
|
18 | impl std::error::Error for String {}
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^------
| | |
| | `Stringìs not defined in the current crate
| impl doesn't use only types from inside the current crate
|
= note: define and implement a trait or new type instead
’t help either, because it doesn’t create a new type and so doesn’t change the error message:
D O E S N O T C O M P I L E
pub type MyError = String;
impl std::error::Error for MyError {}
error[E0117]: only traits defined in the current crate can be implemented for
types defined outside of the crate
--> src/main.rs:41:5
|
41 | impl std::error::Error for MyError {}
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^-------
| | |
| | `Stringìs not defined in the current crate
| impl doesn't use only types from inside the current crate
|
= note: define and implement a trait or new type instead
Item 4: Prefer idiomatic Error types | 27
As usual, the compiler error message gives a hint to solving the problem. Defining a
tuple struct that wraps the String type (the “newtype pattern, ) allows the
Error trait to be implemented, provided that Debug and Display are implemented
too:
#[derive(Debug)]
pub struct MyError(String);
impl std::fmt::Display for MyError {
fn fmt(&self, f: & mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(f, "{}", self.0)
}
}
impl std::error::Error for MyError {}
pub fn find_user(username: & str) -> Result<UserId, MyError> {
let f = std::fs::File::open("/etc/passwd").map_err(|e| {
MyError(format!("Failed to open password file: {:?}", e))
})?;
// ...
}
For convenience, it may make sense to implement the From<String> trait to allow
string values to be easily converted into MyError):
impl From<String> for MyError {
fn from(msg: String) -> Self {
Self(msg)
}
}
When it encounters the question mark operator (?), the compiler will automatically
apply any relevant From trait implementations that are needed to reach the destina‐
tion error return type. This allows further minimization:
pub fn find_user(username: & str) -> Result<UserId, MyError> {
let f = std::fs::File::open("/etc/passwd")
.map_err(|e| format!("Failed to open password file: {:?}", e))?;
// ...
}
The error path here covers the following steps:
• File::open returns an error of type .
• format! converts this to a String, using the Debug implementation of
std::io::Error.
• ? makes the compiler look for and use a From implementation that can take it
from String to MyError.
28 | Chapter 1: Types
Nested Errors
The alternative scenario is where the content of nested errors is important enough
that it should be preserved and made available to the caller.
Consider a library function that attempts to return the first line of a file as a string, as
long as the line is not too long. A moment’s thought reveals (at least) three distinct
types of failure that could occur:
• The file might not exist or might be inaccessible for reading.
• The file might contain data that isn’t valid UTF-8 and so can’t be converted into a
String.
• The file might have a first line that is too long.
pass all of these
possibilities as an enum:
#[derive(Debug)]
pub enum MyError {
Io(std::io::Error),
Utf8(std::string::FromUtf8Error),
General(String),
}
This enum definition includes a derive(Debug), but to satisfy the Error trait, a
Display implementation is also needed:
impl std::fmt::Display for MyError {
fn fmt(&self, f: & mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
MyError::Io(e) => write!(f, "IO error: {}", e),
MyError::Utf8(e) => write!(f, "UTF-8 error: {}", e),
MyError::General(s) => write!(f, "General error: {}", s),
}
}
}
It also makes sense to override the default source() implementation for easy access
to nested errors:
use std::error::Error;
impl Error for MyError {
fn source(&self) -> Option<&(dyn Error + 'static)> {
match self {
MyError::Io(e) => Some(e),
MyError::Utf8(e) => Some(e),
MyError::General(_) => None,
}
Item 4: Prefer idiomatic Error types | 29
}
}
The use of an enum allows the error handling to be concise while still preserving all of
the type information across different classes of error:
use std::io::BufRead; // for `.read_until()`
/// Maximum supported line length.
const MAX_LEN: usize = 1024;
/// Return the first line of the given file.
pub fn first_line(filename: & str) -> Result<String, MyError> {
let file = std::fs::File::open(filename).map_err(MyError::Io)?;
let mut reader = std::io::BufReader::new(file);
// (A real implementation could just usèreader.read_line()`)
let mut buf = vec![];
let len = reader.read_until(b'\n', & mut buf).map_err(MyError::Io)?;
let result = String::from_utf8(buf).map_err(MyError::Utf8)?;
if result.len() > MAX_LEN {
return Err(MyError::General(format!("Line too long: {}", len)));
}
Ok(result)
}
It’s also a good idea to implement the From): impl From<std::io::Error> for MyError {
fn from(e: std::io::Error) -> Self {
Self::Io(e)
}
}
impl From<std::string::FromUtf8Error> for MyError {
fn from(e: std::string::FromUtf8Error) -> Self {
Self::Utf8(e)
}
}
This prevents library users from suffering under the orphan rules themselves: they
aren’t allowed to implement From on MyError, because both the trait and the struct
are external to them.
Better still, implementing From allows for even more concision, beca
utomatically perform any necessary From conversions, removing the need for .map_err():
use std::io::BufRead; // for `.read_until()`
/// Maximum supported line length.
pub const MAX_LEN: usize = 1024;
30 | Chapter 1: Types
/// Return the first line of the given file.
pub fn first_line(filename: & str) -> Result<String, MyError> {
let file = std::fs::File::open(filename)?; // `From<std::io::Error>`
let mut reader = std::io::BufReader::new(file);
let mut buf = vec![];
let len = reader.read_until(b'\n', & mut buf)?; // `From<std::io::Error>`
let result = String::from_utf8(buf)?; // `From<string::FromUtf8Error>`
if result.len() > MAX_LEN {
return Err(MyError::General(format!("Line too long: {}", len)));
}
Ok(result)
}
Writing a complete error type can involve a fair amount of boilerplate, which makes it
a good candidate for automation via a derive macro ). However, there’s no need to write such a macro yourself: consider using th crate from David Tolnay, which provides a high-quality, widely used implementation of just such a
macro. The code generated by thiserror is also careful to avoid making any this
error types visible in the generated API, which in turn means that the concerns asso‐
ciated with ’t apply.
Trait Objects
The first approach to nested errors threw away all of the suberror detail, just preserv‐
ing some string output (format!("{:?}", err)). The second approach preserved the
full type information for all possible suberrors but required a full enumeration of all
possible types of suberror.
This raises the question, Is there a middle ground between these two approaches, pre‐
serving suberror information without needing to manually include every possible
error type?
Encoding the suberror information as a voids the need for an enum variant for every possibility but erases the details of the specific underlying error types.
The receiver of such an object would have access to the methods of the Error trait
and its trait bounds—source(), Display::fmt(), and Debug::fmt(), in turn—but
wouldn’t know the original static type of the suberror:
U N D E S I R E D B E H A V I O R
#[derive(Debug)]
pub enum WrappedError {
Wrapped(Box< dyn Error>),
General(String),
}
impl std::fmt::Display for WrappedError {
fn fmt(&self, f: & mut std::fmt::Formatter<'_>) -> std::fmt::Result {
Item 4: Prefer idiomatic Error types | 31
match self {
Self::Wrapped(e) => write!(f, "Inner error: {}", e),
Self::General(s) => write!(f, "{}", s),
}
}
}
It turns out that this is possible, but it’s surprisingly subtle. Part of the difficulty comes
from the object safety constrainust’s coherence rules also come into play, which (roughly) say that there can be at most one implementation of a trait for a type.
A putative WrappedError type would naively be expected to implement both of the
following:
• The Error trait, because it is an error itself.
• The From<Error> trait, to allow suberrors to be easily wrapped.
That means that a WrappedError can be created from an inner WrappedError, as Wrap
pedError implements Error, and that clashes with the blanket reflexive implementa‐
tion of From:
D O E S N O T C O M P I L E
impl Error for WrappedError {}
impl<E: 'static + Error> From<E> for WrappedError {
fn from(e: E) -> Self {
Self::Wrapped(Box::new(e))
}