4 Design document

This chapter covers

The most common myths around the design document
Defining antigoals for an even sharper focus on core objectives
Drafting a design document based on the information available
Reviewing a design document
The evolution of a design document

Once you have defined the problem your system should solve, as well as a list of stakeholders, and have a rough understanding of what technologies and solutions would be most appropriate for the product, as described in chapter 3, it is time to prepare a design document.

It is worth noting here that there is no set-in-stone order of actions at the early stage of creating a machine learning (ML) system. You can start preparing a design document as soon as you’ve identified the problem and goals (especially if you work in a startup, where the speed of delivery is often more important than following processes). But since this book is presented as a checklist, the list of actions is also displayed in a traditional sequence.

As one of the authors’ managers once said, no fancy recommendation algorithm can beat a customer with a shopping list. These people have a goal and a plan for achieving it. Nothing can stop them.

If you think about it, writing code is just providing a specific set of instructions to achieve a particular goal. In a sense, a design document is a meta-algorithm set to accomplish a specific goal, with the involvement of many subalgorithms. Still, the design document is being regularly challenged by many as either one of the horsemen of bureaucratization or a rudiment used out of inertia.

In this chapter, we will examine the most common myths around the design document. We will introduce and define the concept of antigoals as additional guidelines that lead you toward the project’s objectives, and we will start the practical part of this book, represented by two design documents based on close-to-real-life scenarios.

4.1 Common myths surrounding the design document

Over the years, the design document has seen a number of false assumptions and misinterpretations that may prevent you from putting together a well-organized paper appropriate for your project. Next we’ll examine the most common misconceptions and explain why you shouldn’t dwell on them.

4.1.1 Myth #1. Design documents work only for big companies but not startups

You could argue that dedicating part of your workload to preparing design documents would make only sense for big companies. There is a fair premise in this counterpoint: mature organizations need to invest more time and resources into writing design documents compared to a startup with a dozen employees. It doesn’t mean, though, that small companies should prepare no design documents at all: as a well-known quote says, “Plan is nothing; planning is everything.” The beauty of writing a design document lies in revealing blind spots in your vision, on both the product side and the technical side, which will save you a lot in the midterm, especially if you cut off irrelevant data. For the latter, we recommend applying the method we call “antigoals,” which we will dedicate a separate section to later.

When the book was in early access, there was a common comment shared by our early readers: “Well, this is good, but that is not how it works in startups.” While agreeing that startups’ delivery cadence is different, we still stick to the idea that the design phase is necessary. It is true that cofounders and early engineers can find their consensus during a coffee break, whereas a massive corporation would waste 6 months on the same scope. We also agree that writing formal docs may be inefficient, but that is not what we advocate for. A simple note with a short description can be enough as soon as you are sure it gets all collaborators on the same page. Ignoring good practices of software and ML engineering is fine while you’re hunting for a prize at a hackathon, but the hackathon style barely works at longer distances.

4.1.2 Myth #2. Design documents are efficient only for complex projects

There’s a grain of truth to this statement if you look at a design document in its classic sense: a large, labor-intensive effort involving every detail of the final product, from general scope to risk validation upon deployment. After all, the compilation of such a document alone can take more time than the lifetime of the project itself!

Typically, such an argument can come either from a person with a lack of flexibility or from an ardent opponent of the design document who is eager to use any argument in their favor.

Practice shows that even for smaller projects, a well-structured design document ensures early identification of potential risks, serves as a reference for future enhancements if the project eventually expands, and, most importantly, helps prevent scope creep when every other stakeholder is tempted to add just one more feature.

Even simple initiatives can benefit from a design document with a proportional level of detail.

4.1.3 Myth #3. Every design document should be based on a template

Many companies, especially well-established enterprises, maintain their recommended templates with a strict, rigid structure, and it can be useful, considering the scale of their businesses. However, we recommend avoiding setting design docs templates in stone. Based on our experience, the template should never be a sacred dogma. Such templates may try to serve too many goals all at once, thus getting bloated and discouraging people from preparing and studying those documents. That is why we recommend keeping the core template minimalistic and extending parts here and there depending on system-specific requirements and context.

At first glance, the process of creating a design document may seem straightforward and simple. In reality, right from the start, you will encounter a whole load of factors that will interfere with the process and set you several steps back if ignored.

Remember: your task is not to create a draft document and convince everyone of its purity and correctness. Your task is to find as many weak points as possible (including motivating your stakeholders to find them) so that eventually, after a number of iterations, you have a document that allows you to start developing your ML system.

4.1.4 Myth #4. Every design document should lead to a deployed system

If you are an engineer and need to build a machine, you need to start with a blueprint. Other engineers will review it and provide feedback, which will probably lead to another iteration of blueprints—and another and another until your design is finally ready to be brought to life.

The same principle applies to designing ML systems. An ML system is a highly complex machine of interconnected domains that requires thorough preparation when your design document undergoes multiple iterations before implementation. Still, more often than not, a good design document leads to no ML project at all.

This might sound absurd, but let’s imagine you’re set to choose between two options:

Spending 6 months working relentlessly on models, features, loss functions, and datasets, only to put your project on the shelf (where most ML projects end up finding themselves)
Spending 2 to 4 weeks trying to describe
- Why are we doing this project?
- How do we do it?
- Do we have everything we need?
- Can we have a less efficient solution with less effort?
- Is the desired result achievable?

Realizing that 90% of results can be derived from two IF statements can be frustrating, but it is still much better than the first of the two options.

4.2 Goals and antigoals

One of the goals of a design document is to reduce uncertainty about a problem by setting cornerstones and boundaries. Before the document is drafted, the level of understanding of both the problem and the solution is low and inconsistent among all involved parties. A technique that can help address such a problem is using antigoals—inverse statements that can help us narrow down both the problem space and the solution space.

Each part of a design document can be viewed as an answer to multiple questions: what are the goals of a potential system, what are the key success criteria, what tech aspects should we focus on, how do we solve a given subproblem, etc. A rookie mistake would be to miss tradeoffs and enumerate endless goals for the system: for example, it should do X, Y, and Z; have high performance; be precise; be easy to maintain and cheap to develop; and be intuitively understandable. Obviously, it’s impossible to successfully fit all the good properties into one system, and you will require an approach to counterbalance this possible excessiveness.

Setting antigoals allows us to strike out the aspects we don’t really care about that much and additionally highlight those we see as crucial. Let’s say we’re building a system that will be used internally, and the output artifacts are various reports to be read by the executive team and analysts. We can assume right off the bat that processing time won’t be critical for such a system—it’s only necessary to make it work fast enough that reports are ready by morning. Thus, “processing time” will be the first to join the list of antigoals so that we don’t bother ourselves with this parameter. Or imagine building a recommendation engine for a boutique store: you sure won’t need to support millions of items if the current number of goods contains only three digits (see figure 4.1), meaning excessive productivity is a no-go for the end solution.

Figure 4.1 A shop with <1,000 goods for sale and low traffic should not aim for scalability when building a recommendation system, as almost any tech solution can handle its load these days.

Antigoals like this help us focus only on important aspects and drop the ones that have no positive effect on reaching the main goal of the system.

The following example suggests what the lists of goals and antigoals would look like for a boutique store’s recommendation engine:

Goals:
- Increased conversion from View to Add to Basket steps
- Diverse recommendations for users
- Low latency for users
Antigoals:
- Scalability in terms of the number of goods processed
- Scalability in terms of concurrent users
- Support for new goods categories

A similar logic is applicable to other blocks of a design document. If you formed an idea on implementation and later realized it had an intrinsic critical flaw, it would make sense to mention this issue in the document as a counterexample. Imagine you are designing a scalable system and considered using cloud infrastructure intensively until you learned that the biggest potential customer has strict limitations on using its own hardware for privacy reasons. In this case, a single sentence like “Cloud solution X could be a good option for data storage, but not applicable in this case because of Y’s cloud privacy restrictions” can set important limitations and may spark ideas on alternative tech implementations: “If X is fine from the technical perspective, are there open source X alternatives that can be installed on our own servers?”

Antigoals should not be considered the main source of information in your design document but can become a spice that adds a missing flavor, growing into an essential part of the document’s structure.

Questionable goals and their effect on the end result

We have two stories to highlight how unclear goal setting can affect the development of an ML system.

In 2016, Valerii worked in the collection department of a large bank. By that time, the bank’s management had decided to introduce ML into its daily routine and lean on algorithmic support instead of operating with a set of rigid rules and gut feelings. One of Valerii’s first tasks was to create a model picking the next user the bank must reach to maximize output—a user who can be activated by an incentive (a promise to pay, fee waiver, discount). The existing process involved a lot of manual work, yielding a conversion ratio of around 50%. The new process involving a pretty basic nonlinear model on a set of around 100 engineered features that provided astonishingly better results of 80% conversion was tested within the next 2 months, while the old process was still providing 50%.

The team was happy and excited to present the results to their senior vice president. The second after we finished the presentation, she said, “What’s so special about these clients? I want to know their motivation.” Addressing such a question in 2016 with a nonlinear model on 100 features was not an easy task, not to mention that what people do and why they do it are two completely different things. For example, from the very beginning, the goal of a senior vice president is to understand why, and the goal of the business is to understand who. As such, the team must design the system and model completely differently, aiming to answer both questions, even if it would be less efficient than answering just one. Thus, a bad (or improper) goal at the very beginning set the team 3 months back.

The second example covers the pricing algorithm we discussed in chapter 2. At the very beginning, our goal was to maximize gross merchandise volume based on turnover while keeping the margin at a given level.

At some point, the model found an ingenious way to achieve the goal. There was a boombox speaker in the product catalog, which the model started selling at a price lower than the purchase price. As a result, more speakers were sold in 24 hours than in the previous 90 days. To be fair, this was still within the margin limit because we didn’t mind the margin being negative as part of the task.

However, you can imagine it was completely different from what we really wanted (the proper goal would be to increase the revenue while maintaining margin, affecting X% categories with Y% of SKUs in them with cannibalization no higher than Z). Sure, the revenue went up, and margins stayed within given limits, but in the end, everyone just ran to buy exactly that one model. No other speakers were bought.

Luckily, that was a test launch with a small number of items under dynamic pricing, showing that the initial goal was badly designed and we needed to develop a more thorough approach to goal setting. Fortunately, the overall design was decoupled and easy to adjust.

4.3 Design document structure

In this section, we could have focused on theoretical information about the contents and structure of the classic design document, but the truth is, a design document you prepare for an ML system will hardly rely on practices applied in traditional software development. On top of that, its structure may vary from company to company, so we do not think it makes sense to dwell on layout nuances. Instead, we recommend focusing more on what items need to be covered. Plus, our goal is to showcase the design document as an entity within ML system design. For that reason, starting with this section and for the rest of the book, at the end of each chapter there will be a large practical block representing a part of a design document that incorporates the main message from the given chapter. We see it as a crucial component of this book, which will go side by side with theory and campfire stories while offering an example of applying real-life solutions to problems.

In what follows, we will introduce you to two fictional cases, each with its own specifics, features, problems, and context. These two cases will form the basis of two different design documents, which will gradually grow and evolve from chapter to chapter, adding more depth and complexity. Eventually we will have two fully formed documents at our disposal.

In this section, we start to outline a design document for a project as it might have been written in real life. For this purpose, we introduce a fictional company, Supermegaretail, a retail company with a demand forecast project to launch.

In section 4.4, we give a very brief example of what the first chapter of a design document can look like. We will include only major topics; otherwise, it would not fit into a single book.

NOTE Any text in the body of the design doc written in italics contains our supporting comments and is not part of the document itself.

Design document: Supermegaretail

Problem definition

i. Origin

Supermegaretail is a retail chain operating through a network of thousands of stores across different countries in various regions. The chain’s customers buy various goods, primarily groceries, household essentials, personal care, sports supplements, and many more.

To sell these goods, Supermegaretail must purchase or produce them before delivering them to a store’s location. The number of purchased goods is the key figure that needs to be defined, and there are different possible scenarios here.

For easier calculations, we assume that Supermegaretail bought 1,000 units of item A for the specific store:

Supermegaretail bought 1,000 units and sold 999 before the next delivery. This is an optimal situation. With only 0.1% left over, the retailer is close to the optimal revenue and margin.
Supermegaretail bought 1,000 units and sold 100 before the next delivery. This is usually an awful situation for an apparent reason. Supermegaretail wants to sell almost as many units as it purchased without going out of stock. The more significant the gap, the larger Supermegaretail’s losses.
Supermegaretail bought 1,000 units and sold 1,000. This should be considered a terrible situation because we don’t know how many units people would buy had they had the opportunity. It could be 1,001, 2,000, or 10,000. An out-of-stock situation like that obscures our understanding of the world. Even worse—it drives customers from Supermegaretail to its competitors, where they can buy the goods with no shortages.

An additional constraint is that we have a lot of perishable foods that can’t stay on store shelves for long: they’re either sold or wasted.

The project goal is to reduce the gap between delivered and sold items, making it as narrow as possible, while avoiding an out-of-stock situation with a specific service-level agreement to be specified further. To do that, we plan to forecast the demand for a specific item in a specific store during a particular period with the help of an ML system.

ii. Relevance and reasons

This section highlights the problem’s relevance, backed by exploratory data analysis.

A. Existing flow analysis

What is the current way of ordering, delivering, and selling goods in Supermegaretail?

For Supermegaretail, the possible list might be as follows:

Planning horizon for making a deal with goods manufacturers:
- It’s a 1-year deal with the opportunity to adjust 90 days ahead within the first 9 months.
Additional discount with an increased volume:
- It’s an extra 2% off for every additional $20 million.
The number of distribution centers serving as logistics hubs between manufacturers and stores:
- There are 47 distribution centers around the country, making them a point of presence and aggregated entity for the forecast.
Delivery cadence between distribution centers and stores:
- Usually, every 2 days, there is a truck connecting the distribution center and the store.
Presence or absence of in-store warehouses:
- There are no warehouses in most of the stores. However, the loading bay zone can be (and is) effectively used to store offloaded items for 2 to 3 days.
Who and at what stage decides what and where to deliver?
- There’s a delivery plan coming down from the distribution center. A store’s manager can override and adjust it.
Forecast horizon:
- The primary forecast horizon is week-long and month-long. However, a 1-year horizon is needed when dealing with goods manufacturers.
Business owner of the process:
- Logistics department
- Procurement department
- Operational department (store managers)

B. How much does Supermegaretail lose on the gap between forecasted and factual demand?

Although it is relatively easy to calculate the loss due to overstock and expired items, it is much harder to calculate the loss due to out-of-stock situations. The latter can be estimated through either a series of A/B tests or an expert opinion, which is usually much quicker and cheaper than running those tests.

The overall loss can be approximated by summing up the two, providing an estimate of the gain with an ideal and nonachievable solution.

The initial calculation showed the loss to be around $800 million during the last year.

Starting from the following section of the design document (but only for this chapter), we’ve sketched questions to avoid this being too voluminous. Answering these questions will help you decide on further actions, and the answers are revealed in the later chapters as we go through the different stages of the system.

C. Other reasons

Can other teams use our solution, making development more appealing and reasonable?
Perhaps we can sell demand forecast solutions to other retail companies (obviously not to direct competitors).

iii. Previous work

This section covers whether this is an entirely new problem or something has been done before. Usually, it is a list of questions you ask to avoid doing double work or repeating previous mistakes.

What if Supermegaretail was aware of this issue and had already implemented some demand forecast approach? It has various stores in different locations; its demand forecast is probably already pretty efficient. How does the company do it?
- Rolling window?
- Experts committee?
- Rule of thumb + extra quick delivery?
- Do we have some limitations to consider that we can’t avoid, like minimum or maximum order size?
Can we quickly improve the existing solution, or do we need an entirely new one?
What if the Supermegaretail current forecasting is good enough for some categories and useless for others? In other words, can we use a hybrid approach here, at least in the very beginning, and start from the least successful categories, where the existing gap between predictions and actual sales is the widest?
If our approach unintentionally breaks something, it is not that dangerous. We are testing it for categories where we always had problems while not touching categories where everything is good.
In other words, we need to run an extensive and fresh exploratory data analysis of the existing solution.

iv. Other issues and risks

Do we have a required infrastructure, or do we need to build it?
If we pick something sophisticated, it can go crazy. What necessary checks and balances do we need to implement to avoid a disaster? Do we have a fallback in case something is broken?
How sure are we that we can significantly improve the quality and reduce the manual load? Can we really solve this?
What is the price of a mistake? Out-of-stock and overstock most likely have different costs of errors.
If we deal with an out-of-stock situation, can we handle increased traffic?
How often and on what granularity do we need to perform predictions?

As you can see, even a brief overview of the problem to solve and research using the previously gathered data can easily force us to write a 10-page doc. This draft will help us decide if we need to go further or if it is better to stop right now and avoid a complicated ML solution.

The next section of this chapter is no less important: it gives a practical example of how to review a design document. If you’re new to ML system design, you probably haven’t reached the stage of your career where you have enough experience and credibility to be involved in this kind of working routine. However, stepping up to review your first design doc is just a matter of time, so it’s better to be prepared beforehand, and you will see some practical advice on the reviewing basics.

4.4 Reviewing a design document

Audi alteram partem [Let the other side be heard as well]
—Latin proverb

So far, we haven’t seen a draft design doc written by a single person that would be complete enough to implement right from the start. However, we’ve come across some really decent drafts, which is more than enough after the first iteration.

This fact is essential and quite easily explained. Complex systems require input from many people with diverse expertise and backgrounds. As a design document author, part of your job is to make it more manageable for all the involved parties to navigate. Outlining your doc with chapters and subchapters will help domain experts see where to go from the beginning. Otherwise, the natural reaction for most people when they see a 10+ page doc is to close it and forget it.

Here come the first two critical points: the design doc must be accessible and visible to as many people as possible and easy to navigate for all participants.

As soon as people start reviewing any kind of content, they begin to criticize and offer alternatives. As an author, you want to encourage this type of behavior. After all, what are the chances you had the best and most appropriate design after the first iteration?

Try to derive an explanation for each proposition/fixture, as they could emerge from different conditions:

Reviewers have used this tool before and think it is the best tool for everything.
There are limitations in the current infrastructure. For example, we can’t provide real-time support but can do batch jobs every 60 seconds. Does it affect the flow?
We have people who can maintain technology A, but there’s nobody to maintain technology B. Thus, it is better to move from technology B, mentioned in the design, to technology A.
Reviewers want to boast about their knowledge of technologies and demonstrate this knowledge to a broader audience.
Reviewers see another way to solve a given task and offer an alternative solution.

Try to understand the reasoning behind every input and solicit additional information until you fully understand the reasons. From our personal experience, the least helpful input (on the first iteration) would sound like “looks good to me.” Try to find a part that looks the most questionable to you and ask the reviewer about it, expressing your concerns. A generally good practice would be to have a list of concerns, including things you are not sure of, to target reviewers’ attention and facilitate requests.

A popular failure mode for design documents is writing too generically. That is a huge drawback for a design doc, and often it is caused by the fact that a single person may not have enough context to fill in all the gaps. As an initial author, you need to facilitate the others’ inputs—for example, highlight some problematic areas with a lack of required information and encourage the reviewers to add missing parts of the puzzle.

We discussed how to create a design doc and what to expect from reviewers, but because the title of this section is “Reviewing a design document,” let’s try to reverse our suggestions and apply them from the reviewer’s standpoint:

Take a look at the design doc and try to navigate through the outline. Which chapters do where you feel most confident about?
If the outline does not exist, check if there are open questions/things to consider at the end of the doc.
Ask the design doc owner to provide those if they don’t exist.
When adding a comment, try to answer for yourself what value you are adding and what you want to achieve with it.
If you’re tempted to write “Looks good to me,” think twice. Are you doing that because it indeed looks good to you or because you just want to save time or rely on others’ opinions? If so, maybe it makes sense not to comment at all.

4.4.1 Design document review example

The case we’ve chosen for the our second example design document is the stock photo company we mentioned in chapter 3. Meet PhotoStock Inc., where we’ve been hired to build a modern search tool that will be able to find the most relevant shots upon customer text queries while providing excellent performance and displaying the most relevant images in stock.

The business is effectively a marketplace: photographers join the platform and upload their shots; customers who are looking for specific images for illustrative purposes (editors, designers, ad professionals) purchase rights for these photos. The marketplace makes money through commission from sales. The company is highly interested in making an effective search system on its website.

We provide part of a raw and poorly written design document based on what we discussed in the previous chapters and comment on it as if we were reviewing the document. This time, text highlighted in italics represents reviewers’ comments.

Design document: PhotoStock Inc.

I. Problem

Ninety percent of PhotoStock Inc. users find images via the search bar on our website. It makes the search bar a core component of the user experience.

Currently, the search engine is based on a fuzzy search algorithm powered by Elasticsearch, with its index updated automatically every Monday night. We assume it processes synonyms poorly. In addition, users can apply additional filters from presets that are manually created by the product team.

Many users are not happy with the search quality, which is proven by customer interviews and analysis based on clickstream. Only a small portion of search sessions leads to a purchase.

Reviewer: How many users exactly? Please add links to existing reports and dashboards for more context.

Reviewer: The search-to-purchase conversion is a function of many variables, and the relevancy of search results is just one of the factors. I suggest decomposing the problem further so we can estimate the missed revenue caused by poor search results more efficiently.

Reviewer: Please provide more information on the current search solution, as it’s not clear how it works and interacts with other systems. What are the main failure modes?

Reviewer: How do we measure user happiness? Please add specific criteria.

II. Goals

Increase the search-to-purchase conversion rate by 100%.

Reviewer: Why by 100%? Are there any reasons for this exact level of increase?

Reviewer: As mentioned before, the search-to-purchase funnel is not only determined by search quality. Let’s narrow down the goals.

Reviewer: Are there any important nontechnical requirements like latency?

Reviewer: How do we currently measure conversion? How can we measure that it has been increased by this effort?

Reviewer: Have you defined antigoals to highlight the zones we don’t need to focus on?

III. Risks

We can lose many loyal existing customers as they can’t follow their current behavior patterns.

If we release defective software, we can lose a significant source of revenue.

Reviewer: Interesting point on behavior patterns. Are there any examples of how users have to adapt to dysfunctions of the search engine?

Reviewer: With our infrastructure of blue-green deployment and A/B tests platform, we should be able to roll out the new system gradually; we should use it to mitigate such risks.

IV. References

[Link to YourPowerfulSearch, an enterprise-grade search system for marketplaces]
[Link to an academic paper from the Bing Search Relevancy team]
[Link to a Google Analytics dashboard showing various metrics related to a PhotoStock search]

Reviewer: Please add more internal search-related artifacts, such as PhotoStock BI dashboard and UX research.

Reviewer: I believe that YourPowerfulSearch is not the only relevant solution in the market; can we get a wider overview? The same applies to the paper from Bing.

Reviewer: How can we estimate the influence of search relevancy on our commercial metrics? This can affect a possible budget a lot.

You can see some patterns in the comments, such as

Raising legitimate questions as early as possible
Suggesting missing parts either as questions or statements

Early feedback at the design review stage can save a lot of time in the later stages. Questions should initiate and facilitate a healthy discussion and unlock better solutions and should never be aggressive or toxic.

4.5 A design doc is a living thing

This section was initially planned to be myth #5 in the list from the beginning of this chapter, but we believe this point is important enough to have its own spot as a separate section.

So why should there be no fear or hesitation in editing or criticizing a design doc at any stage? The answer is that a design doc is truly a living thing.

Usually, the evolution of design docs looks like this:

First iteration
Feedback from peers
Rewrite 60% of the doc
Feedback from peers
Rewrite 30% of the doc
Feedback from peers
Rewrite 10% of the doc
Start implementing the system
(Three months later) input from the real world
Rewrite 30% of the doc

With an evolution like that, you need to expect that the only time you could complete the design doc would be if you finished implementing the system, but even this is not guaranteed.

As soon as your system is implemented, life will expose its flaws, which you will have to address; or product managers decide they need new features, and the system has to be extended; or the government issues a new piece of legislation, which you have to consider; or there is an infrastructure migration or a new use case. You name it. To perform these changes, engineers need to understand the system and read design documents. By that time, a new pattern or technology could arise that perfectly fits the system.

If this is not the case, new features and refactoring need to be reflected in the design doc, bringing us to the design doc evolution mentioned earlier.

That is why a design document is never over. It is a living thing, as long as you have a service it describes. Even if you leave the company, others need to take the banner from you, if they don’t want to end up with a completely unsupportable system.

Rewriting a solid share of the design doc may seem discouraging, but it is something you can benefit from in the long run. For complex systems, it even makes sense to practice the “design it twice” approach—admit that your first design is likely not the best one, and design it twice, taking two radically different approaches. As practice shows, this approach can reveal hidden problems and opportunities. Let us quote A Philosophy of Software Design (Yaknyam, 2018) by John Outerhout (a nice book we recommend to feel the spirit of a good design):

I have noticed that the design-it-twice principle is sometimes hard for really smart people to embrace. When they are growing up, smart people discover that their first quick idea about any problem is sufficient for a good grade; there is no need to consider a second or third possibility. This makes it easy to develop bad work habits. However, as these people get older, they get promoted into environments with harder and harder problems. Eventually, everyone reaches a point where your first ideas are no longer good enough; if you want to get really great results, you have to consider a second possibility, or perhaps a third, no matter how smart you are. The design of large software systems falls in this category: no-one is good enough to get it right with their first try.

A good design (and a good design doc, respectively) should reduce various complexity aspects of the system, be it understanding, building, modifying, or maintaining. And if the system promises to be complex from the very beginning, spending additional time to reduce this complexity in advance via multiple iterations is often a good investment.

People who prefer building over thinking may feel irritated by this point: “Come on, first you suggest writing docs instead of writing code, and now you suggest doing it over and over again?” Well, it makes little sense to run many iterations once you no longer receive new information, and sometimes you can’t improve the design before some proof of concept is written. However, designing things twice is often a fair tradeoff between agility and preparedness.

Summary

Just like correctly set goals are the benchmark for your system, antigoals represent areas to avoid. Make sure to keep them in mind and have them pointed out.
A design document that is well put together will help you understand whether you need an ML system at all.
Involve all your stakeholders in reviewing the draft design document.
When you see a “looks good to me” answer, always contact your colleague once again for clearer, more precise feedback. Vagueness at the draft stage will very likely lead to change requests at later stages.
If you’re initiating a review for your design document, encourage people to criticize the current ideas and suggest alternatives.
Consider inviting reviewers with various backgrounds and experiences to gather varied feedback.
While reviewing a design document, ask specifying questions to point out weak or unneeded parts.
Don’t be afraid of multiple iterations, as no first draft will ever reach the final stage without edits.
Remember, a design doc is a living thing and will be subject to edits even after the launch of your system.