On Estimating

Michael Siliski
8 min readJan 20, 2017

--

The other day, I was in a meeting with my team leads working on project planning. I asked the product manager and tech lead of a specific component, “How far along do you think we’ll be in 4 months?” The answer was, “We don’t know.” I responded, “Well, what’s your best guess, given everything you know right now?” They told me it was impossible to say, and they’d need a week to do a thorough workup before quoting any numbers.

Not having any information to go on was frustrating, but I realized afterwards that we’ve never had a real conversation about how to come up with useful estimates. So I decided to write up how I approach this.

Decision making

Why is estimation important? Because we constantly make decisions, and those decisions depend on some information, and that information is nearly always complex, incomplete, and/or at least partially incorrect. Estimating is the art of rendering a quantitative judgment from uncertain information. The estimate can be factored into our decision making, whereas the information it was based on is often too unwieldy to be directly used or efficiently communicated.

Estimation is critical in product management. “When will we be ready to launch?” “How many people will use this feature?” “How much revenue will this generate?” These questions all demand quantitative answers, and the answers can’t be calculated with certainty.

Estimation is also important in daily life. “How far can we go on 1/8 tank of gas?” “How much do we need to budget for Christmas presents this year?” “How likely are we to still live in this house in 5 years?” The estimates in all these cases directly inform decisions, and the quality of the decisions (i.e. the likelihood you’ll be happy with them after the fact) depends on the quality of the estimates.

So, if we want to improve our decisions, how do we make better estimates?

Reducing uncertainty

An obvious place to look is reducing the uncertainty itself. This can sometimes be done efficiently. Broadly speaking, there are two approaches:

  1. Directly gather more information. If I want to know how far I can go on 1/8 tank of gas, I’ll have a much more accurate estimate if I can look up my gas tank size and MPG for my car.
  2. Involve more people. The more people at the table, the less risk you’re at of missing some essential information. Have each person produce their estimate independently, and then encourage active discussion about why the estimates differ. This avoids group think and gets hidden assumptions on the table. The estimates should converge, and you should be able to explain any remaining differences between them based on differing assumptions. You can then collapse this into a single estimate.

Both of these methods have real costs in time and effort. It’s important to account for those costs and make an explicit decision to incur them, as the costs can sometimes outweigh the decision-making value added. For example, if I’m sure I can go 10 miles on 1/8 tank of gas, and my destination is only 5 miles away, I have all the information I need. I’ve seen many cases where a 5-minute conversation with an experienced engineering lead yielded an estimate as accurate as a week’s worth of thorough project planning by a whole team.

While uncertainty can be reduced, it can rarely be eliminated, so the next question is how to produce optimal estimates given the uncertainty. Note that there is no level of uncertainty that makes it impossible to produce any estimate at all. Low confidence information can still be valuable, as long as the uncertainty is clearly represented and communicated.

Generating point estimates

So given some amount of information, how do you actually come up with an estimate? The estimate should have two components: a point estimate representing your best guess and a measurement of uncertainty.

My process for generating point estimates is pretty simple:

  1. Generate the 5-second gut feel number. This is your initial best guess.
  2. Pick a very low number, one so low it’s unimaginable that the real number is outside this range, and a corresponding very high number.
  3. Now ask yourself, “If I had to bet that the real number was lower or higher than my best guess, which would I choose?” Apply information as needed to help you decide.
  4. If you pick lower, then set your new best guess to be halfway between your lower limit and previous best guess. Your previous best guess becomes your new upper limit.
  5. Repeat steps 2–4 until you truly can’t decide in step 3. Then you’re done.

This gives you an over/under — a value that you think is equally likely to be too high as too low. A good test of “can’t decide”: imagine you had to bet a month’s income on higher vs lower… would you truly be indifferent?

Representing uncertainty

So now we’ve got our estimated, best-guess value, but we’ve also made a lot of arbitrary-feeling decisions and intentionally set aside our low confidence. We might even feel that there is so much uncertainty latent in our estimate that the whole thing is worthless. So how do we make sense of, and then communicate, that uncertainty?

This is critical, and far too often ignored. If you tell me a bug is going to be fixed by next Wednesday, but you really have no idea if it’ll be today, tomorrow, or any other day in the next month, your Wednesday estimate is essentially worthless to me (and potentially worse than no information).

The solution I recommend is confidence intervals. Very few people can interpret standard deviations, and “low confidence” means nothing in particular, but confidence intervals are both specific and easy to understand. In fact, I’d go so far as to say that an explicit confidence interval should always be included in any estimate.

A simple and effective form of a confidence interval is providing best and worst cases along with the expected case. I prefer putting explicit odds on the best and worst cases to ensure everyone is speaking the same language, since there is a huge difference between a 1% chance and a 20% chance, and it’s important for people to interpret the uncertainty the same way. For the same reasons, it’s also useful to contextualize your confidence interval by articulating the scenarios that fall inside and outside the range.

Going back to our bug fix example: If you tell me the bug is most likely going to be fixed next Wednesday but might take until Thursday, that’s very different than if you tell me it’s most likely going to be fixed next Wednesday but there’s a chance we hit an issue that would take a couple of weeks to resolve. I’m going to make different decisions. Adding a confidence interval to your estimate is as simple as saying, “Likely Wednesday, and I’m 90% sure it’ll be done by Thursday.” Usually, confidence intervals are skewed towards the pessimistic case. Even when it’s likely that everything goes to plan, there are more ways things can go worse than expected than better than expected!

In addition to quantifying your uncertainty, confidence intervals are also a useful tool to extract estimates from the reticent. People sometimes feel a minimum confidence level is required before they’ll share any estimate. A colleague of mine likes to say estimates have a way of sticking, so people will shy away from giving them if they feel they might get burned later. If you tell them they can pick a number but also give the range that covers 90% of cases, they’ll feel much less pressure about the number, and will give you a lot of information about how much uncertainty they actually perceive. Again, the goal is not to be 100% accurate, but rather to be able to summarize and communicate the uncertain information we do have.

Visualizing probability distributions

One tool that I find useful to increase my accuracy in generating both point estimates and confidence intervals is drawing a cumulative probability chart. This is simply a chart with the y axis showing probability (0% to 100%) and the x axis showing values (time, cost, etc.). The probability increases monotonically as you move out from the origin. Here’s a cumulative probability chart I drew on a whiteboard last month to help a team make an estimate:

I didn’t add much information myself other than to ask the team questions and turn the answers into the chart, but seeing the result was pretty eye opening for them, particularly because it forced them to actively think through the worst case scenarios and factor in unknown unknowns.

To generate the chart, I simply try to pick a few representative values on the x axis, estimate the corresponding probability of each, and then draw a smooth line through all the point estimates. Using the resulting chart, you can directly read off the point of 50% confidence, 80% confidence, etc., as well as get an intuitive feeling of how tightly the expected, best, and worst case scenarios are clustered.

Traps and biases

Through all of this, it’s useful to remind yourself that the human brain uses an elaborate set of shortcuts to make judgments, and these can sometimes lead you astray. There are many biases that you’ll want to actively guard against — a great source on this is Daniel Kahneman’s Thinking Fast and Slow. A few examples that are particularly relevant:

  • Anchoring bias: Whatever number is in your head, whether relevant or not, can significantly affect your estimates.
  • Availability bias: Easy-to-recall events can overly impact your judgments. For example, recency bias could lead you to make an overly optimistic estimate if you just finished a project that shipped on time.
  • Affect bias: Your feelings can impact your reasoning: I like this product, therefore it will probably generate lots of revenue.
  • Halo effect: This handsome and confident engineer will probably get that bug fixed quickly!
  • Overconfidence effect: Your own subjective confidence in your judgments exceeds the objective reliability of those judgments.

Another common trap people fall into is underestimating the probability of relatively unlikely events. (See: 2016 US Presidential Election.) My rule of thumb is that any event with less than 20% probability may get treated as 0% probability. I try to remedy this in two ways:

  1. Make the effort to list out various cases that don’t seem obvious. Once they’re named, it’s easier to force yourself to assign them odds.
  2. Imagine the scenario playing out 10 times, and count how many times various events occur. For example, living in this house for more than 5 more years may be hard for me to imagine right now, but if we played out the next 5 years 10 times, how many times would I still be here? One? Two? Four? This kind of thinking is often more intuitive to people than going straight to percentages.

Improving your estimation skills

If you want to get better at estimating, it’s actually very easy. Practicing is simple. Anytime you’re going to look something up or count something — How high is the Golden Gate bridge? How old is Beyoncé? How many marbles are in that jar? — just challenge yourself to estimate the results first. There’s nothing like direct feedback to refine your skills. If you practice this often, you’ll become more comfortable thinking about probability and expressing uncertainty. This will enable you to clearly communicate estimates, assumptions, and risks.

--

--

Michael Siliski
Michael Siliski

Written by Michael Siliski

products, software, data, cities, housing, mobility, San Francisco, Boston sports, minds, music, coffee, spirits, funny stuff, beautiful stuff, amazing stuff

Responses (2)