Tag Archives: estimating

Feeling Sick or Unlucky

Let’s play doctor.Image result for doctor holding stethoscope

Let’s say you have a patient who shows signs of a disease that’s tricky to diagnose.  In fact, of the people who show these symptoms, only 1 in 100 have the disease.  The test is only successful in detecting the disease 90% of the time.  The test can also fail by incorrectly indicating a “false positive” (i.e., test results show you have the disease when in fact, you do not) 9% of the time.

How do you feel about them odds?

Since this case study is appearing on this blog, you are correct in thinking there is a trick.  In the real world, physicians are confronted with these type of odds all the time.  To make matters worse, the percentages are even murkier, for example, with overlapping or contradicting studies.  The neuroscientist Daniel Levitin, in his book ‘A Field Guide to Lies’ cites a study the indicates that 90% of physicians make the error of conducting the test.
What’s the error?  The odds are that about 9 in 10 positive results are actual false positives.  If the test shows that your patient has the disease, you are nine times more wrong than you are right.  This test, is therefore, useless, or worse than useless.
One thing I have learned is that maintaining one or two numbers and the relationship between them is relatively easy.  When you have to deal with three numbers, even if the math is easy, things get hard really fast.  To do the above math in your head, you have to do a few things.
  1. Track the “patient does not have the disease” part of the equation.  Using numbers from above, 99 do not have the disease, 9% false positive is about 9 people.
  2. Compare that to the “correct positive” of the 1 person who has the disease and gets a positive result”.  Let’s round up and say it’s one person.

Nine false positives to one correct positive.  Feeling lucky?


Guessing how many jelly beans in a jar

How many jelly beans in a jar?  Or candy pieces?  Or odd shaped Lego pieces of differing size?  What if the container is oddly-shaped?  Or we have multiple jars?

I always enjoy a good estimation challenge and find it difficult to pass up a chance to “make a guess and win it all!”.  So what are some good strategies?

You can always spend a lot of time and try to count as many as you can see, then apply some math.  This works well if you are dealing with a well defined shape, like jelly beans, and know something about the container (you can stuff about 900-950 “regular” sized jelly beans in a gallon jar).  What happens if you have oddly shaped containers, or you have some other objects mixed in?  What happens if you are dealing with an assortment of different candies and chocolates?  (Which, in my opinion, are better winnings.)

The following are some strategies for increasing your odds.  I believe there are some applications to business and management, as estimating is something we do more often than we think.

1.  Do NOT look at others’ guesses before making a guess.  Doing so may inadvertently set an “anchor” for your guess.  In any kind of estimation, you want to avoid “bad” anchors.  Chances are, you are not dealing with pros anyway.

2.  Plan on spending a little bit of time.  Not “take out a pencil and calculate the definite integral of 3D space” time, but more than 5 seconds.  Often times, we cheat ourselves of using our information and instinct (both important factors in estimating) and give up too fast.  What I have seen people do is this: they try some method, like counting the whole pieces they can see.  Then they realize that the method does NOT lead to a good estimate.  “If I can count ~50 whole pieces, and about ~40 half pieces, how am I going to use that info?”.  Then they give up.

An important lesson is this: sometimes we think we need very specific data (how many whole pieces can I count), but when you actually have that data, we don’t know how to use it or what it really means.  Data is always important, but don’t be fooled into thinking that data = answer.

3a.  Having said that, do some counting.  Cut up the jar into sections, like in halves or thirds.  Take special care if the jar is tapered.  Just get to some manageable subpart, then see what you can count.  Do a gut-feel estimate based on how you think things are arranged inside.  Pick up the jar, feel the weight.

3b.  Then start over.  This is the mistake that many people make in estimating: they only do it once, and only using one method.  We bias ourselves into thinking whatever method we first start with is the superior method.  This is what clouds our judgment for stuff like this.  So, take another “fresh” estimate, maybe working from counting what you can see from the bottom.  Maybe give yourself a break between the two estimates.  Whatever you do, you MUST not bias your second estimate with your first.  Do NOT look at others’ guesses while you come up with this second estimate.

4.  Then come up with something between these two numbers.  Remember that number.  Then walk away.  That’s right.  Unless the time is running out, just walk away.  Come back later.

5.  When it’s time for the guessing to be finished, come back and THEN take a look at the others’ guesses.  Here’s the trick: tweak your guess by using the others’ guesses to be near the midpoint between two adjacent guesses.  For example, if you are thinking 1200, and someone has 1150 and another person has 1210, change your guess to be closer to 1150, such as 1180.

6.  When they reveal the answer, make a note of whether you guessed too high or too low.  Then think about what influenced your bias.  Were you surprised at how heavy the jar was, and then guessed too high a number?  Did you actually think about how awful it would be to eat the whole jar, and then guessed too low a number?

7.  Look forward to the next guessing opportunity to get better at this.

It’s not a surprise to those who have read other posts on this site, but the two common themes are:
1. Try to unbias yourself as much as possible, or in reality, since you can’t completely unbias yourself, at least understand what biases are at play.
2. Refine, learn and get better.

Why Your Construction Project Estimates are Wrong

According to WSJ’s The Numbers Guy, most infrastructure projects end up with very large overruns.   Surprised?   Probably not.  This is one of those casual conversation topics where many of us can point to a project close to home that have “busted the budget”.   New Jersey’s rail tunnel to New York City, Boston’s Big Dig, Sydney Opera House, your local sports stadium, the highway project on the other side of town… what you have been suspecting (that these big projects become more expensive) is correct, at least according to people who study these things.

So why is this the case?   Like many complex situations, there are several elements at play.  Estimating is always a tricky task.  Even experts cannot estimate as well as they think they should.  In fact, having expertise often affects your estimates in two (bad) ways.  According to Professors Magne Jørgensen and Dale Griffin, there is a link between forward looking perspective and irrational optimism.  You are an expert, you are asked to provide an estimate, you feel optimistic about the future, and give a favorable estimate.  According to Nassim Nicholas Taleb, author of The Black Swan, expertise gives you an unwarranted sense of certainty in situations where such certainty cannot exist.

There are other factors at play.  For example, the larger the project, the more likely that there will be more stakeholders and more ideas.  The public may want better aesthetics.  Public projects expand as other smaller projects are folded into the original project.  Also, the longer the project, the larger the risk of labor and material cost increases.  Some projects also end up paying for “externalities cost” that may not have been originally planned.

If there are other bids involved, the winning project was more likely to be the most optimistic.  This is the so-called winner’s curse.  Yipee, we won the project… And we’ve also convinced the public, officials (and ourselves!) that our estimates are correct.

Finally, there could be misrepresentation and outright lying.  The articles acknowledge this, but does not go into research findings.  I suspect, as the article implies, that there is very little systemic large-scale lying in the industry.  It’s most likely the other factors described above.

Big estimates are big numbers, and we do not deal well with big numbers, experts included.  We remember numbers incorrectly.  For example, we may remember that a project was supposed to be $100MM, but forget that with the additional approved budget, the project eventually was set at $126MM.  It’s easier to remember something like “$100MM”.   Also, we don’t like to deal with ranges and uncertainty, and this is exacerbated with the large numbers we are dealing with.

So, is it wrong to be optimistic?   Are we just lying to ourselves?  Is it possible to have expertise and accuracy?

Tainted Eggs and Sticky Accelerator Pedals

We’ve had hundreds of millions of eggs recalled in the last several weeks.  That’s more eggs than there are people in the United States.  According to CNN.com, there were 1,953 cases of Salmonella enteritidis reported in a 3-month period.  Salmonella hits you hard. It can leave you sick for a week with cramps, chills, headaches, vomiting, diarrhea.  And that’s if you are healthy.  The elderly, young or folks with weaker immunity can suffer much worse.

1,953 reported cases.  Even after the recall, there could be more cases since the symptoms can hit several days after consumption of tainted eggs.  That’s a lot of sick people.

Or is it?

Another recent recall involved Toyota vehicles and the problem of accelerator pedals.  Cars accelerated out of control.  People died.  There were multiple stories carried by the media in quick succession.  Police were interviewed.  Congressional hearings were held.  A company’s reputation was at stake.

Take a look at the graph published by the Wall Street Journal that shows the “daily number of complaints about vehicle speed and accelerator pedal control” and the dates of some key events.  I am not sure what the “normal average daily complaint rate” should be, but before the warning from Toyota in late September 2009, it seem like there were fewer than 10 complaints per day.  There’s a small spike after the September warning.  The complaints seem to show a temporary peak about 6 weeks after this. In late November, Toyota announces a recall, accompanied by another spike in the days following.  Finally, in late Jan and early February 2010, there are calls to investigate the possibility of faulty electronics.  Around the time regulators officially expand the probe, the complaints spike, reaching a height of over 150 on a single day.

It’s difficult for Toyota to claim that either the drivers were becoming less careful or that the complaints were unjustified.  We have seen such PR blunders before from companies.  When a company makes such a mistake, no amount of science, facts, statistics or promises can fix the PR damage.

Back to the tainted eggs.  According the the CDC, from May to Jul, we would expect about 700 cases of Salmonella instead of 1,953.  Clearly, there is a spike associated with the eggs.  And it’s also likely that not all cases relating to the eggs have been reported.

What do you think?

Can recalls “cause” complaints?  Should companies (and organizations) revise the way recalls are done?  How should we use such statistics in setting the communications or policies regarding recalls?

How Much Oil is Leaking in the Gulf?

The oil continues to leak, and we are now starting see the oil hit the shores.  We have live images from the leak 5,000 feet below the water surface.  But how much oil is actually leaking from the wellhead?

Initial estimates (Day 5) from the US Coast Guard and BP placed the estimate at 1,000 barrels/day.  Last week, the “official” government group revised the estimates at 20,000 – 40,000 barrels/day.  Less than a week later, the estimate is now at 35,000 – 60,000 barrels/day.

When the late May estimates came out, there was an interesting quote from Ira Leifer, University of California, Santa Barbara: “It would be irresponsible and unscientific to claim an upper bound, …it’s safe to say that the total amount is significantly larger.”  He wants to make sure the estimate has an asterisk because he wants “to stand up for academic integrity.”  In fact, there’s a whole document available by the university that will explain how the scientists came up with their estimate.  (from WSJ article here)

But I suspect that most of us will not care too much for the actual method or the details.  Maybe a chart like below is helpful since it summarizes the “growth” of the estimates over time.  By doing so, I am (on purpose? inadvertently?) suggesting a story and a conclusion.  What do you read from it?

Report Date Barrels/Day Source / Reported by Method Link
April 24 (Day 5) 1,000 USCG, BP, thru cbc.ca Info from ROV (remote operating vehicles) and surface oil slick Link
April 28 (Day 9) 5,000 NOAA Satellite pictures Link
May 12 (Day 23) 70,000 Steven Wereley, Purdue, for an NPR story particle image velocimetry, based on videotape Link
May 12 (Day 23) 20,000 – 100,000 Eugene Chang, UC-Berkeley, for an NPR story “Pencil and paper” based on pipe diameter (from video) Link
May 27 (Day 34) 12,000 – 19,000 12,000 – 25,000
(depending on source)
Flow Rate Technical Group
(NOAA, USCG, Minerals Mgt)
Based on multiple methods
(blog entry author’s guess)
Link 1

Link 2

Jun 10 (Day 52) 20,000 – 40,000
25,000 – 30,000
(depending on source)
Plume Modeling Team of The Flow Rate Technical Group Revised from earlier, based on additional video from BP Link
Jun 15 (Day 57) 35,000 – 60,000 Deepwater Horizon Incident Joint Information Center, reported by cnn “based on updated information and scientific assessments,” Link

So what can we learn from this?  We all (think we) want lots of data.  It’s helpful when it’s summarized in a way that seems to make sense.  But when we are confronted with data that we are not used to seeing (how may of us deal in BARRELS of oil, or work with flow rates?) we need some anchor, some comparisons, something that helps us make sense of numbers.

No matter how you count it, this is a lot of oil.  But does it really matter that it’s 15,000 or 60,000 barrels/day?  If you are part of cleanup or doing planning for the the collection, it may help you with the planning.  But you’re also going to want to know some other info, such as how long will it flow, how the flow has changed over time and the related “total leakage”.  Even with this last bit of info, you’re more interested in the amount that ends up on the shore or the amount that actually possible to reclaim.

For most of us, the accuracy of the flow rates do not matter so much.  It’s a lot of oil, and we need some way to get a handle on it.  Most of us will not remember the actual number (or in this case, the changing range of numbers).

Besides, no one will really know the true amount that has spilled.

Are you good at estimating?

We all estimate. Whenever we say, “I’ll be home in about 30 minutes” or “I need about 50 inches of tape”, we are estimating. Some of us even estimate as part of our jobs. Project managers, sales reps, executives, coders… whether we estimate lines of code, weeks of effort, new customers, revenue and profit, we make educated guesses based on our experience, observations and other sources.

But how good are we at estimating?

Here is a little exercise. On a sheet of paper, write down 1 through 10 on the left side of the page. Next to each number, draw two blanks, so that you can provide two answers for each number. Like this:

1. _________ _________
2. _________ _________
3. _________ _________
4. _________ _________

and so on to “10”.

Your job is to provide a “90% certainty” estimate for the questions below. You don’t have to get the answer correct, just provide a range of numbers–write your “low estimate” on the first blank and your “high estimate” on the second blank on each line.

  1. What was the production cost of “Gone with the Wind”?
  2. How old was Alexander the Great when he died?
  3. Wikipedia lists Burj Khalifa in Dubai as the tallest building. How tall is it in feet (or meters)?
  4. If you walk at the average speed of 3 miles/hour, how long in months would it take to walk the distance of Earth’s equator?
  5. How many times can Earth fit inside Jupiter?
  6. How many people signed the US Declaration of Independence?
  7. How many countries are there in South America?
  8. In what year did the world’s population surpass 2 billion people?
  9. How many pairs of legs does a common house centipede (Scutigera coleoptrato) have?
  10. What is the “as the crow flies” distance (on miles or km) between Beijing, China and Amsterdam, Netherlands?

For answers and the second part of this post, see comments.  But don’t scroll down or click on link before you take the quiz!