The Black Art of Software Estimation

Monday, August 18, 2008

Well, we can chalk up my month-long hiatus from posting to that blackest of software arts, time estimation. In the understatement of the century, Wikipedia notes:

The ability to accurately estimate the time/cost taken for a project to come in to its successful conclusion is a serious problem for software engineers.

In order to deal with this problem, I've tried everything from function point analysis to tasseography...

Tasseography

...and although this won't be a popular statement, let me tell you what I think: none of it really works.

Not really.

No matter how careful we are, and regardless of what statistical methods we apply, we'll find that software suffers from a sort of Heisenberg Uncertainty principle. It's almost impossible to observe a piece of software or to accurately predict how long it will take it to travel a certain distance on the development timeline. The best we can do is get a ballpark figure, and rely on the fact that our programmers will make a valiant effort to manhandle the project workload into the project schedule.

The problem here is one of loose wiring, a term I first came across reading Caro's Book of Poker Tells.

You see, your poker opponents are volatile beings. They can be impressionable, irritable, playful, capricious, and more. You don't know when they're going to short out, cross-circuit, or doing the silliest or the most brilliant things. This goes for all poker players, from the weakest beginners to the most seasoned pros. The deal is that even when opponents are playing a disciplined game of poker, so many of their decisions are borderlined that what they're going to do is anybody's guess

The idea is that given the exact same player, the exact same opponents, the exact same set of cards, the same amount of money, player actions will vary wildly just based on the current price of garbage in Moscow, or an increase in the butterfly population of Madrid. There's no telling what people will actually do, even if you present them with identical situations. Instead we have to contend with built-in unpredictability, a sort of inherent randomness to the world.

Chaos Theory, by Vicky Brago-Mitchell

This is true not only in poker, but in life, and especially in programming.

The same programmer, given the same exact workload, given the same circumstances, will take anywhere between 50% to 150% as long to finish his work, based on nothing at all. Mood. Whim. Luck. The bad Chicken Marsala he ate last night. There's no way to predict this, no way to quantify it. The best you can do is resort to statistical methods to try to tame the beast. And even those don't work as well as we'd like, because the numbers we build are like houses of cards, prone to tumbling as soon as the assumptions that allowed us to generate those numbers are proven to be hopelessly naive, or on the other end of the spectrum, ridiculously conservative.

So for all these reasons and more, I've been hammered for the past three weeks playing catch-up on a side project, and unable to post. Cue the violins.

That project is finally concluded and for those of you who've asked whether the botting series will continue, the answer is absolutely. I hope you'll stay tuned as we continue discussing the mechanical aspects of poker botting, and as we start looking at the poker strategy and A.I. side of things more closely. We'll also touch on some other topics that I think you'll find quite interesting, if not shocking.

Expect the next post within 24 hours, give or take.

(But as I say that it occurs to me that estimating future publication dates is a bit of a black art, too, and even more difficult, in its own way, than estimating software completion times.)

Luckily this content is already written.

(But as I say that it occurs to me that content, like software, is never really complete. It can always be improved, honed, refactored.)

In the words of a poet:

What a tangled web of unrealistic schedules and intractable time-based dependencies we create, when first we start to estimate...

So, expect the next post within 12 to 36 hours, by the 50/150 rule stated above, and in the meantime: have you ever come across a software time-estimation method that actually works?

How good (or bad) are your software estimation skills?

Are they as bad as mine? As in need of constant oversight and refactoring? Or are you that rarest of creatures, the Bigfoot of Software Engineering...

Bigfoot

A time estimation virtuoso?

Tags: software estimation, Law of Loose Wiring

37 comment(s)

I for one definitely fall into the non-virtuoso category. Moving to Agile has helped some, but somehow I always seem to find myself staring at a long list of hour estimates and realizing that none of them really mean anything, the numbers are fudged beyond recognition, and that the only way software schedules are ever met is by somehow cramming the work to be done in however many hours they give me.

Thanks James, looking forward to the next botting post. So far as estimation goes, I'm a virtuoso. So much of a virtuoso that I once took 6 months to complete a project I initially scope out at 3 wks.

Bigfoot... lol. Actually I followed that Bigfoot link and then did a Google search and those guys are actually claiming to have a frozen Bigfoot carcass.. too funny. There was even something on CNN (online) about it.

it's true there DO seem to be people who are uncannily good at estimating maybe it's because they're just accurate maybe it's because they over-estimate but i think the main thing is that programmers can turn on the afterburners when they need to. maybe the real question is not 'how good at estimating?' but 'how good at developing software quickly?'.

i might be a terrible estimator but if i can code like a bat out of hell (i cant, but if i could) it doesn't matter, i'll just code what i need to meet whatever unrealistic deadline...

I'm horrible at Estimating time as well, don't feel like you have to explain anything to us - we understand! I am ecstatic that you're back though!

I'm not a time estimation virtuoso, but my time estimates have become much more accurate than they used to be. The first step toward making better time estimates is to observe your personal sources of variance, and try to whittle away at them. Here's some of the sources of the variances I've found and how I've worked on them.

Design. A lot of variance crept in when the design in my head overlooked several key components of the project. Abstracting out components is important to make a comprehensible design, but it's easy to abstract out so much complexity that estimates of the implementation complexity are junk. I try to flesh out my designs as much as possible on paper now. I find this makes for fewer errors and oversights in the design, saving me a lot of time in the long run and dramatically decreasing my variance.

Debugging. Sometimes I would get stuck while debugging, possibly wasting several days to figure out a single bug. In these long debugging sessions, I would conjecture a hypothesis as to what might be causing the symptoms of the bug, stare at the related code and see something that looks wrong and fix, and discover that the bug is still there. Sometimes I was actually fixing some other bug. Other times I was staring at the code a little too hard and introducing a new bug. Eventually I might actually fix the bug that I was trying to fix. More recently, when I find myself applying this whack-a-mole approach to debugging, I step back and look for a systematic way to do a binary search for the cause of the bug. Typically, I write some kind of verifier() function that will check whether something has gone wrong. If if the verifier() function is moderately complex and requires its own data structures, writing it is still faster than banging my head against the wall repeatedly. Then I insert the function into the code into a couple places to narrow down the cause of the error. Rinse, wash, repeat, until I have the exact line of the problem.

Feasibility. This one I'm still working on. Sometimes I try to tackle a project that I believe is feasible, but actually isn't (or isn't feasible with the overall architecture I've selected). This can be a huge timesink as I try to force the architecture to solve the problem it was designed for, when really it's just never going to work well. I think that I need to start integrating a "feasibility analysis" step where I throw together a simplified version of the project and see whether it obliterates the problem or not. If it doesn't obliterate the problem, it's probably not worth following through with a complete version.

I was once told that the proper way to estimate software time values is to take your best shot, then double it and move it up to the next time level.

So - 2 hours becomes 4 days, 1 hour becomes 2 days etc.

This works as well as anything.

i worked in a code shop where when we started on a new R&D project, the estimating model was to take hours given by programmer and multiple by four.

as the project slipped, due to all the reasons like function creep, bugs, etc, we moved to multiplying by eight.

they never really got to multiply again, b/c after 3+ years, they folded up this R&D site and moved on with the pieces that were salvageable!

Barry Greenstein's piece on chaos theory in Ace On The River is pretty spot on for coders too.

glad to see you're back!

It depends on the project... big, small ? And also it depends what the project is all about. If you made 10 projects like this before, the estimation will be easier to make. The best estimations I've made are when I detailed all functions about a project and then estimate each functions separately. I try to make each functions at least 1 day long, even if I think that it will take 1 hour. Aslo, depending on the business model / contract that you have, "buffing" an estimation might be good or bad... maybe that's another thing to consider in all the factors ?

Oh, question - who the heck did that painting, and where can I get a wallpaper version of it, please?

I'm perfect at time estimation, because I do all original development. When the time allotted has run out, I call the module complete and let the maintenance engineers worry about the rest. :-D

Thanks for what you've given so far, and for the motivation. The first seven steps really do seem like enough, but I admit I am anxious to see the eighth.

Oh I forgot to mention, I am very glad to see you back!

I like the Scotti from Star Trek approach. Always multiply your estimate by 4x and gain a reputation as a miracle worker.

The reason why developers always underestimate is that they just think about the coding bit of a project. There's an awful lot more to completing a project than just coding it. You've got to design it, code it, test it, debug it and finally tell people how to use it (documentation? why do they need that?!).

The net result is that you've thought about the coding effort, which if all things are equal is 1/5 of the total. Of course, all things are never equal - if you skimp on one stage, you have to expand a later one: inadequate design results in code being reworked, retested, and quality problems, which consumes more time than you saved initially.

Add to this the fact that developers (me included!) like to use new technologies which they don't yet fully understand (otherwise they wouldn't be new), so if they aren't careful they have to spend time learning the technology first. How can you accurately estimate the time it will take you to learn something you don't know?

This is no simple project. I've struggled with design and redesign, and it being my first major project ever, I've run into a lot of pitfalls that David Grooves mentioned.

But documentation, I think you can also over-document too. Sometimes I have a problem, so I try to work it out documenting with UML, and I haven't solved the problem at all, I've just restated it.

I think models that capture the bigger picture are the most helpful. The one in part #3 is invaluable.

I'm going to point out the white elephant in the room. "How I Built a Working..." Does it win?

So far I have been highly accurate by telling people it will be longer than they think, and way longer than they want.

I'm reminded of the good accountant joke:

Customer: What's 2 + 2?

Accountant 1: 5. (He's bad at math) Accountant 2: 4. (No good either) Accountant 3: What do you want it to be? (Someone good at math and has imagination, YES!)

HAHAHA! Yes, I love this project so far... even though it's been difficult and possibly currently beyond my feasibility range. It tickles the math/imagination side so much.

I can't help but feel like one of the Joker's minions from the Dark Night, hehe. This whole series has a delightful anarchists bouquet to it.

People will hate us for it, but AI is the future. AI programmers will be stealing people's jobs in the public's eye. What they don't realize is that they now will finally have time to write that all American novel, or go back to school and learn a new field, or ... but this coming tech revolution is going to be very difficult.

or watch Spongebob, lol

J.D. for President!!!

I read this blog, then i went to the poker bot erector set comments and saw this:

">Whens the new part coming?

Apologies for the delay guys. The post should be out later today. It's a handful of new posts, not just one. We'll be at part 8 or 9 in the series within 24 hours.. knock on wood.

James Devlin on 7/16/2008 7:31:02 AM (34 days ago)"

Blog time estimation obviously suffers from the same problems as software time estimation, and more :P.

thanks for continuing work on this. i have been checking this every day for updates, like many others.

keep it up!

Great article again James. Good to have you back after this month long hiatus. And even better that you are going to carry on the Poker bot series. I didn't really have much of an interest in programming but your constant incorporation with poker keeps me stuck to this site.

Keep up the good work!

Seriously, no need to apologize! :-)

Here is Bigfoot. Lets get on with Botting!

http://upload.wikimedia.org/wikipedia/commons/4/4a/ShravanbelgolaGomateshvarafeet2.jpg

Jeff Atwoods post is also worth reading:

http://www.codinghorror.com/blog/archives/001161.html

In my experience all formulaic time estimates are likely to be wrong. I agree with the posters who said you have to analyze the work effort very carefully. I find that most of my time estimates are off because I've eyeballed the task rather than performing due diligence on it. I find that when I really design things out in a detailed way, my estimates improve noticeably.

I think laziness has a lot to do with it.

My rss says you have a new post but I can't seem to read it. What's happened man!

Time estimation tends to converge as the end of the project comes.

I like art!

ask bir hayal izle

Very interesting and informative entry indeed. I am working in the software development company so I am very interested in all information related with this sphere. Very good said I think that "The ability to accurately estimate the time/cost taken for a project to come in to its successful conclusion is a serious problem for software engineers." It really a true I think. Well reading this post I have noticed and many other original and interesting thoughts. Thanks a lot for sharing this interesting information and I will be waiting for other great articles from you in the nearest future. Regards, Matt Thompson from <a href="http://www.azoft.com/">software development services</a>

Very interesting and informative entry indeed. I am working in the software development company so I am very interested in all information related with this sphere. Very good said I think that "The ability to accurately estimate the time/cost taken for a project to come in to its successful conclusion is a serious problem for software engineers." It really a true I think. Well reading this post I have noticed and many other original and interesting thoughts. Thanks a lot for sharing this interesting information and I will be waiting for other great articles from you in the nearest future. Regards, Matt Thompson from [url=http://www.azoft.com/]software development services[/url]

I love my ghd outlet Glattetang! It heats up very quickly and works well. My glamour hair is annoyingly thick and poofy, but ghdstraighteneroutletaustralia works wonders! I have never been so fully satisfied with just ghd straightener outlet supplier! What a pleasure shopping at this ghd outlet australia! Thank you very much for this wonderful shopping experience. I will be shopping ghd outlet very very often.

I love my ghd outlet Glattetang! It heats up very quickly and works well. My glamour hair is annoyingly thick and poofy, but ghdstraighteneroutletaustralia works wonders! I have never been so fully satisfied with just ghd straightener outlet supplier! What a pleasure shopping at this ghd outlet australia! Thank you very much for this wonderful shopping experience. I will be shopping ghd outlet very very often.

Use the form below to leave a comment.






Coding the Wheel has appeared on the New York Time's Freakonomics blog, Jeff Atwood's Coding Horror, and the front page of Reddit, Slashdot, Digg.

On Twitter

Thanks for reading!

If you enjoyed this post, consider subscribing to Coding the Wheel by RSS or email. You can also follow us on Twitter and Facebook. And even if you didn't enjoy this post, better subscribe anyway. Keep an eye on us.

Question? Ask us.

About

Poker

Coding the Wheel =
Code, poker, technology, games, design, geekery.


Hire

You've read our technical articles, you've tolerated our rants and raves. Now you can hire us anytime, day or night, for any project large or small.

Learn more

We Like

Speculation, by Edmund Jorgensen.