Knowing Your Average Is Powerful!
What is an average?
Yes, you know to take a list of numbers, add them up, and then divide by the total count of the numbers in the list.
But what does it mean to your software project, especially when your project management tools do it for you?
Let’s say that it takes ten days on average to fix a software defect.
Today, software testing reported 25 defects.
If “on average” it takes ten days to fix a defect, how many defects will probably be fixed ten days from now?
A. All of them because 10 days have passed and the typical time to fix a defect is 10 days
B. Half of them because 10 days have passed and an average is generally just a mid point
C. You can’t know because every defect is different as some are hard and some are easy
D. You are wasting my time with guessing games when I have 25 defects to fix
E. It doesn’t matter because they all need to be fixed by Friday since that is what we told the customer
F. Impossible to know as you didn’t tell me the standard deviation nor tested that the fix rate distribution is normal
I’ve heard all these answers during real projects, so let’s look at each of them.
A. All of them because 10 days have passed and the typical time to fix a defect is 10 days. I’m not kidding, an awful lot of people will say this. It is usually in the form of saying something like “it takes us 3 days to resolve a defect, so we should have them all resolved by Friday.” If you then ask the person what an average means, you see the cogs spin in their head and then they say “well, ok, it is an average, but we still think everything will be done by then ….”
B. Half of them because 10 days have passed and an average is generally just a mid point. This is a pretty good answer. The key is to think “half” when talking about an average. However, what you are really asking is “when will ALL these defects be fixed?” This is the question you and more senior management will ask all the time. This is the question you will work hard trying to answer. This is why you will collect data on how long things take to fix. I’ll address how to answer the “all fixed” question below.
C. You can’t know because every defect is different as some are hard and some are easy. This is a good answer when someone asks you how long it will take to fix a particular defect which was just reported. However, if you have a dozen or more defects and someone is asking how long for all of them then you can indeed have a good answer. More on this below.
D. You are wasting my time with guessing games when I have 25 defects to fix! Too many managers resort to this reply for what is really an important question – one that you should be able to answer. Yes, even when you don’t know anything yet about those 25 new defects, you should be able to give a rock solid answer to this question – an answer that will be accurate the vast majority of the time. No, it does not mean making a very large guesstimate. You want to provide an estimate that will most probably be the case. Yes, you can provide that estimate. We’ll talk about that below.
E. It doesn’t matter because they all need to be fixed by Friday since that is what we told the customer. Here the problem is you as a project manager have not educated your account manager about how long things take and what they should be telling the customer. This is also one of the biggest challenges, to educate your customer facing team. The chances are that they’ve been telling their customer roughly the same thing for years on how fast you can respond. Inevitably, when you educate the account team on how long things take, it very well might be twice what they have been normally telling the customer. So instead of “by the end of the week” it becomes standard to say “by the end of next week.” Once you start to deliver on those promises, your account team will come to trust you. Then the customer will come to trust the account team and may, just may, no longer demand everything be fixed by tomorrow.
F. Impossible to know as you didn’t tell me the standard deviation nor tested that the fix rate distribution is normal. Too many Six Sigma Black Belts wannabes reply with this kind of answer. The key for you as a project manager is to track and understand how well your teams perform critical tasks. By “living” with the numbers every day, you will understand them better than anyone trained in any level of statistics who does not have this experience. This gut level feel for the numbers is what you are ultimately trying to achieve by knowing your performance. Note if you are not collecting and tracking this kind of performance data, you will probably not know these numbers by virtue of just working with the process everyday. In one company, our best development managers would routinely say something like “it takes us about 3 days to resolve a defect”, but when the data was compiled, the average was closer to 10 days. Many defects were indeed resolved in just 3 days (the defect time to repair curve was a skew shape), but when looking for a median (the middle point) or the average, the results were 2 to 3 times longer. To really know, you have to measure and track routinely over the life of the project.
So, what do you do?
If you have a critical process that happens regularly, you want to measure that process. I recommend you do it yourself to start, so you understand the numbers. Some examples:
1. If you are regularly fixing defects, keep track of how long it takes to fix them. Track them from when they get reported to when they get fixed in a product build. As the project manager, you want to know the end to end time it is taking. The development team might only want to hear how long it is taking them to fix the defect once they finally receive it or once they finally understand the problem.
2. If your test team is running test cases, keep track of how long it takes to run the entire test suite. Once again, they might want to exclude test cases that didn’t work right or times when the testing environment had problems. You, as the project manager, want to keep track of how long it is currently taking, including all the typical things that go wrong. You always want to be able to answer the question: how long will this process take on average?
Anything that you repeat over and over is a candidate for this kind of tracking and measuring. Focus on the key critical processes that happen regularly. Documentation updates may not be on the critical path for example. Be able to answer the question “how long is this going to take?”
You know you have good estimates when half the folks ask “why is it now taking so long?” and the other half say “it is about time we started using realistic estimates!”
OK, but often this question, “how long will it take”, is more often phrased as “how long will it take to fix all the problems?”
Let’s take a real world example.
At one company, the average time to resolve a defect was about seven days.
What you want to know, as I mentioned above, is generally how long is it going to take to fix the current bumper crop of defects that just got reported.
It turns out that the standard deviation for these defects was also seven days. So the average of 7 plus 2 times the standard deviation of 7 gives 7+2*7= 21 days. (For the statistically minded, yes there are some problems here, but the results are close enough, much closer than what had been used, and this is quick and easy to calculate in Excel, Google Docs, OpenOffice Calc, etc.).
So when the question comes up of “when will all these defects get fixed,” a good answer is 21 days. Using our 25 defects that got reported, this works out to a statement that we expect all but one or two of the hardest defects to be fixed by the end of 3 weeks. Yes, one or two will probably be left over. This is because adding in two standard deviation to the mean tells us when 95% of the defects will get resolved (.95 * 25 = 23.75 defects resolved).
Once you know the average and the 95th percentile, you have a project management tool that puts you in control and allows you to speak in an authoritative manner. You will know how your project is doing and how things will turn out in the near term. You will be much more accurate than those folks who are still estimating without the benefit of knowing the numbers.
If you enjoyed this post, make sure you subscribe to my RSS feed!Filed Under Reporting | 9 Comments
Tagged With Defects, Estimation, Project Management Tools, Reporting, Software Defects, Tracking
Comments
9 Responses to “Knowing Your Average Is Powerful!”
Leave a Reply
[...] Your Averages Posted June 30, 2009 Filed under: Uncategorized | Published Knowing Your Average Is Powerful! A simple concept of just tracking things under your nose. You can fix years of “this is [...]
[...] and/or unwieldy. I talk about simple measurement techniques that work better in many circumstances (http://pmtoolsthatwork.com/your-average-is-powerful/) [...]
I understand the math, but the conclusion makes no sense to me. If the average (mean) time to complete one task is 7 days, and the standard deviation says 95% of tasks can be completed in 21 days, then my conclusions differ from yours:
1) One task can be completed 95% probability in 21 days.
2) 25 tasks can be completed very high probability in 25*7=175 days.
3) If you have 5 teams working in parallel, 25 tasks can be completed on average in 5*7=35 days. The standard deviation suggests there’s a good possibility it might take 50 days or whatever, but I don’t know how to calculate this.
How do you figure 21?
Alan,
Possibly the best way to see this is to graph your defects into a how-many-days-to-repair vertical bar chart: (number of defects y-axis, days to fix x-axis):
|
| *
| ***
| *** **
| ******* *
|*********** * *
————————-
You will generally see a somewhat skewed to the left rise, peak, and fall of defect repair times. So the fastest defects will be fixed in a day, the majority of the defects will show up being repaired in 5 to 7 days (as I said, a bit skewed to the left in the data we experienced) not as many will be repaired up through 14 days, tailing off through 21 days.
What this says is actual history shows that 95% of all defects are being repaired in 21 days or under. There are few, about 5% that will take longer (see note below in the PS on computing these numbers).
So you want to use your real data. Chart it and see what percentage of defects is falling in what interval of days. Chart it and count it. Don’t worry about the math or the notion of probability. These often only serve to get in the way of seeing what is really happening (I talk about this in the paragraph on “diet dilemma” – http://pmtoolsthatwork.com/get-schedule-right/).
Send me a pointer to a spreadsheet of your defect repair times and I’ll show you what I mean. All I need is a pair of dates for each defect: date reported and date fixed. Dates are best because repair capacity and performance will often vary over time (which is also useful to see). Often lumping in all defects found and fixed over, for example, six months will not be as useful as showing the defect repair rates for each month over the last six months.
Bruce
P.S. I use mean + 2 X standard deviation to find the point where about 95% of all defects are finally fixed. My Six Sigma Black Belt buddies cringe at this and say “but defect repair times are skewed and not normally distributed!” True, and so I chart out the times and count the actual number of days that account for 95% of all the defects being repaired. Somewhat humorously and to the chagrin of the BBs it turns out in our case that the mean plus 2 sigma (std dev) falls nicely around 90-95%. This was true for our data for the almost 10 years we did this. So often instead of saying 95% I’ll say something like mean + 2 X std dev tells us when “just about all” or the “vast majority” of defects will be repaired.
I don’t have a spreadsheet of times, but your site is inspiring me to build one. If you want to illustrate, maybe you can give me your data as an example.
Meanwhile, I think your missing my point (or my misunderstanding). You seem to be saying that if I have 1000 defects, I can complete 950 in 21 days and only have 50 left to repair. This is absurd, so you must be saying something else.
Alan,
“I have 1000 defects” is, I believe, the sticking point.
Did you get all of those today? Probably not. Are they all currently being worked on and are at various stages of being repaired? Hopefully!
At a Fortune 50 company where I used this technique, it was not unusual at peak testing to get 2000 defects reported over the span of a *month*. Two thirds of those would ultimately be known errors (duplicates) or configuration/test issues. That would leave about 660+ as real issues that needed code changes.
Time to resolve an issue (i.e., have a working solution) during these peak times would average 7 days. It would take 7 more days on average to get this fix integrated, tested and in an official product build. So this was 14 days on average to a complete repair (which is what I think you are referring to).
Ninety-five percent of the defects achieved completed repair within 28 days. This is taken from the actual repair history. This is *not* an estimate or probability. It is a statement of the characteristic of the total population of defects repaired for the period.
The fact we will use this number to anticipate the near term repair rate makes it an estimate but only for the defects that have not yet achieved repair completion. From my experience it can be a *very* good estimate.
This is 28 days from the *day* the defects *arrived*. If 60 came in today, then 20 would end up being things we needed to fix. Four weeks from today about 19 would have complete fixes and about one would still be outstanding.
So “1000 defects”, in an ongoing development environment, means 1000 defects reported over time and in various states of being repaired: from new today to completing being fixed today. We never got in 1000 defects in a single day or even a week.
What you want to do is chart the numbers for your environment and see what is your rate. What someone else can do or is doing is not too relevant except as an example (we had big products containing up to a million lines of code, and upwards of 450 software engineers working per month).
I’m thinking of doing an e-book/white paper on the technique with complete examples that folks could download from this site.
Bruce
Okay so this is the source of misunderstanding.
You are talking about many people working in parallel, and I’m talking about one team where defect fixing is one-at-a-time.
If we have a set of defects that take 30 days to finish, the question of how many are fixed after 10 days or 25 is a function of how we order them. If we do the hard defects first, everything else moves back, and your measure would suggest that the mean time to completion is 20 days. But if we do the easy ones first (and there are more of those), mean time to completion is 10 days.
If one defect came today, it would take 1 day to fix (on average). If 2 came today, one would be fixed today, and one tomorrow. If 10 came today, half would be fixed at the end of the week. But if there are more difficult problems, one tough bug might push back the entire schedule.
So it seems to me your projections simply don’t apply to my context.
Alan,
Yes, this technique especially helps to manage efforts where many engineers are repairing defects in parallel.
However, I’ve used it successfully in a small team of less than 10 engineers. We were still repairing defects in parallel and then integrating them together. Typically a release would have around 20 fixes in it.
I’ve never seen anyone not finally go “wow” after collecting and seeing how long it was taking to repair. This by folks who insisted they really did know how long things take or who said they couldn’t see how this could be useful information. The second biggest reaction is “no way, it can’t be taking that long!” This second reaction comes from teams that usually have a history of delivering software late.
Good luck and thanks.
Bruce
[...] [...]