Wednesday, 16 October 2019

How to choose a useful measure of incremental progress for your team

Recently I had an interesting call with a senior QA leader. He reached out to me He wanted to get a better sense of how his people are doing, both as a functional unit and individually. Primarily, I suspect he wanted to be proactive, and have some kind of a numerical early warning system in place, which he could cross-reference with common sense and qualitative input he got elsewhere.

As we spoke, he kept using the term "velocity" initially; however, it became clear that he meant velocity in a much looser sense than the typical iterative scrum/agile sense. It doesn't really work for what he wanted to achieve.

Here's what I mean:

Core metrics to baseline progress iteratively

What is velocity anyway?

Velocity itself is first and foremost a team output metric, not an individual one. It is a measure of story points completed over a unit of elapsed time.

It gives visibility on whether the product development team is functioning effectively--as a system for generating new features. In this context, new features are what the customer is expected to value most, so we track only that. It is not an efficiency measure, and shouldn't be confused for one. Traditionally this approach came from a software development environment, but can be applied anywhere there is significant complexity and thought required. Knowledge work.

These story points are the primary "raw material" to generate estimates relative to a goal or target date. Once you have a sense of:

  • who you're building it for and why
  • what you want to build, i.e. the actual stories defined
  • and you have estimated the stories using story points

then the dance around the iron triangle begins.

When the product or project work starts, you keep track of how many story points are completed over time. You use to improve future planning. Usually this works in "sprints", which are predetermined lengths of time, as a way to plan and track progress. For example, in the popular flavor of agile called scrum, these will typically last 1-4 weeks.

Realized velocity

Let's use 2 weeks as an example. The newly formed team has started working on a new product or project. The backlog of items is defined and estimated for the absolute "must have" features.

At this point, if you're being completely transparent, you don't know how fast the team will actually go. You can also negotiate what exactly is "must have" to help reduce the time required (less work, done faster). And ideally you'll also all agree on a quality standard that everyone is ok with--which will also have schedule implications (higher bar takes more time per feature on average). So your initial realized velocity/sprint is 0, and you have a guess as to what the expected velocity will be.

You agree (with the team) which stories will be accomplished in the first sprint. And after 2 weeks, you sit down with the team, and compare what actually happened with what you'd hope would happen. At this early stage, there are likely to be a lot of learning outcomes in general, as it's a new effort. But among other things, you can add up the story points completed by the team. This is your first realized velocity.

Expected velocity

After 3 sprints, you should start to see some kind of a trend to emerge in terms of an average velocity. Sometimes it's worth giving the team the benefit of the doubt, as they might pick up the pace once they get their collective heads around what needs to be done.

Usually this number will be significantly different than your expected velocity for the dates you'd like to hit. If you calculate the total story points needed for the "must have" initial release, and divide it by the realized velocity so far. To simplify the thought process, assume it will stay fixed.

This gives you a sense of how many sprints of work will be needed to hit that final date. Usually, there will be a gap between what's happening vs. what's expected. It's best to know this as early as possible. In fact, this transparency is one of agile's strengths. It's difficult to sugarcoat reality, if you see what is being delivered. Moreover, you also see how many initially estimated story points of cognitive effort were realized.

Warning: This type of analysis can cause some healthy consternation and discussion. This is intended. Using this performance data, you can re-prioritize, change resourcing levels, change scope, or whatever else you think might help the team at that stage.

Expected velocity is the ideal pace you'd like to keep, in order to hit your business goals. Often, in more traditional environments, this will be expressed in terms of a target release date. But it can also be in other forms, depending on what's actually important to the business as a whole.

The core difference between realized and expected velocities is their time orientation. The former measures the velocity trend in the recent past. The latter is more of a business requirement, translated into a number. Expected velocity is a practical way to "have a relationship with your target date". This is a metric which translates longer term expectations into an early warning system checked regularly. When compared to your realized velocity, you'll know whether or not your teams are going too slow to hit your dates.

Cycle time

Cycle time comes from a lean background. It's a measure of how long it takes to build one unit of output. In practical terms, it's a measurement of the elapsed time from the start to the end of your production process.

= time(end of process) - time(start of process)

It includes both the actual time spent working by the team, but also all of the wait time in between steps of the process.

Unlike story points, the unit of measurement is time. This is probably cycle time's greatest strength. Time can be subject to arithmetic, statistics like mean and standard deviation, even compared across various aggregations (e.g. among QA team members). It's also less subjective, as there is not estimation required up front. It's just measured continuously. It gives you a sense of what's been happening. And how healthy your process is.

Now for the downsides. Cycle time implicitly assumes:

  • that the units of output are pretty standard, uniform, and therefore of similar size
  • when aggregated, that there is no difference between types of work. For example, building new features and fixing bugs in already built features doesn't take the same amount of time.
  • that there is no goal. It only measures efficiency not effectiveness

Cycle time works well, as a metric, in software for two scenarios:

  • When stories aren't estimated but just all broken down to be a maximum expected length of 2 days per story for example.
  • When working on maintenance development, where general process monitoring is needed so that extremes can be investigated but where time pressures tend to be issue & person specific and not team-wide

Takt Time

Takt time operates within a similar framework to that of cycle time. However, instead of measuring what has been happening, it's used to quantify expectations so that they can be continuously monitored.

In a nutshell, takt time measures the slowest expected rate at which you need to complete production processes in order to meet customer demand. It's calculated as

=net production time / total output needed by customer

There are a few numerical examples over here, if you want to take a peek.

Anyhoo, there are a number of really helpful attributes of takt time. It expresses expectations numerically, in terms of how much time should be spent on each item in order to hit a target output. For example, if takt time is 10 minutes, evety 10 minutes you should be pushing out another unit. If you are faster, great! If not, you need to troubleshoot and improve your production process, resources, or context.

The "total output needed by customer" can be measured in just units, e.g. number of stories. This way you don't need estimation and estimation won't introduce subjective bias.

Like expected velocity, it gives the team a number to help establish an operational relationship with a longer term goal or target (that has business meaning). In the moment.

Isn't this all a bit abstract and self-referential?

Yes. It is.

The primary measure of progress in an agile framework is "working software". Or to be more general, demonstrably completed work. It's demoed for everyone to see and comment, and should be done in a generic way so that anyone can participate (i.e. not only people with PhDs in Computer Science). Anyone should be able to see the new features working.

That said, not everything is software. And not all software has a user interface. So it's a bit harder to apply this, particularly in the early days of a new product.

In that case, you can use these metrics to monitor effectiveness and efficiency. You can hold both yourself and the team accountable. You have a numerical framework to deliberate with stakeholders, one that can be checked at any given moment, where you don't need to "check with the team" every time someone wants an update. And like the senior QA manager above, you can use this as a proactive early warning system. If one of a number of efforts is going off the rails, and you oversee a number of them, you'd naturally want some way of knowing that something is off.

So that's the menu. Which one to choose?

It depends where you are in your efforts, how much time you want to spend on estimation itself, and how much you need to make comparisons.

Where you are in your efforts:

Early on in a project, you have a lot of unknowns. They tend to be interdependent. For example, in order to give a date estimate, you need to agree on what you're building, and how you're building it. That might depend on the market segmentation or main business goals you want to achieve, which also might need to be negotiated. And if you tweak any one of these, all the rest are also affected.

At this point, if you add technical estimation with story points for granular tasks the mix, you expose even more uncertainty to the whole thing. You might be better off delaying story point estimation. And just use cycle time until you have a clearer picture. This way, you maximize the team's time on delivering actual work, rather than on estimation under conditions of high uncertainty, and both business and technical complexity.

Once you get to a stable team and vision and roughly stable scope, it might be worth doing some estimation and prioritization of the bigger epics. Follow this with the breakdown (into stories) and estimation of the highest priority epic or two. If your initial scope is very large, you'll spend a lot of time estimating something you don't really understand very well yet (yet another reason to be deliberate and precise with your initial release).

How much time you want to spend on estimation & monitoring:

This is a more general question about the ratio of time spent doing vs. monitoring the work. Estimation is a tool to help you monitor and measure the work. Ideally, it's good to do some estimation, so that you can slot in work tactically. In particular, it's most useful when considering the business value generated and comparing it to the amount of work required to complete it.

But estimating out a year's worth of work, especially if there are no releases to customers during that entire period--that's a notch short of madness. Ideally your releases should be tight and getting feedback both from individual customers and also the market as whole.

How much you need to make comparisons:

Like in the example opening this blog post, if you want to measure and compare individual or team efficiency, then cycle time is easily comparable. This is because the "denominator" is the same in all cases: elapsed time:

  • You can compare cycle time across various team members, ideally if they are doing similar work, for example QA.
  • Also you'd be able to compute averages to compare between teams, i.e. QA across different teams.
  • Standard deviation in cycle time can also be useful to figure out what is truly exceptional, so that you diagnose and troubleshoot (if bad) or repeat (if good)

Next steps

That should hopefully give you enough to get started. The next step is choosing which is most relevant for you, and figuring out how to gather the raw data from internal company systems. Ideally, this is done automatically & behind the scenes using software, so that your teams don't need to enter data manually, esp. time spent.

Key Takeaways

  • Velocity is a team based output metric that tracks story points completed over time.
  • Estimation can improve accountability and prioritization, but it costs time and is subject to bias.
  • Keep customer facing releases small, as this will improve your accuracy and estimate variability.

No comments:

Post a Comment