In my previous blog post, I wrote about data-driven experiments. Unfortunately, it lacked any practical application, and today I want to do something about that. But let’s get one thing straight first. This isn’t my favorite topic. Why? It focuses so heavily on data and so little on humans. When we get too wrapped up in data, we lose sight of the human dimension, and that’s bad. Never lose sight of what matters. That’s not you. That’s not data. It’s your teams. Keep this in mind when you run data-driven experiments.
Before I show you some of my own experiments, let me explain the questions I ask myself when building one:
- What problem (or behavior) are we attempting to solve (or highlight)? This is the heart of your experiment. Never lose sight of this. We will, and when we do, immediately bring it back to center as often as necessary.
- What data can we collect that addresses the question above? Collect this data and no more. The more data we collect, the more time it’ll take, and we risk muddying the focus. We also risk inundating our teams with too much data so it’s imperative pare it down to the absolutely minimum.
- Who will we share this data with? Honesty in your data is important. If the team feels that they will be judged by those in power, they may unconsciously game the system and manufacture the data we want or expect to see.
- How can we create simple view into my data? The simpler, the better. Consider the burn down chart. With just a few simple data points, it can inspire a great conversation. It’s these conversations that make for a powerful experiment.
Work In Progress (WIP) Experiment
Multi-tasking has a cost, and it’s often a cost that many teams overlook. In fact, this was one of my top 10 tips in a previous blog post. While I was a scrum master for three teams, I wanted to highlight how much work each team had open on every day of the sprint. Once a day and at the same time, we recorded the percentage of stories in progress by each team and created the graph you see below. Notice that I also included an “optimal” range. This range isn’t based in any science. It’s simply where my gut told me our teams should be. Here’s what they looked like after the first sprint of the experiment.
Here they are after the next sprint.
Notice how just the act of observation alone helped reduce multi-tasking. It’s worth noting that with this experiment we weren’t attempting to adjust their behavior whatsoever. We simply wanted to highlight what percentage of stories were open for every day of the sprint. Of course, they knew what behavior I hoped to create, but I underemphasized this as much as possible during the course of the experiment.
Finally, here’s the results of the entirety of the experiment.
Some final thoughts on this experiment:
- As previously mentioned, this experiment wasn’t intended to change any behaviors but to highlight an existing behavior.
- Because this experiment wasn’t crafted to adjust behaviors, we were less concerned about creating any unintended side effects. I still made a point to underemphasize my intentions and instead focused them on the data so they could reach their own conclusions.
- Teams were also interested in seeing how other teams compared so we shared the data across all my teams.
- Data was not shared with management or executives. This wasn’t for any particular reason. They supported the experiment, but the data wasn’t especially interesting for those not on a team.
This experiment is more complicated than most and is certainly time intensive. Let’s talk about what inspired this experiment:
- There was a great deal of churn in the sprint backlog in every active sprint.
- This change of the sprint backlog was done by the product owner with little to no input from the team.
- Logically, teams understood work shouldn’t roll over from one sprint to the next, to the next, and so on, but that’s what consistently occurred.
- This constant change in priorities due to a changing sprint backlog was becoming a burden on the team.
For this experiment, we asked team members how confident they were that we’d complete everything in the sprint backlog before sprint end. We did so every day, and we did so privately as to not bias other team members. We also recorded how the sprint backlog changed on a daily basis to determine if there were any correlations between the change in the backlog and the team members’ confidence.
Below are some charts from the experiment. This first chart is the team’s confidence for each day of the sprint. Because we feared team members would feel compelled to report near-100% confidence, we masked their names during the course of the experiment. I found out later that only a handful preferred the anonymity, but it did emphasize to the team how important honesty was in their confidence values.
If a team member reports 100% confidence, then it means s/he believed that 100% of the story points in the sprint will be complete by sprint end. If a team member reports 60% confidence, then it means s/he believed 60% of the story points will be completed by sprint end.
I’m sure you notice the drop in confidence on day 2. The team ran into a snag on the 2nd day of the sprint, realizing they were blocked by another team. This was an important teaching point for the product owner since he later realized he could have foreseen the blocker, and this experiment helped him quantify the impact on the team. Here’s another graph showing a few pieces of data:
- (Blue) Average confidence by day of the team.
- (Orange) Percentage story points in the sprint by day. If over 100%, then this means work was inserted into the sprint after sprint start. If under 100%, then work was removed from the sprint after sprint start.
- (Gray) Percentage of work complete. If equal to 100% at sprint end, then all sprint work was completed by the team.
Some final thoughts on this experiment:
- The team did a subpar job completing its sprint backlog in the graphs above. Today, this same team hovers between completing 80% to 95% of its backlog every sprint.
- We ran this experiment for a total of 3 sprints.
- In retrospect, I wish we had explored other ways to represent the data since I don’t feel we did the data justice.
- Teams were excited to see the outcome of this experiment. I believe they enjoyed the act of reflecting on their own confidence in completing their sprint backlog and comparing it to others on the team.
- The conversations we had while looking at the charts were passionate and informative. It also helped us make some impactful changes to better predict our capabilities so we considered it a tremendous success.
- It helped highlight to the product owner group the importance of gaining team buy-in any time an adjustment to the sprint backlog is made while simultaneously highlight how often we were adding work to active sprints.
- Teams began owning their sprint backlogs to a greater degree and leaning on product owners any time they asked to make a change.
- Executives were surprised to see such low confidence values. I feel they were undervaluing the importance of collective ownership and team buy-in when it comes to the sprint backlog.
Each experiment we’ve crafted is different, and we’ve never repeated the same one twice. With each, we started from a blank piece of paper and asked ourselves the questions that began this blog post. We didn’t draw on experiments from others in the community, and I provide the examples above not to be recreated but to inspire something that’s uniquely yours. Find data that your teams will enjoy. Measure it. Visualize it. Finally, talk about it with the team. Use data as a tool to better your teams, and never use it as a weapon to control or judge.