10 min read

Using Fibonacci Story Points for Agile Estimation

Using Fibonacci Story Points for Agile Estimation
Photo by Kelly Sikkema / Unsplash

Estimating the complexity and effort required to complete work is essential to agile software development. Utilizing a story point system based on the Fibonacci sequence can offer an effective means of estimating complexity.

What are Story Points?

Story points serve as a unit used to estimate the overall complexity and effort needed to fully implement a user story or feature within an agile framework like Scrum. They correlate with complexity, effort, and risk rather than time.

The Fibonacci Sequence

The Fibonacci sequence is a series of numbers where each number is the sum of the previous two, starting with 0 and 1. The first several numbers in the Fibonacci sequence are:

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89...

Employing this sequence for story points offers a good scale of relative sizing while also accounting for uncertainty in estimation. The gaps between the numbers grow larger, reflecting greater uncertainty as complexity increases.

Estimating Complexity

In estimating with story points, the team discusses each user story and collectively assigns a story point value based on the relative complexity and effort involved. Some factors to consider:

  • Complexity of problem
  • Unknowns and risk
  • Dependencies on other stories
  • Writing Automated Tests
  • QA

Estimating story points instead of time keeps the focus on the complexity of the work rather than when it will be completed. Considering both the level of complexity related to the unknown and the level of effort in terms of the tedium it would take to complete can generally provide a good understanding of the overall effort required to complete.

Since the Fibonacci sequence allows flexibility in work estimation, it always leaves room for unexpected work. Therefore, find the smallest work understood when pointing and assign it a 3 or a 5. While there may be a desire to point the smallest understood work smaller, like a 1 or a 2, there may be work in the future that is even smaller. Also, as the numbers get smaller, they have a smaller difference in measuring the complexity and effort required to complete it.

If a ticket is a 1, is a 2 twice as large? What is the delta between a 1 and a 2? What about a 2 and a 3? Is 3 one and a half times as much effort? Again, the delta between numbers this low doesn’t provide enough understanding of the distinction between the work. Also, because the delta is smaller, there's an increased expectation that tickets sized at a 1 or a 2 should be identical levels of effort. A ticket sized a 3 and a 5 have a larger delta around them, which provides more flexibility in how much effort two tickets pointed at a 5 have.

For instance, a 5 has to be less than an 8 and greater than a 3. It may be similar to another 5, but it doesn't have to be exact. Because the goal of a story point is to measure complexity, there should be an element of the unknown in that estimation.

Leveraging the Cynefin Framework

A tool I often use when thinking about work is the Cynefin framework.

The Cynefin Co & Dave Snowden

In development work, we often make the mistake of thinking all our work is complex when, in fact, most of it is complicated. It's essential to know the difference between these types of problems. Below is the mental model I use when thinking about what category work falls into.

The Clear domain. I often describe this work as anything that can be taught to someone else in a few hours and can then be delegated to them. When thinking of development work, I think about copy editing. If for instance you have a task, that's to change copy on a website, or text on a button in a mobile application. Anyone who can use the tools at an entry level should be able to do this work. This work has the benefit of having best practices, and can be identified by sensing, categorizing, and responding.

Complicated work, however, requires a more precise skill set. It needs someone to have a trained skill set, and it cannot be taught in a few days; it could take years. 3D modeling, animation, programming, Architecture, etc, would fall into this category. This is also work that doesn't benefit from best practices, but has well-established "Good" practices. In other words, there may be a few ways to solve a problem using good practices, but no one best option. This is why we see the word categorize, replaced with analyze, as the main distinction between clear and complicated. The analysis piece requires knowledge and experience.

Moving on to Complex work, we see a larger shift in defining it. Complex work has exaptive or emergency practices. In other words, we don't know how to solve the problem, and there isn't a pre-defined approach to its solution. This is why the first step is to probe, as we need to run experiments to understand the work. This work will require fast feedback cycles, as learning quickly is the primary goal of this work. Based on what we see from the results of those experiments, we need to respond by offering a solution.

How does this map to Fibonacci numbers? This is where my mental model for pointing comes in.

If the lower end of the spectrum of "complexity" is clear, then I like to treat anything that is a 1, 2, or 3 as a clear type of work. In other words, this is work "anyone" could do, even though the development team is being asked to work on it.

Complicated work requires a developer, but we can leverage good practices to solve this work, so experimentation isn't really required. I treat this work as fairly well understood and known, and point this out with a 5, 8, or 13.

If the work isn't well understood, we're probably dealing with complex work. This work would be pointed as a 21 or higher. When we look at the delta between a 21 and -8 +13, we get a lot of room for ambiguity.

How to manage this work

When a team is faced with a truly complex ticket pointed around a 21, the first thing we want to do is figure out if we can split it and make it smaller. The reality is, we probably don't know enough about this to split this work up. This is where I like to run spikes or create discovery tickets. It's important we understand if this work is complex for us, or if it is generally perceived as complex in the industry. For instance, there could be good practices we're not aware of, and are mischaracterizing this work as complex when it may be complicated.

If we can't find any known good practices for this work, then we truly have something complex and need to run experiments to identify potential solutions. This will require short feedback cycles with feedback from critical stakeholders to determine if we're providing a solution to the problem. This will also increase our learning and understanding of the problem through each iteration, eventually moving the actual work into a complicated state as we define a good practice for dealing with this type of problem.

Benefits of Story Points

Here are some benefits of using story points over time estimation:

  • Abstract from time - Story points measure complexity, not the time it will take. This removes the tendency to pad estimates to be safe.
  • Promote discussion - Assigning story points leads to a greater discussion of scope, unknowns, and effort needed. The focus stays on the complexity of the work.
  • Avoid false precision - Time estimates try to predict an exact number of hours/days despite inherent uncertainty. Story points use a scale like Fibonacci to show the ambiguity.
  • Separate estimation from commitment - With time estimates, if a task goes over the estimated time, it is seen as "late." Story points estimate complexity, while the team commits to a number of points per sprint.
  • Velocity metric - Velocity tracked in story points shows how much complexity a team can handle per sprint. This aids in planning and estimating future work.
  • Uncouple from individuals - Story points measure the complexity of the problem rather than a person's specific skill/speed.
  • Allow for variation - Story Points account for higher degrees of uncertainty as the numbers get larger, but they also provide a larger delta between the numbers around them. This means that when two similar tickets are sized an 8, they don’t necessarily have to be equal. They’re more complex than a 5 but less than a 13. This provides a mathematical buffer when calculating a team's velocity.

As a rule of thumb, when a story point is greater than 13, it’s a strong indicator that the work isn’t well defined and the level of uncertainty is too high. Seeing story points of this size indicates that the story needs to be split. When splitting, the values don’t always have to be summed to equal the original estimate. One benefit of splitting stories is that it becomes possible to be more specific about what should be done as part of that story. For instance, splitting a 13 in two may yield a 5 and an 8, or two 8s, or even two 5s. Remember, the individual stories should again be pointed in relation to other stories of similar size.

Story Points Aid with Prioritization

Once all the stories are pointed and split, evaluating which stories will likely put the sprint at risk becomes easier. Determining which stories are required to achieve the sprint goal is also easier.

Size Tasks Relatively

User stories and backlog items should be pointed relative to each other based on complexity. If the team has completed a task of similar complexity to a new task being pointed, use that history as a reference point. This doesn't mean things can't shift as the team gets more comfortable pointing, but historical context can help.

Understand Workload

When planning a sprint, the team can sum story points for tasks to understand the workload and ensure they aren't overcommitting. Knowing a team's velocity and adjusted velocity helps ensure the team isn't overcommitting. Typically, a team should plan up to their adjusted velocity so there is a buffer left for unplanned work. Knowing how many points each sprint a team spends on unplanned work can help the team better estimate and plan for future distractions and interruptions.

Balance Workload

The team can look at story point totals by discipline and balance workloads. If you have a cross-functional team of front end and back end, you can identify how the point distribution is between those two disciplines. If you plan up to your adjusted velocity but all the work is focused on the front end and none on the backend, it's probably a good indicator the team won't complete the sprint. This can also indicate a need for cross-training, building full-stack developers, or re-assessing hiring needs. Depending on the number of teams, this may also indicate a need for a change in team composition and re-assigning individuals to other teams where they can provide more value.

Velocity Tracking

Velocity, or the rate of story point completion per sprint, guides future sprint planning and prioritization. As mentioned above, it's important to identify the completed points spent on value-added activities and non-value-added activities like re-work. Measuring the points spent on non-value-added activities will give the team a buffer. Subtracting that from the total velocity will give the team their adjusted velocity and a good limit for sprint planning. This may fluctuate quite in the first few sprints as the team gets comfortable with story-pointing, but it should average out fairly quickly.

Identify Process Issues

If story points completed constantly differ from those planned, it could indicate an issue with estimation or planning. It's important to point out everything, not just user stories. One advantage of story points is that it creates a way to measure the capacity and throughput of a team. A team can determine how much effort they spend on bugs and rework by pointing out everything.

Examining story point sizes helps guide where to focus effort, how to balance workloads, and how to prioritize and sequence work. Story points add important context to the relative complexity of work for teams.

Here are some tips to help estimate story points accurately for tasks of varying complexity:

  • Frame a reference story - Estimate a simple story as 3 or 5 points to set a baseline. Use it as a reference to size other stories.
  • Consider all aspects - Factor in front-end, backend, testing, documentation, etc. A story may be more complex than it first appears.
  • Estimate by comparison - Estimate new stories relative to others already given points rather than in isolation.
  • Re-estimate when needed - If a story turns out more complex than initially thought, re-estimate it with the new information. Use this information to re-evaluate the success of the sprint.
  • Track velocity - Look at the team's velocity (points per sprint) to improve estimating and know how much work can be done.
  • Break large stories down - If a story is too complex to estimate, break it into smaller "bite-sized" stories to estimate individually.
  • Avoid false precision - Use the Fibonacci sequence scale (1, 2, 3, 5, 8, 13) rather than sequential numbers to account for uncertainty.
  • Estimate conservatively - When uncertain, round the story points up. It is better to estimate high than low, as low estimates can increase the risk of not completing sprints. Remember to constantly re-point stories if learning indicates the story is smaller or larger than initially estimated. At any point, pulling more work into a sprint is possible. Removing work is also possible but requires another mini-sprint planning session to identify what items to remove.
  • Improve over time - Estimating is a learned skill. Accuracy will improve with practice across sprints.

Accurate story point estimation requires experience and collaboration. Allow time for the team to improve estimation skills together. The goal is to reflect relative complexity, not to predict the exact time needed.

Once a team can point stories consistently using the Fibonacci sequence, they can plan their sprints more accurately, have less need to work overtime, and be happier overall.

This also unlocks future capabilities such as:

  • The ability to do large effort estimations, where a team can now estimate an entire project or initiative.
  • Identify how much effort is spent on waste in the system.
  • Calculate an accurate “buffer” and have an adjusted velocity that captures the amount of value-added work completed each sprint.
  • Identify process improvement opportunities by focusing on reducing the amount of unplanned work in the system.

Other thoughts on managing complexity in software development

When looking at the Cynefin framework, we can see how our work can be distilled into taking things perceived as complex and moving those things into the complicated domain. This is where innovation happens and where we see the largest value-added from software development. However, we also see this occur when we take complicated work and move it to the clear domain. This is typically done by adding layers of abstraction on top of complex tasks to simplify the job and create a best practice for solving this work.

When thinking about software development in that way, we can then leverage things like Wardley Mapping to identify where a strategic advantage may be in doing the work to build abstraction layers and move complicated work into clear. Think of things like AWS, which took highly complex infrastructure work, and added layers of abstraction where someone with a few hours of reading can set up an auto-scaling elastic container environment. Something that would have been extremely difficult to do before services like AWS existed. We are also seeing this happen on a large scale with large language models and AI agents today.