Sprint Planning for AI Projects - Getting It Right
How to run effective sprint planning sessions for AI and ML teams, covering estimation techniques, capacity planning, and handling research uncertainty.
Sprint planning for AI projects is where methodology meets reality. Standard sprint planning assumes work can be estimated with reasonable accuracy and completed within the sprint. AI work includes experiments that might take two hours or two weeks, data dependencies that surface mid-sprint, and training jobs that fail at hour eleven. Effective sprint planning for AI teams addresses these challenges directly.
Before the Meeting
Preparation is more important for AI sprint planning than for typical software sprints:
Data readiness assessment. The tech lead or data engineer should confirm that all data required for planned work is available, accessible, and in the expected state. This check prevents the single most common sprint failure mode: committing to model work when the data is not ready.
Compute resource availability. Check GPU cluster utilization, scheduled maintenance windows, and any competing workloads. If the team plans to run large training jobs, confirm that resources will be available.
Experiment backlog grooming. Research stories should arrive at planning already structured with a hypothesis, evaluation method, and time cap. Grooming these during planning wastes the entire team’s time.
Stakeholder input on priorities. The Product Owner should have clear priority rankings before planning. AI backlogs often have competing priorities (improve accuracy vs. reduce latency vs. add new data source), and resolving these during planning is inefficient.
Estimation Approaches
Story points work poorly for AI research tasks because the variance is enormous. A team might estimate a story at 5 points and it takes 1 point of effort (the first approach worked) or 20 points (nothing worked and they had to pivot three times). Several alternative approaches work better:
Time-boxing. Instead of estimating how long research will take, allocate a fixed time budget. “We will spend 3 days exploring transformer architectures for this task. At the end of day 3, we decide whether to continue, pivot, or abandon.” This converts unbounded research into bounded sprints.
T-shirt sizing with confidence levels. Estimate the likely effort (S/M/L/XL) and pair it with a confidence level (high/medium/low). A “Medium effort, low confidence” story signals that the team should plan for it to be larger than expected.
Reference class estimation. Compare the planned work to similar work the team has done before. “The last time we added a new feature to this model, it took 4 days including evaluation. This is similar in scope.” This works well for teams with history on a project.
Spike and commit. For highly uncertain stories, plan a time-boxed spike in the first half of the sprint. At mid-sprint, the team reassesses based on spike results and commits to specific deliverables for the second half.
Capacity Planning
AI teams need modified capacity planning:
Account for operational interrupts. If the team also handles production model issues, reserve 15-25% of sprint capacity for unplanned work. Track actual interrupt load across sprints and adjust this buffer.
Factor in training time. Long-running training jobs keep work “in progress” but do not consume team attention. A 24-hour training run is not a full day of work - the engineer can context-switch to other tasks while waiting. Plan accordingly but recognize that context switching has its own cost.
Balance research and engineering. Explicitly allocate capacity between research and engineering tracks. A sprint that is 100% research produces no shippable increment; a sprint that is 100% engineering makes no model progress. Aim for a ratio that reflects the project phase.
Leave slack. Do not plan to 100% capacity. AI work has higher variance than software engineering. A sprint planned at 80% capacity is more likely to succeed than one planned at 100%, and the remaining capacity absorbs surprises.
The Planning Meeting
Structure the planning meeting in two parts:
Part 1: Sprint goal (15 minutes). Define one or two measurable goals for the sprint. “Achieve 0.88 F1 on the document classifier” or “Deploy the model serving infrastructure to staging.” The goal gives the team a north star when mid-sprint decisions arise.
Part 2: Story selection and tasking (45-60 minutes). Pull stories from the prioritized backlog until capacity is filled. For each story, confirm the definition of done, identify dependencies, and break into tasks if needed.
For research stories, explicitly discuss:
- What is the hypothesis?
- How will we evaluate the result?
- What is the time cap?
- What is the decision process if results are inconclusive?
For engineering stories, standard sprint planning practices apply. Identify dependencies, estimate effort, and assign owners.
Handling Mid-Sprint Changes
AI sprints are more susceptible to mid-sprint disruption than software sprints. Plan for it:
Pivot criteria. Define upfront what would cause the team to change direction mid-sprint. “If the data quality assessment reveals more than 20% missing values, we pivot to data cleaning instead of model training.”
Decision authority. Who can authorize a mid-sprint scope change? For AI teams, the ML lead or tech lead should be able to redirect research effort without waiting for a formal process, while scope changes to stakeholder-facing deliverables go through the Product Owner.
Communication protocol. When a pivot happens, notify the team and stakeholders immediately. Update the sprint board to reflect the new plan. Do not let pivots happen silently - they erode trust when stakeholders discover at sprint review that the plan changed.
Anti-Patterns
Planning without data readiness. Committing to model training when data has not been validated. Always verify data before planning dependent work.
Overcommitting on research. Filling the sprint with uncertain research stories and planning no engineering work. Ensure at least some sprint capacity goes to predictable deliverables.
Ignoring the evaluation plan. Planning model development without planning model evaluation. You need a test dataset, evaluation scripts, and baseline metrics before you can declare a model improvement “done.”
Treating all work as equal size. Using a flat story point scale for both “add a logging statement” and “redesign the feature pipeline.” Use different scales or time-boxing for different work types.
Effective sprint planning for AI is about honest assessment of uncertainty, explicit handling of dependencies, and leaving enough flexibility to adapt when experiments produce unexpected results.
Need help implementing this?
Turn this knowledge into a working prototype. Our structured workshop methodology takes you from idea to deployed AI solution in three sessions.
Explore AI Workshops