Post

AI-First Sprint Planning and Task Breakdown

Sprint planning in an AI-first team looks different — estimation changes, task decomposition changes, and the relationship between a ticket and its implementation changes. What actually works and what to watch out for.

AI-First Sprint Planning and Task Breakdown

Sprint planning is where the AI-first team’s changed capabilities either get accounted for or get ignored.

Most teams that adopt AI tools leave their planning processes unchanged. The consequence: engineers complete work faster than estimated, the sprint ends early, and the team tries to pull in more work mid-sprint. This looks like success. It’s actually a planning gap.


How Estimation Changes

Traditional estimation accounts for: understanding the requirement, designing the approach, writing the code, writing tests, code review, and integration. AI changes the middle three significantly — design, code, and tests can all be accelerated.

The part AI doesn’t change: understanding the requirement. For ambiguous or complex requirements, the investigation and clarification work is still slow and fully human. AI accelerates implementation of a clear spec; it doesn’t help with clarifying an unclear one.

Practical adjustment: when estimating tasks in an AI-first team, compress the implementation estimate but don’t compress the requirement clarification estimate. A task that’s clearly specified can be estimated 40–60% smaller than pre-AI. A task that requires significant requirement investigation is mostly unchanged.

The risk of uniform compression: engineers start estimating everything smaller, including ambiguous tasks. Then they hit the requirement-clarification wall and the sprint slips. Calibrate by task clarity, not uniformly.


Task Decomposition

AI changes what granularity of task is worth creating.

Before AI: very small tasks (under two hours) were often not worth creating individually because the ticket overhead wasn’t worth it. Engineers would batch them or handle them informally.

With AI: very small tasks are exactly the kind AI executes quickly and reliably. A well-specified two-hour task might take 30 minutes with AI. The ticket overhead is the same; the value per ticket is higher.

The practical implication: decompose more. Break epics into smaller, more precisely specified stories. Each story should be clearly specifiable — because clear specification is what makes AI output good. Vague tickets produce vague AI output.

The spec is the leverage point. A story that says “add pagination to the user list endpoint, return 20 per page, use cursor-based pagination with a next_cursor response field” is something AI can execute reliably. “Improve the user list endpoint performance” is not.


The Definition of Done Changes

In a traditional team, “done” usually means: implemented, reviewed, tested, documented, merged.

In an AI-first team, “done” needs a verification dimension: AI-assisted output has been reviewed to the appropriate depth for the task. This needs to be explicit, because it’s easy to let AI-generated code through review without the verification it needs.

I’ve started adding to DoD for AI-assisted stories: “AI-generated code reviewed for correctness at the logic level, not just style.” This sounds obvious, but making it explicit prevents the pattern where AI output gets rubber-stamped in review because it looks clean.


Using AI in the Planning Meeting Itself

One underused application: using AI to help with task decomposition during planning.

The pattern: give Claude Code (or Copilot Chat) the user story and ask it to identify edge cases, suggest subtasks, and flag technical dependencies. Use the output as input to the planning discussion — not to replace the discussion, but to make it more thorough.

This works well for stories involving significant technical work. For pure process or requirement stories, it’s less useful.

What it catches: edge cases the team would have missed, dependencies that aren’t obvious from the story description, technical tasks that don’t show up in acceptance criteria.

What it doesn’t replace: the team’s architectural judgment about approach, the knowledge of what’s already built that makes some approaches better than others, and the estimation calibration that comes from having built things before.


Capacity Planning

One more change worth naming: capacity planning in an AI-first team is different because velocity is harder to predict.

Traditional velocity: relatively stable. Engineers know what they can deliver in two weeks.

AI-augmented velocity: more variable. It depends on task type (AI helps a lot with some, less with others), on how well-specified the tasks are, and on the team’s current AI proficiency. Teams early in adoption have high variance.

The practical response: run slightly more conservative sprint planning until you’ve accumulated enough data to know your actual AI-augmented velocity. Over-committing in the first few sprints after AI adoption is common and causes the “we always do this” retrospective complaint, when actually the cause was underestimated velocity.


Day 7 of the AI-First Engineering Team series. Previous: Measuring Progress — What Metrics Actually Matter for an AI-First Team

This post is licensed under CC BY 4.0 by the author.