Copilot Workspace — Hands-On with Agentic GitHub Copilot
Copilot Workspace takes GitHub Copilot from in-editor suggestions to multi-step agentic task execution. Here's what it actually does, how it works in practice, and where it fits alongside Claude Code.
Everything I’ve covered in Arc 3 so far has been about Copilot as an in-editor assistant — completions, chat, slash commands, context variables. Today’s post is about something different: Copilot Workspace, which is GitHub’s move into agentic territory.
This is where Copilot stops being a suggestion engine and starts being something closer to an agent.
What Copilot Workspace Actually Is
Copilot Workspace is accessible from GitHub.com — not VS Code. You start with a GitHub issue or a task description, and Workspace generates a plan for implementing it: what files need to change, what changes to make in each file, and why.
You review the plan, can edit it, then Workspace executes it — making the actual code changes across your repository. The result is a branch with the implementation ready for review.
The workflow:
- Open a GitHub issue (or describe a task)
- Copilot generates a plan — a structured breakdown of the implementation
- You review and refine the plan
- Copilot executes the plan and produces code changes
- You review the resulting diff, test it, and open a PR
It’s the observe-reason-act loop I described in Day 3, applied to GitHub issues. The goal is the issue. The plan is the reasoning step. The execution is the action. Your review is the human-in-the-loop checkpoint.
What It Does Well
Well-defined, bounded tasks. Bug fixes with clear symptoms and isolated causes, adding a field to an API with clear requirements, updating a dependency and fixing the resulting breakages — these are the cases where Workspace produces useful output reliably.
The tighter the scope and the clearer the requirement, the better the plan. Issues that say “users report 500 errors on checkout when shipping address has an apartment number” get better plans than issues that say “improve the checkout flow.”
Navigating unfamiliar codebases. For contributors who don’t know a repository well, Workspace’s planning step surfaces the relevant files and the relationships between them. Even if you don’t use the generated code, the plan is a useful map of where the relevant code lives.
Standard CRUD patterns. Adding new endpoints, creating new models, adding validation — these follow patterns that Workspace handles well because they’re consistent and well-represented in training data.
Where It Falls Short
Complex logic with implicit business rules. When the correct implementation depends on understanding domain context that isn’t in the issue or the codebase — pricing rules, business process requirements, edge cases specific to your industry — Workspace generates plausible-looking code that may be subtly wrong. The plan looks reasonable. The implementation misses the nuance.
Cross-cutting concerns. Changes that touch how a system works at a deeper level — new authentication patterns, observability changes, security-relevant modifications — often require judgment about implications that Workspace doesn’t have. It implements the surface change without understanding the system consequences.
Large, complex issues. Issues that are really multiple tasks rolled into one produce plans that are either too broad (missing important detail) or too narrow (correctly planning one interpretation while missing others). Complex work should be broken into smaller issues before handing to Workspace.
The last 15%. The generated code usually needs adjustment. It’s almost right in ways that require understanding — choosing a specific library over another, matching a project-specific pattern, handling an edge case specific to your data. You’re editing a near-complete implementation, not accepting a finished one.
Copilot Workspace vs. Claude Code
Both take on multi-step coding tasks. They’re designed for different contexts and compose well.
Copilot Workspace is GitHub-native. It works with your GitHub issues and creates branches and PRs in your GitHub workflow. The integration is seamless if you live in GitHub. The context is your repository.
Claude Code is conversation-native. It works in a chat session where you can provide deep context, ask follow-up questions, explore approaches iteratively, and steer the direction mid-task. The integration is wherever you have a Claude Code session.
Where I use each:
- Workspace for well-scoped, clearly-described GitHub issues on codebases where the changes are mechanical and follow clear patterns
- Claude Code for complex tasks that require iterative reasoning, or for tasks where I need to explain the context, constraints, and tradeoffs as I go
They’re often sequential in the same workflow: Claude Code to think through the approach and validate the design, Workspace to implement the well-defined changes that result.
The Review Step Is Not Optional
The biggest mistake I see with Workspace: treating the generated code as done rather than as a draft.
The plan step is where most of the value is — seeing how Workspace interpreted the issue and correcting misinterpretations before they become code. Spend time here. Edit the plan. Make sure it’s implementing what you actually want before executing.
Then review the resulting diff as carefully as you would any other PR. The fact that AI wrote it doesn’t mean it’s correct. It means it’s a competent first draft that still needs a human to verify it does what the issue requires.
Where It’s Heading
Workspace is early. The current version is useful for a specific class of tasks and less useful for others. The trajectory — deeper codebase understanding, better multi-step reasoning, tighter integration with the rest of GitHub — points toward something that will handle a broader class of work reliably.
For now: use it deliberately for the tasks it handles well, review carefully, and don’t mistake confident-looking output for correct output.
Day 18 of the 30-Day AI Engineering series. Previous: Copilot for PR Descriptions, Docs, and Commit Messages.