As I have been exploring deeper automation, I have found that orchestrating agents in more deliberate ways can be more useful than simple prompting. Which is obvious but the way to go about it is interesting.
It starts pretty simple. Call out to a model, send a prompt, and get a response back.
This is probably where most agent workflows begin, but as soon as we want to do more complex work, the broad prompt starts to feel inadequate. "Review this code" can mean many different things.
Break Into Specific Reviews
The next move is to split the broad prompt into more specific review prompts. Instead of asking for a general review, we ask for a specific lens.
This helps the agent focus. Each turn now has a specific job it is meant to perform, instead of one broad instruction trying to cover every kind of review at once.
The Context Problem
The problem is that all of those investigations are now happening in the same context. The thread contains the diff, the concurrency notes, the design notes, the story notes, and the synthesis.
The more specific the reviews become, the more the main context fills up with intermediate work.
Introduce Sub-Agents
Sub-agents are awesome because each agent gets its own context as well as one specific goal to focus on. It can go deep on that goal without filling the main thread with every detail.
This is similar to how teams are built today. Design, QA, product, and engineering all look at the same work from different angles. They each have a specific role in deciding whether something is ready.
Add An Orchestrator
Once sub-agents exist, something needs to organize them. That is the orchestrator.
The orchestrator receives the PR, creates the sub-agents, waits for the reports, and reports one coherent finding back to the user.
Selective Routing
A lot of the time we have PRs that are solely focused on UI or data flow, which means we might not need to run every sub-agent every time.
So the orchestrator needs a discovery step. Before it fans work out to sub-agents, it should inspect the PR and decide what kind of review is actually useful.
Model Choice
Let's take this one step further by also choosing and splitting the models. The orchestrator needs to run on our highest reasoning level because it is the one owning, planning, and outputting our final report.
The research step is different. It is read-only, and the goal is to inspect the PR quickly without editing code. That can run on a faster coding model while the main thread stays on the stronger model for judgment.
The reviewers can run lower because their jobs are intentionally narrow. A SwiftUI reviewer, concurrency reviewer, or story reviewer is not responsible for the whole answer. Each one looks through a specific lens and reports back the useful findings.
Recap
We started with a single prompt, which still has its usefulness, but we ended with an orchestrator that can adapt to real workflows.
The orchestrator can scale up or down depending on the complexity of the work. Sometimes one prompt is enough. Sometimes the workflow needs research, routing, specialist reviewers, and a final synthesis.
That is the bigger shift: these tools are no longer just better autocomplete or better chat. Used well, they become building blocks for engineering workflows that can reason, route, specialize, and report back with structure.
Start
Orchestrated workflow
Final Thoughts
I can imagine where we can take this in the future. There are so many different workflows that could benefit from these orchestrators.
A product-manager story groomer could pull in context from adjacent user stories, inspect the code to find relevant testing steps, and maybe even include Slack comments to give the story better business backing.
I am sure this structure is going to change as agents become more capable and our tooling becomes more advanced. But as we keep experimenting with all of this, it is fun finding new ways to tackle problems.