If you’re not using AI sub-agents yet, read this
What Are AI Sub Agents?
In the note, the author explains that they learned from the developer of Claude Code and started using sub agents - specialized agents that run in separate context windows. These helpers handle repeatable tasks like code reviews or security checks without polluting the main conversation. By isolating tasks, the main agent stays focused on the high‑level goal.
Why Context Windows Matter
Large language models (LLMs) process information within a finite context window, and the longer that window becomes, the harder it is for the model to recall important details. Research from Anthropic shows that as the number of tokens in the context window increases, a model’s ability to accurately recall information decreases. This phenomenon, often called context rot, means that every unnecessary token can dilute the quality of the model’s output. An article from Inkeep notes that AI agents have limited attention budgets and that information retrieval accuracy decreases as contexts grow longer. Managing context effectively is therefore critical to maintaining performance.
How Sub Agents Improve Performance
Sub agents can mitigate context rot by delegating messy, token‑intensive tasks to specialized workers. Jason Liu’s overview of context engineering explains that sub agents isolate messy tasks and return distilled insights to the main reasoning thread. By keeping logs, test outputs, or verbose code diffs out of the main context, you minimize noise and free up the main agent’s attention for reasoning. The Inkeep article further suggests that multi‑agent architectures - where specialized sub agents handle specific tasks - help preserve the main context and prevent it from being overwhelmed.
Real‑World Example: Code Review with Claude Code
The note’s author created their first sub agent for unstaged changes review. Instead of dumping all the changes into the main conversation, they spin up a dedicated reviewer that looks at the diffs, summarizes the changes, and sends back a concise report. This approach reflects the practice described by Liu, where messy tasks like reading test outputs are handled by a separate agent to avoid polluting the main thread. Claude Code and similar tools integrate sub agents as built‑in skills, and this multi‑agent pattern is now common in many coding assistants. You can read the Claude Code overview for more context on how it works.
Best Practices for Adopting Sub Agents
- Keep contexts small: Only include relevant details in the main agent’s context. Remove logs or tool outputs once they’re summarized. Anthropic recommends curating the smallest set of high‑signal tokens for the model.
- Leverage multi‑agent architectures: Use specialized sub agents for repeatable tasks like code reviews, security checks, data extraction, or research. This separation of concerns aligns with Inkeep’s recommendation to use multi‑agent architectures to prevent context pollution.
- Adopt just‑in‑time retrieval: Instead of loading everything upfront, fetch data only when needed. The Inkeep article highlights that modern agents retrieve information precisely when it’s required.
- Monitor context budgets: Pay attention to token counts and summarization strategies. Research shows that the transformer architecture’s n² attention complexity makes long contexts expensive to process.
Final Thoughts
As your projects grow, context management becomes a bottleneck. Sub agents offer a simple yet powerful strategy for keeping your main chat focused and productive. By isolating repetitive tasks in dedicated contexts and applying context‑engineering techniques like just‑in‑time retrieval and summarization, you can maintain high output quality even when tackling complex workflows. If you haven’t started experimenting with sub agents yet, now is a great time to try. Tools like Claude Code already integrate them, and the benefits to clarity, efficiency, and scalability are significant.