I continue to experiment with AI in the context of software engineering. I'm fortunate that my team supports me in exploring different ways to impr...
For further actions, you may consider blocking this person and/or reporting abuse
The maturity levels framework is useful. One pattern that works well in practice: specialized small agents (vision agent for GUI, code agent for scripts, data agent for analysis) coordinated by a lightweight orchestrator. Each agent runs its own model optimized for that modality. This "scaling out" approach avoids the single-point-of-failure problem of routing everything through one monolithic LLM. The challenge shifts from "making one model do everything" to "making agents communicate efficiently."
The team-of-agents framing exposes a planning problem most blog posts skip: who holds the shared mental model? With humans, the senior on the team carries it in their head and corrects course informally. With agents you need that mental model to be explicit, queryable, and durable across runs, otherwise each agent rediscovers the same constraints in isolation. Coordination overhead climbs fast once you have more than two agents touching shared state.
Good point. I think the
README.md, a customPLAN.md, or the Wiki itself if you use GitHub can serve as the shared mental model.Agreed. The shared mental model usually starts as README.md, ADRs, PLAN.md, or internal docs. The problem is that coding agents don’t reliably operationalize those constraints during generation. That’s the gap we’re exploring with Mneme: turning architectural intent into enforceable workflow constraints.
the hard part is usually keeping context coherent between design and implementation agents. without a shared state layer they diverge fast. what's your coordination mechanism look like?
Designers write the PLAN.md, while implementors code it.
To be honest, I didn't get any divergence, but perhaps my scope is more limited than yours?
PLAN.md as a shared contract is smart — effectively the same idea. my divergence was runtime, not design-time: impl agents querying a state that design had already moved past. scope probably is the difference — once pipelines get interdependent it compounds fast.
I’ve tried similar multi-agent setup, but gets expensive quickly. Adding a shared memory layer helped a lot. Leveraging GitNexus (for codebase knowledge graphs) + Serena (semantic IDE-style tools via MCP) cuts tpm
Oh yeah!
It's my professional environment setup, and we get quite a leeway regarding tokens usage.
I love using agents... I have experts in the fields I use to develop, and also I created expert agents in a frameworks I created so I can create apps based on that framework very easy...
Great breakdown of autonomous agent teams! The planner/challenger/coder/tester structure makes a lot of sense — having a dedicated challenger agent to push back on plans before implementation is something I hadn't considered but seems really valuable for catching issues early.
The autonomy vs. security tradeoff you mention is real. In my own experiments with multi-agent setups, I've found that being explicit about what each agent can and can't touch (rather than blanket permissions) helps a lot. Looking forward to seeing where the experimental Agent Teams feature goes!
Great. I liked the idea, thanks for sharing.
I think the interesting direction is splitting responsibilities between local and cloud agents instead of making one giant agent do everything.
Local AI can continuously read the repo, collect context, track changes, and structure the user's intent into clean prompts. Then a stronger cloud model handles reasoning and creates the action plan. After that, the local agent executes the smaller modifications step by step inside the repo.
Feels cheaper, more private, and probably more scalable than sending the entire codebase to a cloud model every cycle.