From Idea to Infrastructure: Building a DevOps Copilot with AI Agents
A Journey into Multi-Agent Systems with LangGraph and AWS MCP Server
Introduction
In modern software delivery, DevOps tasks—from high‑level architecture design to infrastructure provisioning—often involve repetitive, error‑prone steps. What if you could hand off that entire workflow to a squad of AI agents? In this post, we explore the DevOps Multi‑Agent Copilot, a system that transforms a simple user request into a validated, production‑ready Terraform project—all in under a minute.
You’ll learn how we leverage LangChain, LangGraph, FastAPI, MCP and Streamlit, along with a network of specialized agents, to create an interactive DevOps assistant that:
Understands your high‑level vision
Plans the right AWS services and components
Visualizes the architecture
Generates and validates infrastructure‑as‑code
The Big Picture: System Architecture
The system comprises four layers:
Agentic Core (LangGraph)
Defined in
graph.py
Stateful, cyclical graph where AI agents collaborate
Memory managed via
MemorySaver
for long‑running, context‑aware workflows
Backend (FastAPI)
Exposes a
/stream
endpoint (server.py
)Orchestrates graph execution, streams events to the UI
MCP Servers (Toolbelt Layer)
Each specialized tool (diagram generation, Terraform validation, etc.) runs on its own MCP server
Exposes HTTP/SSE APIs so agents can call tools remotely
Decouples compute‑heavy or stateful operations from the main application
Frontend (Streamlit)
Interactive chat UI (
app.py
)Streams agent “thinking” steps, renders diagrams, and shows code in real time
Meet the Team of AI Agents
Each agent is a ReAct specialist created via create_react_agent
, with its own system prompt and toolset hosted on a remote MCP server.
1. The Supervisor: Project Manager & Orchestrator
The supervisor_agent
is the heart of the graph. Its job isn't to do the work but to delegate it. After each step, the conversation returns to the supervisor, which analyzes the current state and decides which agent to route the task to next. This routing logic is defined in its system prompt and allows it to orchestrate the entire workflow, from planning to FINISH
.
2. The Planning Agent: Requirements Clarifier & Service Recommender
A Solution Architect. You give it a vague idea, and it transforms it into a robust, detailed technical plan. It thinks about best practices, suggests modern AWS services, and designs a blueprint for the other agents to follow.
3. The Diagram Agent: Visual Architect
This agent takes the planner's blueprint and instantly renders it as a professional architecture diagram. It provides immediate visual feedback, ensuring the design is what you envisioned before a single line of infrastructure code is written.
4. The Terraform Agent: IaC Generator & Validator
A tireless Terraform expert. It takes the final architecture and writes a complete, multi-file Terraform project. More importantly, it doesn't just write code—it writes, tests, and corrects it in a loop until it passes validation, ensuring the output is ready for deployment.
End‑to‑End Workflow
User Requests
“Create a diagram for a serverless Python app on AWS with a database.”
Supervisor
Parses intent
Delegates to Planning Agent
Planning Agent
Clarifies requirements
Outputs: “API Gateway → Lambda → DynamoDB”
Supervisor → Diagram Agent
generate_diagram
tool produces image + raw Python code
Supervisor → Terraform Agent
Translates diagram code to Terraform HCL
Writes to disk, runs
terraform_validate
(up to 3 retries)
Supervisor
On success, emits FINISH → user gets codebase + diagram
Result: From idea to validated Terraform code and architecture diagram in under a minute.
Under the Hood
1. LangGraph as the Brain
Graph model of nodes (agents) & edges (state transfers)
Ensures robust, stateful orchestration for multi‑step tasks
2. ReAct Agents as the Experts
Reason & Act paradigm: each agent chooses tools, observes results, iterates
Prebuilt via
create_react_agent
for consistent behavior
3. MCP Servers as the Tool‑belt
Hosted microservices exposing tool APIs over HTTP
Decouples agents from underlying tool implementations
Scalability and security enhancements over stdio‑based protocols
Modernizing the MCP Protocol
We adapted the original AWS Labs MCP (Model Context Protocol) servers:
From stdio → HTTP/SSE for network‑native tool access
Dependency upgrades for better performance and security
Streamable responses to support real‑time UIs
This refactor allows seamless, scalable interactions between agents and their specialized tools.
Conclusion & Next Steps
By combining LangGraph orchestration with a squad of ReAct agents, we’ve built a DevOps Copilot that:
Automates complex workflows end‑to‑end
Remains highly extensible: add new agents (e.g., security scanner, cost estimator) by defining new graph nodes
Powers a conversational interface that feels like chatting with your DevOps team
Ready to try it?
🔗 Explore the GitHub repo → DevOps‑agent