Module 12 · ~2–6 hrs

Capstone: build Jarvis end-to-end

Everything you've learned, in one repo. By the end of this module you'll have a deployable, observed, hardened Jarvis you can demo. We'll lay out the project structure, build it module by module, run it locally, then deploy.

1. The architecture you're building

Slack / Web / Mobile

→

LangGraph Platform
control + data plane

Supervisor

IT agent

HR agent

Calendar agent

Knowledge agent

Helpdesk API

HRIS

Google Calendar (MCP)

Chroma KB

Plus: Postgres (checkpointer + store) · LangSmith (tracing/evals) · Slack MCP for notifications

2. Repository layout

jarvis/
├── langgraph.json              # deployment manifest
├── pyproject.toml              # deps
├── .env                        # secrets (gitignored)
├── src/
│   ├── jarvis.py               # exports `graph` for the platform
│   ├── state.py                # JarvisState typed dict + reducers
│   ├── supervisor.py           # supervisor node + routing model
│   ├── agents/
│   │   ├── it.py
│   │   ├── hr.py
│   │   ├── calendar.py
│   │   └── knowledge.py
│   ├── tools/
│   │   ├── helpdesk.py         # open_it_ticket, ticket_status
│   │   ├── hris.py             # leave_balance, policy_lookup
│   │   ├── calendar.py         # find_room, schedule_meeting
│   │   └── kb.py               # lookup_policy (retrieval over Chroma)
│   ├── memory/
│   │   ├── load.py             # load_user_memories node
│   │   └── extract.py          # post-turn fact extractor
│   ├── guards/
│   │   ├── injection.py        # input guardrail
│   │   └── rate_limit.py       # per-user quotas
│   └── auth.py                 # JWT verification → config.configurable
├── tests/
│   ├── test_tools.py
│   ├── test_routing.py         # FakeListChatModel routing tests
│   └── eval/
│       └── run_dataset.py      # LangSmith eval runner used in CI
└── scripts/
    └── seed_kb.py              # one-time: ingest policy PDFs into Chroma

jarvis/
├── langgraph.json
├── package.json
├── tsconfig.json
├── .env
├── src/
│   ├── jarvis.ts               // exports `graph`
│   ├── state.ts                // JarvisState (Annotation.Root) + reducers
│   ├── supervisor.ts
│   ├── agents/
│   │   ├── it.ts
│   │   ├── hr.ts
│   │   ├── calendar.ts
│   │   └── knowledge.ts
│   ├── tools/
│   │   ├── helpdesk.ts
│   │   ├── hris.ts
│   │   ├── calendar.ts
│   │   └── kb.ts
│   ├── memory/
│   │   ├── load.ts
│   │   └── extract.ts
│   ├── guards/
│   │   ├── injection.ts
│   │   └── rateLimit.ts
│   └── auth.ts
├── tests/
│   ├── tools.test.ts
│   ├── routing.test.ts
│   └── eval/runDataset.ts
└── scripts/
    └── seedKb.ts

3. The build, in order

Build it in this sequence — each step gives you something you can run before moving on.

State. Define JarvisState with messages (add_messages reducer), next (supervisor's pick), user_email, org_id. (Module 4.)
Tools. Implement the four tool modules with real or mocked APIs. Add validation, error returns, identity from config. (Module 5.)
Specialists. One ReAct agent per domain, each with its own tools, prompt, and a small fast model. (Modules 3, 7.)
Supervisor. Routing model returning {next: ...}; supervisor node that calls it; conditional edges to specialists; edges back. (Module 7.)
Memory. Add the load_user_memories node before the supervisor; add the remember_preference tool to relevant specialists; configure store with embeddings. (Module 6.)
Human-in-the-loop. Wrap send_email and grant_access in interrupt(). Add an UI flow in your test client to approve/edit/decline. (Module 8.)
Guardrails. Input guardrail node ahead of memory; per-user rate limit in tools; allow-list of tools per specialist. (Module 11.)
Persistence. Compile with the platform's default checkpointer + Postgres store. Locally use SQLite. (Modules 6, 9.)
Deployment. langgraph.json + langgraph dev locally; langgraph deploy to the Platform. (Module 9.)
Observability. Tracing on. Build a small gold dataset (20 cases to start). Wire offline evals into your CI. Sample 10% of prod traces for online evals. (Module 10.)
Client. Minimal web chat that calls the SDK: get/create thread, stream a run, render updates, prompt for approvals on interrupts. (Module 9.)

If you're stuck at any step, scroll back to that module — the code there is the source of truth.

4. The wired-up `src/jarvis.py` — what it looks like

from langgraph.graph import StateGraph, START, END
from .state import JarvisState
from .memory.load import load_user_memories
from .guards.injection import injection_check
from .supervisor import supervisor_node, route_after_supervisor, SPECIALISTS

def build():
    g = StateGraph(JarvisState)

    # Pipeline nodes
    g.add_node("inject_check", injection_check)        # guardrail
    g.add_node("memories",    load_user_memories)      # load long-term facts
    g.add_node("supervisor",  supervisor_node)         # picks next worker

    # Specialist sub-agents (each itself a compiled ReAct graph)
    for name, agent in SPECIALISTS.items():
        g.add_node(name, agent)

    # Wire it
    g.add_edge(START, "inject_check")
    g.add_edge("inject_check", "memories")
    g.add_edge("memories", "supervisor")
    g.add_conditional_edges(
        "supervisor",
        route_after_supervisor,
        {**{n: n for n in SPECIALISTS}, "FINISH": END},
    )
    for name in SPECIALISTS:
        g.add_edge(name, "supervisor")                 # specialists report back

    return g

# Exported for langgraph deploy — Platform compiles with its own checkpointer/store.
graph = build()

import { StateGraph, START, END } from "@langchain/langgraph";
import { JarvisState } from "./state";
import { loadUserMemories } from "./memory/load";
import { injectionCheck } from "./guards/injection";
import { supervisorNode, routeAfterSupervisor, SPECIALISTS } from "./supervisor";

function build() {
  const g = new StateGraph(JarvisState)
    .addNode("inject_check", injectionCheck)
    .addNode("memories", loadUserMemories)
    .addNode("supervisor", supervisorNode);

  for (const [name, agent] of Object.entries(SPECIALISTS)) g.addNode(name as any, agent);

  g.addEdge(START, "inject_check")
   .addEdge("inject_check", "memories")
   .addEdge("memories", "supervisor")
   .addConditionalEdges("supervisor", routeAfterSupervisor, {
      ...Object.fromEntries(Object.keys(SPECIALISTS).map(n => [n, n])),
      FINISH: END,
   });
  for (const name of Object.keys(SPECIALISTS)) g.addEdge(name as any, "supervisor");

  return g;
}

export const graph = build();

Every concept from the course shows up here in exactly one place. You should now be able to read this file front-to-back and explain every line to a coworker.

5. Running it

# Local — gives you the Studio UI at http://localhost:2024
langgraph dev

# Deploy
langgraph deploy --name jarvis-prod

# Hit it from your client (Python)
from langgraph_sdk import get_client
client = get_client(url="https://...langgraph.app")
thread = await client.threads.create()
async for chunk in client.runs.stream(
    thread["thread_id"], "jarvis",
    input={"messages": [{"role":"user","content":"Printer floor 3 jammed; also lunch with Anuj tomorrow 1pm"}]},
    config={"configurable": {"user_email":"priya@acme.com", "org_id":"acme"}},
    stream_mode="updates",
):
    print(chunk.event, chunk.data)

npx @langchain/langgraph-cli dev
npx @langchain/langgraph-cli deploy --name jarvis-prod

import { Client } from "@langchain/langgraph-sdk";
const client = new Client({ apiUrl: "https://...langgraph.app" });
const thread = await client.threads.create();
for await (const chunk of client.runs.stream(thread.thread_id, "jarvis", {
  input: { messages: [{ role: "user", content: "..." }] },
  config: { configurable: { user_email: "priya@acme.com", org_id: "acme" } },
  streamMode: "updates",
})) console.log(chunk.event, chunk.data);

6. Self-assessment — can you do all of this?

If you can answer "yes" to every item, you can ship multi-agent systems on your own.

I can explain in one sentence what an LLM agent is, and pick agent vs. workflow correctly.
I can place LangChain / LangGraph / LangSmith / LangGraph Platform on the right layer of the stack.
I can build a working tool-calling agent from primitives (model, messages, prompt, tools) without a prebuilt.
I can express the agent loop as a LangGraph StateGraph with conditional edges, and explain reducers.
I can design good tools — names, descriptions, schemas, error returns, identity from config.
I can add short-term memory (checkpointer/threads) and long-term memory (store + namespaces).
I can pick the right multi-agent pattern (supervisor / network / hierarchical / swarm) for a given problem.
I can add a human-in-the-loop approval gate with interrupt() and resume with Command(resume=…).
I can deploy a graph to LangGraph Platform and explain control plane vs. data plane.
I can wire LangSmith tracing and run offline + online evals with a gold dataset.
I can list the production hardening checklist from Module 11 and explain why each item is on it.

7. What to learn next

Agent UX — building chat surfaces that surface streaming, interrupts, and approvals well. The Vercel AI SDK + LangChain's ai-sdk integration on the TS side is a good starting point.
Voice agents — wrap your graph with a realtime voice layer (Vapi, Deepgram, ElevenLabs). The graph stays the same; only the I/O changes.
Agent evals at scale — read up on trajectory evaluation (grading the path, not just the final answer), pairwise preference evals, and human-in-the-loop annotation queues.
Self-improving agents — agents that update their own procedural memory ("next time, remember to check toner first") based on success/failure feedback. The store + a post-turn reflection node is enough to start.
Specialised orchestration patterns — research the plan-and-execute, reflexion, tree-of-thoughts, and code-act agents. Most are 2–3 nodes added to a base ReAct graph.
Cost-conscious frontier-model use — read each provider's prompt-caching, batch, and structured-output docs annually. The economics shift every six months.

8. You're done

★ Jarvis status: shipped

You built Jarvis. More importantly, you now understand the entire stack underneath it — from the ReAct loop to the control plane. Take this same architecture and apply it to whatever your real product is. Every other multi-agent system you'll build is a permutation of what you've already done.

Bookmark the glossary and come back to specific modules whenever you need a refresher.

Final check

1. What's the most important thing you should take away from this course?

Exact API signatures for LangGraph methods. The mental model: state + nodes + edges; specialise → supervise; tool = name + description + schema; control plane vs. data plane. Which AI model is best.

2. When LangGraph's API names change in 6 months, you will:

Have to relearn everything. Skim the migration notes and keep building — the concepts are the same. Switch frameworks.