Skip to main content

LangGraph Agents in Production: Build Stateful AI Workflows with Python (2026)

· 7 min read
Yassine El Haddad
Software Developer & Automation Specialist

I build production AI agents, web scrapers, and automation pipelines. Most of what I publish here comes from the actual problems they run into: proxies that get banned, anti-bot stacks that fingerprint your client, RAG that drifts when the underlying data moves. Stack: Python, TypeScript, Go, FastAPI, LangChain, Crawlee, Playwright, deployed on AWS, GCP, and Cloudflare.

LangGraph is LangChain's graph-based framework for building stateful, multi-step AI agents. Unlike simple chains, LangGraph lets you define nodes (functions), edges (transitions), conditional branching, loops, and human-in-the-loop checkpoints. It's the go-to choice for production agents that need persistence, interrupts, and complex control flow. Use Apify for web data in LangGraph.

What LangGraph Is

LangGraph models an agent as a state graph: a directed graph of nodes connected by edges. Each node is a function that receives and updates shared state. Edges define transitions; conditional edges let the graph branch based on state. You can loop (e.g., "search → summarize → validate → repeat if invalid") and add interrupts so a human can approve or correct before the next step.

This is a step up from basic LangChain chains, which are linear and stateless. LangGraph supports:

  • State persistence — Checkpoint to PostgreSQL or memory; resume or replay runs
  • Loops — Conditional edges that route back to earlier nodes
  • Human-in-the-loopinterrupt_before to pause for approval
  • Parallel execution — Fan-out to multiple nodes when possible

Core Concepts: StateGraph, Nodes, Edges

from langgraph.graph import StateGraph, END

def search_node(state):
# state is a dict; update and return
return {"results": [...]}

def summarize_node(state):
return {"summary": "..."}

graph = StateGraph(dict)
graph.add_node("search", search_node)
graph.add_node("summarize", summarize_node)
graph.add_edge("search", "summarize")
graph.add_edge("summarize", END)
graph.set_entry_point("search")
app = graph.compile()

StateGraph takes a state schema (a dict or TypedDict). Nodes receive state, return updates (merged into state). add_edge defines transitions. compile() produces a runnable that you invoke with input state.

Why LangGraph Over Basic LangChain Chains

CapabilityLangChain ChainsLangGraph
StateStateless per callFull state, persisted
LoopsNot supportedConditional edges back
Human approvalManual wiringinterrupt_before
CheckpointingNonePostgreSQL, memory
StreamingPer-stepPer-node, intermediate
Parallel nodesLimitedNative fan-out

Use LangGraph when you need retry loops, approval gates, or long-running workflows that must survive restarts. For simple prompt → LLM → parser flows, a basic chain is enough.

Code Example: Research Agent with Search and Summarize Loop

from langgraph.graph import StateGraph, END
from langgraph.checkpoint.postgres import PostgresSaver
from langchain_anthropic import ChatAnthropic

def search_node(state):
# Call Apify Actor or Tavily
results = apify_client.actor("apify/web-scraper").call(
run_input={"urls": [state["query"]]}
)
return {"results": results["defaultDatasetId"]}

def summarize_node(state):
llm = ChatAnthropic(model="claude-3-5-sonnet")
summary = llm.invoke(
f"Summarize: {state['results']}"
).content
return {"summary": summary}

def validate_node(state):
# Simple heuristic or LLM-based check
is_valid = len(state["summary"]) > 100
return {"valid": is_valid}

graph = StateGraph({"query": str, "results": list, "summary": str, "valid": bool})
graph.add_node("search", search_node)
graph.add_node("summarize", summarize_node)
graph.add_node("validate", validate_node)
graph.add_edge("search", "summarize")
graph.add_edge("summarize", "validate")
graph.add_conditional_edges("validate", lambda s: "end" if s["valid"] else "search", {"end": END, "search": "search"})
graph.set_entry_point("search")

# With checkpointing
checkpointer = PostgresSaver.from_conn_string("postgresql://...")
app = graph.compile(checkpointer=checkpointer)
result = app.invoke({"query": "Latest AI news"}, config={"configurable": {"thread_id": "1"}})

This agent searches, summarizes, and loops back to search if the summary is too short. Checkpointing to PostgreSQL lets you resume or inspect runs.

Production Patterns

Checkpointing to PostgreSQL: Use PostgresSaver.from_conn_string(). Pass thread_id in config so each conversation has its own thread. State is saved after each node.

Streaming intermediate results: Use app.stream() instead of invoke. You get events per node; stream tokens or partial state to the UI.

Human-in-the-loop: Add interrupt_before=["validate"] when compiling. The graph pauses before validate; you resume by calling app.invoke(None, config) with the same thread_id after the human approves.

Cost and latency: Each node that calls an LLM consumes tokens. Cache repeated calls (e.g., same search query) with LangChain's caching. Use smaller models for validation, larger for summarization. Monitor token usage per graph run; a research agent with three LLM nodes can easily consume 10–50K tokens per query. Set budgets and fallbacks for production.

Deploying LangGraph

OptionUse Case
LangGraph PlatformManaged cloud; built-in LangSmith, deployment
FastAPISelf-hosted; wrap app.invoke in an endpoint
DockerPackage FastAPI + LangGraph; deploy anywhere
ServerlessPossible but checkpoints need external Postgres; cold starts affect latency

For self-hosted FastAPI:

from fastapi import FastAPI
from langchain_anthropic import ChatAnthropic

app = FastAPI()
langgraph_app = graph.compile(checkpointer=PostgresSaver.from_conn_string(...))

@app.post("/research")
async def research(query: str):
result = langgraph_app.invoke(
{"query": query},
config={"configurable": {"thread_id": str(uuid.uuid4())}}
)
return result

Comparison: LangGraph vs AutoGen vs CrewAI vs Bare Function Calling

AttributeLangGraphAutoGenCrewAIBare Function Calling
StateExplicit, persistedConversationalAgent memoryManual
Graph modelYes (nodes, edges)No (conversation)No (roles/tasks)No
Checkpointing✓ Built-inLimitedLimitedManual
Python-firstAny
Apify integrationCustom nodeCustom toolCustom toolCustom
Human-in-loop✓ interrupt_beforeManualManualManual

LangGraph is best when you need explicit control flow, loops, and persistence. AutoGen and CrewAI suit multi-agent conversations; bare function calling suits simple tool-use without a framework.

Integration with Apify

Add an Apify node to your LangGraph:

from langchain_apify import ApifyWrapper

def apify_node(state):
apify = ApifyWrapper()
results = apify.actor("apify/website-content-crawler").call(
run_input={"startUrls": [{"url": state["url"]}], "maxCrawlPages": 5}
)
docs = [Document(page_content=r["markdown"]) for r in results]
return {"documents": docs}

Use this for web research, RAG ingestion, or competitive intel. Chaining multiple Apify nodes (e.g., search, then content crawl) lets you build sophisticated research graphs without leaving the LangGraph model. See LangChain Apify content pipeline and AI research agent with LangGraph and Apify for full examples.

Apify Affiliate Banner 728x90Apify Affiliate Banner 728x90Apify Affiliate Banner 300x50Apify Affiliate Banner 300x50
LangGraph + Apify

Combine LangGraph's stateful control flow with Apify's web data. Build research agents, content pipelines, and RAG workflows with checkpoints and human approval.



Try Apify | Automate with Make | LangChain Apify Pipeline

Frequently Asked Questions

LangGraph is LangChain's graph-based framework for stateful AI agents. You define nodes (functions), edges (transitions), and conditional edges for loops and branching.

Use LangGraph when you need loops, human-in-the-loop checkpoints, state persistence, or complex branching. Simple linear flows can stay as chains.

Use interrupt_before when compiling: graph.compile(interrupt_before=['node_name']). The graph pauses before that node; resume with invoke(None, config) after human approval.

Yes. Add a node that uses ApifyWrapper or the Apify API. Common use: web search, content crawling, or dataset retrieval as input to other nodes.

Cache LLM calls for repeated inputs. Use smaller models for validation. Limit loop iterations. Consider streaming to surface results earlier without full completion.

LangGraph Platform (managed), self-hosted FastAPI, or Docker. For production, use PostgreSQL checkpointing and ensure your deployment can access it.

Common mistakes and fixes

LangGraph checkpointing fails

Ensure PostgreSQL connection string is correct. Check table permissions. Use MemorySaver for local dev; PostgresSaver for production.

Graph runs out of tokens or loops

Add max_iterations to the config. Use conditional edges with clear termination. Cache LLM calls where possible.