* fix(agents): make update_agent honor runtime.context user_id like setup_agent PR #2784 hardened setup_agent to prefer runtime.context["user_id"] (set by inject_authenticated_user_context from the auth-validated request) over the contextvar, so an agent created during the bootstrap flow always lands under users/<auth_uid>/agents/<name>. update_agent was left calling get_effective_user_id() unconditionally — the same class of bug that produced issues #2782 / #2862 still applies whenever the contextvar is not available on the executing task (background work, future cross-process drivers, checkpoint resume on a different task). In that regime update_agent silently routes writes to users/default/agents/<name>, corrupting the shared default bucket and losing the user's edit. Extract the resolution policy into a shared resolve_runtime_user_id helper on deerflow.runtime.user_context and route both setup_agent and update_agent through it so the two halves of the lifecycle stay in lockstep. Add load-bearing end-to-end tests that drive a real langchain.agents create_agent graph with a fake LLM, exercising the full pipeline: HTTP wire format -> app.gateway.services.start_run config-assembly -> deerflow.runtime.runs.worker._build_runtime_context -> langchain.agents create_agent graph -> ToolNode dispatch (sync + async + sub-graph + ContextThreadPoolExecutor) -> setup_agent / update_agent The negative-control tests intentionally land in users/default/ to prove the positive tests are actually load-bearing rather than vacuously passing. The new test_update_agent_e2e_user_isolation suite included a test that failed against main and now passes after this fix. * style: ruff format on new e2e tests * test(e2e): real-server HTTP test driving setup_agent through the full ASGI stack Adds tests/test_setup_agent_http_e2e_real_server.py — a single load-bearing test that drives the entire FastAPI gateway through starlette.testclient. TestClient with no mocks above the LLM: - lifespan boots (config, sqlite engine, LangGraph runtime, channels) - POST /api/v1/auth/register (real password hash, real sqlite write, issues access_token + csrf_token cookies) - POST /api/threads (real thread_meta + checkpoint creation) - POST /api/threads/{id}/runs/stream with the exact wire shape the React frontend sends (assistant_id + input + config + context with agent_name/is_bootstrap) - AuthMiddleware -> CSRFMiddleware -> require_permission -> start_run -> inject_authenticated_user_context -> asyncio.create_task(run_agent) -> worker._build_runtime_context -> Runtime injection -> ToolNode dispatch -> real setup_agent - Asserts SOUL.md is under users/<authenticated_uid>/agents/<name>/ and NOT under users/default/agents/<name>/. DEER_FLOW_HOME and the sqlite path are redirected into tmp_path so the test never touches the real .deer-flow directory or developer database. The only patch above the LLM boundary is replacing create_chat_model with a fake that emits a single setup_agent tool_call. This is the "真实验证" answer: it reproduces what curl-against-uvicorn would do, minus the network socket layer. * test: address Copilot review on user-isolation e2e tests - Drop "currently expected to FAIL" wording from update_agent e2e docstring and header (Copilot review): the fix is in this PR, the test pins the corrected behaviour rather than driving a future change. - Rephrase the assertion failure messages from "BUG:" to "REGRESSION:" to match the test's role on the fixed branch. - Bound _drain_stream with a wall-clock timeout, a max-bytes cap, and an early break on the "event: end" SSE frame (Copilot review). Stops the test from hanging on a stuck run or runaway heartbeat loop. - Replace the misleading "patch both module aliases" comment with an explanation of why patching lead_agent.agent.create_chat_model is the only correct target (Copilot review): lead_agent rebinds the symbol into its own namespace at import time, so patching deerflow.models is too late. * test(refactor): address WillemJiang review on user-isolation e2e tests - Extract the duplicated FakeToolCallingModel (and a build_single_tool_call_model helper) into tests/_agent_e2e_helpers.py. All three e2e files now import from the shared module instead of redefining the shim locally. - Convert the manual p.start() / p.stop() try/finally blocks in test_update_agent_e2e_user_isolation.py to contextlib.ExitStack so patch lifecycle is Pythonic and exception-safe. - Lift the isolated_app fixture's private-attribute resets into a named _reset_process_singletons helper with a comment block explaining why each singleton has to be invalidated for true e2e isolation, and why raising=False is intentional. Makes the fragility visible and the intent self-documenting rather than leaving the resets inline as opaque monkeypatch calls. Net change: -59 lines (143 -> 84) across the three test files, with every assertion intact. Full suite remains 69 passed / lint clean. * test(e2e): make real-server test self-supply its config CI's actions/checkout only ships config.example.yaml (the real config.yaml is gitignored), so the production config-discovery search (./config.yaml -> ../config.yaml -> $DEER_FLOW_CONFIG_PATH) finds nothing and the test fails at lifespan boot with FileNotFoundError. The dev-machine run passed only because a local config.yaml happened to exist. Write a minimal AppConfig-valid yaml into tmp_path and pin DEER_FLOW_CONFIG_PATH to it. The yaml carries just what the schema requires (a single fake-test-model entry, LocalSandboxProvider, sqlite database). The LLM never gets instantiated because the test patches create_chat_model on the lead agent module, so the api_key/base_url stay placeholders. Verified by hiding the local config.yaml to mirror the CI checkout — the test now passes in both environments.
DeerFlow Backend
DeerFlow is a LangGraph-based AI super agent with sandbox execution, persistent memory, and extensible tool integration. The backend enables AI agents to execute code, browse the web, manage files, delegate tasks to subagents, and retain context across conversations - all in isolated, per-thread environments.
Architecture
┌──────────────────────────────────────┐
│ Nginx (Port 2026) │
│ Unified reverse proxy │
└───────┬──────────────────┬───────────┘
│
/api/langgraph/* │ /api/* (other)
rewritten to /api/* │
▼
┌────────────────────────────────────────┐
│ Gateway API (8001) │
│ FastAPI REST + agent runtime │
│ │
│ Models, MCP, Skills, Memory, Uploads, │
│ Artifacts, Threads, Runs, Streaming │
│ │
│ ┌────────────────────────────────────┐ │
│ │ Lead Agent │ │
│ │ Middleware Chain, Tools, Subagents │ │
│ └────────────────────────────────────┘ │
└────────────────────────────────────────┘
Request Routing (via Nginx):
/api/langgraph/*→ Gateway LangGraph-compatible API - agent interactions, threads, streaming/api/*(other) → Gateway API - models, MCP, skills, memory, artifacts, uploads, thread-local cleanup/(non-API) → Frontend - Next.js web interface
Core Components
Lead Agent
The single LangGraph agent (lead_agent) is the runtime entry point, created via make_lead_agent(config). It combines:
- Dynamic model selection with thinking and vision support
- Middleware chain for cross-cutting concerns (9 middlewares)
- Tool system with sandbox, MCP, community, and built-in tools
- Subagent delegation for parallel task execution
- System prompt with skills injection, memory context, and working directory guidance
Middleware Chain
Middlewares execute in strict order, each handling a specific concern:
| # | Middleware | Purpose |
|---|---|---|
| 1 | ThreadDataMiddleware | Creates per-thread isolated directories (workspace, uploads, outputs) |
| 2 | UploadsMiddleware | Injects newly uploaded files into conversation context |
| 3 | SandboxMiddleware | Acquires sandbox environment for code execution |
| 4 | SummarizationMiddleware | Reduces context when approaching token limits (optional) |
| 5 | TodoListMiddleware | Tracks multi-step tasks in plan mode (optional) |
| 6 | TitleMiddleware | Auto-generates conversation titles after first exchange |
| 7 | MemoryMiddleware | Queues conversations for async memory extraction |
| 8 | ViewImageMiddleware | Injects image data for vision-capable models (conditional) |
| 9 | ClarificationMiddleware | Intercepts clarification requests and interrupts execution (must be last) |
Sandbox System
Per-thread isolated execution with virtual path translation:
- Abstract interface:
execute_command,read_file,write_file,list_dir - Providers:
LocalSandboxProvider(filesystem) andAioSandboxProvider(Docker, in community/) - Virtual paths:
/mnt/user-data/{workspace,uploads,outputs}→ thread-specific physical directories - Skills path:
/mnt/skills→deer-flow/skills/directory - Skills loading: Recursively discovers nested
SKILL.mdfiles underskills/{public,custom}and preserves nested container paths - File-write safety:
str_replaceserializes read-modify-write per(sandbox.id, path)so isolated sandboxes keep concurrency even when virtual paths match - Tools:
bash,ls,read_file,write_file,str_replace(write_fileoverwrites by default and exposesappendfor end-of-file writes;bashis disabled by default when usingLocalSandboxProvider; useAioSandboxProviderfor isolated shell access)
Subagent System
Async task delegation with concurrent execution:
- Built-in agents:
general-purpose(full toolset) andbash(command specialist, exposed only when shell access is available) - Concurrency: Max 3 subagents per turn, 15-minute timeout
- Execution: Background thread pools with status tracking and SSE events
- Flow: Agent calls
task()tool → executor runs subagent in background → polls for completion → returns result
Memory System
LLM-powered persistent context retention across conversations:
- Automatic extraction: Analyzes conversations for user context, facts, and preferences
- Structured storage: User context (work, personal, top-of-mind), history, and confidence-scored facts
- Debounced updates: Batches updates to minimize LLM calls (configurable wait time)
- System prompt injection: Top facts + context injected into agent prompts
- Storage: JSON file with mtime-based cache invalidation
Tool Ecosystem
| Category | Tools |
|---|---|
| Sandbox | bash, ls, read_file, write_file, str_replace |
| Built-in | present_files, ask_clarification, view_image, task (subagent) |
| Community | Tavily (web search), Jina AI (web fetch), Firecrawl (scraping), DuckDuckGo (image search) |
| MCP | Any Model Context Protocol server (stdio, SSE, HTTP transports) |
| Skills | Domain-specific workflows injected via system prompt |
Gateway API
FastAPI application providing REST endpoints for frontend integration:
| Route | Purpose |
|---|---|
GET /api/models |
List available LLM models |
GET/PUT /api/mcp/config |
Manage MCP server configurations |
GET/PUT /api/skills |
List and manage skills |
POST /api/skills/install |
Install skill from .skill archive |
GET /api/memory |
Retrieve memory data |
POST /api/memory/reload |
Force memory reload |
GET /api/memory/config |
Memory configuration |
GET /api/memory/status |
Combined config + data |
POST /api/threads/{id}/uploads |
Upload files (auto-converts PDF/PPT/Excel/Word to Markdown, rejects directory paths, auto-renames duplicate filenames in one request) |
GET /api/threads/{id}/uploads/list |
List uploaded files |
DELETE /api/threads/{id} |
Delete DeerFlow-managed local thread data after LangGraph thread deletion; unexpected failures are logged server-side and return a generic 500 detail |
GET /api/threads/{id}/artifacts/{path} |
Serve generated artifacts |
IM Channels
The IM bridge supports Feishu, Slack, and Telegram. Slack and Telegram still use the final runs.wait() response path, while Feishu now streams through runs.stream(["messages-tuple", "values"]) and updates a single in-thread card in place.
For Feishu card updates, DeerFlow stores the running card's message_id per inbound message and patches that same card until the run finishes, preserving the existing OK / DONE reaction flow.
Quick Start
Prerequisites
- Python 3.12+
- uv package manager
- API keys for your chosen LLM provider
Installation
cd deer-flow
# Copy configuration files
cp config.example.yaml config.yaml
# Install backend dependencies
cd backend
make install
Configuration
Edit config.yaml in the project root:
models:
- name: gpt-4o
display_name: GPT-4o
use: langchain_openai:ChatOpenAI
model: gpt-4o
api_key: $OPENAI_API_KEY
supports_thinking: false
supports_vision: true
- name: gpt-5-responses
display_name: GPT-5 (Responses API)
use: langchain_openai:ChatOpenAI
model: gpt-5
api_key: $OPENAI_API_KEY
use_responses_api: true
output_version: responses/v1
supports_vision: true
Set your API keys:
export OPENAI_API_KEY="your-api-key-here"
Running
Full Application (from project root):
make dev # Starts Gateway + Frontend + Nginx
Access at: http://localhost:2026
Backend Only (from backend directory):
# Gateway API + embedded agent runtime
make dev
Direct access: Gateway at http://localhost:8001
Project Structure
backend/
├── src/
│ ├── agents/ # Agent system
│ │ ├── lead_agent/ # Main agent (factory, prompts)
│ │ ├── middlewares/ # 9 middleware components
│ │ ├── memory/ # Memory extraction & storage
│ │ └── thread_state.py # ThreadState schema
│ ├── gateway/ # FastAPI Gateway API
│ │ ├── app.py # Application setup
│ │ └── routers/ # 6 route modules
│ ├── sandbox/ # Sandbox execution
│ │ ├── local/ # Local filesystem provider
│ │ ├── sandbox.py # Abstract interface
│ │ ├── tools.py # bash, ls, read/write/str_replace
│ │ └── middleware.py # Sandbox lifecycle
│ ├── subagents/ # Subagent delegation
│ │ ├── builtins/ # general-purpose, bash agents
│ │ ├── executor.py # Background execution engine
│ │ └── registry.py # Agent registry
│ ├── tools/builtins/ # Built-in tools
│ ├── mcp/ # MCP protocol integration
│ ├── models/ # Model factory
│ ├── skills/ # Skill discovery & loading
│ ├── config/ # Configuration system
│ ├── community/ # Community tools & providers
│ ├── reflection/ # Dynamic module loading
│ └── utils/ # Utilities
├── docs/ # Documentation
├── tests/ # Test suite
├── langgraph.json # LangGraph graph registry for tooling/Studio compatibility
├── pyproject.toml # Python dependencies
├── Makefile # Development commands
└── Dockerfile # Container build
langgraph.json is not the default service entrypoint. The scripts and Docker
deployments run the Gateway embedded runtime; the file is kept for LangGraph
tooling, Studio, or direct LangGraph Server compatibility.
Configuration
Main Configuration (config.yaml)
Place in project root. Config values starting with $ resolve as environment variables.
Key sections:
models- LLM configurations with class paths, API keys, thinking/vision flagstools- Tool definitions with module paths and groupstool_groups- Logical tool groupingssandbox- Execution environment providerskills- Skills directory pathstitle- Auto-title generation settingssummarization- Context summarization settingssubagents- Subagent system (enabled/disabled)memory- Memory system settings (enabled, storage, debounce, facts limits)
Provider note:
models[*].usereferences provider classes by module path (for examplelangchain_openai:ChatOpenAI).- If a provider module is missing, DeerFlow now returns an actionable error with install guidance (for example
uv add langchain-google-genai).
Extensions Configuration (extensions_config.json)
MCP servers and skill states in a single file:
{
"mcpServers": {
"github": {
"enabled": true,
"type": "stdio",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": {"GITHUB_TOKEN": "$GITHUB_TOKEN"}
},
"secure-http": {
"enabled": true,
"type": "http",
"url": "https://api.example.com/mcp",
"oauth": {
"enabled": true,
"token_url": "https://auth.example.com/oauth/token",
"grant_type": "client_credentials",
"client_id": "$MCP_OAUTH_CLIENT_ID",
"client_secret": "$MCP_OAUTH_CLIENT_SECRET"
}
}
},
"skills": {
"pdf-processing": {"enabled": true}
}
}
Environment Variables
DEER_FLOW_CONFIG_PATH- Override config.yaml locationDEER_FLOW_EXTENSIONS_CONFIG_PATH- Override extensions_config.json location- Model API keys:
OPENAI_API_KEY,ANTHROPIC_API_KEY,DEEPSEEK_API_KEY, etc. - Tool API keys:
TAVILY_API_KEY,GITHUB_TOKEN, etc.
LangSmith Tracing
DeerFlow has built-in LangSmith integration for observability. When enabled, all LLM calls, agent runs, tool executions, and middleware processing are traced and visible in the LangSmith dashboard.
Setup:
- Sign up at smith.langchain.com and create a project.
- Add the following to your
.envfile in the project root:
LANGSMITH_TRACING=true
LANGSMITH_ENDPOINT=https://api.smith.langchain.com
LANGSMITH_API_KEY=lsv2_pt_xxxxxxxxxxxxxxxx
LANGSMITH_PROJECT=xxx
Legacy variables: The LANGCHAIN_TRACING_V2, LANGCHAIN_API_KEY, LANGCHAIN_PROJECT, and LANGCHAIN_ENDPOINT variables are also supported for backward compatibility. LANGSMITH_* variables take precedence when both are set.
Langfuse Tracing
DeerFlow also supports Langfuse observability for LangChain-compatible runs.
Add the following to your .env file:
LANGFUSE_TRACING=true
LANGFUSE_PUBLIC_KEY=pk-lf-xxxxxxxxxxxxxxxx
LANGFUSE_SECRET_KEY=sk-lf-xxxxxxxxxxxxxxxx
LANGFUSE_BASE_URL=https://cloud.langfuse.com
If you are using a self-hosted Langfuse deployment, set LANGFUSE_BASE_URL to your Langfuse host.
Dual Provider Behavior
If both LangSmith and Langfuse are enabled, DeerFlow initializes and attaches both callbacks so the same run data is reported to both systems.
If a provider is explicitly enabled but required credentials are missing, or the provider callback cannot be initialized, DeerFlow raises an error when tracing is initialized during model creation instead of silently disabling tracing.
Docker: In docker-compose.yaml, tracing is disabled by default (LANGSMITH_TRACING=false). Set LANGSMITH_TRACING=true and/or LANGFUSE_TRACING=true in your .env, together with the required credentials, to enable tracing in containerized deployments.
Development
Commands
make install # Install dependencies
make dev # Run Gateway API + embedded agent runtime (port 8001)
make gateway # Run Gateway API without reload (port 8001)
make lint # Run linter (ruff)
make format # Format code (ruff)
Code Style
- Linter/Formatter:
ruff - Line length: 240 characters
- Python: 3.12+ with type hints
- Quotes: Double quotes
- Indentation: 4 spaces
Testing
uv run pytest
Technology Stack
- LangGraph (1.0.6+) - Agent framework and multi-agent orchestration
- LangChain (1.2.3+) - LLM abstractions and tool system
- FastAPI (0.115.0+) - Gateway REST API
- langchain-mcp-adapters - Model Context Protocol support
- agent-sandbox - Sandboxed code execution
- markitdown - Multi-format document conversion
- tavily-python / firecrawl-py - Web search and scraping
Documentation
- Configuration Guide
- Architecture Details
- API Reference
- File Upload
- Path Examples
- Context Summarization
- Plan Mode
- Setup Guide
License
See the LICENSE file in the project root.
Contributing
See CONTRIBUTING.md for contribution guidelines.