Tao Liu daa3ffc29b
feat(loop-detection): make loop detection configurable with per-tool frequency overrides (#2711)
* Make loop detection configurable

Expose LoopDetectionMiddleware thresholds through config.yaml while preserving existing defaults and allowing the middleware to be disabled.

Refs bytedance/deer-flow#2517

* feat(loop-detection): add per-tool tool_freq_overrides to Phase 1

Adds ToolFreqOverride model and tool_freq_overrides field to
LoopDetectionConfig, wires it through LoopDetectionMiddleware, and
documents the option in config.example.yaml.

Resolves the gap flagged in the #2586 review: without per-tool overrides,
users hit by #2510/#2511 (RNA-seq workflows exceeding the bash hard limit)
had no way to raise thresholds for one tool without loosening the global
limit for every tool.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* docs(loop-detection): document tool_freq_overrides in LoopDetectionMiddleware docstring

Add the missing Args entry for tool_freq_overrides, explaining the
(warn, hard_limit) tuple structure and how per-tool thresholds supersede
the global tool_freq_warn / tool_freq_hard_limit for named tools.
Also run ruff format on the three files flagged by the lint check.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(loop-detection): validate LoopDetectionMiddleware __init__ params eagerly

Raise clear ValueError at construction time instead of crashing at
unpack-time inside _track_and_check when bad values are passed:
- tool_freq_overrides: must be 2-tuples of positive ints with hard_limit >= warn
- scalar thresholds: warn_threshold, hard_limit, tool_freq_warn,
  tool_freq_hard_limit must be >= 1 and hard limits must >= their warn pairs
- window_size, max_tracked_threads must be >= 1

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(test): isolate credential loader directory-path test from real ~/.claude

The test didn't monkeypatch HOME, so on any machine with real Claude Code
credentials at ~/.claude/.credentials.json the function fell through to
those credentials and the assertion failed. Adding HOME redirect ensures
the default credential path doesn't exist during the test.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style(test): add blank lines after import pytest in TestInitValidation

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(loop-detection): collapse dual validation to LoopDetectionConfig

Modifications
  - LoopDetectionMiddleware.__init__: stripped of all ValueError raises;
    becomes a plain field-assignment constructor.
  - LoopDetectionMiddleware.from_config: classmethod that builds the
    middleware from a Pydantic-validated LoopDetectionConfig and handles
    the ToolFreqOverride -> tuple[int, int] conversion.
  - agents/factory.py: SDK construction routed through
    LoopDetectionMiddleware.from_config(LoopDetectionConfig()) so the
    defaults path is Pydantic-validated too.
  - agents/lead_agent/agent.py: uses from_config instead of unpacking
    config fields by hand.
  - tests/test_loop_detection_middleware.py: deleted TestInitValidation
    (16 methods exercising the removed __init__ checks); added
    TestFromConfig (4 tests: scalar field mapping, override tuple
    conversion, empty overrides, behavioral smoke test).

Result: one validation layer (Pydantic), zero duplication, no __new__
hacks. Both production construction sites flow through LoopDetectionConfig.

Test results
  make test   -> 2977 passed, 18 skipped, 0 failed (137s)
  make format -> All checks passed; 411 files left unchanged

* feat(agents): make loop_detection configurable in create_deerflow_agent

Adds a `loop_detection: bool | AgentMiddleware = True` field to
RuntimeFeatures, mirroring the existing pattern used by `sandbox`,
`memory`, and `vision`. SDK users can now disable LoopDetectionMiddleware
or replace it with a custom instance built from their own
LoopDetectionConfig — e.g.
`LoopDetectionMiddleware.from_config(my_cfg)` — instead of being stuck
with the hardcoded defaults previously installed by the SDK factory.

The lead-agent path (which already reads AppConfig.loop_detection) is
unchanged, and the default `True` preserves prior always-on behavior for
all existing callers.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: knight0940 <631532668@qq.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Amorend <142649913+knight0940@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-05-07 16:15:15 +08:00
..
2026-01-14 09:57:52 +08:00

DeerFlow Backend

DeerFlow is a LangGraph-based AI super agent with sandbox execution, persistent memory, and extensible tool integration. The backend enables AI agents to execute code, browse the web, manage files, delegate tasks to subagents, and retain context across conversations - all in isolated, per-thread environments.


Architecture

                        ┌──────────────────────────────────────┐
                        │          Nginx (Port 2026)           │
                        │      Unified reverse proxy           │
                        └───────┬──────────────────┬───────────┘
                                │                  │
              /api/langgraph/*  │                  │  /api/* (other)
                                ▼                  ▼
               ┌────────────────────┐  ┌────────────────────────┐
               │ LangGraph Server   │  │   Gateway API (8001)   │
               │    (Port 2024)     │  │   FastAPI REST         │
               │                    │  │                        │
               │ ┌────────────────┐ │  │ Models, MCP, Skills,   │
               │ │  Lead Agent    │ │  │ Memory, Uploads,       │
               │ │  ┌──────────┐  │ │  │ Artifacts              │
               │ │  │Middleware│  │ │  └────────────────────────┘
               │ │  │  Chain   │  │ │
               │ │  └──────────┘  │ │
               │ │  ┌──────────┐  │ │
               │ │  │  Tools   │  │ │
               │ │  └──────────┘  │ │
               │ │  ┌──────────┐  │ │
               │ │  │Subagents │  │ │
               │ │  └──────────┘  │ │
               │ └────────────────┘ │
               └────────────────────┘

Request Routing (via Nginx):

  • /api/langgraph/* → LangGraph Server - agent interactions, threads, streaming
  • /api/* (other) → Gateway API - models, MCP, skills, memory, artifacts, uploads, thread-local cleanup
  • / (non-API) → Frontend - Next.js web interface

Core Components

Lead Agent

The single LangGraph agent (lead_agent) is the runtime entry point, created via make_lead_agent(config). It combines:

  • Dynamic model selection with thinking and vision support
  • Middleware chain for cross-cutting concerns (9 middlewares)
  • Tool system with sandbox, MCP, community, and built-in tools
  • Subagent delegation for parallel task execution
  • System prompt with skills injection, memory context, and working directory guidance

Middleware Chain

Middlewares execute in strict order, each handling a specific concern:

# Middleware Purpose
1 ThreadDataMiddleware Creates per-thread isolated directories (workspace, uploads, outputs)
2 UploadsMiddleware Injects newly uploaded files into conversation context
3 SandboxMiddleware Acquires sandbox environment for code execution
4 SummarizationMiddleware Reduces context when approaching token limits (optional)
5 TodoListMiddleware Tracks multi-step tasks in plan mode (optional)
6 TitleMiddleware Auto-generates conversation titles after first exchange
7 MemoryMiddleware Queues conversations for async memory extraction
8 ViewImageMiddleware Injects image data for vision-capable models (conditional)
9 ClarificationMiddleware Intercepts clarification requests and interrupts execution (must be last)

Sandbox System

Per-thread isolated execution with virtual path translation:

  • Abstract interface: execute_command, read_file, write_file, list_dir
  • Providers: LocalSandboxProvider (filesystem) and AioSandboxProvider (Docker, in community/)
  • Virtual paths: /mnt/user-data/{workspace,uploads,outputs} → thread-specific physical directories
  • Skills path: /mnt/skillsdeer-flow/skills/ directory
  • Skills loading: Recursively discovers nested SKILL.md files under skills/{public,custom} and preserves nested container paths
  • File-write safety: str_replace serializes read-modify-write per (sandbox.id, path) so isolated sandboxes keep concurrency even when virtual paths match
  • Tools: bash, ls, read_file, write_file, str_replace (bash is disabled by default when using LocalSandboxProvider; use AioSandboxProvider for isolated shell access)

Subagent System

Async task delegation with concurrent execution:

  • Built-in agents: general-purpose (full toolset) and bash (command specialist, exposed only when shell access is available)
  • Concurrency: Max 3 subagents per turn, 15-minute timeout
  • Execution: Background thread pools with status tracking and SSE events
  • Flow: Agent calls task() tool → executor runs subagent in background → polls for completion → returns result

Memory System

LLM-powered persistent context retention across conversations:

  • Automatic extraction: Analyzes conversations for user context, facts, and preferences
  • Structured storage: User context (work, personal, top-of-mind), history, and confidence-scored facts
  • Debounced updates: Batches updates to minimize LLM calls (configurable wait time)
  • System prompt injection: Top facts + context injected into agent prompts
  • Storage: JSON file with mtime-based cache invalidation

Tool Ecosystem

Category Tools
Sandbox bash, ls, read_file, write_file, str_replace
Built-in present_files, ask_clarification, view_image, task (subagent)
Community Tavily (web search), Jina AI (web fetch), Firecrawl (scraping), DuckDuckGo (image search)
MCP Any Model Context Protocol server (stdio, SSE, HTTP transports)
Skills Domain-specific workflows injected via system prompt

Gateway API

FastAPI application providing REST endpoints for frontend integration:

Route Purpose
GET /api/models List available LLM models
GET/PUT /api/mcp/config Manage MCP server configurations
GET/PUT /api/skills List and manage skills
POST /api/skills/install Install skill from .skill archive
GET /api/memory Retrieve memory data
POST /api/memory/reload Force memory reload
GET /api/memory/config Memory configuration
GET /api/memory/status Combined config + data
POST /api/threads/{id}/uploads Upload files (auto-converts PDF/PPT/Excel/Word to Markdown, rejects directory paths)
GET /api/threads/{id}/uploads/list List uploaded files
DELETE /api/threads/{id} Delete DeerFlow-managed local thread data after LangGraph thread deletion; unexpected failures are logged server-side and return a generic 500 detail
GET /api/threads/{id}/artifacts/{path} Serve generated artifacts

IM Channels

The IM bridge supports Feishu, Slack, and Telegram. Slack and Telegram still use the final runs.wait() response path, while Feishu now streams through runs.stream(["messages-tuple", "values"]) and updates a single in-thread card in place.

For Feishu card updates, DeerFlow stores the running card's message_id per inbound message and patches that same card until the run finishes, preserving the existing OK / DONE reaction flow.


Quick Start

Prerequisites

  • Python 3.12+
  • uv package manager
  • API keys for your chosen LLM provider

Installation

cd deer-flow

# Copy configuration files
cp config.example.yaml config.yaml

# Install backend dependencies
cd backend
make install

Configuration

Edit config.yaml in the project root:

models:
  - name: gpt-4o
    display_name: GPT-4o
    use: langchain_openai:ChatOpenAI
    model: gpt-4o
    api_key: $OPENAI_API_KEY
    supports_thinking: false
    supports_vision: true

  - name: gpt-5-responses
    display_name: GPT-5 (Responses API)
    use: langchain_openai:ChatOpenAI
    model: gpt-5
    api_key: $OPENAI_API_KEY
    use_responses_api: true
    output_version: responses/v1
    supports_vision: true

Set your API keys:

export OPENAI_API_KEY="your-api-key-here"

Running

Full Application (from project root):

make dev  # Starts LangGraph + Gateway + Frontend + Nginx

Access at: http://localhost:2026

Backend Only (from backend directory):

# Terminal 1: LangGraph server
make dev

# Terminal 2: Gateway API
make gateway

Direct access: LangGraph at http://localhost:2024, Gateway at http://localhost:8001


Project Structure

backend/
├── src/
│   ├── agents/                  # Agent system
│   │   ├── lead_agent/         # Main agent (factory, prompts)
│   │   ├── middlewares/        # 9 middleware components
│   │   ├── memory/             # Memory extraction & storage
│   │   └── thread_state.py    # ThreadState schema
│   ├── gateway/                # FastAPI Gateway API
│   │   ├── app.py             # Application setup
│   │   └── routers/           # 6 route modules
│   ├── sandbox/                # Sandbox execution
│   │   ├── local/             # Local filesystem provider
│   │   ├── sandbox.py         # Abstract interface
│   │   ├── tools.py           # bash, ls, read/write/str_replace
│   │   └── middleware.py      # Sandbox lifecycle
│   ├── subagents/              # Subagent delegation
│   │   ├── builtins/          # general-purpose, bash agents
│   │   ├── executor.py        # Background execution engine
│   │   └── registry.py        # Agent registry
│   ├── tools/builtins/         # Built-in tools
│   ├── mcp/                    # MCP protocol integration
│   ├── models/                 # Model factory
│   ├── skills/                 # Skill discovery & loading
│   ├── config/                 # Configuration system
│   ├── community/              # Community tools & providers
│   ├── reflection/             # Dynamic module loading
│   └── utils/                  # Utilities
├── docs/                       # Documentation
├── tests/                      # Test suite
├── langgraph.json              # LangGraph server configuration
├── pyproject.toml              # Python dependencies
├── Makefile                    # Development commands
└── Dockerfile                  # Container build

Configuration

Main Configuration (config.yaml)

Place in project root. Config values starting with $ resolve as environment variables.

Key sections:

  • models - LLM configurations with class paths, API keys, thinking/vision flags
  • tools - Tool definitions with module paths and groups
  • tool_groups - Logical tool groupings
  • sandbox - Execution environment provider
  • skills - Skills directory paths
  • title - Auto-title generation settings
  • summarization - Context summarization settings
  • subagents - Subagent system (enabled/disabled)
  • memory - Memory system settings (enabled, storage, debounce, facts limits)

Provider note:

  • models[*].use references provider classes by module path (for example langchain_openai:ChatOpenAI).
  • If a provider module is missing, DeerFlow now returns an actionable error with install guidance (for example uv add langchain-google-genai).

Extensions Configuration (extensions_config.json)

MCP servers and skill states in a single file:

{
  "mcpServers": {
    "github": {
      "enabled": true,
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": {"GITHUB_TOKEN": "$GITHUB_TOKEN"}
    },
    "secure-http": {
      "enabled": true,
      "type": "http",
      "url": "https://api.example.com/mcp",
      "oauth": {
        "enabled": true,
        "token_url": "https://auth.example.com/oauth/token",
        "grant_type": "client_credentials",
        "client_id": "$MCP_OAUTH_CLIENT_ID",
        "client_secret": "$MCP_OAUTH_CLIENT_SECRET"
      }
    }
  },
  "skills": {
    "pdf-processing": {"enabled": true}
  }
}

Environment Variables

  • DEER_FLOW_CONFIG_PATH - Override config.yaml location
  • DEER_FLOW_EXTENSIONS_CONFIG_PATH - Override extensions_config.json location
  • Model API keys: OPENAI_API_KEY, ANTHROPIC_API_KEY, DEEPSEEK_API_KEY, etc.
  • Tool API keys: TAVILY_API_KEY, GITHUB_TOKEN, etc.

LangSmith Tracing

DeerFlow has built-in LangSmith integration for observability. When enabled, all LLM calls, agent runs, tool executions, and middleware processing are traced and visible in the LangSmith dashboard.

Setup:

  1. Sign up at smith.langchain.com and create a project.
  2. Add the following to your .env file in the project root:
LANGSMITH_TRACING=true
LANGSMITH_ENDPOINT=https://api.smith.langchain.com
LANGSMITH_API_KEY=lsv2_pt_xxxxxxxxxxxxxxxx
LANGSMITH_PROJECT=xxx

Legacy variables: The LANGCHAIN_TRACING_V2, LANGCHAIN_API_KEY, LANGCHAIN_PROJECT, and LANGCHAIN_ENDPOINT variables are also supported for backward compatibility. LANGSMITH_* variables take precedence when both are set.

Langfuse Tracing

DeerFlow also supports Langfuse observability for LangChain-compatible runs.

Add the following to your .env file:

LANGFUSE_TRACING=true
LANGFUSE_PUBLIC_KEY=pk-lf-xxxxxxxxxxxxxxxx
LANGFUSE_SECRET_KEY=sk-lf-xxxxxxxxxxxxxxxx
LANGFUSE_BASE_URL=https://cloud.langfuse.com

If you are using a self-hosted Langfuse deployment, set LANGFUSE_BASE_URL to your Langfuse host.

Dual Provider Behavior

If both LangSmith and Langfuse are enabled, DeerFlow initializes and attaches both callbacks so the same run data is reported to both systems.

If a provider is explicitly enabled but required credentials are missing, or the provider callback cannot be initialized, DeerFlow raises an error when tracing is initialized during model creation instead of silently disabling tracing.

Docker: In docker-compose.yaml, tracing is disabled by default (LANGSMITH_TRACING=false). Set LANGSMITH_TRACING=true and/or LANGFUSE_TRACING=true in your .env, together with the required credentials, to enable tracing in containerized deployments.


Development

Commands

make install    # Install dependencies
make dev        # Run LangGraph server (port 2024)
make gateway    # Run Gateway API (port 8001)
make lint       # Run linter (ruff)
make format     # Format code (ruff)

Code Style

  • Linter/Formatter: ruff
  • Line length: 240 characters
  • Python: 3.12+ with type hints
  • Quotes: Double quotes
  • Indentation: 4 spaces

Testing

uv run pytest

Technology Stack

  • LangGraph (1.0.6+) - Agent framework and multi-agent orchestration
  • LangChain (1.2.3+) - LLM abstractions and tool system
  • FastAPI (0.115.0+) - Gateway REST API
  • langchain-mcp-adapters - Model Context Protocol support
  • agent-sandbox - Sandboxed code execution
  • markitdown - Multi-format document conversion
  • tavily-python / firecrawl-py - Web search and scraping

Documentation


License

See the LICENSE file in the project root.

Contributing

See CONTRIBUTING.md for contribution guidelines.