mirror of https://github.com/bytedance/deer-flow.git synced 2026-04-29 13:28:11 +00:00

History

rayhpeng 229c8095be fix(threads): load history messages from event store, immune to summarize

``get_thread_history`` and ``get_thread_state`` in Gateway mode read
messages from ``checkpoint.channel_values["messages"]``. After
SummarizationMiddleware runs mid-run, that list is rewritten in-place:
pre-summarize messages are dropped and a synthetic summary-as-human
message takes position 0. The frontend then renders a chat history that
starts with ``"Here is a summary of the conversation to date:..."``
instead of the user's original query, and all earlier turns are gone.

The event store (``RunEventStore``) is append-only and never rewritten,
so it retains the full transcript. This commit adds a helper
``_get_event_store_messages`` that loads the event store's message
stream and overrides ``values["messages"]`` in both endpoints; the
checkpoint fallback kicks in only when the event store is unavailable.

Behavior contract of the helper:

- **Full pagination.** ``list_messages`` returns the newest ``limit``
  records when no cursor is given, so a fixed limit silently drops
  older messages on long threads. The helper sizes the read from
  ``count_messages()`` and pages forward with ``after_seq`` cursors.
- **Copy-on-read.** Each content dict is copied before ``id`` is
  patched so the live store object (``MemoryRunEventStore`` returns
  references) is never mutated.
- **Stable ids.** Messages with ``id=None`` (human + tool_result,
  which don't receive an id until checkpoint persistence) get a
  deterministic ``uuid5(NAMESPACE_URL, f"{thread_id}:{seq}")`` so
  React keys stay stable across requests. AI messages keep their
  LLM-assigned ``lc_run--*`` ids.
- **Legacy ``Command`` repr sanitization.** Rows captured before the
  ``journal.py`` ``on_tool_end`` fix (previous commit) stored
  ``str(Command(update={'messages': [ToolMessage(content='X', ...)]}))``
  as the tool_result content. ``_sanitize_legacy_command_repr``
  regex-extracts the inner text so old threads render cleanly.
- **Inline feedback.** When loading the stream, the helper also pulls
  ``feedback_repo.list_by_thread_grouped`` and attaches ``run_id`` to
  every message plus ``feedback`` to the final ``ai_message`` of each
  run. This removes the frontend's need to fetch a second endpoint
  and positional-index-map its way back to the right run. When the
  feedback subsystem is unavailable, the ``feedback`` field is left
  absent entirely so the frontend hides the button rather than
  rendering it over a broken write path.
- **User context.** ``DbRunEventStore`` is user-scoped by default via
  ``resolve_user_id(AUTO)``. The helper relies on the ``@require_permission``
  decorator having populated the user contextvar on both callers; the
  docstring documents this dependency explicitly so nobody wires it
  into a CLI or migration script without passing ``user_id=None``.

Real data verification against thread
``6d30913e-dcd4-41c8-8941-f66c716cf359``: checkpoint showed 12 messages
(summarize-corrupted), event store had 16. The original human message
``"最新伊美局势"`` was preserved as seq=1 in the event store and
correctly restored to position 0 in the helper output. Helper output
for AI messages was byte-identical to checkpoint for every overlapping
message; only tool_result ids differed (patched to uuid5) and the
legacy Command repr at seq=48 was sanitized.

Tests:
- ``test_thread_state_event_store.py`` — 18 tests covering
  ``_sanitize_legacy_command_repr`` (passthrough, single/double-quote
  extraction, unparseable fallback), helper happy path (all message
  types, stable uuid5, store non-mutation), multi-page pagination,
  summarize regression (recovers pre-summarize messages), feedback
  attachment (per-run, multi-run threads, repo failure graceful),
  and dependency failure fallback to ``None``.

Docs:
- ``docs/superpowers/plans/2026-04-10-event-store-history.md`` — the
  implementation plan this commit realizes, with Task 1 revised after
  the evaluation findings (pagination, copy-on-read, Command wrap
  already landed in journal.py, frontend feedback pagination in the
  follow-up commit, Standard-mode follow-up noted).
- ``docs/superpowers/specs/2026-04-11-runjournal-history-evaluation.md``
  — the Claude + second-opinion evaluation document that drove the
  plan revisions (pagination bug, dict-mutation bug, feedback hidden
  bug, Command bug).
- ``docs/superpowers/specs/2026-04-11-summarize-marker-design.md`` —
  design for a follow-up PR that visually marks summarize events in
  history, based on a verified ``adispatch_custom_event`` experiment
  (``trace=False`` middleware nodes can still forward the Pregel task
  config via explicit signature injection).

Scope: Gateway mode only (``make dev-pro``). Standard mode
(``make dev``) hits LangGraph Server directly and bypasses these
endpoints; the summarize symptom is still present there and is
tracked as a separate follow-up in the plan.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-11 23:38:53 +08:00

.vscode

chore: specify project title

2026-01-14 09:57:52 +08:00

app

fix(threads): load history messages from event store, immune to summarize

2026-04-11 23:38:53 +08:00

docs

refactor(persistence): unify SQLite to single deerflow.db and move checkpointer to runtime

2026-04-11 23:37:25 +08:00

packages/harness

fix(journal): unwrap Command tool results in on_tool_end

2026-04-11 23:38:53 +08:00

tests

fix(threads): load history messages from event store, immune to summarize

2026-04-11 23:38:53 +08:00

.gitignore

feat: add DeerFlowClient for embedded programmatic access (#926 )

2026-02-28 14:38:15 +08:00

.python-version

chore: add Python and LangGraph stuff

2026-01-14 07:15:02 +08:00

AGENTS.md

docs: fix typo and grammar issues in docs (#1315 )

2026-03-25 10:01:36 +08:00

CLAUDE.md

fix(backend): stream DeerFlowClient AI text as token deltas (#1969 ) (#1974 )

2026-04-10 18:16:38 +08:00

CONTRIBUTING.md

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

debug.py

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

Dockerfile

feat(persistence): add unified persistence layer with event store, token tracking, and feedback (#1930 )

2026-04-11 11:23:39 +08:00

langgraph.json

feat(auth): release-validation pass for 2.0-rc — 12 blockers + simplify follow-ups (#2008 )

2026-04-11 11:25:38 +08:00

Makefile

fix: unblock concurrent threads and workspace hydration (#1839 )

2026-04-04 21:19:35 +08:00

pyproject.toml

feat: replace auto-admin creation with secure interactive first-boot setup (#2063 )

2026-04-11 16:39:12 +08:00

README.md

fix: prevent concurrent subagent file write conflicts in sandbox tools (#1714 )

2026-04-02 15:39:41 +08:00

ruff.toml

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

uv.lock

feat(dependencies): add langchain-ollama and ollama packages with optional dependencies

2026-04-11 16:49:32 +08:00

uv.toml

feat(auth): release-validation pass for 2.0-rc — 12 blockers + simplify follow-ups (#2008 )

2026-04-11 11:25:38 +08:00

README.md

DeerFlow Backend

DeerFlow is a LangGraph-based AI super agent with sandbox execution, persistent memory, and extensible tool integration. The backend enables AI agents to execute code, browse the web, manage files, delegate tasks to subagents, and retain context across conversations - all in isolated, per-thread environments.

Architecture

                        ┌──────────────────────────────────────┐
                        │          Nginx (Port 2026)           │
                        │      Unified reverse proxy           │
                        └───────┬──────────────────┬───────────┘
                                │                  │
              /api/langgraph/*  │                  │  /api/* (other)
                                ▼                  ▼
               ┌────────────────────┐  ┌────────────────────────┐
               │ LangGraph Server   │  │   Gateway API (8001)   │
               │    (Port 2024)     │  │   FastAPI REST         │
               │                    │  │                        │
               │ ┌────────────────┐ │  │ Models, MCP, Skills,   │
               │ │  Lead Agent    │ │  │ Memory, Uploads,       │
               │ │  ┌──────────┐  │ │  │ Artifacts              │
               │ │  │Middleware│  │ │  └────────────────────────┘
               │ │  │  Chain   │  │ │
               │ │  └──────────┘  │ │
               │ │  ┌──────────┐  │ │
               │ │  │  Tools   │  │ │
               │ │  └──────────┘  │ │
               │ │  ┌──────────┐  │ │
               │ │  │Subagents │  │ │
               │ │  └──────────┘  │ │
               │ └────────────────┘ │
               └────────────────────┘

Request Routing (via Nginx):

/api/langgraph/* → LangGraph Server - agent interactions, threads, streaming
/api/* (other) → Gateway API - models, MCP, skills, memory, artifacts, uploads, thread-local cleanup
/ (non-API) → Frontend - Next.js web interface

Core Components

Lead Agent

The single LangGraph agent (lead_agent) is the runtime entry point, created via make_lead_agent(config). It combines:

Dynamic model selection with thinking and vision support
Middleware chain for cross-cutting concerns (9 middlewares)
Tool system with sandbox, MCP, community, and built-in tools
Subagent delegation for parallel task execution
System prompt with skills injection, memory context, and working directory guidance

Middleware Chain

Middlewares execute in strict order, each handling a specific concern:

#	Middleware	Purpose
1	ThreadDataMiddleware	Creates per-thread isolated directories (workspace, uploads, outputs)
2	UploadsMiddleware	Injects newly uploaded files into conversation context
3	SandboxMiddleware	Acquires sandbox environment for code execution
4	SummarizationMiddleware	Reduces context when approaching token limits (optional)
5	TodoListMiddleware	Tracks multi-step tasks in plan mode (optional)
6	TitleMiddleware	Auto-generates conversation titles after first exchange
7	MemoryMiddleware	Queues conversations for async memory extraction
8	ViewImageMiddleware	Injects image data for vision-capable models (conditional)
9	ClarificationMiddleware	Intercepts clarification requests and interrupts execution (must be last)

Sandbox System

Per-thread isolated execution with virtual path translation:

Abstract interface: execute_command, read_file, write_file, list_dir
Providers: LocalSandboxProvider (filesystem) and AioSandboxProvider (Docker, in community/)
Virtual paths: /mnt/user-data/{workspace,uploads,outputs} → thread-specific physical directories
Skills path: /mnt/skills → deer-flow/skills/ directory
Skills loading: Recursively discovers nested SKILL.md files under skills/{public,custom} and preserves nested container paths
File-write safety: str_replace serializes read-modify-write per (sandbox.id, path) so isolated sandboxes keep concurrency even when virtual paths match
Tools: bash, ls, read_file, write_file, str_replace (bash is disabled by default when using LocalSandboxProvider; use AioSandboxProvider for isolated shell access)

Subagent System

Async task delegation with concurrent execution:

Built-in agents: general-purpose (full toolset) and bash (command specialist, exposed only when shell access is available)
Concurrency: Max 3 subagents per turn, 15-minute timeout
Execution: Background thread pools with status tracking and SSE events
Flow: Agent calls task() tool → executor runs subagent in background → polls for completion → returns result

Memory System

LLM-powered persistent context retention across conversations:

Automatic extraction: Analyzes conversations for user context, facts, and preferences
Structured storage: User context (work, personal, top-of-mind), history, and confidence-scored facts
Debounced updates: Batches updates to minimize LLM calls (configurable wait time)
System prompt injection: Top facts + context injected into agent prompts
Storage: JSON file with mtime-based cache invalidation

Tool Ecosystem

Category	Tools
Sandbox	`bash`, `ls`, `read_file`, `write_file`, `str_replace`
Built-in	`present_files`, `ask_clarification`, `view_image`, `task` (subagent)
Community	Tavily (web search), Jina AI (web fetch), Firecrawl (scraping), DuckDuckGo (image search)
MCP	Any Model Context Protocol server (stdio, SSE, HTTP transports)
Skills	Domain-specific workflows injected via system prompt

Gateway API

FastAPI application providing REST endpoints for frontend integration:

Route	Purpose
`GET /api/models`	List available LLM models
`GET/PUT /api/mcp/config`	Manage MCP server configurations
`GET/PUT /api/skills`	List and manage skills
`POST /api/skills/install`	Install skill from `.skill` archive
`GET /api/memory`	Retrieve memory data
`POST /api/memory/reload`	Force memory reload
`GET /api/memory/config`	Memory configuration
`GET /api/memory/status`	Combined config + data
`POST /api/threads/{id}/uploads`	Upload files (auto-converts PDF/PPT/Excel/Word to Markdown, rejects directory paths)
`GET /api/threads/{id}/uploads/list`	List uploaded files
`DELETE /api/threads/{id}`	Delete DeerFlow-managed local thread data after LangGraph thread deletion; unexpected failures are logged server-side and return a generic 500 detail
`GET /api/threads/{id}/artifacts/{path}`	Serve generated artifacts

IM Channels

The IM bridge supports Feishu, Slack, and Telegram. Slack and Telegram still use the final runs.wait() response path, while Feishu now streams through runs.stream(["messages-tuple", "values"]) and updates a single in-thread card in place.

For Feishu card updates, DeerFlow stores the running card's message_id per inbound message and patches that same card until the run finishes, preserving the existing OK / DONE reaction flow.

Quick Start

Prerequisites

Python 3.12+
uv package manager
API keys for your chosen LLM provider

Installation

cd deer-flow

# Copy configuration files
cp config.example.yaml config.yaml

# Install backend dependencies
cd backend
make install

Configuration

Edit config.yaml in the project root:

models:
  - name: gpt-4o
    display_name: GPT-4o
    use: langchain_openai:ChatOpenAI
    model: gpt-4o
    api_key: $OPENAI_API_KEY
    supports_thinking: false
    supports_vision: true

  - name: gpt-5-responses
    display_name: GPT-5 (Responses API)
    use: langchain_openai:ChatOpenAI
    model: gpt-5
    api_key: $OPENAI_API_KEY
    use_responses_api: true
    output_version: responses/v1
    supports_vision: true

Set your API keys:

export OPENAI_API_KEY="your-api-key-here"

Running

Full Application (from project root):

make dev  # Starts LangGraph + Gateway + Frontend + Nginx

Access at: http://localhost:2026

Backend Only (from backend directory):

# Terminal 1: LangGraph server
make dev

# Terminal 2: Gateway API
make gateway

Direct access: LangGraph at http://localhost:2024, Gateway at http://localhost:8001

Project Structure

backend/
├── src/
│   ├── agents/                  # Agent system
│   │   ├── lead_agent/         # Main agent (factory, prompts)
│   │   ├── middlewares/        # 9 middleware components
│   │   ├── memory/             # Memory extraction & storage
│   │   └── thread_state.py    # ThreadState schema
│   ├── gateway/                # FastAPI Gateway API
│   │   ├── app.py             # Application setup
│   │   └── routers/           # 6 route modules
│   ├── sandbox/                # Sandbox execution
│   │   ├── local/             # Local filesystem provider
│   │   ├── sandbox.py         # Abstract interface
│   │   ├── tools.py           # bash, ls, read/write/str_replace
│   │   └── middleware.py      # Sandbox lifecycle
│   ├── subagents/              # Subagent delegation
│   │   ├── builtins/          # general-purpose, bash agents
│   │   ├── executor.py        # Background execution engine
│   │   └── registry.py        # Agent registry
│   ├── tools/builtins/         # Built-in tools
│   ├── mcp/                    # MCP protocol integration
│   ├── models/                 # Model factory
│   ├── skills/                 # Skill discovery & loading
│   ├── config/                 # Configuration system
│   ├── community/              # Community tools & providers
│   ├── reflection/             # Dynamic module loading
│   └── utils/                  # Utilities
├── docs/                       # Documentation
├── tests/                      # Test suite
├── langgraph.json              # LangGraph server configuration
├── pyproject.toml              # Python dependencies
├── Makefile                    # Development commands
└── Dockerfile                  # Container build

Configuration

Main Configuration (`config.yaml`)

Place in project root. Config values starting with $ resolve as environment variables.

Key sections:

models - LLM configurations with class paths, API keys, thinking/vision flags
tools - Tool definitions with module paths and groups
tool_groups - Logical tool groupings
sandbox - Execution environment provider
skills - Skills directory paths
title - Auto-title generation settings
summarization - Context summarization settings
subagents - Subagent system (enabled/disabled)
memory - Memory system settings (enabled, storage, debounce, facts limits)

Provider note:

models[*].use references provider classes by module path (for example langchain_openai:ChatOpenAI).
If a provider module is missing, DeerFlow now returns an actionable error with install guidance (for example uv add langchain-google-genai).

Extensions Configuration (`extensions_config.json`)

MCP servers and skill states in a single file:

{
  "mcpServers": {
    "github": {
      "enabled": true,
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": {"GITHUB_TOKEN": "$GITHUB_TOKEN"}
    },
    "secure-http": {
      "enabled": true,
      "type": "http",
      "url": "https://api.example.com/mcp",
      "oauth": {
        "enabled": true,
        "token_url": "https://auth.example.com/oauth/token",
        "grant_type": "client_credentials",
        "client_id": "$MCP_OAUTH_CLIENT_ID",
        "client_secret": "$MCP_OAUTH_CLIENT_SECRET"
      }
    }
  },
  "skills": {
    "pdf-processing": {"enabled": true}
  }
}

Environment Variables

DEER_FLOW_CONFIG_PATH - Override config.yaml location
DEER_FLOW_EXTENSIONS_CONFIG_PATH - Override extensions_config.json location
Model API keys: OPENAI_API_KEY, ANTHROPIC_API_KEY, DEEPSEEK_API_KEY, etc.
Tool API keys: TAVILY_API_KEY, GITHUB_TOKEN, etc.

LangSmith Tracing

DeerFlow has built-in LangSmith integration for observability. When enabled, all LLM calls, agent runs, tool executions, and middleware processing are traced and visible in the LangSmith dashboard.

Setup:

Sign up at smith.langchain.com and create a project.
Add the following to your .env file in the project root:

LANGSMITH_TRACING=true
LANGSMITH_ENDPOINT=https://api.smith.langchain.com
LANGSMITH_API_KEY=lsv2_pt_xxxxxxxxxxxxxxxx
LANGSMITH_PROJECT=xxx

Legacy variables: The LANGCHAIN_TRACING_V2, LANGCHAIN_API_KEY, LANGCHAIN_PROJECT, and LANGCHAIN_ENDPOINT variables are also supported for backward compatibility. LANGSMITH_* variables take precedence when both are set.

Langfuse Tracing

DeerFlow also supports Langfuse observability for LangChain-compatible runs.

Add the following to your .env file:

LANGFUSE_TRACING=true
LANGFUSE_PUBLIC_KEY=pk-lf-xxxxxxxxxxxxxxxx
LANGFUSE_SECRET_KEY=sk-lf-xxxxxxxxxxxxxxxx
LANGFUSE_BASE_URL=https://cloud.langfuse.com

If you are using a self-hosted Langfuse deployment, set LANGFUSE_BASE_URL to your Langfuse host.

Dual Provider Behavior

If both LangSmith and Langfuse are enabled, DeerFlow initializes and attaches both callbacks so the same run data is reported to both systems.

If a provider is explicitly enabled but required credentials are missing, or the provider callback cannot be initialized, DeerFlow raises an error when tracing is initialized during model creation instead of silently disabling tracing.

Docker: In docker-compose.yaml, tracing is disabled by default (LANGSMITH_TRACING=false). Set LANGSMITH_TRACING=true and/or LANGFUSE_TRACING=true in your .env, together with the required credentials, to enable tracing in containerized deployments.

Development

Commands

make install    # Install dependencies
make dev        # Run LangGraph server (port 2024)
make gateway    # Run Gateway API (port 8001)
make lint       # Run linter (ruff)
make format     # Format code (ruff)

Code Style

Linter/Formatter: ruff
Line length: 240 characters
Python: 3.12+ with type hints
Quotes: Double quotes
Indentation: 4 spaces

Testing

uv run pytest

Technology Stack

LangGraph (1.0.6+) - Agent framework and multi-agent orchestration
LangChain (1.2.3+) - LLM abstractions and tool system
FastAPI (0.115.0+) - Gateway REST API
langchain-mcp-adapters - Model Context Protocol support
agent-sandbox - Sandboxed code execution
markitdown - Multi-format document conversion
tavily-python / firecrawl-py - Web search and scraping

README.md

DeerFlow Backend

Architecture

Core Components

Lead Agent

Middleware Chain

Sandbox System

Subagent System

Memory System

Tool Ecosystem

Gateway API

IM Channels

Quick Start

Prerequisites

Installation

Configuration

Running

Project Structure

Configuration

Main Configuration (`config.yaml`)

Extensions Configuration (`extensions_config.json`)

Environment Variables

LangSmith Tracing

Langfuse Tracing

Dual Provider Behavior

Development

Commands

Code Style

Testing

Technology Stack

Documentation

License

Contributing

README.md

DeerFlow Backend

Architecture

Core Components

Lead Agent

Middleware Chain

Sandbox System

Subagent System

Memory System

Tool Ecosystem

Gateway API

IM Channels

Quick Start

Prerequisites

Installation

Configuration

Running

Project Structure

Configuration

Main Configuration (config.yaml)

Extensions Configuration (extensions_config.json)

Environment Variables

LangSmith Tracing

Langfuse Tracing

Dual Provider Behavior

Development

Commands

Code Style

Testing

Technology Stack

Documentation

License

Contributing

Main Configuration (`config.yaml`)

Extensions Configuration (`extensions_config.json`)