mirror of
https://github.com/bytedance/deer-flow.git
synced 2026-06-09 17:12:01 +00:00
* fix(config): make the reload boundary discoverable from code, not just docs Closes #3144. The hot-reload contract — per-run fields are resolved through `get_app_config()` on every request, infrastructure fields snapshot at gateway startup — landed in `backend/CLAUDE.md` as part of #3131. A maintainer reading `get_config()` or an `AppConfig` field still had to context-switch to that document to know which fields require a process restart, and there was no enforcement that the prose list stayed in sync with the code. This commit moves the boundary to a machine-readable single source of truth and surfaces it where the code lives: - New `deerflow.config.reload_boundary` module owns the registry of restart-required fields (`STARTUP_ONLY_FIELDS`) and a tiny helper API (`is_startup_only_field`, `iter_startup_only_field_paths`, `format_field_description`). The standardised `"startup-only:"` prefix is exported as `STARTUP_ONLY_PREFIX` so future scanners / lint hooks / doc generators can pivot off it without re-parsing prose. - `AppConfig`'s `database`, `checkpointer`, `run_events`, `stream_bridge`, `sandbox`, and `log_level` fields now build their `Field(description=...)` from `format_field_description(...)`. The same text shows up in IDE hover (Pydantic v2 exposes `description` via `model_fields[...]`). - `channels` is restart-required too but lives outside the AppConfig Pydantic schema (the config section is consumed directly by `start_channel_service`). The registry owns it so the boundary is not split between two places. - `get_config()` docstring points to the registry instead of leaving the reader to find `CLAUDE.md`. The `CLAUDE.md` table collapses to a one-liner pointing back at `reload_boundary.py` so the boundary has one canonical location, not two. Drift coverage in `tests/test_reload_boundary.py`: - Every registered field has a non-trivial reason. - Iterator / membership helpers stay in sync with the dict. - Every registry entry that maps to an `AppConfig` field also carries the `"startup-only:"` prefix in the schema (catches "forgot to update the schema"). - Reverse drift: any AppConfig field whose description starts with the prefix must be registered (catches "marked restart-required in the schema but forgot the registry"). - The runtime introspection that IDE hover depends on (`AppConfig.model_fields["database"].description`) is pinned, so a future Pydantic upgrade or schema swap that breaks the hover surface shows up as a test failure rather than a silent regression. Refs: bytedance/deer-flow#3138 (split summary), #3107 (origin), #3131 (prior boundary fix in prose form). * fix(config): preserve field doc and correct log_level reload reason Two follow-ups on the PR #3153 review: 1. The `log_level` STARTUP_ONLY_FIELDS reason previously claimed `apply_logging_level()` mutates the root logger level. It does not: only the `deerflow` / `app` logger levels are set, and root handler thresholds are conditionally lowered so messages from those loggers can propagate. Reword to match the actual behavior so operators reading IDE hover get accurate restart guidance. 2. `format_field_description(field_path)` was the sole `Field(description=)` for every restart-required field, which silently overwrote the original human-facing documentation — most visibly the `log_level` field that used to list debug/info/warning/error and clarify that third-party libraries are not affected. Extend the helper with a keyword-only `field_doc` parameter that composes the startup-only marker with the original prose so IDE hover documents both *why* the field is restart-required and *what* it actually accepts. Updated all six restart-required AppConfig fields (`log_level`, `database`, `sandbox`, `run_events`, `checkpointer`, `stream_bridge`) to pass their original descriptions through the helper. Tests: two new cases in `test_reload_boundary.py` pin (a) the helper composition and (b) every AppConfig restart-required field still surfaces a recognisable substring of its original documentation. --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
641 lines
47 KiB
Markdown
641 lines
47 KiB
Markdown
# CLAUDE.md
|
|
|
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
|
|
## Project Overview
|
|
|
|
DeerFlow is a LangGraph-based AI super agent system with a full-stack architecture. The backend provides a "super agent" with sandbox execution, persistent memory, subagent delegation, and extensible tool integration - all operating in per-thread isolated environments.
|
|
|
|
**Architecture**:
|
|
- **Gateway API** (port 8001): REST API plus embedded LangGraph-compatible agent runtime
|
|
- **Frontend** (port 3000): Next.js web interface
|
|
- **Nginx** (port 2026): Unified reverse proxy entry point
|
|
- **Provisioner** (port 8002, optional in Docker dev): Started only when sandbox is configured for provisioner/Kubernetes mode
|
|
|
|
**Runtime**:
|
|
- `make dev`, Docker dev, and production all run the agent runtime in Gateway via `RunManager` + `run_agent()` + `StreamBridge` (`packages/harness/deerflow/runtime/`). Nginx exposes that runtime at `/api/langgraph/*` and rewrites it to Gateway's native `/api/*` routers.
|
|
|
|
**Project Structure**:
|
|
```
|
|
deer-flow/
|
|
├── Makefile # Root commands (check, install, dev, stop)
|
|
├── config.yaml # Main application configuration
|
|
├── extensions_config.json # MCP servers and skills configuration
|
|
├── backend/ # Backend application (this directory)
|
|
│ ├── Makefile # Backend-only commands (dev, gateway, lint)
|
|
│ ├── langgraph.json # LangGraph Studio graph configuration
|
|
│ ├── packages/
|
|
│ │ └── harness/ # deerflow-harness package (import: deerflow.*)
|
|
│ │ ├── pyproject.toml
|
|
│ │ └── deerflow/
|
|
│ │ ├── agents/ # LangGraph agent system
|
|
│ │ │ ├── lead_agent/ # Main agent (factory + system prompt)
|
|
│ │ │ ├── middlewares/ # 10 middleware components
|
|
│ │ │ ├── memory/ # Memory extraction, queue, prompts
|
|
│ │ │ └── thread_state.py # ThreadState schema
|
|
│ │ ├── sandbox/ # Sandbox execution system
|
|
│ │ │ ├── local/ # Local filesystem provider
|
|
│ │ │ ├── sandbox.py # Abstract Sandbox interface
|
|
│ │ │ ├── tools.py # bash, ls, read/write/str_replace
|
|
│ │ │ └── middleware.py # Sandbox lifecycle management
|
|
│ │ ├── subagents/ # Subagent delegation system
|
|
│ │ │ ├── builtins/ # general-purpose, bash agents
|
|
│ │ │ ├── executor.py # Background execution engine
|
|
│ │ │ └── registry.py # Agent registry
|
|
│ │ ├── tools/builtins/ # Built-in tools (present_files, ask_clarification, view_image)
|
|
│ │ ├── mcp/ # MCP integration (tools, cache, client)
|
|
│ │ ├── models/ # Model factory with thinking/vision support
|
|
│ │ ├── skills/ # Skills discovery, loading, parsing
|
|
│ │ ├── config/ # Configuration system (app, model, sandbox, tool, etc.)
|
|
│ │ ├── community/ # Community tools (tavily, jina_ai, firecrawl, image_search, aio_sandbox)
|
|
│ │ ├── reflection/ # Dynamic module loading (resolve_variable, resolve_class)
|
|
│ │ ├── utils/ # Utilities (network, readability)
|
|
│ │ └── client.py # Embedded Python client (DeerFlowClient)
|
|
│ ├── app/ # Application layer (import: app.*)
|
|
│ │ ├── gateway/ # FastAPI Gateway API
|
|
│ │ │ ├── app.py # FastAPI application
|
|
│ │ │ └── routers/ # FastAPI route modules (models, mcp, memory, skills, uploads, threads, artifacts, agents, suggestions, channels)
|
|
│ │ └── channels/ # IM platform integrations
|
|
│ ├── tests/ # Test suite
|
|
│ └── docs/ # Documentation
|
|
├── frontend/ # Next.js frontend application
|
|
└── skills/ # Agent skills directory
|
|
├── public/ # Public skills (committed)
|
|
└── custom/ # Custom skills (gitignored)
|
|
```
|
|
|
|
## Important Development Guidelines
|
|
|
|
### Documentation Update Policy
|
|
**CRITICAL: Always update README.md and CLAUDE.md after every code change**
|
|
|
|
When making code changes, you MUST update the relevant documentation:
|
|
- Update `README.md` for user-facing changes (features, setup, usage instructions)
|
|
- Update `CLAUDE.md` for development changes (architecture, commands, workflows, internal systems)
|
|
- Keep documentation synchronized with the codebase at all times
|
|
- Ensure accuracy and timeliness of all documentation
|
|
|
|
## Commands
|
|
|
|
**Root directory** (for full application):
|
|
```bash
|
|
make check # Check system requirements
|
|
make install # Install all dependencies (frontend + backend)
|
|
make dev # Start all services (Gateway + Frontend + Nginx), with config.yaml preflight
|
|
make start # Start production services locally
|
|
make stop # Stop all services
|
|
```
|
|
|
|
**Backend directory** (for backend development only):
|
|
```bash
|
|
make install # Install backend dependencies
|
|
make dev # Run Gateway API with reload (port 8001)
|
|
make gateway # Run Gateway API only (port 8001)
|
|
make test # Run all backend tests
|
|
make test-blocking-io # Run strict Blockbuster runtime gate on tests/blocking_io/
|
|
make lint # Lint with ruff
|
|
make format # Format code with ruff
|
|
```
|
|
|
|
The `detect-blocking-io` target parses `app/`, `packages/harness/deerflow/`,
|
|
and `scripts/` with AST. By default it reports only blocking IO candidates that
|
|
are inside async code, reachable from async code in the same file, or reachable
|
|
from sync-only `AgentMiddleware` before/after hooks that LangGraph can execute
|
|
on the async graph path. It prints a concise summary and writes complete JSON
|
|
findings to `.deer-flow/blocking-io-findings.json` at the repository root
|
|
(both `make detect-blocking-io` from the repo root and `cd backend && make
|
|
detect-blocking-io` resolve to the same repo-root path). JSON findings include
|
|
`priority`, `location`, `blocking_call`, `event_loop_exposure`, `reason`, and
|
|
`code` for model-assisted or manual review. `priority` is a deterministic
|
|
review ordering from operation type, not proof of a bug. Bare-name same-file
|
|
calls are resolved by function name, so duplicate helper names in one file can
|
|
conservatively over-report async reachability. It is intentionally
|
|
informational and is not run from CI in this round.
|
|
|
|
Regression tests related to Docker/provisioner behavior:
|
|
- `tests/test_docker_sandbox_mode_detection.py` (mode detection from `config.yaml`)
|
|
- `tests/test_provisioner_kubeconfig.py` (kubeconfig file/directory handling)
|
|
|
|
Blocking-IO runtime gate (`tests/blocking_io/`):
|
|
- Wraps every item under `tests/blocking_io/` with a strict Blockbuster
|
|
context scoped to `app.*` and `deerflow.*` (see
|
|
`tests/support/detectors/blocking_io_runtime.py`). Any sync blocking IO
|
|
call whose stack passes through DeerFlow business code while running on
|
|
the asyncio event loop raises `BlockingError` and fails the test.
|
|
- Regression anchors live there: `test_skills_load.py` (locks the
|
|
`asyncio.to_thread` offload around `LocalSkillStorage.load_skills`, fix
|
|
for #1917); `test_sqlite_lifespan.py` (locks the offload around
|
|
SQLite path resolution plus `ensure_sqlite_parent_dir`, fix for #1912);
|
|
`test_jsonl_run_event_store.py` (locks `JsonlRunEventStore`'s async
|
|
API offloading its file IO via `asyncio.to_thread`, fix #3084); and
|
|
`test_uploads_middleware.py` (locks `UploadsMiddleware.abefore_agent`
|
|
offloading the uploads-directory scan off the event loop).
|
|
- `test_gate_smoke.py` is a meta-test asserting the gate actually catches
|
|
unoffloaded blocking IO and that the `@pytest.mark.allow_blocking_io`
|
|
opt-out works.
|
|
- Coverage boundary: the gate only sees code that test execution actually
|
|
touches. Static AST coverage is a separate concern (out of scope for
|
|
this PR).
|
|
- CI: runs on every PR via `.github/workflows/backend-blocking-io-tests.yml`,
|
|
hard-fail.
|
|
|
|
Boundary check (harness → app import firewall):
|
|
- `tests/test_harness_boundary.py` — ensures `packages/harness/deerflow/` never imports from `app.*`
|
|
|
|
CI runs these regression tests for every pull request via [.github/workflows/backend-unit-tests.yml](../.github/workflows/backend-unit-tests.yml).
|
|
|
|
## Architecture
|
|
|
|
### Harness / App Split
|
|
|
|
The backend is split into two layers with a strict dependency direction:
|
|
|
|
- **Harness** (`packages/harness/deerflow/`): Publishable agent framework package (`deerflow-harness`). Import prefix: `deerflow.*`. Contains agent orchestration, tools, sandbox, models, MCP, skills, config — everything needed to build and run agents.
|
|
- **App** (`app/`): Unpublished application code. Import prefix: `app.*`. Contains the FastAPI Gateway API and IM channel integrations (Feishu, Slack, Telegram, DingTalk).
|
|
|
|
**Dependency rule**: App imports deerflow, but deerflow never imports app. This boundary is enforced by `tests/test_harness_boundary.py` which runs in CI.
|
|
|
|
**Import conventions**:
|
|
```python
|
|
# Harness internal
|
|
from deerflow.agents import make_lead_agent
|
|
from deerflow.models import create_chat_model
|
|
|
|
# App internal
|
|
from app.gateway.app import app
|
|
from app.channels.service import start_channel_service
|
|
|
|
# App → Harness (allowed)
|
|
from deerflow.config import get_app_config
|
|
|
|
# Harness → App (FORBIDDEN — enforced by test_harness_boundary.py)
|
|
# from app.gateway.routers.uploads import ... # ← will fail CI
|
|
```
|
|
|
|
### Agent System
|
|
|
|
**Lead Agent** (`packages/harness/deerflow/agents/lead_agent/agent.py`):
|
|
- Entry point: `make_lead_agent(config: RunnableConfig)` registered in `langgraph.json`
|
|
- Dynamic model selection via `create_chat_model()` with thinking/vision support
|
|
- Tools loaded via `get_available_tools()` - combines sandbox, built-in, MCP, community, and subagent tools
|
|
- System prompt generated by `apply_prompt_template()` with skills, memory, and subagent instructions
|
|
|
|
**ThreadState** (`packages/harness/deerflow/agents/thread_state.py`):
|
|
- Extends `AgentState` with: `sandbox`, `thread_data`, `title`, `artifacts`, `todos`, `uploaded_files`, `viewed_images`
|
|
- Uses custom reducers: `merge_artifacts` (deduplicate), `merge_viewed_images` (merge/clear)
|
|
|
|
**Runtime Configuration** (via `config.configurable`):
|
|
- `thinking_enabled` - Enable model's extended thinking
|
|
- `model_name` - Select specific LLM model
|
|
- `is_plan_mode` - Enable TodoList middleware
|
|
- `subagent_enabled` - Enable task delegation tool
|
|
|
|
### Middleware Chain
|
|
|
|
Lead-agent middlewares are assembled in strict append order across `packages/harness/deerflow/agents/middlewares/tool_error_handling_middleware.py` (`build_lead_runtime_middlewares`) and `packages/harness/deerflow/agents/lead_agent/agent.py` (`_build_middlewares`):
|
|
|
|
1. **ThreadDataMiddleware** - Creates per-thread directories under the user's isolation scope (`backend/.deer-flow/users/{user_id}/threads/{thread_id}/user-data/{workspace,uploads,outputs}`); resolves `user_id` via `get_effective_user_id()` (falls back to `"default"` in no-auth mode); Web UI thread deletion now follows LangGraph thread removal with Gateway cleanup of the local thread directory
|
|
2. **UploadsMiddleware** - Tracks and injects newly uploaded files into conversation
|
|
3. **SandboxMiddleware** - Acquires sandbox, stores `sandbox_id` in state
|
|
4. **DanglingToolCallMiddleware** - Injects placeholder ToolMessages for AIMessage tool_calls that lack responses (e.g., due to user interruption), including raw provider tool-call payloads preserved only in `additional_kwargs["tool_calls"]`
|
|
5. **LLMErrorHandlingMiddleware** - Normalizes provider/model invocation failures into recoverable assistant-facing errors before later middleware/tool stages run
|
|
6. **GuardrailMiddleware** - Pre-tool-call authorization via pluggable `GuardrailProvider` protocol (optional, if `guardrails.enabled` in config). Evaluates each tool call and returns error ToolMessage on deny. Three provider options: built-in `AllowlistProvider` (zero deps), OAP policy providers (e.g. `aport-agent-guardrails`), or custom providers. See [docs/GUARDRAILS.md](docs/GUARDRAILS.md) for setup, usage, and how to implement a provider.
|
|
7. **SandboxAuditMiddleware** - Audits sandboxed shell/file operations for security logging before tool execution continues
|
|
8. **ToolErrorHandlingMiddleware** - Converts tool exceptions into error `ToolMessage`s so the run can continue instead of aborting
|
|
9. **SummarizationMiddleware** - Context reduction when approaching token limits (optional, if enabled)
|
|
10. **TodoListMiddleware** - Task tracking with `write_todos` tool (optional, if plan_mode)
|
|
11. **TokenUsageMiddleware** - Records token usage metrics when token tracking is enabled (optional); subagent usage is cached by `tool_call_id` only while token usage is enabled and merged back into the dispatching AIMessage by message position rather than message id
|
|
12. **TitleMiddleware** - Auto-generates thread title after first complete exchange and normalizes structured message content before prompting the title model
|
|
13. **MemoryMiddleware** - Queues conversations for async memory update (filters to user + final AI responses)
|
|
14. **ViewImageMiddleware** - Injects base64 image data before LLM call (conditional on vision support)
|
|
15. **DeferredToolFilterMiddleware** - Hides deferred (MCP) tool schemas from the bound model using a build-time deferred-name set + catalog hash, reading per-thread promotions from `ThreadState.promoted` (hash-scoped, no ContextVar); a tool becomes bound on subsequent turns after `tool_search` returns its schema (optional, if `tool_search.enabled`)
|
|
16. **SubagentLimitMiddleware** - Truncates excess `task` tool calls from model response to enforce `MAX_CONCURRENT_SUBAGENTS` limit (optional, if `subagent_enabled`)
|
|
17. **LoopDetectionMiddleware** - Detects repeated tool-call loops; hard-stop responses clear both structured `tool_calls` and raw provider tool-call metadata before forcing a final text answer
|
|
18. **ClarificationMiddleware** - Intercepts `ask_clarification` tool calls, interrupts via `Command(goto=END)` (must be last)
|
|
|
|
### Configuration System
|
|
|
|
**Main Configuration** (`config.yaml`):
|
|
|
|
Setup: Copy `config.example.yaml` to `config.yaml` in the **project root** directory.
|
|
|
|
**Config Versioning**: `config.example.yaml` has a `config_version` field. On startup, `AppConfig.from_file()` compares user version vs example version and emits a warning if outdated. Missing `config_version` = version 0. Run `make config-upgrade` to auto-merge missing fields. When changing the config schema, bump `config_version` in `config.example.yaml`.
|
|
|
|
**Config Caching**: `get_app_config()` caches the parsed config, but automatically reloads it when the resolved config path changes or the file's mtime increases. This keeps Gateway and LangGraph reads aligned with `config.yaml` edits without requiring a manual process restart.
|
|
|
|
**Config Hot-Reload Boundary**: Gateway dependencies route through `get_app_config()` on every request, so per-run fields like `models[*].max_tokens`, `summarization.*`, `title.*`, `memory.*`, `subagents.*`, `tools[*]`, and the agent system prompt pick up `config.yaml` edits on the next message. `AppConfig` is intentionally **not** cached on `app.state` — `lifespan()` keeps a local `startup_config` variable for one-shot bootstrap work and passes it to `langgraph_runtime(app, startup_config)`.
|
|
|
|
Infrastructure fields are **restart-required**. The authoritative list lives in `packages/harness/deerflow/config/reload_boundary.py::STARTUP_ONLY_FIELDS` and is mirrored by the standardised `"startup-only:"` prefix on the corresponding `Field(description=...)` in `AppConfig`, so IDE hover on those fields surfaces the reason inline (no need to context-switch into this table). Currently registered: `database`, `checkpointer`, `run_events`, `stream_bridge`, `sandbox`, `log_level`, `channels`. Adding a new restart-required field requires updating the registry; drift is pinned by `tests/test_reload_boundary.py`.
|
|
|
|
Configuration priority:
|
|
1. Explicit `config_path` argument
|
|
2. `DEER_FLOW_CONFIG_PATH` environment variable
|
|
3. `config.yaml` in current directory (backend/)
|
|
4. `config.yaml` in parent directory (project root - **recommended location**)
|
|
|
|
Config values starting with `$` are resolved as environment variables (e.g., `$OPENAI_API_KEY`).
|
|
`ModelConfig` also declares `use_responses_api` and `output_version` so OpenAI `/v1/responses` can be enabled explicitly while still using `langchain_openai:ChatOpenAI`.
|
|
|
|
**Extensions Configuration** (`extensions_config.json`):
|
|
|
|
MCP servers and skills are configured together in `extensions_config.json` in project root:
|
|
|
|
Configuration priority:
|
|
1. Explicit `config_path` argument
|
|
2. `DEER_FLOW_EXTENSIONS_CONFIG_PATH` environment variable
|
|
3. `extensions_config.json` in current directory (backend/)
|
|
4. `extensions_config.json` in parent directory (project root - **recommended location**)
|
|
|
|
### Gateway API (`app/gateway/`)
|
|
|
|
FastAPI application on port 8001 with health check at `GET /health`. Set `GATEWAY_ENABLE_DOCS=false` to disable `/docs`, `/redoc`, and `/openapi.json` in production (default: enabled).
|
|
|
|
CORS is same-origin by default when requests enter through nginx on port 2026. Split-origin or port-forwarded browser clients must opt in with `GATEWAY_CORS_ORIGINS` (comma-separated exact origins); Gateway `CORSMiddleware` and `CSRFMiddleware` both read that variable so browser CORS and auth-origin checks stay aligned.
|
|
|
|
**Routers**:
|
|
|
|
| Router | Endpoints |
|
|
|--------|-----------|
|
|
| **Models** (`/api/models`) | `GET /` - list models; `GET /{name}` - model details |
|
|
| **MCP** (`/api/mcp`) | `GET /config` - get config; `PUT /config` - update config (saves to extensions_config.json) |
|
|
| **Skills** (`/api/skills`) | `GET /` - list skills; `GET /{name}` - details; `PUT /{name}` - update enabled; `POST /install` - install from .skill archive (accepts standard optional frontmatter like `version`, `author`, `compatibility`) |
|
|
| **Memory** (`/api/memory`) | `GET /` - memory data; `POST /reload` - force reload; `GET /config` - config; `GET /status` - config + data |
|
|
| **Uploads** (`/api/threads/{id}/uploads`) | `POST /` - upload files (auto-converts PDF/PPT/Excel/Word); `GET /list` - list; `DELETE /{filename}` - delete |
|
|
| **Threads** (`/api/threads/{id}`) | `DELETE /` - remove DeerFlow-managed local thread data after LangGraph thread deletion; unexpected failures are logged server-side and return a generic 500 detail |
|
|
| **Artifacts** (`/api/threads/{id}/artifacts`) | `GET /{path}` - serve artifacts; active content types (`text/html`, `application/xhtml+xml`, `image/svg+xml`) are always forced as download attachments to reduce XSS risk; `?download=true` still forces download for other file types |
|
|
| **Suggestions** (`/api/threads/{id}/suggestions`) | `POST /` - generate follow-up questions; rich list/block model content is normalized before JSON parsing |
|
|
| **Thread Runs** (`/api/threads/{id}/runs`) | `POST /` - create background run; `POST /stream` - create + SSE stream; `POST /wait` - create + block; `GET /` - list runs; `GET /{rid}` - run details; `POST /{rid}/cancel` - cancel; `GET /{rid}/join` - join SSE; `GET /{rid}/messages` - paginated messages `{data, has_more}`; `GET /{rid}/events` - full event stream; `GET /../messages` - thread messages with feedback; `GET /../token-usage` - aggregate tokens |
|
|
| **Feedback** (`/api/threads/{id}/runs/{rid}/feedback`) | `PUT /` - upsert feedback; `DELETE /` - delete user feedback; `POST /` - create feedback; `GET /` - list feedback; `GET /stats` - aggregate stats; `DELETE /{fid}` - delete specific |
|
|
| **Runs** (`/api/runs`) | `POST /stream` - stateless run + SSE; `POST /wait` - stateless run + block; `GET /{rid}/messages` - paginated messages by run_id `{data, has_more}` (cursor: `after_seq`/`before_seq`); `GET /{rid}/feedback` - list feedback by run_id |
|
|
|
|
**RunManager / RunStore contract**:
|
|
- `RunManager.get()` is async; direct callers must `await` it.
|
|
- When a persistent `RunStore` is configured, `get()` and `list_by_thread()` hydrate historical runs from the store. In-memory records win for the same `run_id` so task, abort, and stream-control state stays attached to active local runs.
|
|
- `cancel()` and `create_or_reject(..., multitask_strategy="interrupt"|"rollback")` persist interrupted status through `RunStore.update_status()`, matching normal `set_status()` transitions.
|
|
- Store-only hydrated runs are readable history. If the current worker has no in-memory task/control state for that run, cancellation APIs can return 409 because this worker cannot stop the task.
|
|
- `POST /wait` (both thread-scoped and `/api/runs/wait`) drains the stream bridge via `wait_for_run_completion()` instead of bare `await record.task`, so it honours the run's `on_disconnect` setting and cancels the background run on real client disconnect rather than returning a stale checkpoint (issue #3265).
|
|
|
|
Proxied through nginx: `/api/langgraph/*` → Gateway LangGraph-compatible runtime, all other `/api/*` → Gateway REST APIs.
|
|
|
|
### Sandbox System (`packages/harness/deerflow/sandbox/`)
|
|
|
|
**Interface**: Abstract `Sandbox` with `execute_command`, `read_file`, `write_file`, `list_dir`
|
|
**Provider Pattern**: `SandboxProvider` with `acquire`, `acquire_async`, `get`, `release` lifecycle. Async agent/tool paths call async sandbox lifecycle hooks so Docker sandbox creation, discovery, cross-process locking, readiness polling, and release stay off the event loop.
|
|
**Implementations**:
|
|
- `LocalSandboxProvider` - Local filesystem execution. `acquire(thread_id)` returns a per-thread `LocalSandbox` (id `local:{thread_id}`) whose `path_mappings` resolve `/mnt/user-data/{workspace,uploads,outputs}` and `/mnt/acp-workspace` to that thread's host directories, so the public `Sandbox` API honours the `/mnt/user-data` contract uniformly with AIO. `acquire()` / `acquire(None)` keeps the legacy generic singleton (id `local`) for callers without a thread context. Per-thread sandboxes are held in an LRU cache (default 256 entries) guarded by a `threading.Lock`.
|
|
- `AioSandboxProvider` (`packages/harness/deerflow/community/`) - Docker-based isolation
|
|
|
|
**Virtual Path System**:
|
|
- Agent sees: `/mnt/user-data/{workspace,uploads,outputs}`, `/mnt/skills`
|
|
- Physical: `backend/.deer-flow/users/{user_id}/threads/{thread_id}/user-data/...`, `deer-flow/skills/`
|
|
- Translation: `LocalSandboxProvider` builds per-thread `PathMapping`s for the user-data prefixes at acquire time; `tools.py` keeps `replace_virtual_path()` / `replace_virtual_paths_in_command()` as a defense-in-depth layer (and for path validation). AIO has the directories volume-mounted at the same virtual paths inside its container, so both implementations accept `/mnt/user-data/...` natively.
|
|
- Detection: `is_local_sandbox()` accepts both `sandbox_id == "local"` (legacy / no-thread) and `sandbox_id.startswith("local:")` (per-thread)
|
|
|
|
**Sandbox Tools** (in `packages/harness/deerflow/sandbox/tools.py`):
|
|
- `bash` - Execute commands with path translation and error handling
|
|
- `ls` - Directory listing (tree format, max 2 levels)
|
|
- `read_file` - Read file contents with optional line range
|
|
- `write_file` - Write/append to files, creates directories; overwrites by default and exposes the `append` argument in the model-facing schema for end-of-file writes
|
|
- `str_replace` - Substring replacement (single or all occurrences); same-path serialization is scoped to `(sandbox.id, path)` so isolated sandboxes do not contend on identical virtual paths inside one process
|
|
|
|
### Subagent System (`packages/harness/deerflow/subagents/`)
|
|
|
|
**Built-in Agents**: `general-purpose` (all tools except `task`) and `bash` (command specialist)
|
|
**Execution**: Dual thread pool - `_scheduler_pool` (3 workers) + `_execution_pool` (3 workers)
|
|
**Concurrency**: `MAX_CONCURRENT_SUBAGENTS = 3` enforced by `SubagentLimitMiddleware` (truncates excess tool calls in `after_model`), 15-minute timeout
|
|
**Flow**: `task()` tool → `SubagentExecutor` → background thread → poll 5s → SSE events → result
|
|
**Events**: `task_started`, `task_running`, `task_completed`/`task_failed`/`task_timed_out`
|
|
|
|
### Tool System (`packages/harness/deerflow/tools/`)
|
|
|
|
`get_available_tools(groups, include_mcp, model_name, subagent_enabled)` assembles:
|
|
1. **Config-defined tools** - Resolved from `config.yaml` via `resolve_variable()`
|
|
2. **MCP tools** - From enabled MCP servers (lazy initialized, cached with mtime invalidation)
|
|
3. **Built-in tools**:
|
|
- `present_files` - Make output files visible to user (only `/mnt/user-data/outputs`)
|
|
- `ask_clarification` - Request clarification (intercepted by ClarificationMiddleware → interrupts)
|
|
- `view_image` - Read image as base64 (added only if model supports vision)
|
|
- `setup_agent` - Bootstrap-only: persist a brand-new custom agent's `SOUL.md` and `config.yaml`. Bound only when `is_bootstrap=True`.
|
|
- `update_agent` - Custom-agent-only: persist self-updates to the current agent's `SOUL.md` / `config.yaml` from inside a normal chat (partial update + atomic write). Bound when `agent_name` is set and `is_bootstrap=False`.
|
|
4. **Subagent tool** (if enabled):
|
|
- `task` - Delegate to subagent (description, prompt, subagent_type)
|
|
|
|
**Community tools** (`packages/harness/deerflow/community/`):
|
|
- `tavily/` - Web search (5 results default) and web fetch (4KB limit)
|
|
- `jina_ai/` - Web fetch via Jina reader API with readability extraction
|
|
- `firecrawl/` - Web scraping via Firecrawl API
|
|
|
|
**ACP agent tools**:
|
|
- `invoke_acp_agent` - Invokes external ACP-compatible agents from `config.yaml`
|
|
- ACP launchers must be real ACP adapters. The standard `codex` CLI is not ACP-compatible by itself; configure a wrapper such as `npx -y @zed-industries/codex-acp` or an installed `codex-acp` binary
|
|
- Missing ACP executables now return an actionable error message instead of a raw `[Errno 2]`
|
|
- Each ACP agent uses a per-thread workspace at `{base_dir}/users/{user_id}/threads/{thread_id}/acp-workspace/`. The workspace is accessible to the lead agent via the virtual path `/mnt/acp-workspace/` (read-only). In docker sandbox mode, the directory is volume-mounted into the container at `/mnt/acp-workspace` (read-only); in local sandbox mode, path translation is handled by `tools.py`
|
|
- `image_search/` - Image search via DuckDuckGo
|
|
|
|
### MCP System (`packages/harness/deerflow/mcp/`)
|
|
|
|
- Uses `langchain-mcp-adapters` `MultiServerMCPClient` for multi-server management
|
|
- **Lazy initialization**: Tools loaded on first use via `get_cached_mcp_tools()`
|
|
- **Cache invalidation**: Detects config file changes via mtime comparison
|
|
- **Transports**: stdio (command-based), SSE, HTTP
|
|
- **OAuth (HTTP/SSE)**: Supports token endpoint flows (`client_credentials`, `refresh_token`) with automatic token refresh + Authorization header injection
|
|
- **Runtime updates**: Gateway API saves to extensions_config.json; the Gateway-embedded runtime detects changes via mtime
|
|
|
|
### Skills System (`packages/harness/deerflow/skills/`)
|
|
|
|
- **Location**: `deer-flow/skills/{public,custom}/`
|
|
- **Format**: Directory with `SKILL.md` (YAML frontmatter: name, description, license, allowed-tools)
|
|
- **Loading**: `load_skills()` recursively scans `skills/{public,custom}` for `SKILL.md`, parses metadata, and reads enabled state from extensions_config.json
|
|
- **Injection**: Enabled skills listed in agent system prompt with container paths
|
|
- **Installation**: `POST /api/skills/install` extracts .skill ZIP archive to custom/ directory
|
|
|
|
### Model Factory (`packages/harness/deerflow/models/factory.py`)
|
|
|
|
- `create_chat_model(name, thinking_enabled)` instantiates LLM from config via reflection
|
|
- Supports `thinking_enabled` flag with per-model `when_thinking_enabled` overrides
|
|
- Supports vLLM-style thinking toggles via `when_thinking_enabled.extra_body.chat_template_kwargs.enable_thinking` for Qwen reasoning models, while normalizing legacy `thinking` configs for backward compatibility
|
|
- Supports `supports_vision` flag for image understanding models
|
|
- Config values starting with `$` resolved as environment variables
|
|
- Missing provider modules surface actionable install hints from reflection resolvers (for example `uv add langchain-google-genai`)
|
|
|
|
### vLLM Provider (`packages/harness/deerflow/models/vllm_provider.py`)
|
|
|
|
- `VllmChatModel` subclasses `langchain_openai:ChatOpenAI` for vLLM 0.19.0 OpenAI-compatible endpoints
|
|
- Preserves vLLM's non-standard assistant `reasoning` field on full responses, streaming deltas, and follow-up tool-call turns
|
|
- Designed for configs that enable thinking through `extra_body.chat_template_kwargs.enable_thinking` on vLLM 0.19.0 Qwen reasoning models, while accepting the older `thinking` alias
|
|
|
|
### IM Channels System (`app/channels/`)
|
|
|
|
Bridges external messaging platforms (Feishu, Slack, Telegram, DingTalk) to the DeerFlow agent via Gateway's LangGraph-compatible API.
|
|
|
|
|
|
**Architecture**: Channels communicate with Gateway through the `langgraph-sdk` HTTP client (same as the frontend), ensuring threads are created and managed server-side. The internal SDK client injects process-local internal auth plus a matching CSRF cookie/header pair so Gateway accepts state-changing thread/run requests from channel workers without relying on browser session cookies.
|
|
|
|
**Components**:
|
|
- `message_bus.py` - Async pub/sub hub (`InboundMessage` → queue → dispatcher; `OutboundMessage` → callbacks → channels)
|
|
- `store.py` - JSON-file persistence mapping `channel_name:chat_id[:topic_id]` → `thread_id` (keys are `channel:chat` for root conversations and `channel:chat:topic` for threaded conversations)
|
|
- `manager.py` - Core dispatcher: creates threads via `client.threads.create()`, routes commands, keeps Slack/Telegram on `client.runs.wait()`, and uses `client.runs.stream(["messages-tuple", "values"])` for Feishu incremental outbound updates
|
|
- `base.py` - Abstract `Channel` base class (start/stop/send lifecycle)
|
|
- `service.py` - Manages lifecycle of all configured channels from `config.yaml`
|
|
- `slack.py` / `feishu.py` / `telegram.py` / `dingtalk.py` - Platform-specific implementations (`feishu.py` tracks the running card `message_id` in memory and patches the same card in place; `dingtalk.py` optionally uses AI Card streaming for in-place updates when `card_template_id` is configured)
|
|
|
|
**Message Flow**:
|
|
1. External platform -> Channel impl -> `MessageBus.publish_inbound()`
|
|
2. `ChannelManager._dispatch_loop()` consumes from queue
|
|
3. For chat: look up/create thread through Gateway's LangGraph-compatible API
|
|
4. Feishu chat: `runs.stream()` → accumulate AI text → publish multiple outbound updates (`is_final=False`) → publish final outbound (`is_final=True`)
|
|
5. Slack/Telegram chat: `runs.wait()` → extract final response → publish outbound
|
|
6. Feishu channel sends one running reply card up front, then patches the same card for each outbound update (card JSON sets `config.update_multi=true` for Feishu's patch API requirement)
|
|
7. DingTalk AI Card mode (when `card_template_id` configured): `runs.stream()` → create card with initial text → stream updates via `PUT /v1.0/card/streaming` → finalize on `is_final=True`. Falls back to `sampleMarkdown` if card creation or streaming fails
|
|
8. For commands (`/new`, `/status`, `/models`, `/memory`, `/help`): handle locally or query Gateway API
|
|
9. Outbound → channel callbacks → platform reply
|
|
|
|
**Configuration** (`config.yaml` -> `channels`):
|
|
- `langgraph_url` - LangGraph-compatible Gateway API base URL (default: `http://localhost:8001/api`)
|
|
- `gateway_url` - Gateway API URL for auxiliary commands (default: `http://localhost:8001`)
|
|
- In Docker Compose, IM channels run inside the `gateway` container, so `localhost` points back to that container. Use `http://gateway:8001/api` for `langgraph_url` and `http://gateway:8001` for `gateway_url`, or set `DEER_FLOW_CHANNELS_LANGGRAPH_URL` / `DEER_FLOW_CHANNELS_GATEWAY_URL`.
|
|
- Per-channel configs: `feishu` (app_id, app_secret), `slack` (bot_token, app_token), `telegram` (bot_token), `dingtalk` (client_id, client_secret, optional `card_template_id` for AI Card streaming)
|
|
|
|
|
|
### Memory System (`packages/harness/deerflow/agents/memory/`)
|
|
|
|
**Components**:
|
|
- `updater.py` - LLM-based memory updates with fact extraction, whitespace-normalized fact deduplication (trims leading/trailing whitespace before comparing), and atomic file I/O
|
|
- `queue.py` - Debounced update queue (per-thread deduplication, configurable wait time); captures `user_id` at enqueue time so it survives the `threading.Timer` boundary
|
|
- `prompt.py` - Prompt templates for memory updates
|
|
- `storage.py` - File-based storage with per-user isolation; cache keyed by `(user_id, agent_name)` tuple
|
|
|
|
**Per-User Isolation**:
|
|
- Memory is stored per-user at `{base_dir}/users/{user_id}/memory.json`
|
|
- Per-agent per-user memory at `{base_dir}/users/{user_id}/agents/{agent_name}/memory.json`
|
|
- Custom agent definitions (`SOUL.md` + `config.yaml`) are also per-user at `{base_dir}/users/{user_id}/agents/{agent_name}/`. The legacy shared layout `{base_dir}/agents/{agent_name}/` remains read-only fallback for unmigrated installations
|
|
- `user_id` is resolved via `get_effective_user_id()` from `deerflow.runtime.user_context`
|
|
- In no-auth mode, `user_id` defaults to `"default"` (constant `DEFAULT_USER_ID`)
|
|
- Absolute `storage_path` in config opts out of per-user isolation
|
|
- **Migration**: Run `PYTHONPATH=. python scripts/migrate_user_isolation.py` to move legacy `memory.json`, `threads/`, and `agents/` into per-user layout. Supports `--dry-run` (preview changes) and `--user-id USER_ID` (assign unowned legacy data to a user, defaults to `default`).
|
|
|
|
**Data Structure** (stored in `{base_dir}/users/{user_id}/memory.json`):
|
|
- **User Context**: `workContext`, `personalContext`, `topOfMind` (1-3 sentence summaries)
|
|
- **History**: `recentMonths`, `earlierContext`, `longTermBackground`
|
|
- **Facts**: Discrete facts with `id`, `content`, `category` (preference/knowledge/context/behavior/goal), `confidence` (0-1), `createdAt`, `source`
|
|
|
|
**Workflow**:
|
|
1. `MemoryMiddleware` filters messages (user inputs + final AI responses), captures `user_id` via `get_effective_user_id()`, and queues conversation with the captured `user_id`
|
|
2. Queue debounces (30s default), batches updates, deduplicates per-thread
|
|
3. Background thread invokes LLM to extract context updates and facts, using the stored `user_id` (not the contextvar, which is unavailable on timer threads)
|
|
4. Applies updates atomically (temp file + rename) with cache invalidation, skipping duplicate fact content before append
|
|
5. Next interaction injects top 15 facts + context into `<memory>` tags in system prompt
|
|
|
|
Focused regression coverage for the updater lives in `backend/tests/test_memory_updater.py`.
|
|
|
|
**Configuration** (`config.yaml` → `memory`):
|
|
- `enabled` / `injection_enabled` - Master switches
|
|
- `storage_path` - Path to memory.json (absolute path opts out of per-user isolation)
|
|
- `debounce_seconds` - Wait time before processing (default: 30)
|
|
- `model_name` - LLM for updates (null = default model)
|
|
- `max_facts` / `fact_confidence_threshold` - Fact storage limits (100 / 0.7)
|
|
- `max_injection_tokens` - Token limit for prompt injection (2000)
|
|
|
|
### Reflection System (`packages/harness/deerflow/reflection/`)
|
|
|
|
- `resolve_variable(path)` - Import module and return variable (e.g., `module.path:variable_name`)
|
|
- `resolve_class(path, base_class)` - Import and validate class against base class
|
|
|
|
### Tracing System (`packages/harness/deerflow/tracing/`)
|
|
|
|
LangSmith and Langfuse are both supported. The wiring lives in two layers:
|
|
|
|
- `factory.py::build_tracing_callbacks()` — returns the LangChain `CallbackHandler` list for the providers currently enabled via env vars (`LANGSMITH_TRACING`, `LANGFUSE_TRACING`, etc.). The handlers are attached at the **graph invocation root** for in-graph runs (`make_lead_agent` and `DeerFlowClient.stream` both append them to `config["callbacks"]` before invoking the graph) so a single run produces one trace with all node / LLM / tool calls as child spans. Standalone callers — anything that invokes a model outside such a graph (e.g. `MemoryUpdater`) — keep `create_chat_model`'s default `attach_tracing=True`, which falls back to model-level callback attachment.
|
|
- `metadata.py::build_langfuse_trace_metadata()` — builds the Langfuse-reserved trace attributes for `RunnableConfig.metadata`. The Langfuse v4 `langchain.CallbackHandler` lifts these onto the root trace (see its `_parse_langfuse_trace_attributes`), but only when it sees `on_chain_start(parent_run_id=None)` — which is why the callbacks have to live at the graph root, not the model.
|
|
|
|
**Trace-attribute injection points**: both `runtime/runs/worker.py::run_agent` (gateway path) and `client.py::DeerFlowClient.stream` (embedded path) merge the metadata into `config["metadata"]` right before constructing the graph. Caller-supplied keys win via `setdefault`, so an external `session_id` override is preserved. Field mapping:
|
|
|
|
| Langfuse field | Source |
|
|
|-----------------------|----------------------------------------------|
|
|
| `langfuse_session_id` | LangGraph `thread_id` |
|
|
| `langfuse_user_id` | `get_effective_user_id()` (`default` in no-auth) |
|
|
| `langfuse_trace_name` | `RunRecord.assistant_id` / client `agent_name` (defaults to `lead-agent`) |
|
|
| `langfuse_tags` | `env:<DEER_FLOW_ENV>` + `model:<model_name>` |
|
|
|
|
Returns `{}` when Langfuse is not in the enabled providers — LangSmith-only deployments are unaffected. Set `DEER_FLOW_ENV` (or `ENVIRONMENT`) to tag traces by deployment environment. Tests live in `tests/test_tracing_factory.py`, `tests/test_tracing_metadata.py`, `tests/test_worker_langfuse_metadata.py`, and `tests/test_client_langfuse_metadata.py`.
|
|
|
|
### Config Schema
|
|
|
|
**`config.yaml`** key sections:
|
|
- `models[]` - LLM configs with `use` class path, `supports_thinking`, `supports_vision`, provider-specific fields
|
|
- vLLM reasoning models should use `deerflow.models.vllm_provider:VllmChatModel`; for Qwen-style parsers prefer `when_thinking_enabled.extra_body.chat_template_kwargs.enable_thinking`, and DeerFlow will also normalize the older `thinking` alias
|
|
- `tools[]` - Tool configs with `use` variable path and `group`
|
|
- `tool_groups[]` - Logical groupings for tools
|
|
- `sandbox.use` - Sandbox provider class path
|
|
- `skills.path` / `skills.container_path` - Host and container paths to skills directory
|
|
- `title` - Auto-title generation (enabled, max_words, max_chars, prompt_template)
|
|
- `summarization` - Context summarization (enabled, trigger conditions, keep policy)
|
|
- `subagents.enabled` - Master switch for subagent delegation
|
|
- `memory` - Memory system (enabled, storage_path, debounce_seconds, model_name, max_facts, fact_confidence_threshold, injection_enabled, max_injection_tokens)
|
|
|
|
**`extensions_config.json`**:
|
|
- `mcpServers` - Map of server name → config (enabled, type, command, args, env, url, headers, oauth, description)
|
|
- `skills` - Map of skill name → state (enabled)
|
|
|
|
Both can be modified at runtime via Gateway API endpoints or `DeerFlowClient` methods.
|
|
|
|
### Embedded Client (`packages/harness/deerflow/client.py`)
|
|
|
|
`DeerFlowClient` provides direct in-process access to all DeerFlow capabilities without HTTP services. All return types align with the Gateway API response schemas, so consumer code works identically in HTTP and embedded modes.
|
|
|
|
**Architecture**: Imports the same `deerflow` modules that Gateway API uses. Shares the same config files and data directories. No FastAPI dependency.
|
|
|
|
**Agent Conversation**:
|
|
- `chat(message, thread_id)` — synchronous, accumulates streaming deltas per message-id and returns the final AI text
|
|
- `stream(message, thread_id)` — subscribes to LangGraph `stream_mode=["values", "messages", "custom"]` and yields `StreamEvent`:
|
|
- `"values"` — full state snapshot (title, messages, artifacts); AI text already delivered via `messages` mode is **not** re-synthesized here to avoid duplicate deliveries
|
|
- `"messages-tuple"` — per-chunk update: for AI text this is a **delta** (concat per `id` to rebuild the full message); tool calls and tool results are emitted once each
|
|
- `"custom"` — forwarded from `StreamWriter`
|
|
- `"end"` — stream finished (carries cumulative `usage` counted once per message id)
|
|
- Agent created lazily via `create_agent()` + `_build_middlewares()`, same as `make_lead_agent`
|
|
- Supports `checkpointer` parameter for state persistence across turns
|
|
- `reset_agent()` forces agent recreation (e.g. after memory or skill changes)
|
|
- See [docs/STREAMING.md](docs/STREAMING.md) for the full design: why Gateway and DeerFlowClient are parallel paths, LangGraph's `stream_mode` semantics, the per-id dedup invariants, and regression testing strategy
|
|
|
|
**Gateway Equivalent Methods** (replaces Gateway API):
|
|
|
|
| Category | Methods | Return format |
|
|
|----------|---------|---------------|
|
|
| Models | `list_models()`, `get_model(name)` | `{"models": [...]}`, `{name, display_name, ...}` |
|
|
| MCP | `get_mcp_config()`, `update_mcp_config(servers)` | `{"mcp_servers": {...}}` |
|
|
| Skills | `list_skills()`, `get_skill(name)`, `update_skill(name, enabled)`, `install_skill(path)` | `{"skills": [...]}` |
|
|
| Memory | `get_memory()`, `reload_memory()`, `get_memory_config()`, `get_memory_status()` | dict |
|
|
| Uploads | `upload_files(thread_id, files)`, `list_uploads(thread_id)`, `delete_upload(thread_id, filename)` | `{"success": true, "files": [...]}`, `{"files": [...], "count": N}` |
|
|
| Artifacts | `get_artifact(thread_id, path)` → `(bytes, mime_type)` | tuple |
|
|
|
|
**Key difference from Gateway**: Upload accepts local `Path` objects instead of HTTP `UploadFile`, rejects directory paths before copying, and reuses a single worker when document conversion must run inside an active event loop. Artifact returns `(bytes, mime_type)` instead of HTTP Response. The new Gateway-only thread cleanup route deletes `.deer-flow/threads/{thread_id}` after LangGraph thread deletion; there is no matching `DeerFlowClient` method yet. `update_mcp_config()` and `update_skill()` automatically invalidate the cached agent.
|
|
|
|
**Tests**: `tests/test_client.py` (77 unit tests including `TestGatewayConformance`), `tests/test_client_live.py` (live integration tests, requires config.yaml)
|
|
|
|
**Gateway Conformance Tests** (`TestGatewayConformance`): Validate that every dict-returning client method conforms to the corresponding Gateway Pydantic response model. Each test parses the client output through the Gateway model — if Gateway adds a required field that the client doesn't provide, Pydantic raises `ValidationError` and CI catches the drift. Covers: `ModelsListResponse`, `ModelResponse`, `SkillsListResponse`, `SkillResponse`, `SkillInstallResponse`, `McpConfigResponse`, `UploadResponse`, `MemoryConfigResponse`, `MemoryStatusResponse`.
|
|
|
|
## Development Workflow
|
|
|
|
### Test-Driven Development (TDD) — MANDATORY
|
|
|
|
**Every new feature or bug fix MUST be accompanied by unit tests. No exceptions.**
|
|
|
|
- Write tests in `backend/tests/` following the existing naming convention `test_<feature>.py`
|
|
- Run the full suite before and after your change: `make test`
|
|
- Tests must pass before a feature is considered complete
|
|
- For lightweight config/utility modules, prefer pure unit tests with no external dependencies
|
|
- If a module causes circular import issues in tests, add a `sys.modules` mock in `tests/conftest.py` (see existing example for `deerflow.subagents.executor`)
|
|
|
|
```bash
|
|
# Run all tests
|
|
make test
|
|
|
|
# Run a specific test file
|
|
PYTHONPATH=. uv run pytest tests/test_<feature>.py -v
|
|
```
|
|
|
|
### Running the Full Application
|
|
|
|
From the **project root** directory:
|
|
```bash
|
|
make dev
|
|
```
|
|
|
|
This starts all services and makes the application available at `http://localhost:2026`.
|
|
|
|
**All startup modes:**
|
|
|
|
| | **Local Foreground** | **Local Daemon** | **Docker Dev** | **Docker Prod** |
|
|
|---|---|---|---|---|
|
|
| **Dev** | `./scripts/serve.sh --dev`<br/>`make dev` | `./scripts/serve.sh --dev --daemon`<br/>`make dev-daemon` | `./scripts/docker.sh start`<br/>`make docker-start` | — |
|
|
| **Prod** | `./scripts/serve.sh --prod`<br/>`make start` | `./scripts/serve.sh --prod --daemon`<br/>`make start-daemon` | — | `./scripts/deploy.sh`<br/>`make up` |
|
|
|
|
| Action | Local | Docker Dev | Docker Prod |
|
|
|---|---|---|---|
|
|
| **Stop** | `./scripts/serve.sh --stop`<br/>`make stop` | `./scripts/docker.sh stop`<br/>`make docker-stop` | `./scripts/deploy.sh down`<br/>`make down` |
|
|
| **Restart** | `./scripts/serve.sh --restart [flags]` | `./scripts/docker.sh restart` | — |
|
|
|
|
**Nginx routing**:
|
|
- `/api/langgraph/*` → Gateway embedded runtime (8001), rewritten to `/api/*`
|
|
- `/api/*` (other) → Gateway API (8001)
|
|
- `/` (non-API) → Frontend (3000)
|
|
|
|
### Running Backend Services Separately
|
|
|
|
From the **backend** directory:
|
|
|
|
```bash
|
|
# Gateway API
|
|
make gateway
|
|
```
|
|
|
|
Direct access (without nginx):
|
|
- Gateway: `http://localhost:8001`
|
|
|
|
### Frontend Configuration
|
|
|
|
The frontend uses environment variables to connect to backend services:
|
|
- `NEXT_PUBLIC_LANGGRAPH_BASE_URL` - Defaults to `/api/langgraph` (through nginx)
|
|
- `NEXT_PUBLIC_BACKEND_BASE_URL` - Defaults to empty string (through nginx)
|
|
|
|
When using `make dev` from root, the frontend automatically connects through nginx.
|
|
|
|
## Key Features
|
|
|
|
### File Upload
|
|
|
|
Multi-file upload with automatic document conversion:
|
|
- Endpoint: `POST /api/threads/{thread_id}/uploads`
|
|
- Supports: PDF, PPT, Excel, Word documents (converted via `markitdown`)
|
|
- Rejects directory inputs before copying so uploads stay all-or-nothing
|
|
- Reuses one conversion worker per request when called from an active event loop
|
|
- Files stored in thread-isolated directories
|
|
- Duplicate filenames in a single upload request are auto-renamed with `_N` suffixes so later files do not truncate earlier files
|
|
- Agent receives uploaded file list via `UploadsMiddleware`
|
|
|
|
See [docs/FILE_UPLOAD.md](docs/FILE_UPLOAD.md) for details.
|
|
|
|
### Plan Mode
|
|
|
|
TodoList middleware for complex multi-step tasks:
|
|
- Controlled via runtime config: `config.configurable.is_plan_mode = True`
|
|
- Provides `write_todos` tool for task tracking
|
|
- One task in_progress at a time, real-time updates
|
|
|
|
See [docs/plan_mode_usage.md](docs/plan_mode_usage.md) for details.
|
|
|
|
### Context Summarization
|
|
|
|
Automatic conversation summarization when approaching token limits:
|
|
- Configured in `config.yaml` under `summarization` key
|
|
- Trigger types: tokens, messages, or fraction of max input
|
|
- Keeps recent messages while summarizing older ones
|
|
|
|
See [docs/summarization.md](docs/summarization.md) for details.
|
|
|
|
### Vision Support
|
|
|
|
For models with `supports_vision: true`:
|
|
- `ViewImageMiddleware` processes images in conversation
|
|
- `view_image_tool` added to agent's toolset
|
|
- Images automatically converted to base64 and injected into state
|
|
|
|
## Code Style
|
|
|
|
- Uses `ruff` for linting and formatting
|
|
- Line length: 240 characters
|
|
- Python 3.12+ with type hints
|
|
- Double quotes, space indentation
|
|
|
|
## Documentation
|
|
|
|
See `docs/` directory for detailed documentation:
|
|
- [CONFIGURATION.md](docs/CONFIGURATION.md) - Configuration options
|
|
- [ARCHITECTURE.md](docs/ARCHITECTURE.md) - Architecture details
|
|
- [API.md](docs/API.md) - API reference
|
|
- [SETUP.md](docs/SETUP.md) - Setup guide
|
|
- [FILE_UPLOAD.md](docs/FILE_UPLOAD.md) - File upload feature
|
|
- [PATH_EXAMPLES.md](docs/PATH_EXAMPLES.md) - Path types and usage
|
|
- [summarization.md](docs/summarization.md) - Context summarization
|
|
- [plan_mode_usage.md](docs/plan_mode_usage.md) - Plan mode with TodoList
|