mirror of https://github.com/bytedance/deer-flow.git synced 2026-06-10 01:22:09 +00:00

Go to file

rayhpeng 185f5649dd feat(persistence): add unified persistence layer with event store, token tracking, and feedback (#1930 )

* feat(persistence): add SQLAlchemy 2.0 async ORM scaffold

Introduce a unified database configuration (DatabaseConfig) that
controls both the LangGraph checkpointer and the DeerFlow application
persistence layer from a single `database:` config section.

New modules:
- deerflow.config.database_config — Pydantic config with memory/sqlite/postgres backends
- deerflow.persistence — async engine lifecycle, DeclarativeBase with to_dict mixin, Alembic skeleton
- deerflow.runtime.runs.store — RunStore ABC + MemoryRunStore implementation

Gateway integration initializes/tears down the persistence engine in
the existing langgraph_runtime() context manager. Legacy checkpointer
config is preserved for backward compatibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(persistence): add RunEventStore ABC + MemoryRunEventStore

Phase 2-A prerequisite for event storage: adds the unified run event
stream interface (RunEventStore) with an in-memory implementation,
RunEventsConfig, gateway integration, and comprehensive tests (27 cases).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(persistence): add ORM models, repositories, DB/JSONL event stores, RunJournal, and API endpoints

Phase 2-B: run persistence + event storage + token tracking.

- ORM models: RunRow (with token fields), ThreadMetaRow, RunEventRow
- RunRepository implements RunStore ABC via SQLAlchemy ORM
- ThreadMetaRepository with owner access control
- DbRunEventStore with trace content truncation and cursor pagination
- JsonlRunEventStore with per-run files and seq recovery from disk
- RunJournal (BaseCallbackHandler) captures LLM/tool/lifecycle events,
  accumulates token usage by caller type, buffers and flushes to store
- RunManager now accepts optional RunStore for persistent backing
- Worker creates RunJournal, writes human_message, injects callbacks
- Gateway deps use factory functions (RunRepository when DB available)
- New endpoints: messages, run messages, run events, token-usage
- ThreadCreateRequest gains assistant_id field
- 92 tests pass (33 new), zero regressions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(persistence): add user feedback + follow-up run association

Phase 2-C: feedback and follow-up tracking.

- FeedbackRow ORM model (rating +1/-1, optional message_id, comment)
- FeedbackRepository with CRUD, list_by_run/thread, aggregate stats
- Feedback API endpoints: create, list, stats, delete
- follow_up_to_run_id in RunCreateRequest (explicit or auto-detected
  from latest successful run on the thread)
- Worker writes follow_up_to_run_id into human_message event metadata
- Gateway deps: feedback_repo factory + getter
- 17 new tests (14 FeedbackRepository + 3 follow-up association)
- 109 total tests pass, zero regressions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test+config: comprehensive Phase 2 test coverage + deprecate checkpointer config

- config.example.yaml: deprecate standalone checkpointer section, activate
  unified database:sqlite as default (drives both checkpointer + app data)
- New: test_thread_meta_repo.py (14 tests) — full ThreadMetaRepository coverage
  including check_access owner logic, list_by_owner pagination
- Extended test_run_repository.py (+4 tests) — completion preserves fields,
  list ordering desc, limit, owner_none returns all
- Extended test_run_journal.py (+8 tests) — on_chain_error, track_tokens=false,
  middleware no ai_message, unknown caller tokens, convenience fields,
  tool_error, non-summarization custom event
- Extended test_run_event_store.py (+7 tests) — DB batch seq continuity,
  make_run_event_store factory (memory/db/jsonl/fallback/unknown)
- Extended test_phase2b_integration.py (+4 tests) — create_or_reject persists,
  follow-up metadata, summarization in history, full DB-backed lifecycle
- Fixed DB integration test to use proper fake objects (not MagicMock)
  for JSON-serializable metadata
- 157 total Phase 2 tests pass, zero regressions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* config: move default sqlite_dir to .deer-flow/data

Keep SQLite databases alongside other DeerFlow-managed data
(threads, memory) under the .deer-flow/ directory instead of a
top-level ./data folder.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(persistence): remove UTFJSON, use engine-level json_serializer + datetime.now()

- Replace custom UTFJSON type with standard sqlalchemy.JSON in all ORM
  models. Add json_serializer=json.dumps(ensure_ascii=False) to all
  create_async_engine calls so non-ASCII text (Chinese etc.) is stored
  as-is in both SQLite and Postgres.
- Change ORM datetime defaults from datetime.now(UTC) to datetime.now(),
  remove UTC imports.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(gateway): simplify deps.py with getter factory + inline repos

- Replace 6 identical getter functions with _require() factory.
- Inline 3 _make_*_repo() factories into langgraph_runtime(), call
  get_session_factory() once instead of 3 times.
- Add thread_meta upsert in start_run (services.py).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(docker): add UV_EXTRAS build arg for optional dependencies

Support installing optional dependency groups (e.g. postgres) at
Docker build time via UV_EXTRAS build arg:
  UV_EXTRAS=postgres docker compose build

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(journal): fix flush, token tracking, and consolidate tests

RunJournal fixes:
- _flush_sync: retain events in buffer when no event loop instead of
  dropping them; worker's finally block flushes via async flush().
- on_llm_end: add tool_calls filter and caller=="lead_agent" guard for
  ai_message events; mark message IDs for dedup with record_llm_usage.
- worker.py: persist completion data (tokens, message count) to RunStore
  in finally block.

Model factory:
- Auto-inject stream_usage=True for BaseChatOpenAI subclasses with
  custom api_base, so usage_metadata is populated in streaming responses.

Test consolidation:
- Delete test_phase2b_integration.py (redundant with existing tests).
- Move DB-backed lifecycle test into test_run_journal.py.
- Add tests for stream_usage injection in test_model_factory.py.
- Clean up executor/task_tool dead journal references.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(events): widen content type to str|dict in all store backends

Allow event content to be a dict (for structured OpenAI-format messages)
in addition to plain strings. Dict values are JSON-serialized for the DB
backend and deserialized on read; memory and JSONL backends handle dicts
natively. Trace truncation now serializes dicts to JSON before measuring.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(events): use metadata flag instead of heuristic for dict content detection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(converters): add LangChain-to-OpenAI message format converters

Pure functions langchain_to_openai_message, langchain_to_openai_completion,
langchain_messages_to_openai, and _infer_finish_reason for converting
LangChain BaseMessage objects to OpenAI Chat Completions format, used by
RunJournal for event storage. 15 unit tests added.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(converters): handle empty list content as null, clean up test

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(events): human_message content uses OpenAI user message format

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(events): ai_message uses OpenAI format, add ai_tool_call message event

- ai_message content now uses {"role": "assistant", "content": "..."} format
- New ai_tool_call message event emitted when lead_agent LLM responds with tool_calls
- ai_tool_call uses langchain_to_openai_message converter for consistent format
- Both events include finish_reason in metadata ("stop" or "tool_calls")

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(events): add tool_result message event with OpenAI tool message format

Cache tool_call_id from on_tool_start keyed by run_id as fallback for on_tool_end,
then emit a tool_result message event (role=tool, tool_call_id, content) after each
successful tool completion.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(events): summary content uses OpenAI system message format

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(events): replace llm_start/llm_end with llm_request/llm_response in OpenAI format

Add on_chat_model_start to capture structured prompt messages as llm_request events.
Replace llm_end trace events with llm_response using OpenAI Chat Completions format.
Track llm_call_index to pair request/response events.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(events): add record_middleware method for middleware trace events

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test(events): add full run sequence integration test for OpenAI content format

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(events): align message events with checkpoint format and add middleware tag injection

- Message events (ai_message, ai_tool_call, tool_result, human_message) now use
  BaseMessage.model_dump() format, matching LangGraph checkpoint values.messages
- on_tool_end extracts tool_call_id/name/status from ToolMessage objects
- on_tool_error now emits tool_result message events with error status
- record_middleware uses middleware:{tag} event_type and middleware category
- Summarization custom events use middleware:summarize category
- TitleMiddleware injects middleware:title tag via get_config() inheritance
- SummarizationMiddleware model bound with middleware:summarize tag
- Worker writes human_message using HumanMessage.model_dump()

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(threads): switch search endpoint to threads_meta table and sync title

- POST /api/threads/search now queries threads_meta table directly,
  removing the two-phase Store + Checkpointer scan approach
- Add ThreadMetaRepository.search() with metadata/status filters
- Add ThreadMetaRepository.update_display_name() for title sync
- Worker syncs checkpoint title to threads_meta.display_name on run completion
- Map display_name to values.title in search response for API compatibility

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(threads): history endpoint reads messages from event store

- POST /api/threads/{thread_id}/history now combines two data sources:
  checkpointer for checkpoint_id, metadata, title, thread_data;
  event store for messages (complete history, not truncated by summarization)
- Strip internal LangGraph metadata keys from response
- Remove full channel_values serialization in favor of selective fields

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: remove duplicate optional-dependencies header in pyproject.toml

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(middleware): pass tagged config to TitleMiddleware ainvoke call

Without the config, the middleware:title tag was not injected,
causing the LLM response to be recorded as a lead_agent ai_message
in run_events.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: resolve merge conflict in .env.example

Keep both DATABASE_URL (from persistence-scaffold) and WECOM
credentials (from main) after the merge.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(persistence): address review feedback on PR #1851

- Fix naive datetime.now() → datetime.now(UTC) in all ORM models
- Fix seq race condition in DbRunEventStore.put() with FOR UPDATE
  and UNIQUE(thread_id, seq) constraint
- Encapsulate _store access in RunManager.update_run_completion()
- Deduplicate _store.put() logic in RunManager via _persist_to_store()
- Add update_run_completion to RunStore ABC + MemoryRunStore
- Wire follow_up_to_run_id through the full create path
- Add error recovery to RunJournal._flush_sync() lost-event scenario
- Add migration note for search_threads breaking change
- Fix test_checkpointer_none_fix mock to set database=None

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: update uv.lock

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(persistence): address 22 review comments from CodeQL, Copilot, and Code Quality

Bug fixes:
- Sanitize log params to prevent log injection (CodeQL)
- Reset threads_meta.status to idle/error when run completes
- Attach messages only to latest checkpoint in /history response
- Write threads_meta on POST /threads so new threads appear in search

Lint fixes:
- Remove unused imports (journal.py, migrations/env.py, test_converters.py)
- Convert lambda to named function (engine.py, Ruff E731)
- Remove unused logger definitions in repos (Ruff F841)
- Add logging to JSONL decode errors and empty except blocks
- Separate assert side-effects in tests (CodeQL)
- Remove unused local variables in tests (Ruff F841)
- Fix max_trace_content truncation to use byte length, not char length

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style: apply ruff format to persistence and runtime files

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Potential fix for pull request finding 'Statement has no effect'

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

* refactor(runtime): introduce RunContext to reduce run_agent parameter bloat

Extract checkpointer, store, event_store, run_events_config, thread_meta_repo,
and follow_up_to_run_id into a frozen RunContext dataclass. Add get_run_context()
in deps.py to build the base context from app.state singletons. start_run() uses
dataclasses.replace() to enrich per-run fields before passing ctx to run_agent.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(gateway): move sanitize_log_param to app/gateway/utils.py

Extract the log-injection sanitizer from routers/threads.py into a shared
utils module and rename to sanitize_log_param (public API). Eliminates the
reverse service → router import in services.py.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* perf: use SQL aggregation for feedback stats and thread token usage

Replace Python-side counting in FeedbackRepository.aggregate_by_run with
a single SELECT COUNT/SUM query. Add RunStore.aggregate_tokens_by_thread
abstract method with SQL GROUP BY implementation in RunRepository and
Python fallback in MemoryRunStore. Simplify the thread_token_usage
endpoint to delegate to the new method, eliminating the limit=10000
truncation risk.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: annotate DbRunEventStore.put() as low-frequency path

Add docstring clarifying that put() opens a per-call transaction with
FOR UPDATE and should only be used for infrequent writes (currently
just the initial human_message event). High-throughput callers should
use put_batch() instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(threads): fall back to Store search when ThreadMetaRepository is unavailable

When database.backend=memory (default) or no SQL session factory is
configured, search_threads now queries the LangGraph Store instead of
returning 503. Returns empty list if neither Store nor repo is available.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(persistence): introduce ThreadMetaStore ABC for backend-agnostic thread metadata

Add ThreadMetaStore abstract base class with create/get/search/update/delete
interface. ThreadMetaRepository (SQL) now inherits from it. New
MemoryThreadMetaStore wraps LangGraph BaseStore for memory-mode deployments.

deps.py now always provides a non-None thread_meta_repo, eliminating all
`if thread_meta_repo is not None` guards in services.py, worker.py, and
routers/threads.py. search_threads no longer needs a Store fallback branch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(history): read messages from checkpointer instead of RunEventStore

The /history endpoint now reads messages directly from the
checkpointer's channel_values (the authoritative source) instead of
querying RunEventStore.list_messages(). The RunEventStore API is
preserved for other consumers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(persistence): address new Copilot review comments

- feedback.py: validate thread_id/run_id before deleting feedback
- jsonl.py: add path traversal protection with ID validation
- run_repo.py: parse `before` to datetime for PostgreSQL compat
- thread_meta_repo.py: fix pagination when metadata filter is active
- database_config.py: use resolve_path for sqlite_dir consistency

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Implement skill self-evolution and skill_manage flow (#1874)

* chore: ignore .worktrees directory

* Add skill_manage self-evolution flow

* Fix CI regressions for skill_manage

* Address PR review feedback for skill evolution

* fix(skill-evolution): preserve history on delete

* fix(skill-evolution): tighten scanner fallbacks

* docs: add skill_manage e2e evidence screenshot

* fix(skill-manage): avoid blocking fs ops in session runtime

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>

* fix(config): resolve sqlite_dir relative to CWD, not Paths.base_dir

resolve_path() resolves relative to Paths.base_dir (.deer-flow),
which double-nested the path to .deer-flow/.deer-flow/data/app.db.
Use Path.resolve() (CWD-relative) instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Feature/feishu receive file (#1608)

* feat(feishu): add channel file materialization hook for inbound messages

- Introduce Channel.receive_file(msg, thread_id) as a base method for file materialization; default is no-op.
- Implement FeishuChannel.receive_file to download files/images from Feishu messages, save to sandbox, and inject virtual paths into msg.text.
- Update ChannelManager to call receive_file for any channel if msg.files is present, enabling downstream model access to user-uploaded files.
- No impact on Slack/Telegram or other channels (they inherit the default no-op).

* style(backend): format code with ruff for lint compliance

- Auto-formatted packages/harness/deerflow/agents/factory.py and tests/test_create_deerflow_agent.py using `ruff format`
- Ensured both files conform to project linting standards
- Fixes CI lint check failures caused by code style issues

* fix(feishu): handle file write operation asynchronously to prevent blocking

* fix(feishu): rename GetMessageResourceRequest to _GetMessageResourceRequest and remove redundant code

* test(feishu): add tests for receive_file method and placeholder replacement

* fix(manager): remove unnecessary type casting for channel retrieval

* fix(feishu): update logging messages to reflect resource handling instead of image

* fix(feishu): sanitize filename by replacing invalid characters in file uploads

* fix(feishu): improve filename sanitization and reorder image key handling in message processing

* fix(feishu): add thread lock to prevent filename conflicts during file downloads

* fix(test): correct bad merge in test_feishu_parser.py

* chore: run ruff and apply formatting cleanup
fix(feishu): preserve rich-text attachment order and improve fallback filename handling

* fix(docker): restore gateway env vars and fix langgraph empty arg issue (#1915)

Two production docker-compose.yaml bugs prevent `make up` from working:

1. Gateway missing DEER_FLOW_CONFIG_PATH and DEER_FLOW_EXTENSIONS_CONFIG_PATH
   environment overrides. Added in fb2d99f (#1836) but accidentally reverted
   by ca2fb95 (#1847). Without them, gateway reads host paths from .env via
   env_file, causing FileNotFoundError inside the container.

2. Langgraph command fails when LANGGRAPH_ALLOW_BLOCKING is unset (default).
   Empty $${allow_blocking} inserts a bare space between flags, causing
   ' --no-reload' to be parsed as unexpected extra argument. Fix by building
   args string first and conditionally appending --allow-blocking.

Co-authored-by: cooper <cooperfu@tencent.com>

* fix(frontend): resolve invalid HTML nesting and tabnabbing vulnerabilities (#1904)

* fix(frontend): resolve invalid HTML nesting and tabnabbing vulnerabilities

Fix `<button>` inside `<a>` invalid HTML in artifact components and add
missing `noopener,noreferrer` to `window.open` calls to prevent reverse
tabnabbing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(frontend): address Copilot review on tabnabbing and double-tab-open

Remove redundant parent onClick on web_fetch ChainOfThoughtStep to
prevent opening two tabs on link click, and explicitly null out
window.opener after window.open() for defensive tabnabbing hardening.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(persistence): organize entities into per-entity directories

Restructure the persistence layer from horizontal "models/ + repositories/"
split into vertical entity-aligned directories. Each entity (thread_meta,
run, feedback) now owns its ORM model, abstract interface (where applicable),
and concrete implementations under a single directory with an aggregating
__init__.py for one-line imports.

Layout:
  persistence/thread_meta/{base,model,sql,memory}.py
  persistence/run/{model,sql}.py
  persistence/feedback/{model,sql}.py

models/__init__.py is kept as a facade so Alembic autogenerate continues to
discover all ORM tables via Base.metadata. RunEventRow remains under
models/run_event.py because its storage implementation lives in
runtime/events/store/db.py and has no matching repository directory.

The repositories/ directory is removed entirely. All call sites in
gateway/deps.py and tests are updated to import from the new entity
packages, e.g.:

    from deerflow.persistence.thread_meta import ThreadMetaRepository
    from deerflow.persistence.run import RunRepository
    from deerflow.persistence.feedback import FeedbackRepository

Full test suite passes (1690 passed, 14 skipped).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(gateway): sync thread rename and delete through ThreadMetaStore

The POST /threads/{id}/state endpoint previously synced title changes
only to the LangGraph Store via _store_upsert. In sqlite mode the search
endpoint reads from the ThreadMetaRepository SQL table, so renames never
appeared in /threads/search until the next agent run completed (worker.py
syncs title from checkpoint to thread_meta in its finally block).

Likewise the DELETE /threads/{id} endpoint cleaned up the filesystem,
Store, and checkpointer but left the threads_meta row orphaned in sqlite,
so deleted threads kept appearing in /threads/search.

Fix both endpoints by routing through the ThreadMetaStore abstraction
which already has the correct sqlite/memory implementations wired up by
deps.py. The rename path now calls update_display_name() and the delete
path calls delete() — both work uniformly across backends.

Verified end-to-end with curl in gateway mode against sqlite backend.
Existing test suite (1690 passed) and focused router/repo tests pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(gateway): route all thread metadata access through ThreadMetaStore

Following the rename/delete bug fix in PR1, migrate the remaining direct
LangGraph Store reads/writes in the threads router and services to the
ThreadMetaStore abstraction so that the sqlite and memory backends behave
identically and the legacy dual-write paths can be removed.

Migrated endpoints (threads.py):
- create_thread: idempotency check + write now use thread_meta_repo.get/create
  instead of dual-writing the LangGraph Store and the SQL row.
- get_thread: reads from thread_meta_repo.get; the checkpoint-only fallback
  for legacy threads is preserved.
- patch_thread: replaced _store_get/_store_put with thread_meta_repo.update_metadata.
- delete_thread_data: dropped the legacy store.adelete; thread_meta_repo.delete
  already covers it.

Removed dead code (services.py):
- _upsert_thread_in_store — redundant with the immediately following
  thread_meta_repo.create() call.
- _sync_thread_title_after_run — worker.py's finally block already syncs
  the title via thread_meta_repo.update_display_name() after each run.

Removed dead code (threads.py):
- _store_get / _store_put / _store_upsert helpers (no remaining callers).
- THREADS_NS constant.
- get_store import (router no longer touches the LangGraph Store directly).

New abstract method:
- ThreadMetaStore.update_metadata(thread_id, metadata) merges metadata into
  the thread's metadata field. Implemented in both ThreadMetaRepository (SQL,
  read-modify-write inside one session) and MemoryThreadMetaStore. Three new
  unit tests cover merge / empty / nonexistent behaviour.

Net change: -134 lines. Full test suite: 1693 passed, 14 skipped.
Verified end-to-end with curl in gateway mode against sqlite backend
(create / patch / get / rename / search / delete).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
Co-authored-by: DanielWalnut <45447813+hetaoBackend@users.noreply.github.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: JilongSun <965640067@qq.com>
Co-authored-by: jie <49781832+stan-fu@users.noreply.github.com>
Co-authored-by: cooper <cooperfu@tencent.com>
Co-authored-by: yangzheli <43645580+yangzheli@users.noreply.github.com>

2026-04-11 11:23:39 +08:00

.agent/skills/smoke-test

feat(smoke-test): add smoke test skill (#1947 )

2026-04-09 18:56:28 +08:00

.github

ci: enforce code formatting checks for backend and frontend (#1536 )

2026-03-29 15:34:38 +08:00

backend

feat(persistence): add unified persistence layer with event store, token tracking, and feedback (#1930 )

2026-04-11 11:23:39 +08:00

docker

feat(persistence): add unified persistence layer with event store, token tracking, and feedback (#1930 )

2026-04-11 11:23:39 +08:00

docs

Implement skill self-evolution and skill_manage flow (#1874 )

2026-04-06 22:07:11 +08:00

frontend

feat(blog): implement blog structure with post listing, tagging, and layout enhancements (#1962 )

2026-04-10 20:24:52 +08:00

pr-build

Fix abnormal preview of HTML files (#1986 )

2026-04-09 16:32:01 +08:00

scripts

feat(dx): Setup Wizard + doctor command — closes #2030 (#2034 )

2026-04-10 17:43:39 +08:00

skills/public

test(skills): add evaluation + trigger analysis for systematic-literature-review (#2061 )

2026-04-10 18:02:45 +08:00

.dockerignore

Adds Kubernetes sandbox provisioner support (#35 )

2026-02-12 11:02:09 +08:00

.env.example

feat(persistence): add unified persistence layer with event store, token tracking, and feedback (#1930 )

2026-04-11 11:23:39 +08:00

.gitattributes

fix(git):add .gitattributes to avoid 'bash\r' issue (#924 )

2026-02-28 10:32:56 +08:00

.gitignore

Implement skill self-evolution and skill_manage flow (#1874 )

2026-04-06 22:07:11 +08:00

CODE_OF_CONDUCT.md

Add Contributor Covenant Code of Conduct

2026-04-10 22:26:40 +08:00

config.example.yaml

feat(persistence): add unified persistence layer with event store, token tracking, and feedback (#1930 )

2026-04-11 11:23:39 +08:00

CONTRIBUTING.md

docs: clarify deployment sizing guidance (#1963 )

2026-04-08 09:45:31 +08:00

deer-flow.code-workspace

Add TypeScript SDK path to code-workspace settings (#2052 )

2026-04-10 18:20:08 +08:00

extensions_config.example.json

chore(docker): Refactor sandbox state management and improve Docker integration (#1068 )

2026-03-11 10:03:01 +08:00

Install.md

docs: add install.md agent setup guide (#1402 )

2026-03-26 21:39:34 +08:00

LICENSE

docs: update LICENSE

2026-02-13 11:49:51 +08:00

Makefile

fix(makefile): route Windows shell-script targets through Git Bash (#2060 )

2026-04-11 09:30:22 +08:00

README_fr.md

feat(feishu): add configurable domain for Lark international support (#1535 )

2026-03-30 11:42:07 +08:00

README_ja.md

feat(feishu): add configurable domain for Lark international support (#1535 )

2026-03-30 11:42:07 +08:00

README_ru.md

feat(feishu): add configurable domain for Lark international support (#1535 )

2026-03-30 11:42:07 +08:00

README_zh.md

docs: clarify deployment sizing guidance (#1963 )

2026-04-08 09:45:31 +08:00

README.md

feat: add WeChat channel integration (#1869 )

2026-04-10 20:49:28 +08:00

SECURITY.md

docs: fix typo and grammar issues in docs (#1315 )

2026-03-25 10:01:36 +08:00

README.md

🦌 DeerFlow - 2.0

English | 中文 | 日本語 | Français | Русский

On February 28th, 2026, DeerFlow claimed the 🏆 #1 spot on GitHub Trending following the launch of version 2. Thanks a million to our incredible community — you made this happen! 💪🔥

DeerFlow (Deep Exploration and Efficient Research Flow) is an open-source super agent harness that orchestrates sub-agents, memory, and sandboxes to do almost anything — powered by extensible skills.

https://github.com/user-attachments/assets/a8bcadc4-e040-4cf2-8fda-dd768b999c18

Note

DeerFlow 2.0 is a ground-up rewrite. It shares no code with v1. If you're looking for the original Deep Research framework, it's maintained on the 1.x branch — contributions there are still welcome. Active development has moved to 2.0.

Official Website

Learn more and see real demos on our official website.

Coding Plan from ByteDance Volcengine

We strongly recommend using Doubao-Seed-2.0-Code, DeepSeek v3.2 and Kimi 2.5 to run DeerFlow
Learn more
中国大陆地区的开发者请点击这里

InfoQuest

DeerFlow has newly integrated the intelligent search and crawling toolset independently developed by BytePlus--InfoQuest (supports free online experience)

🦌 DeerFlow - 2.0

One-Line Agent Setup

If you use Claude Code, Codex, Cursor, Windsurf, or another coding agent, you can hand it the setup instructions in one sentence:

Help me clone DeerFlow if needed, then bootstrap it for local development by following https://raw.githubusercontent.com/bytedance/deer-flow/main/Install.md

That prompt is intended for coding agents. It tells the agent to clone the repo if needed, choose Docker when available, and stop with the exact next command plus any missing config the user still needs to provide.

Quick Start

Configuration

Clone the DeerFlow repository

git clone https://github.com/bytedance/deer-flow.git
cd deer-flow

Run the setup wizard

From the project root directory (deer-flow/), run:
```
make setup
```
This launches an interactive wizard that guides you through choosing an LLM provider, optional web search, and execution/safety preferences such as sandbox mode, bash access, and file-write tools. It generates a minimal config.yaml and writes your keys to .env. Takes about 2 minutes.

The wizard also lets you configure an optional web search provider, or skip it for now.

Run make doctor at any time to verify your setup and get actionable fix hints.

Advanced / manual configuration: If you prefer to edit config.yaml directly, run make config instead to copy the full template. See config.example.yaml for the complete reference including CLI-backed providers (Codex CLI, Claude Code OAuth), OpenRouter, Responses API, and more.
Manual model configuration examples
```
models:
  - name: gpt-4o
    display_name: GPT-4o
    use: langchain_openai:ChatOpenAI
    model: gpt-4o
    api_key: $OPENAI_API_KEY

  - name: openrouter-gemini-2.5-flash
    display_name: Gemini 2.5 Flash (OpenRouter)
    use: langchain_openai:ChatOpenAI
    model: google/gemini-2.5-flash-preview
    api_key: $OPENROUTER_API_KEY
    base_url: https://openrouter.ai/api/v1

  - name: gpt-5-responses
    display_name: GPT-5 (Responses API)
    use: langchain_openai:ChatOpenAI
    model: gpt-5
    api_key: $OPENAI_API_KEY
    use_responses_api: true
    output_version: responses/v1

  - name: qwen3-32b-vllm
    display_name: Qwen3 32B (vLLM)
    use: deerflow.models.vllm_provider:VllmChatModel
    model: Qwen/Qwen3-32B
    api_key: $VLLM_API_KEY
    base_url: http://localhost:8000/v1
    supports_thinking: true
    when_thinking_enabled:
      extra_body:
        chat_template_kwargs:
          enable_thinking: true
```
OpenRouter and similar OpenAI-compatible gateways should be configured with langchain_openai:ChatOpenAI plus base_url. If you prefer a provider-specific environment variable name, point api_key at that variable explicitly (for example api_key: $OPENROUTER_API_KEY).

To route OpenAI models through /v1/responses, keep using langchain_openai:ChatOpenAI and set use_responses_api: true with output_version: responses/v1.

For vLLM 0.19.0, use deerflow.models.vllm_provider:VllmChatModel. For Qwen-style reasoning models, DeerFlow toggles reasoning with extra_body.chat_template_kwargs.enable_thinking and preserves vLLM's non-standard reasoning field across multi-turn tool-call conversations. Legacy thinking configs are normalized automatically for backward compatibility. Reasoning models may also require the server to be started with --reasoning-parser .... If your local vLLM deployment accepts any non-empty API key, you can still set VLLM_API_KEY to a placeholder value.

CLI-backed provider examples:
```
models:
  - name: gpt-5.4
    display_name: GPT-5.4 (Codex CLI)
    use: deerflow.models.openai_codex_provider:CodexChatModel
    model: gpt-5.4
    supports_thinking: true
    supports_reasoning_effort: true

  - name: claude-sonnet-4.6
    display_name: Claude Sonnet 4.6 (Claude Code OAuth)
    use: deerflow.models.claude_provider:ClaudeChatModel
    model: claude-sonnet-4-6
    max_tokens: 4096
    supports_thinking: true
```
- Codex CLI reads ~/.codex/auth.json
- Claude Code accepts CLAUDE_CODE_OAUTH_TOKEN, ANTHROPIC_AUTH_TOKEN, CLAUDE_CODE_CREDENTIALS_PATH, or ~/.claude/.credentials.json
- ACP agent entries are separate from model providers — if you configure acp_agents.codex, point it at a Codex ACP adapter such as npx -y @zed-industries/codex-acp
- On macOS, export Claude Code auth explicitly if needed:
```
eval "$(python3 scripts/export_claude_code_oauth.py --print-export)"
```
API keys can also be set manually in .env (recommended) or exported in your shell:
```
OPENAI_API_KEY=your-openai-api-key
TAVILY_API_KEY=your-tavily-api-key
```

Running the Application

Deployment Sizing

Use the table below as a practical starting point when choosing how to run DeerFlow:

Deployment target	Starting point	Recommended	Notes
Local evaluation / `make dev`	4 vCPU, 8 GB RAM, 20 GB free SSD	8 vCPU, 16 GB RAM	Good for one developer or one light session with hosted model APIs. `2 vCPU / 4 GB` is usually not enough.
Docker development / `make docker-start`	4 vCPU, 8 GB RAM, 25 GB free SSD	8 vCPU, 16 GB RAM	Image builds, bind mounts, and sandbox containers need more headroom than pure local dev.
Long-running server / `make up`	8 vCPU, 16 GB RAM, 40 GB free SSD	16 vCPU, 32 GB RAM	Preferred for shared use, multi-agent runs, report generation, or heavier sandbox workloads.

These numbers cover DeerFlow itself. If you also host a local LLM, size that service separately.
Linux plus Docker is the recommended deployment target for a persistent server. macOS and Windows are best treated as development or evaluation environments.
If CPU or memory usage stays pinned, reduce concurrent runs first, then move to the next sizing tier.

Option 1: Docker (Recommended)

Development (hot-reload, source mounts):

make docker-init    # Pull sandbox image (only once or when image updates)
make docker-start   # Start services (auto-detects sandbox mode from config.yaml)

make docker-start starts provisioner only when config.yaml uses provisioner mode (sandbox.use: deerflow.community.aio_sandbox:AioSandboxProvider with provisioner_url).

Docker builds use the upstream uv registry by default. If you need faster mirrors in restricted networks, export UV_INDEX_URL=https://pypi.tuna.tsinghua.edu.cn/simple and NPM_REGISTRY=https://registry.npmmirror.com before running make docker-init or make docker-start.

Backend processes automatically pick up config.yaml changes on the next config access, so model metadata updates do not require a manual restart during development.

Tip

On Linux, if Docker-based commands fail with permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock, add your user to the docker group and re-login before retrying. See CONTRIBUTING.md for the full fix.

Production (builds images locally, mounts runtime config and data):

make up     # Build images and start all production services
make down   # Stop and remove containers

Note

The LangGraph agent server currently runs via langgraph dev (the open-source CLI server).

Access: http://localhost:2026

See CONTRIBUTING.md for detailed Docker development guide.

Option 2: Local Development

If you prefer running services locally:

Prerequisite: complete the "Configuration" steps above first (make setup). make dev requires a valid config.yaml in the project root (can be overridden via DEER_FLOW_CONFIG_PATH). Run make doctor to verify your setup before starting. On Windows, run the local development flow from Git Bash. Native cmd.exe and PowerShell shells are not supported for the bash-based service scripts, and WSL is not guaranteed because some scripts rely on Git for Windows utilities such as cygpath.

Check prerequisites:

make check  # Verifies Node.js 22+, pnpm, uv, nginx

Install dependencies:

make install  # Install backend + frontend dependencies

(Optional) Pre-pull sandbox image:

# Recommended if using Docker/Container-based sandbox
make setup-sandbox

(Optional) Load sample memory data for local review:
```
python scripts/load_memory_sample.py
```
This copies the sample fixture into the default local runtime memory file so reviewers can immediately test Settings > Memory. See backend/docs/MEMORY_SETTINGS_REVIEW.md for the shortest review flow.
Start services:
```
make dev
```
Access: http://localhost:2026

Startup Modes

DeerFlow supports multiple startup modes across two dimensions:

Dev / Prod — dev enables hot-reload; prod uses pre-built frontend
Standard / Gateway — standard uses a separate LangGraph server (4 processes); Gateway mode (experimental) embeds the agent runtime in the Gateway API (3 processes)

	Local Foreground	Local Daemon	Docker Dev	Docker Prod
Dev	`./scripts/serve.sh --dev` `make dev`	`./scripts/serve.sh --dev --daemon` `make dev-daemon`	`./scripts/docker.sh start` `make docker-start`	—
Dev + Gateway	`./scripts/serve.sh --dev --gateway` `make dev-pro`	`./scripts/serve.sh --dev --gateway --daemon` `make dev-daemon-pro`	`./scripts/docker.sh start --gateway` `make docker-start-pro`	—
Prod	`./scripts/serve.sh --prod` `make start`	`./scripts/serve.sh --prod --daemon` `make start-daemon`	—	`./scripts/deploy.sh` `make up`
Prod + Gateway	`./scripts/serve.sh --prod --gateway` `make start-pro`	`./scripts/serve.sh --prod --gateway --daemon` `make start-daemon-pro`	—	`./scripts/deploy.sh --gateway` `make up-pro`

Action	Local	Docker Dev	Docker Prod
Stop	`./scripts/serve.sh --stop` `make stop`	`./scripts/docker.sh stop` `make docker-stop`	`./scripts/deploy.sh down` `make down`
Restart	`./scripts/serve.sh --restart [flags]`	`./scripts/docker.sh restart`	—

Gateway mode eliminates the LangGraph server process — the Gateway API handles agent execution directly via async tasks, managing its own concurrency.

Why Gateway Mode?

In standard mode, DeerFlow runs a dedicated LangGraph Platform server alongside the Gateway API. This architecture works well but has trade-offs:

	Standard Mode	Gateway Mode
Architecture	Gateway (REST API) + LangGraph (agent runtime)	Gateway embeds agent runtime
Concurrency	`--n-jobs-per-worker` per worker (requires license)	`--workers` × async tasks (no per-worker cap)
Containers / Processes	4 (frontend, gateway, langgraph, nginx)	3 (frontend, gateway, nginx)
Resource usage	Higher (two Python runtimes)	Lower (single Python runtime)
LangGraph Platform license	Required for production images	Not required
Cold start	Slower (two services to initialize)	Faster

Both modes are functionally equivalent — the same agents, tools, and skills work in either mode.

Docker Production Deployment

deploy.sh supports building and starting separately. Images are mode-agnostic — runtime mode is selected at start time:

# One-step (build + start)
deploy.sh                    # standard mode (default)
deploy.sh --gateway          # gateway mode

# Two-step (build once, start with any mode)
deploy.sh build              # build all images
deploy.sh start              # start in standard mode
deploy.sh start --gateway    # start in gateway mode

# Stop
deploy.sh down

Advanced

Sandbox Mode

DeerFlow supports multiple sandbox execution modes:

Local Execution (runs sandbox code directly on the host machine)
Docker Execution (runs sandbox code in isolated Docker containers)
Docker Execution with Kubernetes (runs sandbox code in Kubernetes pods via provisioner service)

For Docker development, service startup follows config.yaml sandbox mode. In Local/Docker modes, provisioner is not started.

See the Sandbox Configuration Guide to configure your preferred mode.

MCP Server

DeerFlow supports configurable MCP servers and skills to extend its capabilities. For HTTP/SSE MCP servers, OAuth token flows are supported (client_credentials, refresh_token). See the MCP Server Guide for detailed instructions.

IM Channels

DeerFlow supports receiving tasks from messaging apps. Channels auto-start when configured — no public IP required for any of them.

Channel	Transport	Difficulty
Telegram	Bot API (long-polling)	Easy
Slack	Socket Mode	Moderate
Feishu / Lark	WebSocket	Moderate
WeChat	Tencent iLink (long-polling)	Moderate
WeCom	WebSocket	Moderate

Configuration in config.yaml:

channels:
  # LangGraph Server URL (default: http://localhost:2024)
  langgraph_url: http://localhost:2024
  # Gateway API URL (default: http://localhost:8001)
  gateway_url: http://localhost:8001

  # Optional: global session defaults for all mobile channels
  session:
    assistant_id: lead_agent  # or a custom agent name; custom agents are routed via lead_agent + agent_name
    config:
      recursion_limit: 100
    context:
      thinking_enabled: true
      is_plan_mode: false
      subagent_enabled: false

  feishu:
    enabled: true
    app_id: $FEISHU_APP_ID
    app_secret: $FEISHU_APP_SECRET
    # domain: https://open.feishu.cn       # China (default)
    # domain: https://open.larksuite.com   # International

  wecom:
    enabled: true
    bot_id: $WECOM_BOT_ID
    bot_secret: $WECOM_BOT_SECRET

  slack:
    enabled: true
    bot_token: $SLACK_BOT_TOKEN     # xoxb-...
    app_token: $SLACK_APP_TOKEN     # xapp-... (Socket Mode)
    allowed_users: []               # empty = allow all

  telegram:
    enabled: true
    bot_token: $TELEGRAM_BOT_TOKEN
    allowed_users: []               # empty = allow all

  wechat:
    enabled: false
    bot_token: $WECHAT_BOT_TOKEN
    ilink_bot_id: $WECHAT_ILINK_BOT_ID
    qrcode_login_enabled: true      # optional: allow first-time QR bootstrap when bot_token is absent
    allowed_users: []               # empty = allow all
    polling_timeout: 35
    state_dir: ./.deer-flow/wechat/state
    max_inbound_image_bytes: 20971520
    max_outbound_image_bytes: 20971520
    max_inbound_file_bytes: 52428800
    max_outbound_file_bytes: 52428800

    # Optional: per-channel / per-user session settings
    session:
      assistant_id: mobile-agent  # custom agent names are also supported here
      context:
        thinking_enabled: false
      users:
        "123456789":
          assistant_id: vip-agent
          config:
            recursion_limit: 150
          context:
            thinking_enabled: true
            subagent_enabled: true

Notes:

assistant_id: lead_agent calls the default LangGraph assistant directly.
If assistant_id is set to a custom agent name, DeerFlow still routes through lead_agent and injects that value as agent_name, so the custom agent's SOUL/config takes effect for IM channels.

Set the corresponding API keys in your .env file:

# Telegram
TELEGRAM_BOT_TOKEN=123456789:ABCdefGHIjklMNOpqrSTUvwxYZ

# Slack
SLACK_BOT_TOKEN=xoxb-...
SLACK_APP_TOKEN=xapp-...

# Feishu / Lark
FEISHU_APP_ID=cli_xxxx
FEISHU_APP_SECRET=your_app_secret

# WeChat iLink
WECHAT_BOT_TOKEN=your_ilink_bot_token
WECHAT_ILINK_BOT_ID=your_ilink_bot_id

# WeCom
WECOM_BOT_ID=your_bot_id
WECOM_BOT_SECRET=your_bot_secret

Telegram Setup

Chat with @BotFather, send /newbot, and copy the HTTP API token.
Set TELEGRAM_BOT_TOKEN in .env and enable the channel in config.yaml.

Slack Setup

Create a Slack App at api.slack.com/apps → Create New App → From scratch.
Under OAuth & Permissions, add Bot Token Scopes: app_mentions:read, chat:write, im:history, im:read, im:write, files:write.
Enable Socket Mode → generate an App-Level Token (xapp-…) with connections:write scope.
Under Event Subscriptions, subscribe to bot events: app_mention, message.im.
Set SLACK_BOT_TOKEN and SLACK_APP_TOKEN in .env and enable the channel in config.yaml.

Feishu / Lark Setup

Create an app on Feishu Open Platform → enable Bot capability.
Add permissions: im:message, im:message.p2p_msg:readonly, im:resource.
Under Events, subscribe to im.message.receive_v1 and select Long Connection mode.
Copy the App ID and App Secret. Set FEISHU_APP_ID and FEISHU_APP_SECRET in .env and enable the channel in config.yaml.

WeChat Setup

Enable the wechat channel in config.yaml.
Either set WECHAT_BOT_TOKEN in .env, or set qrcode_login_enabled: true for first-time QR bootstrap.
When bot_token is absent and QR bootstrap is enabled, watch backend logs for the QR content returned by iLink and complete the binding flow.
After the QR flow succeeds, DeerFlow persists the acquired token under state_dir for later restarts.
For Docker Compose deployments, keep state_dir on a persistent volume so the get_updates_buf cursor and saved auth state survive restarts.

WeCom Setup

Create a bot on the WeCom AI Bot platform and obtain the bot_id and bot_secret.
Enable channels.wecom in config.yaml and fill in bot_id / bot_secret.
Set WECOM_BOT_ID and WECOM_BOT_SECRET in .env.
Make sure backend dependencies include wecom-aibot-python-sdk. The channel uses a WebSocket long connection and does not require a public callback URL.
The current integration supports inbound text, image, and file messages. Final images/files generated by the agent are also sent back to the WeCom conversation.

When DeerFlow runs in Docker Compose, IM channels execute inside the gateway container. In that case, do not point channels.langgraph_url or channels.gateway_url at localhost; use container service names such as http://langgraph:2024 and http://gateway:8001, or set DEER_FLOW_CHANNELS_LANGGRAPH_URL and DEER_FLOW_CHANNELS_GATEWAY_URL.

Commands

Once a channel is connected, you can interact with DeerFlow directly from the chat:

Command	Description
`/new`	Start a new conversation
`/status`	Show current thread info
`/models`	List available models
`/memory`	View memory
`/help`	Show help

Messages without a command prefix are treated as regular chat — DeerFlow creates a thread and responds conversationally.

LangSmith Tracing

DeerFlow has built-in LangSmith integration for observability. When enabled, all LLM calls, agent runs, and tool executions are traced and visible in the LangSmith dashboard.

Add the following to your .env file:

LANGSMITH_TRACING=true
LANGSMITH_ENDPOINT=https://api.smith.langchain.com
LANGSMITH_API_KEY=lsv2_pt_xxxxxxxxxxxxxxxx
LANGSMITH_PROJECT=xxx

Langfuse Tracing

DeerFlow also supports Langfuse observability for LangChain-compatible runs.

Add the following to your .env file:

LANGFUSE_TRACING=true
LANGFUSE_PUBLIC_KEY=pk-lf-xxxxxxxxxxxxxxxx
LANGFUSE_SECRET_KEY=sk-lf-xxxxxxxxxxxxxxxx
LANGFUSE_BASE_URL=https://cloud.langfuse.com

If you are using a self-hosted Langfuse instance, set LANGFUSE_BASE_URL to your deployment URL.

Using Both Providers

If both LangSmith and Langfuse are enabled, DeerFlow attaches both tracing callbacks and reports the same model activity to both systems.

If a provider is explicitly enabled but missing required credentials, or if its callback fails to initialize, DeerFlow fails fast when tracing is initialized during model creation and the error message names the provider that caused the failure.

For Docker deployments, tracing is disabled by default. Set LANGSMITH_TRACING=true and LANGSMITH_API_KEY in your .env to enable it.

From Deep Research to Super Agent Harness

DeerFlow started as a Deep Research framework — and the community ran with it. Since launch, developers have pushed it far beyond research: building data pipelines, generating slide decks, spinning up dashboards, automating content workflows. Things we never anticipated.

That told us something important: DeerFlow wasn't just a research tool. It was a harness — a runtime that gives agents the infrastructure to actually get work done.

So we rebuilt it from scratch.

DeerFlow 2.0 is no longer a framework you wire together. It's a super agent harness — batteries included, fully extensible. Built on LangGraph and LangChain, it ships with everything an agent needs out of the box: a filesystem, memory, skills, sandbox-aware execution, and the ability to plan and spawn sub-agents for complex, multi-step tasks.

Use it as-is. Or tear it apart and make it yours.

Core Features

Skills & Tools

Skills are what make DeerFlow do almost anything.

A standard Agent Skill is a structured capability module — a Markdown file that defines a workflow, best practices, and references to supporting resources. DeerFlow ships with built-in skills for research, report generation, slide creation, web pages, image and video generation, and more. But the real power is extensibility: add your own skills, replace the built-in ones, or combine them into compound workflows.

Skills are loaded progressively — only when the task needs them, not all at once. This keeps the context window lean and makes DeerFlow work well even with token-sensitive models.

When you install .skill archives through the Gateway, DeerFlow accepts standard optional frontmatter metadata such as version, author, and compatibility instead of rejecting otherwise valid external skills.

Tools follow the same philosophy. DeerFlow comes with a core toolset — web search, web fetch, file operations, bash execution — and supports custom tools via MCP servers and Python functions. Swap anything. Add anything.

Gateway-generated follow-up suggestions now normalize both plain-string model output and block/list-style rich content before parsing the JSON array response, so provider-specific content wrappers do not silently drop suggestions.

# Paths inside the sandbox container
/mnt/skills/public
├── research/SKILL.md
├── report-generation/SKILL.md
├── slide-creation/SKILL.md
├── web-page/SKILL.md
└── image-generation/SKILL.md

/mnt/skills/custom
└── your-custom-skill/SKILL.md      ← yours

Claude Code Integration

The claude-to-deerflow skill lets you interact with a running DeerFlow instance directly from Claude Code. Send research tasks, check status, manage threads — all without leaving the terminal.

Install the skill:

npx skills add https://github.com/bytedance/deer-flow --skill claude-to-deerflow

Then make sure DeerFlow is running (default at http://localhost:2026) and use the /claude-to-deerflow command in Claude Code.

What you can do:

Send messages to DeerFlow and get streaming responses
Choose execution modes: flash (fast), standard, pro (planning), ultra (sub-agents)
Check DeerFlow health, list models/skills/agents
Manage threads and conversation history
Upload files for analysis

Environment variables (optional, for custom endpoints):

DEERFLOW_URL=http://localhost:2026            # Unified proxy base URL
DEERFLOW_GATEWAY_URL=http://localhost:2026    # Gateway API
DEERFLOW_LANGGRAPH_URL=http://localhost:2026/api/langgraph  # LangGraph API

See skills/public/claude-to-deerflow/SKILL.md for the full API reference.

Sub-Agents

Complex tasks rarely fit in a single pass. DeerFlow decomposes them.

The lead agent can spawn sub-agents on the fly — each with its own scoped context, tools, and termination conditions. Sub-agents run in parallel when possible, report back structured results, and the lead agent synthesizes everything into a coherent output.

This is how DeerFlow handles tasks that take minutes to hours: a research task might fan out into a dozen sub-agents, each exploring a different angle, then converge into a single report — or a website — or a slide deck with generated visuals. One harness, many hands.

Sandbox & File System

DeerFlow doesn't just talk about doing things. It has its own computer.

Each task gets its own execution environment with a full filesystem view — skills, workspace, uploads, outputs. The agent reads, writes, and edits files. It can view images and, when configured safely, execute shell commands.

With AioSandboxProvider, shell execution runs inside isolated containers. With LocalSandboxProvider, file tools still map to per-thread directories on the host, but host bash is disabled by default because it is not a secure isolation boundary. Re-enable host bash only for fully trusted local workflows.

This is the difference between a chatbot with tool access and an agent with an actual execution environment.

# Paths inside the sandbox container
/mnt/user-data/
├── uploads/          ← your files
├── workspace/        ← agents' working directory
└── outputs/          ← final deliverables

Context Engineering

Isolated Sub-Agent Context: Each sub-agent runs in its own isolated context. This means that the sub-agent will not be able to see the context of the main agent or other sub-agents. This is important to ensure that the sub-agent is able to focus on the task at hand and not be distracted by the context of the main agent or other sub-agents.

Summarization: Within a session, DeerFlow manages context aggressively — summarizing completed sub-tasks, offloading intermediate results to the filesystem, compressing what's no longer immediately relevant. This lets it stay sharp across long, multi-step tasks without blowing the context window.

Long-Term Memory

Most agents forget everything the moment a conversation ends. DeerFlow remembers.

Across sessions, DeerFlow builds a persistent memory of your profile, preferences, and accumulated knowledge. The more you use it, the better it knows you — your writing style, your technical stack, your recurring workflows. Memory is stored locally and stays under your control.

Memory updates now skip duplicate fact entries at apply time, so repeated preferences and context do not accumulate endlessly across sessions.

Recommended Models

DeerFlow is model-agnostic — it works with any LLM that implements the OpenAI-compatible API. That said, it performs best with models that support:

Long context windows (100k+ tokens) for deep research and multi-step tasks
Reasoning capabilities for adaptive planning and complex decomposition
Multimodal inputs for image understanding and video comprehension
Strong tool-use for reliable function calling and structured outputs

Embedded Python Client

DeerFlow can be used as an embedded Python library without running the full HTTP services. The DeerFlowClient provides direct in-process access to all agent and Gateway capabilities, returning the same response schemas as the HTTP Gateway API. The HTTP Gateway also exposes DELETE /api/threads/{thread_id} to remove DeerFlow-managed local thread data after the LangGraph thread itself has been deleted:

from deerflow.client import DeerFlowClient

client = DeerFlowClient()

# Chat
response = client.chat("Analyze this paper for me", thread_id="my-thread")

# Streaming (LangGraph SSE protocol: values, messages-tuple, end)
for event in client.stream("hello"):
    if event.type == "messages-tuple" and event.data.get("type") == "ai":
        print(event.data["content"])

# Configuration & management — returns Gateway-aligned dicts
models = client.list_models()        # {"models": [...]}
skills = client.list_skills()        # {"skills": [...]}
client.update_skill("web-search", enabled=True)
client.upload_files("thread-1", ["./report.pdf"])  # {"success": True, "files": [...]}

All dict-returning methods are validated against Gateway Pydantic response models in CI (TestGatewayConformance), ensuring the embedded client stays in sync with the HTTP API schemas. See backend/packages/harness/deerflow/client.py for full API documentation.

Documentation

Contributing Guide - Development environment setup and workflow
Configuration Guide - Setup and configuration instructions
Architecture Overview - Technical architecture details
Backend Architecture - Backend architecture and API reference

⚠️ Security Notice

Improper Deployment May Introduce Security Risks

DeerFlow has key high-privilege capabilities including system command execution, resource operations, and business logic invocation, and is designed by default to be deployed in a local trusted environment (accessible only via the 127.0.0.1 loopback interface). If you deploy the agent in untrusted environments — such as LAN networks, public cloud servers, or other multi-endpoint accessible environments — without strict security measures, it may introduce security risks, including:

Unauthorized illegal invocation: Agent functionality could be discovered by unauthorized third parties or malicious internet scanners, triggering bulk unauthorized requests that execute high-risk operations such as system commands and file read/write, potentially causing serious security consequences.
Compliance and legal risks: If the agent is illegally invoked to conduct cyberattacks, data theft, or other illegal activities, it may result in legal liability and compliance risks.

Security Recommendations

Note: We strongly recommend deploying DeerFlow in a local trusted network environment. If you need cross-device or cross-network deployment, you must implement strict security measures, such as:

IP allowlist: Use iptables, or deploy hardware firewalls / switches with Access Control Lists (ACL), to configure IP allowlist rules and deny access from all other IP addresses.
Authentication gateway: Configure a reverse proxy (e.g., nginx) and enable strong pre-authentication, blocking any unauthenticated access.
Network isolation: Where possible, place the agent and trusted devices in the same dedicated VLAN, isolated from other network devices.
Stay updated: Continue to follow DeerFlow's security feature updates.

Contributing

We welcome contributions! Please see CONTRIBUTING.md for development setup, workflow, and guidelines.

Regression coverage includes Docker sandbox mode detection and provisioner kubeconfig-path handling tests in backend/tests/. Gateway artifact serving now forces active web content types (text/html, application/xhtml+xml, image/svg+xml) to download as attachments instead of inline rendering, reducing XSS risk for generated artifacts.

License

This project is open source and available under the MIT License.

Acknowledgments

DeerFlow is built upon the incredible work of the open-source community. We are deeply grateful to all the projects and contributors whose efforts have made DeerFlow possible. Truly, we stand on the shoulders of giants.

We would like to extend our sincere appreciation to the following projects for their invaluable contributions:

LangChain: Their exceptional framework powers our LLM interactions and chains, enabling seamless integration and functionality.
LangGraph: Their innovative approach to multi-agent orchestration has been instrumental in enabling DeerFlow's sophisticated workflows.

These projects exemplify the transformative power of open-source collaboration, and we are proud to build upon their foundations.

Key Contributors

A heartfelt thank you goes out to the core authors of DeerFlow, whose vision, passion, and dedication have brought this project to life:

Daniel Walnut
Henry Li

Your unwavering commitment and expertise have been the driving force behind DeerFlow's success. We are honored to have you at the helm of this journey.

Star History

Description

An open-source long-horizon SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skill, subagents and message gateway, it handles different levels of tasks that could take minutes to hours.

agent agentic agentic-framework agentic-workflow ai ai-agents deep-research harness langchain langgraph langmanus llm multi-agent nodejs podcast python superagent typescript

Readme MIT 300 MiB

Languages

Python 74.6%

TypeScript 15.1%

MDX 4%

HTML 2.6%

Shell 1.4%

Other 2.2%

README.md Unescape Escape

🦌 DeerFlow - 2.0

Official Website

Coding Plan from ByteDance Volcengine

InfoQuest

Table of Contents

One-Line Agent Setup

Quick Start

Configuration

Running the Application

Deployment Sizing

Option 1: Docker (Recommended)

Option 2: Local Development

Startup Modes

Why Gateway Mode?

Docker Production Deployment

Advanced

Sandbox Mode

MCP Server

IM Channels

LangSmith Tracing

Langfuse Tracing

Using Both Providers

From Deep Research to Super Agent Harness

Core Features

Skills & Tools

Claude Code Integration

Sub-Agents

Sandbox & File System

Context Engineering

Long-Term Memory

Recommended Models

Embedded Python Client

Documentation

⚠️ Security Notice

Improper Deployment May Introduce Security Risks

Security Recommendations

Contributing

License

Acknowledgments

Key Contributors

Star History

README.md