404 Commits

Author SHA1 Message Date
IECspace
f394c0d8c8
feat(mcp): support custom tool interceptors via extensions_config.json (#2451)
* feat(mcp): support custom tool interceptors via extensions_config.json

Add a generic extension point for registering custom MCP tool
interceptors through `extensions_config.json`. This allows downstream
projects to inject per-request header manipulation, auth context
propagation, or other cross-cutting concerns without modifying
DeerFlow source code.

Interceptors are declared as Python callable paths in a new
`mcpInterceptors` array field and loaded via the existing
`resolve_variable` reflection mechanism:

```json
{
  "mcpInterceptors": [
    "my_package.mcp.auth:build_auth_interceptor"
  ]
}
```

Each entry must resolve to a no-arg builder function that returns an
async interceptor compatible with `MultiServerMCPClient`'s
`tool_interceptors` interface.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test(mcp): add unit tests for custom tool interceptors

Cover all branches of the mcpInterceptors loading logic:

- valid interceptor loaded and appended to tool_interceptors
- multiple interceptors loaded in declaration order
- builder returning None is skipped
- resolve_variable ImportError logged and skipped
- builder raising exception logged and skipped
- absent mcpInterceptors field is safe (no-op)
- custom interceptors coexist with OAuth interceptor

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* fix(mcp): validate mcpInterceptors type and fix lint warnings

Address review feedback:

1. Validate mcpInterceptors config value before iterating:
   - Accept a single string and normalize to [string]
   - Ignore None silently
   - Log warning and skip for non-list/non-string types

2. Fix ruff F841 lint errors in tests:
   - Rename _make_mock_env to _make_patches, embed mock_client
   - Remove unused `as mock_cls` bindings where not needed
   - Extract _get_interceptors() helper to reduce repetition

3. Add two new test cases for type validation:
   - test_mcp_interceptors_single_string_is_normalized
   - test_mcp_interceptors_invalid_type_logs_warning

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(mcp): validate interceptor return type and fix import mock path

Address review feedback:

1. Validate builder return type with callable() check:
   - callable interceptor → append to tool_interceptors
   - None → silently skip (builder opted out)
   - non-callable → log warning with type name and skip

2. Fix test mock path: resolve_variable is a top-level import in
   tools.py, so mock deerflow.mcp.tools.resolve_variable instead of
   deerflow.reflection.resolve_variable to correctly intercept calls.

3. Add test_custom_interceptor_non_callable_return_logs_warning to
   cover the new non-callable validation branch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs(mcp): add mcpInterceptors example and documentation

- Add mcpInterceptors field to extensions_config.example.json
- Add "Custom Tool Interceptors" section to MCP_SERVER.md with
  configuration format, example interceptor code, and edge case
  behavior notes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: IECspace <IECspace@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-04-25 09:18:13 +08:00
orbisai0security
950821cb9b
fix: use subprocess instead of os.system in local_backend.py (#2494)
* fix: use subprocess instead of os.system in local_backend.py

The sandbox backend and skill evaluation scripts use subprocess

* fixing the failing test

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-04-25 08:59:31 +08:00
pyp0327
2bb1a2dfa2
feat(models): Provider for MindIE model engine (#2483)
* feat(models): 适配 MindIE引擎的模型

* test: add unit tests for MindIEChatModel adapter and fix PR review comments

* chore: update uv.lock with pytest-asyncio

* build: add pytest-asyncio to test dependencies

* fix: address PR review comments (lazy import, cache clients, safe newline escape, strict xml regex)

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-04-25 08:59:03 +08:00
DanielWalnut
b970993425
fix: read lead agent options from context (#2515)
* fix: read lead agent options from context

* fix: validate runtime context config
2026-04-24 22:46:51 +08:00
DanielWalnut
ec8a8cae38
fix: gate deferred MCP tool execution (#2513)
* fix: gate deferred MCP tool execution

* style: format deferred tool middleware

* fix: address deferred tool review feedback
2026-04-24 22:45:41 +08:00
DanielWalnut
d78ed5c8f2
fix: inherit subagent skill allowlists (#2514) 2026-04-24 21:24:42 +08:00
Nan Gao
f9ff3a698d
fix(middleware): avoid rescuing non-skill tool outputs during summarization (#2458)
* fix(middelware): narrow skill rescue to skill-related tool outputs

* fix(summarization): address skill rescue review feedback

* fix: wire summarization skill rescue config

* fix: remove dead skill tool helper

* fix(lint): fix format

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-04-24 21:19:46 +08:00
He Wang
3a61126824
fix: keep debug.py interactive terminal free from background log noise (#2466)
* fix(debug): keep terminal clean by redirecting all logs to file

- Redirect all logs to debug.log file to prevent background task logs
  from interfering with interactive terminal prompts
- Honor AppConfig.log_level setting instead of hard-coding to INFO
- Make logging setup idempotent by clearing pre-existing handlers
- Defer deerflow imports until after logging is configured to ensure
  import-time side effects are captured in debug.log
- Display active log level in startup banner
- Add prompt_toolkit installation tip for enhanced readline support

Made-with: Cursor

* attaching the file handler before importing/calling get_app_config()

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-04-24 17:09:41 +08:00
Airene Fang
11f557a2c6
feat(trace):Add run_name to the trace info for system agents. (#2492)
* feat(trace): Add `run_name` to the trace info for suggestions and memory.

before(in langsmith):
CodexChatModel
CodexChatModel
lead_agent
after:
suggest_agent
memory_agent
lead_agent

feat(trace): Add `run_name` to the trace info for suggestions and memory.

before(in langsmith):
CodexChatModel
CodexChatModel
lead_agent
after:
suggest_agent
memory_agent
lead_agent

* feat(trace): Add `run_name` to the trace info for system agents.

before(in langsmith):
CodexChatModel
CodexChatModel
CodexChatModel
CodexChatModel
lead_agent
after:
suggest_agent
title_agent
security_agent
memory_agent
lead_agent

* chore(code format):code format

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-04-24 17:06:55 +08:00
d 🔹
e8572b9d0c
fix(jina): log transient failures at WARNING without traceback (#2484) (#2485)
The exception handler in JinaClient.crawl used logger.exception, which
emits an ERROR-level record with the full httpx/httpcore/anyio traceback
for every transient network failure (timeout, connection refused). Other
search/crawl providers in the project log the same class of recoverable
failures as a single line. One offline/slow-network session could produce
dozens of multi-frame ERROR stack traces, drowning out real problems.

Switch to logger.warning with a concise message that includes the
exception type and its str, matching the style used elsewhere for
recoverable transient failures (aio_sandbox, ddg, etc.). The exception
type now also surfaces into the returned "Error: ..." string so callers
retain diagnostic signal.

Adds a regression test that asserts the log record is WARNING, carries
no exc_info, and includes the exception class name.

Co-authored-by: voidborne-d <voidborne-d@users.noreply.github.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-04-24 16:00:14 +08:00
Willem Jiang
80a7446fd6 fix(backend): fix the unit test error in backend 2026-04-24 14:56:03 +08:00
Willem Jiang
cd12821134 fix(backend): Updated the uv.lock with new added dependency 2026-04-24 14:55:13 +08:00
Xinmin Zeng
30d619de08
feat(subagents): support per-subagent skill loading and custom subagent types (#2253)
* feat(subagents): support per-subagent skill loading and custom subagent types (#2230)

Add per-subagent skill configuration and custom subagent type registration,
aligned with Codex's role-based config layering and per-session skill injection.

Backend:
- SubagentConfig gains `skills` field (None=all, []=none, list=whitelist)
- New CustomSubagentConfig for user-defined subagent types in config.yaml
- SubagentsAppConfig gains `custom_agents` section and `get_skills_for()`
- Registry resolves custom agents with three-layer config precedence
- SubagentExecutor loads skills per-session as conversation items (Codex pattern)
- task_tool no longer appends skills to system_prompt
- Lead agent system prompt dynamically lists all registered subagent types
- setup_agent tool accepts optional skills parameter
- Gateway agents API transparently passes skills in CRUD operations

Frontend:
- Agent/CreateAgentRequest/UpdateAgentRequest types include skills field
- Agent card displays skills as badges alongside tool_groups

Config:
- config.example.yaml documents custom_agents and per-agent skills override

Tests:
- 40 new tests covering all skill config, custom agents, and registry logic
- Existing tests updated for new get_skills_prompt_section signature

Closes #2230

* fix: address review feedback on skills PR

- Remove stale get_skills_prompt_section monkeypatches from test_task_tool_core_logic.py
  (task_tool no longer imports this function after skill injection moved to executor)
- Add key prefixes (tg:/sk:) to agent-card badges to prevent React key collisions
  between tool_groups and skills

* fix(ci): resolve lint and test failures

- Format agent-card.tsx with prettier (lint-frontend)
- Remove stale "Skills Appendix" system_prompt assertion — skills are now
  loaded per-session by SubagentExecutor, not appended to system_prompt

* fix(ci): sort imports in test_subagent_skills_config.py (ruff I001)

* fix(ci): use nullish coalescing in agent-card badge condition (eslint)

* fix: address review feedback on skills PR

- Use model_fields_set in AgentUpdateRequest to distinguish "field omitted"
  from "explicitly set to null" — fixes skills=None ambiguity where None
  means "inherit all" but was treated as "don't change"
- Move lazy import of get_subagent_config outside loop in
  _build_available_subagents_description to avoid repeated import overhead

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-04-23 23:59:47 +08:00
JerryChaox
4e72410154
fix(gateway): bound lifespan shutdown hooks to prevent worker hang under uvicorn reload (#2331)
* fix(gateway): bound lifespan shutdown hooks to prevent worker hang

Gateway worker can hang indefinitely in `uvicorn --reload` mode with
the listening socket still bound — all /api/* requests return 504,
and SIGKILL is the only recovery.

Root cause (py-spy dump from a reproduction showed 16+ stacked frames
of signal_handler -> Event.set -> threading.Lock.__enter__ on the
main thread): CPython's `threading.Event` uses `Condition(Lock())`
where the inner Lock is non-reentrant. uvicorn's BaseReload signal
handler calls `should_exit.set()` directly from signal context; if a
second signal (SIGTERM/SIGHUP from the reload supervisor, or
watchfiles-triggered reload) arrives while the first handler holds
the Lock, the reentrant call deadlocks on itself.

The reload supervisor keeps sending those signals only when the
worker fails to exit promptly. DeerFlow's lifespan currently awaits
`stop_channel_service()` with no timeout; if a channel's `stop()`
stalls (e.g. Feishu/Slack WebSocket waiting for an ack), the worker
can't exit, the supervisor keeps signaling, and the deadlock becomes
reachable.

This is a defense-in-depth fix — it does not repair the upstream
uvicorn/CPython issue, but it ensures DeerFlow's lifespan exits
within a bounded window so the supervisor has no reason to keep
firing signals. No behavior change on the happy path.

Wraps the shutdown hook in `asyncio.wait_for(timeout=5.0)` and logs
a warning on timeout before proceeding to worker exit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Update backend/app/gateway/app.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* style: apply make format (ruff) to test assertions

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-04-23 19:41:26 +08:00
He Wang
c42ae3af79
feat: add optional prompt-toolkit support to debug.py (#2461)
* feat: add optional prompt-toolkit support to debug.py

Use PromptSession.prompt_async() for arrow-key navigation and input
history when prompt-toolkit is available, falling back to plain input()
with a helpful install tip otherwise.

Made-with: Cursor

* fix: handle EOFError gracefully in debug.py

Catch EOFError alongside KeyboardInterrupt so that Ctrl-D exits
cleanly instead of printing a traceback.

Made-with: Cursor
2026-04-23 17:49:18 +08:00
d 🔹
b90f219bd1
fix(skills): validate bundled SKILL.md front-matter in CI (fixes #2443) (#2457)
* fix(skills): validate bundled SKILL.md front-matter in CI (fixes #2443)

Adds a parametrized backend test that runs `_validate_skill_frontmatter`
against every bundled SKILL.md under `skills/public/`, so a broken
front-matter fails CI with a per-skill error message instead of
surfacing as a runtime gateway-load warning.

The new test caught two pre-existing breakages on `main` and fixes them:

* `bootstrap/SKILL.md`: the unquoted description had a second `:` mid-line
  ("Also trigger for updates: ..."), which YAML parses as a nested mapping
  ("mapping values are not allowed here"). Rewrites the description as a
  folded scalar (`>-`), which preserves the original wording (including the
  embedded colon, double quotes, and apostrophes) without further escaping.
  This complements PR #2436 (single-file colon→hyphen patch) with a more
  general convention that survives future edits.

* `chart-visualization/SKILL.md`: used `dependency:` which is not in
  `ALLOWED_FRONTMATTER_PROPERTIES`. Renamed to `compatibility:`, the
  documented field for "Required tools, dependencies" per skill-creator.
  No code reads `dependency` (verified by grep across backend/).

* Apply suggestions from code review

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* Fix the lint error

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-04-23 14:06:14 +08:00
He Wang
c43c803f66
fix: remove mismatched context param in debug.py to suppress Pydantic warning (#2446)
* fix: remove mismatched context param in debug.py to suppress Pydantic warning

The ainvoke call passed context={"thread_id": ...} but the agent graph
has no context_schema (ContextT defaults to None), causing a
PydanticSerializationUnexpectedValue warning on every invocation.

Align with the production run_agent path by injecting context via
Runtime into configurable["__pregel_runtime"] instead.

Closes #2445

Made-with: Cursor

* refactor: derive runtime thread_id from config to avoid duplication

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Made-with: Cursor

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-04-23 09:56:57 +08:00
dependabot[bot]
dbd777fe62
chore(deps): bump python-dotenv from 1.2.1 to 1.2.2 in /backend (#2440)
Bumps [python-dotenv](https://github.com/theskumar/python-dotenv) from 1.2.1 to 1.2.2.
- [Release notes](https://github.com/theskumar/python-dotenv/releases)
- [Changelog](https://github.com/theskumar/python-dotenv/blob/main/CHANGELOG.md)
- [Commits](https://github.com/theskumar/python-dotenv/compare/v1.2.1...v1.2.2)

---
updated-dependencies:
- dependency-name: python-dotenv
  dependency-version: 1.2.2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-22 16:48:09 +08:00
dependabot[bot]
1ca2621285
chore(deps): bump lxml from 6.0.2 to 6.1.0 in /backend (#2427)
Bumps [lxml](https://github.com/lxml/lxml) from 6.0.2 to 6.1.0.
- [Release notes](https://github.com/lxml/lxml/releases)
- [Changelog](https://github.com/lxml/lxml/blob/master/CHANGES.txt)
- [Commits](https://github.com/lxml/lxml/compare/lxml-6.0.2...lxml-6.1.0)

---
updated-dependencies:
- dependency-name: lxml
  dependency-version: 6.1.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-22 16:14:11 +08:00
Shawn Jasper
5ba1dacf25
fix: rename present_file to present_files in docs and prompts (#2393)
The tool is registered as `present_files` (plural) in present_file_tool.py,
but four references in documentation and prompt strings incorrectly used the
singular form `present_file`. This could cause confusion and potentially
lead to incorrect tool invocations.

Changed files:
- backend/docs/GUARDRAILS.md
- backend/docs/ARCHITECTURE.md
- backend/packages/harness/deerflow/agents/lead_agent/prompt.py (2 occurrences)
2026-04-21 16:10:14 +08:00
Ansel
6dce26a52e
fix: resolve tool duplication and skill parser YAML inconsistencies (#1803) (#2107)
* Refactor tests for SKILL.md parser

Updated tests for SKILL.md parser to handle quoted names and descriptions correctly. Added new tests for parsing plain and single-quoted names, and ensured multi-line descriptions are processed properly.

* Implement tool name validation and deduplication

Add tool name mismatch warning and deduplication logic

* Refactor skill file parsing and error handling

* Add tests for tool name deduplication

Added tests for tool name deduplication in get_available_tools(). Ensured that duplicates are not returned, the first occurrence is kept, and warnings are logged for skipped duplicates.

* Apply suggestions from code review

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* Update minimal config to include tools list

* Update test for nonexistent skill file

Ensure the test for nonexistent files checks for None.

* Refactor tool loading and add skill management support

Refactor tool loading logic to include skill management tools based on configuration and clean up comments.

* Enhance code comments for tool loading logic

Added comments to clarify the purpose of various code sections related to tool loading and configuration.

* Fix assertion for duplicate tool name warning

* Fix indentation issues in tools.py

* Fix the lint error of test_tool_deduplication

* Fix the lint error of tools.py

* Fix the lint error

* Fix the lint error

* make format

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-04-20 20:25:03 +08:00
imhaoran
fc94e90f6c
fix(setup-agent): prevent data loss when setup fails on existing agen… (#2254)
* fix(setup-agent): prevent data loss when setup fails on existing agent directory

Record whether the agent directory pre-existed before mkdir, and only
run shutil.rmtree cleanup when the directory was newly created during
this call. Previously, any failure would delete the entire directory
including pre-existing SOUL.md and config.yaml.

* fix: address PR review — init variables before try, remove unused result

* style: fix ruff I001 import block formatting in test file

* style: add missing blank lines between top-level definitions in test file
2026-04-20 20:17:30 +08:00
Admire
c99865f53d
fix(token-usage): enable stream usage for openai-compatible models (#2217)
* fix(token-usage): enable stream usage for openai-compatible models

* fix(token-usage): narrow stream_usage default to ChatOpenAI
2026-04-19 22:42:55 +08:00
Xun
a62ca5dd47
fix: Catch httpx.ReadError in the error handling (#2309)
* fix: Catch httpx.ReadError in the error handling

* fix
2026-04-19 22:30:22 +08:00
Nan Gao
f514e35a36
fix(backend): make clarification messages idempotent (#2350) (#2351) 2026-04-19 22:00:58 +08:00
Hinotobi
80e210f5bb
[security] fix(uploads): require explicit opt-in for host-side document conversion (#2332)
* fix: disable host-side upload conversion by default

* fix: address PR review comments on upload conversion gate
2026-04-18 22:47:42 +08:00
dependabot[bot]
5656f90792
chore(deps-dev): bump pytest from 9.0.2 to 9.0.3 in /backend (#2349)
Bumps [pytest](https://github.com/pytest-dev/pytest) from 9.0.2 to 9.0.3.
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pytest-dev/pytest/compare/9.0.2...9.0.3)

---
updated-dependencies:
- dependency-name: pytest
  dependency-version: 9.0.3
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-18 22:22:40 +08:00
Shawn Jasper
55474011c9
fix(subagent): inherit parent agent's tool_groups in task_tool (#2305)
* fix(subagent): inherit parent agent's tool_groups in task_tool

When a custom agent defines tool_groups (e.g. [file:read, file:write, bash]),
the restriction is correctly applied to the lead agent. However, when the lead
agent delegates work to a subagent via the task tool, get_available_tools() is
called without the groups parameter, causing the subagent to receive ALL tools
(including web_search, web_fetch, image_search, etc.) regardless of the parent
agent's configuration.

This fix propagates tool_groups through run metadata so that task_tool passes
the same group filter when building the subagent's tool set.

Changes:
- agent.py: include tool_groups in run metadata
- task_tool.py: read tool_groups from metadata and pass to get_available_tools()

* fix: initialize metadata before conditional block and update tests for tool_groups propagation

- Initialize metadata = {} before the 'if runtime is not None' block to
  avoid Ruff F821 (possibly-undefined variable) and simplify the
  parent_tool_groups expression.
- Update existing test assertion to expect groups=None in
  get_available_tools call signature.
- Add 3 new test cases:
  - test_task_tool_propagates_tool_groups_to_subagent
  - test_task_tool_no_tool_groups_passes_none
  - test_task_tool_runtime_none_passes_groups_none
2026-04-18 22:17:37 +08:00
imhaoran
24fe5fbd8c
fix(mcp): prevent RuntimeError from escaping except block in get_cach… (#2252)
* fix(mcp): prevent RuntimeError from escaping except block in get_cached_mcp_tools

When `asyncio.get_event_loop()` raises RuntimeError and the fallback
`asyncio.run()` also fails, the exception escapes unhandled because
Python does not route exceptions raised inside an `except` block to
sibling `except` clauses. Wrap the fallback call in its own try/except
so failures are logged and the function returns [] as intended.

* fix: use logger.exception to preserve stack traces on MCP init failure
2026-04-18 21:07:30 +08:00
dependabot[bot]
aa6098e6a4
chore(deps): bump langsmith from 0.6.4 to 0.7.31 in /backend (#2291)
Bumps [langsmith](https://github.com/langchain-ai/langsmith-sdk) from 0.6.4 to 0.7.31.
- [Release notes](https://github.com/langchain-ai/langsmith-sdk/releases)
- [Commits](https://github.com/langchain-ai/langsmith-sdk/compare/v0.6.4...v0.7.31)

---
updated-dependencies:
- dependency-name: langsmith
  dependency-version: 0.7.31
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-18 19:54:21 +08:00
Shawn Jasper
ca1b7d5f48
fix(sandbox): add missing path masking in ls_tool output (#2317)
ls_tool was the only file-system tool that did not call
mask_local_paths_in_output() before returning its result, causing host
absolute paths (e.g. /Users/.../backend/.deer-flow/knowledge-base/...)
to leak to the LLM instead of the expected virtual paths
(/mnt/knowledge-base/...).

This patch:
- Adds the mask_local_paths_in_output() call to ls_tool, consistent
  with bash_tool, glob_tool and grep_tool.
- Initialises thread_data = None before the is_local_sandbox branch
  (same pattern as glob_tool) so the variable is always in scope.
- Adds three new tests covering user-data path masking, skills path
  masking and the empty-directory edge case.
2026-04-18 08:46:59 +08:00
DanielWalnut
898f4e8ac2
fix: Memory update system has cache corruption, data loss, and thread-safety bugs (#2251)
* fix(memory): cache corruption, thread-safety, and caller mutation bugs

Bug 1 (updater.py): deep-copy current_memory before passing to
_apply_updates() so a subsequent save() failure cannot leave a
partially-mutated object in the storage cache.

Bug 3 (storage.py): add _cache_lock (threading.Lock) to
FileMemoryStorage and acquire it around every read/write of
_memory_cache, fixing concurrent-access races between the background
timer thread and HTTP reload calls.

Bug 4 (storage.py): replace in-place mutation
  memory_data["lastUpdated"] = ...
with a shallow copy
  memory_data = {**memory_data, "lastUpdated": ...}
so save() no longer silently modifies the caller's dict.

Regression tests added for all three bugs in test_memory_storage.py
and test_memory_updater.py.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: format test_memory_updater.py with ruff

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* style: remove stale bug-number labels from code comments and docstrings

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 12:00:31 +08:00
dependabot[bot]
259a6844bf
chore(deps): bump python-multipart from 0.0.22 to 0.0.26 in /backend (#2282)
Bumps [python-multipart](https://github.com/Kludex/python-multipart) from 0.0.22 to 0.0.26.
- [Release notes](https://github.com/Kludex/python-multipart/releases)
- [Changelog](https://github.com/Kludex/python-multipart/blob/master/CHANGELOG.md)
- [Commits](https://github.com/Kludex/python-multipart/compare/0.0.22...0.0.26)

---
updated-dependencies:
- dependency-name: python-multipart
  dependency-version: 0.0.26
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-16 09:07:28 +08:00
d 🔹
a664d2f5c4
fix(checkpointer): create parent directory before opening SQLite in sync provider (#2272)
* fix(checkpointer): create parent directory before opening SQLite in sync provider

The sync checkpointer factory (_sync_checkpointer_cm) opens a SQLite
connection without first ensuring the parent directory exists.  The async
provider and both store providers already call ensure_sqlite_parent_dir(),
but this call was missing from the sync path.

When the deer-flow harness package is used from an external virtualenv
(where the .deer-flow directory is not pre-created), the missing parent
directory causes:

    sqlite3.OperationalError: unable to open database file

Add the missing ensure_sqlite_parent_dir() call in the sync SQLite
branch, consistent with the async provider, and add a regression test.

Closes #2259

* style: fix ruff format + add call-order assertion for ensure_parent_dir

- Fix formatting in test_checkpointer.py (ruff format)
- Add test_sqlite_ensure_parent_dir_before_connect to verify
  ensure_sqlite_parent_dir is called before from_conn_string
  (addresses Copilot review suggestion)

---------

Co-authored-by: voidborne-d <voidborne-d@users.noreply.github.com>
2026-04-16 09:06:38 +08:00
YuJitang
105db00987
feat: show token usage per assistant response (#2270)
* feat: show token usage per assistant response

* fix: align client models response with token usage

* fix: address token usage review feedback

* docs: clarify token usage config example

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-04-16 08:56:49 +08:00
Hinotobi
2176b2bbfc
fix: validate bootstrap agent names before filesystem writes (#2274)
* fix: validate bootstrap agent names before filesystem writes

* fix: tighten bootstrap agent-name validation
2026-04-16 08:36:42 +08:00
Wen
8e3591312a
test: add unit tests for ViewImageMiddleware (#2256)
* test: add unit tests for ViewImageMiddleware

- Add 33 test cases covering all 7 internal methods plus sync/async
  before_model hooks
- Cover normal path, edge cases (missing keys, empty base64, stale
  ToolMessages before assistant turn), and deduplication logic
- Related to Q2 Roadmap #1669

* test: add unit tests for ViewImageMiddleware

Add 35 test cases covering all internal methods, before_model hooks,
and edge cases (missing attrs, list-content dedup, stale ToolMessages).

Related to #1669
2026-04-15 23:54:30 +08:00
Jason
692f79452d
fix(gateway): forward agent_name and is_bootstrap from context to configurable (#2242)
The frontend sends agent_name and is_bootstrap via the context field
in run requests, but services.py only forwards a hardcoded whitelist
of keys (_CONTEXT_CONFIGURABLE_KEYS) into the agent's configurable
dict.  Since agent_name was missing, custom agents never received
their name — make_lead_agent always fell back to the default lead
agent, skipping SOUL.md, per-agent config and skill filtering.

Similarly, is_bootstrap was dropped, so the bootstrap creation flow
could never activate the setup_agent tool path.

Add both keys to the whitelist so they reach make_lead_agent.

Fixes #2222

Co-authored-by: JasonOA888 <JasonOA888@users.noreply.github.com>
2026-04-15 23:11:10 +08:00
DanielWalnut
8760937439
fix(memory): use asyncio.to_thread for blocking file I/O in aupdate_memory (#2220)
* fix(memory): use asyncio.to_thread for blocking file I/O in aupdate_memory

`_finalize_update` performs synchronous blocking operations (os.mkdir,
file open/write/rename/stat) that were called directly from the async
`aupdate_memory` method, causing `BlockingError` from blockbuster when
running under an ASGI server. Wrap the call with `asyncio.to_thread` to
offload all blocking I/O to a thread pool.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(memory): use unique temp filename to prevent concurrent write collision

`file_path.with_suffix(".tmp")` produces a fixed path — concurrent saves
for the same agent (now possible after wrapping _finalize_update in
asyncio.to_thread) would clobber the same temp file. Use a UUID-suffixed
temp file so each write is isolated.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(memory): also offload _prepare_update_prompt to thread pool

FileMemoryStorage.load() inside _prepare_update_prompt performs
synchronous stat() and file read, blocking the event loop just like
_finalize_update did. Wrap _prepare_update_prompt in asyncio.to_thread
for the same reason.

The async path now has no blocking file I/O on the event loop:
  to_thread(_prepare_update_prompt) → await model.ainvoke() → to_thread(_finalize_update)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 16:41:54 +08:00
DanielWalnut
4ba3167f48
feat: flush memory before summarization (#2176)
* feat: flush memory before summarization

* fix: keep agent-scoped memory on summarization flush

* fix: harden summarization hook plumbing

* fix: address summarization review feedback

* style: format memory middleware
2026-04-14 15:01:06 +08:00
Octopus
e4f896e90d
fix(todo-middleware): prevent premature agent exit with incomplete todos (#2135)
* fix(todo-middleware): prevent premature agent exit with incomplete todos

When plan mode is active (is_plan_mode=True), the agent occasionally
exits the loop and outputs a final response while todo items are still
incomplete. This happens because the routing edge only checks for
tool_calls, not todo completion state.

Fixes #2112

Add an after_model override to TodoMiddleware with
@hook_config(can_jump_to=["model"]). When the model produces a
response with no tool calls but there are still incomplete todos, the
middleware injects a todo_completion_reminder HumanMessage and returns
jump_to=model to force another model turn. A cap of 2 reminders
prevents infinite loops when the agent cannot make further progress.

Also adds _completion_reminder_count() helper and 14 new unit tests
covering all edge cases of the new after_model / aafter_model logic.

* Remove unnecessary blank line in test file

* Fix runtime argument annotation in before_model

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: octo-patch <octo-patch@github.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-04-14 11:11:26 +08:00
luo jiyin
07fc25d285
feat: switch memory updater to async LLM calls (#2138)
* docs: mark memory updater async migration as completed

- Update TODO.md to mark the replacement of sync model.invoke()
  with async model.ainvoke() in title_middleware and memory updater
  as completed using [x] format

Addresses #2131

* feat: switch memory updater to async LLM calls

- Add async aupdate_memory() method using await model.ainvoke()
- Convert sync update_memory() to use async wrapper
- Add _run_async_update_sync() for nested loop context handling
- Maintain backward compatibility with existing sync API
- Add ThreadPoolExecutor for async execution from sync contexts

Addresses #2131

* test: add tests for async memory updater

- Add test_async_update_memory_uses_ainvoke() to verify async path
- Convert existing tests to use AsyncMock and ainvoke assertions
- Add test_sync_update_memory_wrapper_works_in_running_loop()
- Update all model mocks to use async await patterns

Addresses #2131

* fix: apply ruff formatting to memory updater

- Format multi-line expressions to single line
- Ensure code style consistency with project standards
- Fix lint issues caught by GitHub Actions

* test: add comprehensive tests for async memory updater

- Add test_async_update_memory_uses_ainvoke() to verify async path
- Convert existing tests to use AsyncMock and ainvoke assertions
- Add test_sync_update_memory_wrapper_works_in_running_loop()
- Update all model mocks to use async await patterns
- Ensure backward compatibility with sync API

* fix: satisfy ruff formatting in memory updater test

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-04-14 11:10:42 +08:00
Nan Gao
55bc09ac33
fix(backend): fix uploads for mounted sandbox providers (#2199)
* fix uploads for mounted sandbox providers

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-04-14 10:44:31 +08:00
dependabot[bot]
c43a45ea40
chore(deps): bump pillow from 12.1.1 to 12.2.0 in /backend (#2206)
Bumps [pillow](https://github.com/python-pillow/Pillow) from 12.1.1 to 12.2.0.
- [Release notes](https://github.com/python-pillow/Pillow/releases)
- [Changelog](https://github.com/python-pillow/Pillow/blob/main/CHANGES.rst)
- [Commits](https://github.com/python-pillow/Pillow/compare/12.1.1...12.2.0)

---
updated-dependencies:
- dependency-name: pillow
  dependency-version: 12.2.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-14 10:35:59 +08:00
Admire
9cf7153b1d
fix(check): windows pnpm version detection in check script (#2189)
* fix: resolve Windows pnpm detection in check script

* style: format check script regression test

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* fix: resolve corepack fallback on windows

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-04-14 10:29:44 +08:00
Octopus
c91785dd68
fix(title): strip <think> tags from title model responses and assistant context (#1927)
* fix(title): strip <think> tags from title model responses and assistant context

Reasoning models (e.g. minimax M2.7, DeepSeek-R1) emit <think>...</think>
blocks before their actual output. When such a model is used as the title
model (or as the main agent), the raw thinking content leaked into the thread
title stored in state, so the chat list showed the internal monologue instead
of a meaningful title.

Fixes #1884

- Add `_strip_think_tags()` helper using a regex to remove all <think>...</think> blocks
- Apply it in `_parse_title()` so the title model response is always clean
- Apply it to the assistant message in `_build_title_prompt()` so thinking
  content from the first AI turn is not fed back to the title model
- Add four new unit tests covering: stripping in parse, think-only response,
  assistant prompt stripping, and end-to-end async flow with think tags

* Fix the lint error

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-04-14 09:51:39 +08:00
sqsge
053e18e1a6
fix(skills): avoid blocking custom skill deletion on readonly history writes (#2197) 2026-04-14 09:00:29 +08:00
Hinotobi
a7e7c6d667
fix: disable custom-agent management API by default (#2161)
* fix: disable custom-agent management API by default

* style: format agents API hardening files

* fix: address review feedback for agents API hardening

* fix: add missing disabled API coverage
2026-04-14 00:03:38 +08:00
Nan Gao
f4c17c66ce
fix(middleware): fix present_files thread id fallback (#2181)
* fix present files thread id fallback

* fix: resolve present_files thread id from runtime config
2026-04-13 22:59:13 +08:00
lesliewangwyc-dev
1df389b9d0
fix: wrap blocking readability call with asyncio.to_thread in web_fetch (#2157)
* fix: wrap blocking readability call with asyncio.to_thread in web_fetch

The readability extractor internally spawns a Node.js subprocess via
readabilipy, which blocks the async event loop and causes a
BlockingError when web_fetch is invoked inside LangGraph's async
runtime.

Wrap the synchronous extract_article call with asyncio.to_thread to
offload it to a thread pool, unblocking the event loop.

Note: community/infoquest/tools.py has the same latent issue and
should be addressed in a follow-up PR.

Closes #2152

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test: verify web_fetch offloads extraction via asyncio.to_thread

Add a regression test that monkeypatches asyncio.to_thread to confirm
readability extraction is offloaded to a worker thread, preventing
future refactors from reintroducing the blocking call.

Addresses Copilot review feedback on #2157.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-04-13 21:15:24 +08:00