* fix(gateway): bound lifespan shutdown hooks to prevent worker hang
Gateway worker can hang indefinitely in `uvicorn --reload` mode with
the listening socket still bound — all /api/* requests return 504,
and SIGKILL is the only recovery.
Root cause (py-spy dump from a reproduction showed 16+ stacked frames
of signal_handler -> Event.set -> threading.Lock.__enter__ on the
main thread): CPython's `threading.Event` uses `Condition(Lock())`
where the inner Lock is non-reentrant. uvicorn's BaseReload signal
handler calls `should_exit.set()` directly from signal context; if a
second signal (SIGTERM/SIGHUP from the reload supervisor, or
watchfiles-triggered reload) arrives while the first handler holds
the Lock, the reentrant call deadlocks on itself.
The reload supervisor keeps sending those signals only when the
worker fails to exit promptly. DeerFlow's lifespan currently awaits
`stop_channel_service()` with no timeout; if a channel's `stop()`
stalls (e.g. Feishu/Slack WebSocket waiting for an ack), the worker
can't exit, the supervisor keeps signaling, and the deadlock becomes
reachable.
This is a defense-in-depth fix — it does not repair the upstream
uvicorn/CPython issue, but it ensures DeerFlow's lifespan exits
within a bounded window so the supervisor has no reason to keep
firing signals. No behavior change on the happy path.
Wraps the shutdown hook in `asyncio.wait_for(timeout=5.0)` and logs
a warning on timeout before proceeding to worker exit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Update backend/app/gateway/app.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* style: apply make format (ruff) to test assertions
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* feat: add optional prompt-toolkit support to debug.py
Use PromptSession.prompt_async() for arrow-key navigation and input
history when prompt-toolkit is available, falling back to plain input()
with a helpful install tip otherwise.
Made-with: Cursor
* fix: handle EOFError gracefully in debug.py
Catch EOFError alongside KeyboardInterrupt so that Ctrl-D exits
cleanly instead of printing a traceback.
Made-with: Cursor
* fix(skills): validate bundled SKILL.md front-matter in CI (fixes#2443)
Adds a parametrized backend test that runs `_validate_skill_frontmatter`
against every bundled SKILL.md under `skills/public/`, so a broken
front-matter fails CI with a per-skill error message instead of
surfacing as a runtime gateway-load warning.
The new test caught two pre-existing breakages on `main` and fixes them:
* `bootstrap/SKILL.md`: the unquoted description had a second `:` mid-line
("Also trigger for updates: ..."), which YAML parses as a nested mapping
("mapping values are not allowed here"). Rewrites the description as a
folded scalar (`>-`), which preserves the original wording (including the
embedded colon, double quotes, and apostrophes) without further escaping.
This complements PR #2436 (single-file colon→hyphen patch) with a more
general convention that survives future edits.
* `chart-visualization/SKILL.md`: used `dependency:` which is not in
`ALLOWED_FRONTMATTER_PROPERTIES`. Renamed to `compatibility:`, the
documented field for "Required tools, dependencies" per skill-creator.
No code reads `dependency` (verified by grep across backend/).
* Apply suggestions from code review
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* Fix the lint error
---------
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* fix: remove mismatched context param in debug.py to suppress Pydantic warning
The ainvoke call passed context={"thread_id": ...} but the agent graph
has no context_schema (ContextT defaults to None), causing a
PydanticSerializationUnexpectedValue warning on every invocation.
Align with the production run_agent path by injecting context via
Runtime into configurable["__pregel_runtime"] instead.
Closes#2445
Made-with: Cursor
* refactor: derive runtime thread_id from config to avoid duplication
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Made-with: Cursor
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
The tool is registered as `present_files` (plural) in present_file_tool.py,
but four references in documentation and prompt strings incorrectly used the
singular form `present_file`. This could cause confusion and potentially
lead to incorrect tool invocations.
Changed files:
- backend/docs/GUARDRAILS.md
- backend/docs/ARCHITECTURE.md
- backend/packages/harness/deerflow/agents/lead_agent/prompt.py (2 occurrences)
- Remove f-string prefix on 7 strings with no placeholders (F541)
in analyze.py, aggregate_benchmark.py, run_loop.py, generate_review.py
- Remove unused `os` import in quick_validate.py (F401)
Found by ruff via HUMMBL Arbiter (https://hummbl.io/audit).
* Refactor tests for SKILL.md parser
Updated tests for SKILL.md parser to handle quoted names and descriptions correctly. Added new tests for parsing plain and single-quoted names, and ensured multi-line descriptions are processed properly.
* Implement tool name validation and deduplication
Add tool name mismatch warning and deduplication logic
* Refactor skill file parsing and error handling
* Add tests for tool name deduplication
Added tests for tool name deduplication in get_available_tools(). Ensured that duplicates are not returned, the first occurrence is kept, and warnings are logged for skipped duplicates.
* Apply suggestions from code review
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* Update minimal config to include tools list
* Update test for nonexistent skill file
Ensure the test for nonexistent files checks for None.
* Refactor tool loading and add skill management support
Refactor tool loading logic to include skill management tools based on configuration and clean up comments.
* Enhance code comments for tool loading logic
Added comments to clarify the purpose of various code sections related to tool loading and configuration.
* Fix assertion for duplicate tool name warning
* Fix indentation issues in tools.py
* Fix the lint error of test_tool_deduplication
* Fix the lint error of tools.py
* Fix the lint error
* Fix the lint error
* make format
---------
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* fix(setup-agent): prevent data loss when setup fails on existing agent directory
Record whether the agent directory pre-existed before mkdir, and only
run shutil.rmtree cleanup when the directory was newly created during
this call. Previously, any failure would delete the entire directory
including pre-existing SOUL.md and config.yaml.
* fix: address PR review — init variables before try, remove unused result
* style: fix ruff I001 import block formatting in test file
* style: add missing blank lines between top-level definitions in test file
* fix(subagent): inherit parent agent's tool_groups in task_tool
When a custom agent defines tool_groups (e.g. [file:read, file:write, bash]),
the restriction is correctly applied to the lead agent. However, when the lead
agent delegates work to a subagent via the task tool, get_available_tools() is
called without the groups parameter, causing the subagent to receive ALL tools
(including web_search, web_fetch, image_search, etc.) regardless of the parent
agent's configuration.
This fix propagates tool_groups through run metadata so that task_tool passes
the same group filter when building the subagent's tool set.
Changes:
- agent.py: include tool_groups in run metadata
- task_tool.py: read tool_groups from metadata and pass to get_available_tools()
* fix: initialize metadata before conditional block and update tests for tool_groups propagation
- Initialize metadata = {} before the 'if runtime is not None' block to
avoid Ruff F821 (possibly-undefined variable) and simplify the
parent_tool_groups expression.
- Update existing test assertion to expect groups=None in
get_available_tools call signature.
- Add 3 new test cases:
- test_task_tool_propagates_tool_groups_to_subagent
- test_task_tool_no_tool_groups_passes_none
- test_task_tool_runtime_none_passes_groups_none
* fix(mcp): prevent RuntimeError from escaping except block in get_cached_mcp_tools
When `asyncio.get_event_loop()` raises RuntimeError and the fallback
`asyncio.run()` also fails, the exception escapes unhandled because
Python does not route exceptions raised inside an `except` block to
sibling `except` clauses. Wrap the fallback call in its own try/except
so failures are logged and the function returns [] as intended.
* fix: use logger.exception to preserve stack traces on MCP init failure
When NEXT_PUBLIC_BACKEND_BASE_URL is unset, the frontend proxies API
requests to the gateway. Only /api/agents and /api/skills had rewrite
rules, causing 404s for /api/models, /api/threads, /api/memory,
/api/mcp, /api/suggestions, /api/runs, etc.
Add a catch-all /api/:path* rewrite that proxies all remaining gateway
API routes. The existing /api/langgraph rewrite takes priority because
it is pushed to the array first (Next.js checks rewrites in order).
Fixes#2327
Co-authored-by: JasonOA888 <JasonOA888@users.noreply.github.com>
ls_tool was the only file-system tool that did not call
mask_local_paths_in_output() before returning its result, causing host
absolute paths (e.g. /Users/.../backend/.deer-flow/knowledge-base/...)
to leak to the LLM instead of the expected virtual paths
(/mnt/knowledge-base/...).
This patch:
- Adds the mask_local_paths_in_output() call to ls_tool, consistent
with bash_tool, glob_tool and grep_tool.
- Initialises thread_data = None before the is_local_sandbox branch
(same pattern as glob_tool) so the variable is always in scope.
- Adds three new tests covering user-data path masking, skills path
masking and the empty-directory edge case.
* fix(memory): cache corruption, thread-safety, and caller mutation bugs
Bug 1 (updater.py): deep-copy current_memory before passing to
_apply_updates() so a subsequent save() failure cannot leave a
partially-mutated object in the storage cache.
Bug 3 (storage.py): add _cache_lock (threading.Lock) to
FileMemoryStorage and acquire it around every read/write of
_memory_cache, fixing concurrent-access races between the background
timer thread and HTTP reload calls.
Bug 4 (storage.py): replace in-place mutation
memory_data["lastUpdated"] = ...
with a shallow copy
memory_data = {**memory_data, "lastUpdated": ...}
so save() no longer silently modifies the caller's dict.
Regression tests added for all three bugs in test_memory_storage.py
and test_memory_updater.py.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* style: format test_memory_updater.py with ruff
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* style: remove stale bug-number labels from code comments and docstrings
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(checkpointer): create parent directory before opening SQLite in sync provider
The sync checkpointer factory (_sync_checkpointer_cm) opens a SQLite
connection without first ensuring the parent directory exists. The async
provider and both store providers already call ensure_sqlite_parent_dir(),
but this call was missing from the sync path.
When the deer-flow harness package is used from an external virtualenv
(where the .deer-flow directory is not pre-created), the missing parent
directory causes:
sqlite3.OperationalError: unable to open database file
Add the missing ensure_sqlite_parent_dir() call in the sync SQLite
branch, consistent with the async provider, and add a regression test.
Closes#2259
* style: fix ruff format + add call-order assertion for ensure_parent_dir
- Fix formatting in test_checkpointer.py (ruff format)
- Add test_sqlite_ensure_parent_dir_before_connect to verify
ensure_sqlite_parent_dir is called before from_conn_string
(addresses Copilot review suggestion)
---------
Co-authored-by: voidborne-d <voidborne-d@users.noreply.github.com>
* fix(frontend): make Suggestion button opaque in dark mode
The outline Button variant applies dark:bg-input/30, leaving Suggestion
pills ~70% transparent in dark mode. Scrolled chat content bled through
the buttons, making suggestion text unreadable. Override with
dark:bg-background so it matches the opaque light-mode appearance.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix the lint error of commit
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
After a page refresh, the artifact panel's autoOpen/autoSelect state is
reset to true. Submitting a new question flips thread.isLoading to true,
which message-list passes to every MessageGroup — including historical
ones. The previous response's last write_file step then satisfies the
auto-open condition and re-pops the stale artifact.
Gate the auto-open on the tool call having no result yet, so only a
write_file that is still streaming in the current response can trigger
it; rehydrated tool calls always carry a result and are now skipped.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* test: add unit tests for ViewImageMiddleware
- Add 33 test cases covering all 7 internal methods plus sync/async
before_model hooks
- Cover normal path, edge cases (missing keys, empty base64, stale
ToolMessages before assistant turn), and deduplication logic
- Related to Q2 Roadmap #1669
* test: add unit tests for ViewImageMiddleware
Add 35 test cases covering all internal methods, before_model hooks,
and edge cases (missing attrs, list-content dedup, stale ToolMessages).
Related to #1669
Fixes#2203
When NEXT_PUBLIC_BACKEND_BASE_URL is not set, the frontend uses Next.js
rewrites to proxy API calls to the gateway. Skills API routes were missing
from the rewrite config, causing /api/skills to return the SPA HTML instead
of JSON, which produced 'Unexpected token <' errors in the skill settings page.
Co-authored-by: JasonOA888 <JasonOA888@users.noreply.github.com>
* fix(gateway): forward agent_name and is_bootstrap from context to configurable
The frontend sends agent_name and is_bootstrap via the context field
in run requests, but services.py only forwards a hardcoded whitelist
of keys (_CONTEXT_CONFIGURABLE_KEYS) into the agent's configurable
dict. Since agent_name was missing, custom agents never received
their name — make_lead_agent always fell back to the default lead
agent, skipping SOUL.md, per-agent config and skill filtering.
Similarly, is_bootstrap was dropped, so the bootstrap creation flow
could never activate the setup_agent tool path.
Add both keys to the whitelist so they reach make_lead_agent.
Fixes#2222
* fix(frontend): resolve /mnt/ links in markdown to artifact API URLs
AI agent messages contain links like /mnt/user-data/outputs/file.pdf
which were rendered as-is in the browser, resulting in 404 errors.
Images already got the correct treatment via MessageImage and
resolveArtifactURL, but anchor tags (<a>) were passed through
unchanged.
Add an 'a' component override in MessageContent_ that rewrites
/mnt/-prefixed hrefs to the artifact API endpoint, matching the
existing image handling pattern.
Fixes#2232
---------
Co-authored-by: JasonOA888 <JasonOA888@users.noreply.github.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
The frontend sends agent_name and is_bootstrap via the context field
in run requests, but services.py only forwards a hardcoded whitelist
of keys (_CONTEXT_CONFIGURABLE_KEYS) into the agent's configurable
dict. Since agent_name was missing, custom agents never received
their name — make_lead_agent always fell back to the default lead
agent, skipping SOUL.md, per-agent config and skill filtering.
Similarly, is_bootstrap was dropped, so the bootstrap creation flow
could never activate the setup_agent tool path.
Add both keys to the whitelist so they reach make_lead_agent.
Fixes#2222
Co-authored-by: JasonOA888 <JasonOA888@users.noreply.github.com>
* fix(memory): use asyncio.to_thread for blocking file I/O in aupdate_memory
`_finalize_update` performs synchronous blocking operations (os.mkdir,
file open/write/rename/stat) that were called directly from the async
`aupdate_memory` method, causing `BlockingError` from blockbuster when
running under an ASGI server. Wrap the call with `asyncio.to_thread` to
offload all blocking I/O to a thread pool.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(memory): use unique temp filename to prevent concurrent write collision
`file_path.with_suffix(".tmp")` produces a fixed path — concurrent saves
for the same agent (now possible after wrapping _finalize_update in
asyncio.to_thread) would clobber the same temp file. Use a UUID-suffixed
temp file so each write is isolated.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(memory): also offload _prepare_update_prompt to thread pool
FileMemoryStorage.load() inside _prepare_update_prompt performs
synchronous stat() and file read, blocking the event loop just like
_finalize_update did. Wrap _prepare_update_prompt in asyncio.to_thread
for the same reason.
The async path now has no blocking file I/O on the event loop:
to_thread(_prepare_update_prompt) → await model.ainvoke() → to_thread(_finalize_update)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(todo-middleware): prevent premature agent exit with incomplete todos
When plan mode is active (is_plan_mode=True), the agent occasionally
exits the loop and outputs a final response while todo items are still
incomplete. This happens because the routing edge only checks for
tool_calls, not todo completion state.
Fixes#2112
Add an after_model override to TodoMiddleware with
@hook_config(can_jump_to=["model"]). When the model produces a
response with no tool calls but there are still incomplete todos, the
middleware injects a todo_completion_reminder HumanMessage and returns
jump_to=model to force another model turn. A cap of 2 reminders
prevents infinite loops when the agent cannot make further progress.
Also adds _completion_reminder_count() helper and 14 new unit tests
covering all edge cases of the new after_model / aafter_model logic.
* Remove unnecessary blank line in test file
* Fix runtime argument annotation in before_model
* Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: octo-patch <octo-patch@github.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* docs: mark memory updater async migration as completed
- Update TODO.md to mark the replacement of sync model.invoke()
with async model.ainvoke() in title_middleware and memory updater
as completed using [x] format
Addresses #2131
* feat: switch memory updater to async LLM calls
- Add async aupdate_memory() method using await model.ainvoke()
- Convert sync update_memory() to use async wrapper
- Add _run_async_update_sync() for nested loop context handling
- Maintain backward compatibility with existing sync API
- Add ThreadPoolExecutor for async execution from sync contexts
Addresses #2131
* test: add tests for async memory updater
- Add test_async_update_memory_uses_ainvoke() to verify async path
- Convert existing tests to use AsyncMock and ainvoke assertions
- Add test_sync_update_memory_wrapper_works_in_running_loop()
- Update all model mocks to use async await patterns
Addresses #2131
* fix: apply ruff formatting to memory updater
- Format multi-line expressions to single line
- Ensure code style consistency with project standards
- Fix lint issues caught by GitHub Actions
* test: add comprehensive tests for async memory updater
- Add test_async_update_memory_uses_ainvoke() to verify async path
- Convert existing tests to use AsyncMock and ainvoke assertions
- Add test_sync_update_memory_wrapper_works_in_running_loop()
- Update all model mocks to use async await patterns
- Ensure backward compatibility with sync API
* fix: satisfy ruff formatting in memory updater test
---------
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
* fix uploads for mounted sandbox providers
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* fix: resolve Windows pnpm detection in check script
* style: format check script regression test
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* fix: resolve corepack fallback on windows
---------
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>