deer-flow

mirror of https://github.com/bytedance/deer-flow.git synced 2026-07-19 04:27:55 +00:00

Author	SHA1	Message	Date
KiteEater	2b5bece744	fix(harness): reset local sandbox singleton with provider lifecycle (#2834 ) * Fix local sandbox singleton reset on provider lifecycle * Fix local sandbox singleton reset on provider reset --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-05-11 07:42:15 +08:00
YuJitang	e82b2fb4d0	docs: clarify token usage accounting semantics (#2845 )	2026-05-11 07:17:49 +08:00
Maz Benoscar	30a5846219	fix(tools): make write_file append discoverable in model-facing schema (#2843 ) * fix: make tool argument behavior discoverable The write_file tool already supported append=false by default with append=true for end-of-file writes, but the parsed docstring did not describe append in the model-facing schema. This records the overwrite default and append path in the tool description, adds resilient schema regression coverage, and keeps backend sandbox docs aligned. The regression now also checks that every public parameter in the existing tool schema test matrix has a description. Enabling docstring parsing on setup_agent and update_agent fills the two existing gaps with their existing Args docs instead of duplicating descriptions elsewhere. Constraint: Issue #2831 asks for a small docstring/schema discoverability fix without changing runtime file-writing behavior Rejected: Changing write_file defaults \| would alter existing overwrite semantics and broaden the fix beyond schema discoverability Rejected: Exact phrase assertions \| too brittle for future docstring rewording while testing the same behavior Confidence: high Scope-risk: narrow Directive: Keep model-facing tool parameters documented through parsed docstrings or equivalent schema descriptions Tested: cd backend && uv run pytest tests/test_setup_agent_tool.py tests/test_update_agent_tool.py tests/test_tool_args_schema_no_pydantic_warning.py tests/test_sandbox_tools_security.py::test_str_replace_and_append_on_same_path_should_preserve_both_updates -q Tested: cd backend && uv run ruff check packages/harness/deerflow/sandbox/tools.py packages/harness/deerflow/tools/builtins/setup_agent_tool.py packages/harness/deerflow/tools/builtins/update_agent_tool.py tests/test_tool_args_schema_no_pydantic_warning.py Not-tested: Full backend test suite Co-authored-by: OmX <omx@oh-my-codex.dev> * Fix the lint error --------- Co-authored-by: OmX <omx@oh-my-codex.dev> Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-05-10 23:09:03 +08:00
YuJitang	9892a7d468	fix: bucket subagent token usage into parent run totals (#2838 ) * fix: bucket subagent token usage into RunRow.subagent_tokens Add caller-bucketed token tracking to RunJournal so subagent and middleware LLM calls are written to the correct RunRow columns instead of all falling into lead_agent_tokens (default 0). - RunJournal: accumulate _lead_agent_tokens / _subagent_tokens / _middleware_tokens in on_llm_end, deduped by langchain run_id. Add record_external_llm_usage_records() for external sources (respects track_token_usage flag). Return caller buckets from get_completion_data(). - SubagentTokenCollector: new lightweight callback handler that collects LLM usage within subagent execution. - SubagentExecutor: wire collector into subagent run_config and sync records to SubagentResult on every chunk (timeout/cancel safe). - SubagentResult: add token_usage_records and usage_reported fields. - task_tool: report subagent usage to parent RunJournal on every terminal status (COMPLETED/FAILED/CANCELLED/TIMED_OUT), including the CancelledError path, guarded against double-reporting. No DB migration needed — RunRow columns already exist. * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * fix: address token usage review feedback * Address review follow-ups --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-05-10 22:47:30 +08:00
Xinmin Zeng	94da8f67d7	fix(scripts): preserve uv extras across `make dev` restarts (#2754 ) (#2767 ) `make dev` ran `uv sync` unconditionally on every restart, wiping any optional extras the user had installed manually with `uv sync --all-packages --extra postgres`. The Docker image-build path already solved this via the `UV_EXTRAS` build-arg in backend/Dockerfile; the local serve.sh path and the docker-compose-dev startup command were the remaining outliers. `scripts/serve.sh` now resolves extras before `uv sync`: 1. honors `UV_EXTRAS` (parity with backend/Dockerfile and docker/docker-compose.yaml — no new convention introduced); 2. falls back to parsing config.yaml — `database.backend: postgres` or legacy `checkpointer.type: postgres` auto-pins `--extra postgres`, so the common case needs zero extra config. 3. detector stderr is no longer suppressed, so whitelist warnings or crashes surface to the dev terminal (review feedback). Detection lives in `scripts/detect_uv_extras.py` (stdlib-only — has to run before the venv exists). Extra names are validated against `^[A-Za-z][A-Za-z0-9_-]$` so a stray shell metacharacter in `.env` cannot reach `uv sync` downstream (defense in depth). `docker/docker-compose-dev.yaml`'s startup command is now extracted to `docker/dev-entrypoint.sh` (review feedback — the inline command had grown to a ~350-char one-liner). The script: - parses comma/whitespace-separated UV_EXTRAS, applying the same `^[A-Za-z][A-Za-z0-9_-]$` whitelist as the local detector; - emits one `--extra X` flag per token, so `UV_EXTRAS=postgres,ollama` works in Docker dev too (harmonized with local — review feedback); - calls `uv sync --all-packages` (PR #2584) so workspace member extras (deerflow-harness's postgres extra) are installed; - keeps the existing self-heal `(uv sync \|\| (recreate venv && retry))` branch; - exposes `--print-extras` for dry-run testing. The compose file mounts the script read-only at runtime, so script edits take effect on `make docker-restart` without an image rebuild. The `--no-sync` alternative (a separate suggestion in the issue thread) was considered but rejected for dev paths because it would drop the self-heal branch and the auto-pickup of new pyproject deps. `--no-sync` is already in use for the production CMD (`backend/Dockerfile:101`) where it's appropriate. Updates the asyncpg-missing error message to include the `--all-packages` flag (matching #2584) plus the persistent install flow, and expands `config.example.yaml` so all three install paths (local / docker dev / docker image build) are documented with their multi-extra capabilities. Tests: - `tests/test_detect_uv_extras.py` (21 tests) — local-path env parsing, YAML edge cases, env-vs-config precedence, whitelist rejection of shell metacharacters. - `tests/test_dev_entrypoint.py` (15 tests) — docker-path validation via `--print-extras`, multi-extra parsing, metacharacter abort. - `tests/test_persistence_scaffold.py` (22 tests, unchanged) — passes with the merged `--all-packages --extra postgres` error message. Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-05-10 22:28:29 +08:00
YuJitang	5127f08e1a	enable token usage by default (#2841 )	2026-05-10 22:00:57 +08:00
DanielWalnut	dfa4eb0c1a	[codex] fix follow-up suggestions layout (#2836 ) * fix follow-up suggestions layout * fix agent chat welcome layout transition --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-05-10 15:10:44 +08:00
DanielWalnut	08ee7adeba	fix(lint): remove duplicate is_dynamic_context_reminder definition (#2837 ) Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-09 23:40:46 +08:00
Eilen Shin	1c96a6afc8	fix: keep new agent bootstrap in user scope (#2784 )	2026-05-09 19:43:50 +08:00
YuJitang	417416087b	fix: use backend thread token usage for header total (#2800 ) * fix: use backend thread token usage for header total * Refactor thread token usage fetch	2026-05-09 19:40:32 +08:00
DanielWalnut	881ff71252	fix(harness): preserve dynamic context across summarization (#2823 )	2026-05-09 19:39:36 +08:00
DanielWalnut	f76e4e35c8	fix title generation with dynamic context reminder (#2830 )	2026-05-09 18:22:58 +08:00
yangyufan	0d1053ca44	fix(uploads): add Windows support for safe symlink-protected uploads (#2794 ) * fix(uploads): add Windows support for safe symlink-protected uploads * fix(uploads): update tests and translate comments;	2026-05-09 18:21:54 +08:00
He Wang	4063dd7157	feat(debug): print presented file paths with physical resolution (#2825 ) Surface artifacts produced via the present_files tool in the CLI debug REPL so headless clients without a frontend (VS Code launch configs, etc.) can locate output files. Each turn prints newly added artifacts plus their resolved host path. Works for any source that goes through present_files — ACP agents, subagents, or sandbox writes. Co-authored-by: Claude Opus 4 <noreply@anthropic.com>	2026-05-09 18:21:01 +08:00
ChenglongZ	7a3c58a733	Fix duplicate gateway upload filenames (#2789 )	2026-05-09 18:02:40 +08:00
dependabot[bot]	1edc9d9fae	chore(deps): bump langchain-core from 1.3.2 to 1.3.3 in /backend (#2807 ) Bumps [langchain-core](https://github.com/langchain-ai/langchain) from 1.3.2 to 1.3.3. - [Release notes](https://github.com/langchain-ai/langchain/releases) - [Commits](https://github.com/langchain-ai/langchain/compare/langchain-core==1.3.2...langchain-core==1.3.3) --- updated-dependencies: - dependency-name: langchain-core dependency-version: 1.3.3 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-09 15:51:18 +08:00
KiteEater	7caf03e97c	fix(packaging): add postgres extra for store/checkpointer supportFix postgres extra install guidance (#2584 ) * Fix postgres extra install guidance * Fix postgres install message lint * Format postgres install messages * Fix postgres install guidance and config docs	2026-05-09 09:49:08 +08:00
dependabot[bot]	41b04a556f	chore(deps): bump uuid from 10.0.0 to 14.0.0 in /frontend (#2802 ) Bumps [uuid](https://github.com/uuidjs/uuid) from 10.0.0 to 14.0.0. - [Release notes](https://github.com/uuidjs/uuid/releases) - [Changelog](https://github.com/uuidjs/uuid/blob/main/CHANGELOG.md) - [Commits](https://github.com/uuidjs/uuid/compare/v10.0.0...v14.0.0) --- updated-dependencies: - dependency-name: uuid dependency-version: 14.0.0 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-09 09:33:00 +08:00
DanielWalnut	c1b7f1d189	feat: static system prompt with DynamicContextMiddleware for prefix-cache optimization (#2801 ) * feat(middleware): inject dynamic context via DynamicContextMiddleware Move memory and current date out of the system prompt and into a dedicated <system-reminder> HumanMessage injected once per session (frozen-snapshot pattern) via a new DynamicContextMiddleware. This keeps the system prompt byte-exact across all users and sessions, enabling maximum Anthropic/Bedrock prefix-cache reuse. Key design decisions: - ID-swap technique: reminder takes the first HumanMessage's ID (replacing it in-place via add_messages), original content gets a derived `{id}__user` ID (appended after). Preserves correct ordering. - hide_from_ui: True on reminder messages so frontend filters them out. - Midnight crossing: date-update reminder injected before the current turn's HumanMessage when the conversation spans midnight. - INFO-level logging for production diagnostics. Also adds prompt-caching breakpoint budget enforcement tests and updates ClaudeChatModel docs to reference the new pattern. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(token-usage): log input/output token detail breakdown in middleware Extend the LLM token usage log line to include input_token_details and output_token_details (cache_creation, cache_read, reasoning, audio, etc.) when present. Adds tests covering Anthropic cache detail logging from both usage_metadata and response_metadata. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: fix nginx * fix(middleware): always inject date; gate memory on injection_enabled Date injection is now unconditional — it is part of the static system prompt replacement and should always be present. Memory injection remains gated by `memory.injection_enabled` in the app config. Previously the entire DynamicContextMiddleware was skipped when injection_enabled was False, which also suppressed the date. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(lint): format files and correct test assertions for token usage middleware - ruff format dynamic_context_middleware.py and test_claude_provider_prompt_caching.py - Remove unused pytest import from test_dynamic_context_middleware.py - Fix two tests that asserted response_metadata fallback logic that doesn't exist: replace with tests that match actual middleware behavior Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(middleware): address Copilot review comments on DynamicContextMiddleware - Use additional_kwargs flag for reminder detection instead of content substring matching, so user messages containing '<system-reminder>' are not mistakenly treated as injected reminders - Generate stable UUID when original HumanMessage.id is None to prevent ambiguous 'None__user' derived IDs and message collisions - Downgrade per-turn no-op log to DEBUG; keep actual injection events at INFO - Add two new tests: missing-id UUID fallback and user-text false-positive Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-09 09:27:02 +08:00
dependabot[bot]	109490da25	chore(deps): bump python-multipart from 0.0.26 to 0.0.27 in /backend (#2799 ) Bumps [python-multipart](https://github.com/Kludex/python-multipart) from 0.0.26 to 0.0.27. - [Release notes](https://github.com/Kludex/python-multipart/releases) - [Changelog](https://github.com/Kludex/python-multipart/blob/main/CHANGELOG.md) - [Commits](https://github.com/Kludex/python-multipart/compare/0.0.26...0.0.27) --- updated-dependencies: - dependency-name: python-multipart dependency-version: 0.0.27 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-08 22:58:15 +08:00
dependabot[bot]	14c0a32ee6	chore(deps): bump mako from 1.3.11 to 1.3.12 in /backend (#2798 ) Bumps [mako](https://github.com/sqlalchemy/mako) from 1.3.11 to 1.3.12. - [Release notes](https://github.com/sqlalchemy/mako/releases) - [Changelog](https://github.com/sqlalchemy/mako/blob/main/CHANGES) - [Commits](https://github.com/sqlalchemy/mako/commits) --- updated-dependencies: - dependency-name: mako dependency-version: 1.3.12 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-08 22:57:48 +08:00
Willem Jiang	70737af7cd	fix(nignx):resolve CSRF auth failure on non-standard ports (#2796 )	2026-05-08 22:40:38 +08:00
DanielWalnut	2b1fcb3e43	fix(task): remove max_turns parameter from task tool interface (#2783 ) * fix(task): remove max_turns parameter from task tool interface Subagents should always use their configured max_turns value. Exposing this parameter allowed callers to override the admin-configured limit, which is undesirable. The value is now exclusively driven by subagent config (per-agent overrides and global defaults in config.yaml). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Willem Jiang <willem.jiang@gmail.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-05-08 15:05:24 +08:00
He Wang	7de9b5828b	fix(tools): introduce Runtime type alias to eliminate Pydantic serialization warning (#2774 ) * fix(tools): introduce Runtime type alias to eliminate Pydantic serialization warning Add deerflow/tools/types.py with: Runtime = ToolRuntime[dict[str, Any], ThreadState] Replace every runtime: ToolRuntime[ContextT, ThreadState] and runtime: ToolRuntime[dict[str, Any], ThreadState] annotation in sandbox/tools.py, present_file_tool.py, task_tool.py, view_image_tool.py, and skill_manage_tool.py with the new Runtime alias. The unbound ContextT TypeVar (default None) caused PydanticSerializationUnexpectedValue warnings on every tool call because LangChain's BaseTool._parse_input calls model_dump() on the auto-generated args_schema while DeerFlow passes a dict as runtime context. Binding the context to dict[str, Any] aligns Pydantic's serialization expectations with reality and removes the noise from all run modes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Cursor <cursoragent@cursor.com> * fix(tools): extend Runtime alias to setup_agent and update_agent tools Replace bare ToolRuntime annotations in setup_agent_tool.py and update_agent_tool.py with the shared Runtime alias introduced in the previous commit, and add both tools to the Pydantic serialization warning regression test (13 cases total). Co-authored-by: Cursor <cursoragent@cursor.com> * test(tools): loosen Pydantic warning filter to avoid version-specific format Replace the brittle "field_name='context'" substring check with a looser "context" match so the assertion stays valid if Pydantic changes its internal warning format across versions. Co-authored-by: Cursor <cursoragent@cursor.com> * test(tools): simplify warning filter and clean up docstring Remove the "context" substring condition from the Pydantic warning filter — asserting that no PydanticSerializationUnexpectedValue fires at all is both simpler and more comprehensive, since the test payload contains only the tool's own args plus runtime. Also update the module docstring to remove the version-specific warning format example that was inconsistent with the looser filter. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-08 14:50:33 +08:00
Eilen Shin	37db689349	fix(events): serialize structured db event content (#2762 )	2026-05-08 10:17:17 +08:00
Eilen Shin	bd45cb2846	fix(sandbox): disable msys path conversion (#2766 )	2026-05-08 10:13:11 +08:00
Eilen Shin	5fd0e6ac89	fix(middleware): sync raw tool call metadata (#2757 )	2026-05-08 10:08:53 +08:00
YuJitang	530bda7107	fix: dedupe token usage aggregation by message id (#2770 )	2026-05-08 09:54:20 +08:00
Willem Jiang	6c220a9aef	fix(chat): prevent first user message from being swallowed in new conversations (#2731 ) * fix(chat): prevent first user message from being swallowed in new conversations The optimistic message clearing effect cleared too eagerly — any stream message (including AI messages from messages-tuple events) triggered the clear before the server's human message had arrived via values events. For new threads this caused the user's first prompt to disappear permanently. Only clear optimistic messages once the server's human message has been confirmed to arrive in thread.messages, not just when any message arrives. Fixes #2730 * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-05-07 17:31:48 +08:00
Tao Liu	daa3ffc29b	feat(loop-detection): make loop detection configurable with per-tool frequency overrides (#2711 ) * Make loop detection configurable Expose LoopDetectionMiddleware thresholds through config.yaml while preserving existing defaults and allowing the middleware to be disabled. Refs bytedance/deer-flow#2517 * feat(loop-detection): add per-tool tool_freq_overrides to Phase 1 Adds ToolFreqOverride model and tool_freq_overrides field to LoopDetectionConfig, wires it through LoopDetectionMiddleware, and documents the option in config.example.yaml. Resolves the gap flagged in the #2586 review: without per-tool overrides, users hit by #2510/#2511 (RNA-seq workflows exceeding the bash hard limit) had no way to raise thresholds for one tool without loosening the global limit for every tool. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * docs(loop-detection): document tool_freq_overrides in LoopDetectionMiddleware docstring Add the missing Args entry for tool_freq_overrides, explaining the (warn, hard_limit) tuple structure and how per-tool thresholds supersede the global tool_freq_warn / tool_freq_hard_limit for named tools. Also run ruff format on the three files flagged by the lint check. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(loop-detection): validate LoopDetectionMiddleware __init__ params eagerly Raise clear ValueError at construction time instead of crashing at unpack-time inside _track_and_check when bad values are passed: - tool_freq_overrides: must be 2-tuples of positive ints with hard_limit >= warn - scalar thresholds: warn_threshold, hard_limit, tool_freq_warn, tool_freq_hard_limit must be >= 1 and hard limits must >= their warn pairs - window_size, max_tracked_threads must be >= 1 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(test): isolate credential loader directory-path test from real ~/.claude The test didn't monkeypatch HOME, so on any machine with real Claude Code credentials at ~/.claude/.credentials.json the function fell through to those credentials and the assertion failed. Adding HOME redirect ensures the default credential path doesn't exist during the test. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * style(test): add blank lines after import pytest in TestInitValidation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(loop-detection): collapse dual validation to LoopDetectionConfig Modifications - LoopDetectionMiddleware.__init__: stripped of all ValueError raises; becomes a plain field-assignment constructor. - LoopDetectionMiddleware.from_config: classmethod that builds the middleware from a Pydantic-validated LoopDetectionConfig and handles the ToolFreqOverride -> tuple[int, int] conversion. - agents/factory.py: SDK construction routed through LoopDetectionMiddleware.from_config(LoopDetectionConfig()) so the defaults path is Pydantic-validated too. - agents/lead_agent/agent.py: uses from_config instead of unpacking config fields by hand. - tests/test_loop_detection_middleware.py: deleted TestInitValidation (16 methods exercising the removed __init__ checks); added TestFromConfig (4 tests: scalar field mapping, override tuple conversion, empty overrides, behavioral smoke test). Result: one validation layer (Pydantic), zero duplication, no __new__ hacks. Both production construction sites flow through LoopDetectionConfig. Test results make test -> 2977 passed, 18 skipped, 0 failed (137s) make format -> All checks passed; 411 files left unchanged * feat(agents): make loop_detection configurable in create_deerflow_agent Adds a `loop_detection: bool \| AgentMiddleware = True` field to RuntimeFeatures, mirroring the existing pattern used by `sandbox`, `memory`, and `vision`. SDK users can now disable LoopDetectionMiddleware or replace it with a custom instance built from their own LoopDetectionConfig — e.g. `LoopDetectionMiddleware.from_config(my_cfg)` — instead of being stuck with the hardcoded defaults previously installed by the SDK factory. The lead-agent path (which already reads AppConfig.loop_detection) is unchanged, and the default `True` preserves prior always-on behavior for all existing callers. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: knight0940 <631532668@qq.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: Amorend <142649913+knight0940@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-05-07 16:15:15 +08:00
Xinmin Zeng	27559f3675	fix(frontend): defer thread id to onStart to avoid 404 on new chat (#2749 ) * fix(frontend): defer thread id to onStart to avoid 404 on new chat The LangGraph SDK's useStream eagerly fetches /threads/{id}/history the moment it receives a thread id, and the local useThreadRuns issues GET /threads/{id}/runs for the same reason. The chats page used to flip isNewThread=false (and forward the client-generated thread id) inside the synchronous onSend callback, before thread.submit had created the thread on the backend. The two queries therefore raced ahead of POST /runs/stream and returned 404 on the very first send. Drop the onSend handler so isNewThread stays true until onStart fires from useStream's onCreated — by then the backend has the thread, and the SDK's submittingRef guard naturally suppresses the redundant history fetch. The agent chat page already uses this pattern, so this also unifies the two flows. Adds an E2E regression that records request ordering and asserts GET /history and GET /runs are never issued before POST /runs/stream on the first send from /chats/new. Closes #2746 * fix(frontend): split welcome layout from backend thread state Removing onSend kept GET /history and GET /runs from racing ahead of POST /runs/stream, but it also coupled the welcome layout (centered input, hero, quick actions) to backend thread creation. Until onCreated returned, the user's optimistic message and the welcome hero rendered on top of each other. Introduce a dedicated `isWelcomeMode` UI flag, separate from `isNewThread`: - `isNewThread` still tracks "backend has no thread yet" and gates the thread id forwarded to useStream. - `isWelcomeMode` drives the visual layout (header background, input box position, max width, hero, quick actions, autoFocus) and flips to false inside onSend so the layout animates immediately. `isWelcomeMode` is kept in sync with `isNewThread` via an effect so sidebar navigation and "new chat" still behave correctly. All 15 E2E tests pass, including the ordering regression added in the previous commit. * test(e2e): use monotonic sequence for thread-init ordering check Date.now() is millisecond-resolution, so two requests emitted within the same tick would share a timestamp and slip past the strict `<` ordering assertions. Replace the timestamp with a monotonic counter that increments on every observed request/requestfinished event so the ordering check is robust regardless of scheduling. Per PR #2749 review feedback from copilot-pull-request-reviewer. * refactor(input-box): rename isNewThread prop to isWelcomeMode Inside InputBox, the prop named `isNewThread` is only ever consulted for visual layout decisions — gating follow-up suggestions, the bottom background strip, and the welcome-mode quick-action SuggestionList. It never reflects "the backend has created the thread", which after #2746 is tracked separately via `isNewThread` in the chat pages themselves. Rename the prop to `isWelcomeMode` and update both call sites (workspace chats page and agent chats page) so the prop name matches its actual semantics. No behavior change. Per PR #2749 review feedback from @WillemJiang.	2026-05-07 16:11:44 +08:00
AochenShen99	cef4224381	fix(skills): enforce allowed-tools metadata (#2626 ) * fix(skills): parse allowed-tools frontmatter * fix(skills): validate allowed-tools metadata * fix(skills): add shared allowed-tools policy * fix(subagents): enforce skill allowed-tools * fix(agent): enforce skill allowed-tools * refactor(skills): dedupe TypeVar and reuse cached enabled skills - Drop redundant module-level TypeVar in tool_policy; rely on PEP 695 syntax. - Expose get_cached_enabled_skills() and have the lead agent reuse it instead of synchronously rescanning skills on every request. * fix(agent): expose config-scoped skill cache * fix(subagents): pass filtered tools explicitly * fix(skills): clean allowed-tools policy feedback	2026-05-07 08:34:43 +08:00
Hinotobi	2b0e62f679	[security] fix(auth): reject cross-site auth POSTs (#2740 ) * fix(security): reject cross-site auth posts * fix(auth): align secure cookie proxy scheme handling --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-05-07 07:58:06 +08:00
Eilen Shin	1336872b15	fix(channels): authenticate gateway command requests (#2742 )	2026-05-06 15:27:34 +08:00
KiteEater	4ead2c6b19	fix(config): reset config-backed singletons on hot reload (#2588 ) * Fix stale config singletons on reload * fix(config): update checkpointer imports after runtime move * Fix config reload singleton mutation on validation failure --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-05-06 10:17:55 +08:00
yangzheli	59c4a3f0a4	feat(agent): add custom-agent self-updates with user isolation (#2713 ) * feat(agent): add update_agent tool for in-chat custom-agent self-updates (#2616) Custom agents had no built-in way to persist updates to their own SOUL.md / config.yaml from a normal chat — `setup_agent` was only bound during the bootstrap flow, so when the user asked the agent to refine its description or personality, the agent would shell out via bash/write_file and the edits landed in a temporary sandbox/tool workspace instead of `{base_dir}/agents/{agent_name}/`. Changes: - New `update_agent` builtin tool with partial-update semantics (only the fields you pass are written) and atomic temp-file + os.replace writes so a failed update never corrupts existing SOUL.md / config.yaml. - Lead agent now binds `update_agent` in the non-bootstrap path whenever `agent_name` is set in the runtime context. Default agent (no agent_name) and bootstrap flow are unchanged. - New `<self_update>` system-prompt section is injected for custom agents, instructing them to use `update_agent` — and explicitly NOT bash / write_file — to persist self-updates. - Tests: 11 new cases in `tests/test_update_agent_tool.py` covering validation (missing/invalid agent_name, unknown agent, no fields), partial updates (soul-only, description-only, skills=[] vs omitted), no-op detection, atomic-write safety, and AgentConfig round-tripping; plus 2 new cases in `tests/test_lead_agent_prompt.py` covering the self-update prompt section. - Docs: updated backend/CLAUDE.md builtin tools list and tools.mdx (en/zh) with the new tool description. * feat(agent): isolate custom agents per user Store custom agent definitions under the effective user, keep legacy agents readable until migration, and cover API/tool/migration behavior with tests. Co-authored-by: Cursor <cursoragent@cursor.com> * feat: consistent write/delete targets & add --user-id to migration --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-05 23:17:42 +08:00
Nan Gao	e8675f266d	fix(loop-detection): keep tool-call pairing on warn injection (#2724 ) (#2725 ) * fix(loop-detection): keep tool-call pairing on warn injection (#2724) * make format * fix(loop-detection): avoid IMMessage leak to downstream consumer * fix(channels): filter loop warning text from IM replies	2026-05-05 18:53:49 +08:00
Xun	680187ddc2	fix: Supplement list_running in RemoteSandboxBackend (#2716 ) * fix: Supplement list_running in RemoteSandboxBackend * fix * except requests.RequestException as exc: * fix	2026-05-05 18:53:10 +08:00
Xinmin Zeng	aded753de3	fix(frontend): restore localhost fallback for getGatewayConfig in prod mode (#2705 ) (#2718 ) * fix(frontend): unify gateway-config localhost fallback for prod (#2705) `getGatewayConfig()` only fell back to localhost defaults when `NODE_ENV === "development"`, while `next.config.js` always falls back to `127.0.0.1:8001`. Running `make start` (which sets NODE_ENV=production via `next start`) without `DEER_FLOW_INTERNAL_GATEWAY_BASE_URL` / `DEER_FLOW_TRUSTED_ORIGINS` therefore caused zod to throw inside SSR layouts and surfaced as a 500. Drop the NODE_ENV gating and use localhost defaults everywhere — the "force explicit config in prod" intent should be enforced by deployment templates (docker-compose already sets both vars), not by request-time crashes. Document the two vars in both .env.example files and add unit coverage for the dev/prod env-unset paths. * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Update internalGatewayUrl in gateway config tests --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-05-05 16:27:29 +08:00
Willem Jiang	028493bfd8	fix(docker):force ngix to resolve upstream names at request time (#2717 ) * fix(docker):force ngix to resolve upstream names at request time * fix(docker): set resolver valid=0s to eliminate DNS cache window for request-time re-resolution Agent-Logs-Url: https://github.com/bytedance/deer-flow/sessions/07bdb872-022f-4fd2-9fa8-d800a4ce34a7 Co-authored-by: WillemJiang <219644+WillemJiang@users.noreply.github.com> * Update DNS resolver valid time and add upstreams * fix the unit test error * Remove upstream server configurations from nginx.conf Removed upstream server configurations for gateway and frontend. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>	2026-05-05 14:35:55 +08:00
Willem Jiang	8e48b7e85c	fix(channels): preserve clarification conversation history across follow-up turns (#2444 ) * fix(channels): preserve clarification conversation history across follow-up turns Pin channel-triggered runs to the root checkpoint namespace and ensure thread_id is always present in configurable run config so follow-up replies resume the same conversation state. Add regression coverage to channel tests: assert checkpoint_ns/thread_id are passed in wait and stream paths add an integration-style clarification flow test that verifies the second user reply continues prior context instead of starting a new session This addresses history loss after ask_clarification interruptions (issue #2425). * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix(channels): copy configurable dict before injecting run-scoped fields When configurable was already a plain dict, _resolve_run_params mutated it in place, leaking checkpoint_ns and thread_id back into the shared session config. Always copy via dict() before mutating to prevent cross-user or cross-channel config pollution. --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-05-04 16:14:07 +08:00
Willem Jiang	af6e48ccaa	fix(i18n): add Chinese translations for account settings page (#2712 ) The account settings page had all user-facing strings (profile labels, password form placeholders, validation messages, button text) hardcoded in English. Replace them with i18n translation keys so the page renders correctly when the locale is set to Chinese. Fixed #2710 v2.0-m1-rc0	2026-05-04 11:15:16 +08:00
Willem Jiang	b10eb7bafc	feat(github): Added container push workflow (#2709 ) * feat(github):Added container push workflow * Apply suggestions from code review Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-05-04 11:14:34 +08:00
YuJitang	d02f762ab0	feat: refine token usage display modes (#2329 ) * feat: refine token usage display modes * docs: clarify token usage accounting semantics * fix: avoid duplicate subtask debug keys * style: format token usage tests * chore: address token attribution review feedback * Update test_token_usage_middleware.py * Update test_token_usage_middleware.py * chore: simplify token attribution fallback * fix token usage metadata follow-up handling --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-05-04 09:56:16 +08:00
Willem Jiang	82e7936d36	fix(docker): set UTF-8 locale to prevent ASCII encoding errors in minimal containers (#2707 ) * fix(docker): set UTF-8 locale to prevent ASCII encoding errors in minimal containers * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-05-04 09:41:10 +08:00
Nan Gao	222a7773cb	fix(frontend): avoid misleading error message when agent api is disable (#2697 ) (#2698 )	2026-05-04 09:38:05 +08:00
Nan Gao	f80ac961ec	fix(harness): restore legacy skills path fallback (#2694 ) (#2696 ) * fix(harness): restore legacy skills path fallback (#2694) * fix(format): make format * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-05-03 23:40:59 +08:00
wanxsb	44ab21fc44	feat(community): add Serper web search provider (#2630 ) * feat(community): add Serper web search provider Add a new community search provider backed by the Serper Google Search API (https://serper.dev). Serper returns real-time Google results via a simple JSON API and requires only an API key — no extra Python package. Changes: - backend/packages/harness/deerflow/community/serper/__init__.py - backend/packages/harness/deerflow/community/serper/tools.py Implements web_search_tool using httpx (already a project dependency). API key is read from config.yaml `api_key` field or SERPER_API_KEY env var. Follows the same interface / output shape as the existing ddg_search provider. Exposes max_results parameter (default 5) with config override logic. - backend/tests/test_serper_tools.py Unit tests covering API key resolution, config overrides, HTTP errors, empty results, and parameter passing. - config.example.yaml: add commented-out Serper example alongside other providers - .env.example: add SERPER_API_KEY placeholder Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Fix the lint error * Fix the lint error --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-05-02 16:22:35 +08:00
Hinotobi	e543bbf5d6	[security] fix(upload): reject symlinked upload destinations (#2623 ) * fix: reject symlinked upload destinations * test: harden upload destination checks * fix: address PR feedback for #2623 * test: cover safe upload re-uploads * fix: preserve upload limit checks after rebase * fix(upload): stream safe HTTP upload writes	2026-05-02 15:19:28 +08:00
Xinmin Zeng	ca3332f8bf	fix(gateway): return ISO 8601 timestamps from threads endpoints (#2599 ) * fix(gateway): return ISO 8601 timestamps from threads endpoints (#2594) ThreadResponse documents created_at / updated_at as ISO timestamps, matching the LangGraph Platform schema (langgraph_sdk.schema.Thread exposes them as datetime, JSON-encoded as ISO 8601). The gateway threads router was instead emitting str(time.time()) — unix-second floats — breaking frontend new Date() parsing and producing a mixed ISO/unix wire format that also corrupted the search sort order. Centralize timestamp generation in deerflow.utils.time: - now_iso() — datetime.now(UTC).isoformat() - coerce_iso(x) — heals legacy unix-timestamp strings on read so the store converges to ISO without a one-shot migration threads.py: replace 6 time.time() call sites with now_iso(); wrap all read paths and Phase-2 checkpoint metadata with coerce_iso(); _store_upsert opportunistically heals legacy created_at on update; drop unused time import. thread_runs.py: reuse now_iso() instead of a private duplicate _now_iso(), preventing future drift between the two timestamp call sites. Tests: 9 unit tests for the helper; 5 integration tests pinning the ISO contract for create/get/patch/search and the legacy-healing path on the internal store upsert. Full suite: 2144 passed, 15 skipped, 0 failed. Closes #2594 * fix(gateway): coerce checkpoint metadata timestamps to ISO on read After the merge with main, three additional read paths in ``threads.py`` were still emitting raw ``str(metadata.get("created_at", ""))`` — ``get_thread_state``, ``update_thread_state``, and ``get_thread_history``. Same root cause as #2594: when the checkpoint metadata's ``created_at`` is a unix-second float (legacy data, or a checkpoint written by an older Gateway version), ``str(float)`` produces ``"1777252410.411327"`` and the frontend's ``new Date(...)`` returns ``Invalid Date``. The fix on the ``/threads/{id}`` GET path was already in place; these three sibling endpoints needed the same treatment. All four call sites now flow through ``coerce_iso``, so: - legacy float metadata heals to ISO on the way out, - ISO metadata passes through unchanged, - ``datetime`` instances (which the new ``coerce_iso`` branch handles explicitly) emit with the ``T`` separator instead of falling through to the space-separated ``str(datetime)`` form. Coverage added for the two endpoints not already pinned by the merge: - ``test_get_thread_state_returns_iso_for_legacy_checkpoint_metadata`` - ``test_get_thread_history_returns_iso_for_legacy_checkpoint_metadata`` Both pre-seed a checkpoint whose metadata carries the literal float from the issue body and assert the wire format is ISO.	2026-05-02 15:16:16 +08:00

1 2 3 4 5 ...

2083 Commits