deer-flow

mirror of https://github.com/bytedance/deer-flow.git synced 2026-05-30 12:28:10 +00:00

Author	SHA1	Message	Date
Xinmin Zeng	0287240728	fix(frontend): show new thread in sidebar immediately on creation (#3276 ) (#3283 ) When a user starts a new conversation, the sidebar list did not display it until the AI finished streaming and generated a title. This made it impossible to switch back to an in-progress conversation when working with multiple threads concurrently. Optimistically insert the new thread into the TanStack Query cache during the `onCreated` callback so the sidebar renders a placeholder entry ("New chat") as soon as the backend acknowledges thread creation. The existing `onUpdateEvent` title handler and `onFinish` query invalidation then update the entry in-place with the real title.	2026-05-28 15:27:38 +08:00
dependabot[bot]	9e332c594a	chore(deps): bump uuid from 10.0.0 to 14.0.0 in /frontend (#3281 ) Bumps [uuid](https://github.com/uuidjs/uuid) from 10.0.0 to 14.0.0. - [Release notes](https://github.com/uuidjs/uuid/releases) - [Changelog](https://github.com/uuidjs/uuid/blob/main/CHANGELOG.md) - [Commits](https://github.com/uuidjs/uuid/compare/v10.0.0...v14.0.0) --- updated-dependencies: - dependency-name: uuid dependency-version: 14.0.0 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-28 07:14:44 +08:00
Admire	f68bcb771c	fix(frontend): guard message copy clipboard access (#3211 ) * fix(frontend): guard message copy clipboard access * fix(frontend): reuse clipboard guard across copy actions	2026-05-26 09:37:51 +08:00
AochenShen99	11dd5b0683	fix(frontend): strip unclosed <think> tags from streaming AI content (#3218 ) * fix(frontend): strip unclosed <think> tags from streaming AI content During streaming, an opening <think> tag may arrive in one chunk while the matching </think> arrives in a later chunk. The existing splitInlineReasoning regex only matched fully closed pairs, so the mid-flight reasoning was left in message.content and rendered into the chat bubble via the markdown pipeline's rehypeRaw plugin until the closing tag landed. Extend splitInlineReasoning with a second pass: after stripping every closed <think>...</think> pair, route any remaining content from a lone opener to the reasoning slot and leave only the preceding preamble in content. Closed-tag behavior is unchanged. Covers every provider whose stream emits reasoning inline as <think> tags (MiniMax streaming path, MindIE, PatchedChatOpenAI, and any gateway-served DeepSeek/OpenAI-compatible model). * style(frontend): apply prettier formatting to streaming reasoning tests * fix(frontend): skip <think> split for literal think tags in inline code Treats a `<think>` opener immediately preceded by a backtick as part of markdown inline code rather than a streaming reasoning marker. Prevents permanent content truncation when an AI message documents the `<think>` tag literally (e.g. ``Use `<think>` markers``), where the streaming-safe fallback would otherwise route the rest of the answer into the reasoning panel because no `</think>` ever arrives. Adds regression tests for both the post-stream and mid-stream cases.	2026-05-26 09:35:07 +08:00
Admire	e7967a7fc3	fix(frontend): hide copy for streaming assistant turn (#3176 )	2026-05-23 23:29:16 +08:00
Admire	d0fa37e71d	fix(frontend): avoid duplicate optimistic user message (#3002 )	2026-05-23 17:02:23 +08:00
AochenShen99	604fcbb9d2	Stabilize write artifact previews (#3172 )	2026-05-23 16:56:14 +08:00
Nan Gao	a64a39dbc0	config: raise default summarization trigger before v2.0-m1 (#3174 ) * config: update summarization configuration * docs: sync summarization trigger guidance	2026-05-23 15:38:25 +08:00
JeffJiang	b103d1a7f5	feat(frontend): support static website demo mode (#3170 ) * feat(frontend): support static website demo mode * fix(frontend): render html artifact previews from blob content * chore(frontend): apply pre-commit formatting * fix(frontend): address static demo PR review comments * Update the release information of DeerFlow --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-05-23 00:10:56 +08:00
Nan Gao	914d6a4f1c	docs: add provider safety termination post (#3167 )	2026-05-22 21:33:15 +08:00
Nan Gao	253542ea0d	docs: discourage MCP filesystem workspace config (#3141 )	2026-05-22 09:19:23 +08:00
Xinmin Zeng	e93f658472	fix(stability): resolve P0 blockers from v2.0-m1-rc1 stability audit (#3107 ) (#3131 ) * fix(task-tool): unwrap callback manager when locating usage recorder `config["callbacks"]` may arrive as a `BaseCallbackManager` (e.g. the `AsyncCallbackManager` LangChain hands to async tool runs), not just a plain list. The previous `for cb in callbacks` loop raised `TypeError: 'AsyncCallbackManager' object is not iterable`, which `ToolErrorHandlingMiddleware` then converted into a failed `task` ToolMessage even though the subagent had completed internally — Ultra mode lost subagent results and the lead agent fell back to redoing the work. Unwrap `BaseCallbackManager.handlers` before searching for the recorder. Refs: bytedance/deer-flow#3107 (BUG-002) * fix(frontend): treat any task tool error as a terminal subtask failure The subtask card status machine matched only three English prefixes (`Task Succeeded. Result:`, `Task failed.`, `Task timed out`). Anything else fell through to `in_progress`, so a `task` tool error wrapped by `ToolErrorHandlingMiddleware` (`Error: Tool 'task' failed ...`) left the card spinning forever even after the run had ended. Extract the prefix logic into `parseSubtaskResult` and recognise any leading `Error:` token as a terminal failure. The extracted function is unit-tested against the legacy prefixes plus the `AsyncCallbackManager` regression captured in the upstream issue. Refs: bytedance/deer-flow#3107 (BUG-007) * fix(frontend): exclude hidden, reasoning, and tool payloads from chat export `formatThreadAsMarkdown` / `formatThreadAsJSON` iterated raw messages without running the UI-level `isHiddenFromUIMessage` filter. Exported transcripts therefore included `hide_from_ui` system reminders, memory injections, provider `reasoning_content`, tool calls, and tool result messages — content that is intentionally hidden in the chat view. Filter the export to the user-visible transcript by default and gate reasoning / tool calls / tool messages / hidden messages behind explicit `ExportOptions` flags so a future debug export can opt back in without forking the formatter. Refs: bytedance/deer-flow#3107 (BUG-006) * fix(gateway): route get_config through get_app_config for mtime hot reload `get_config(request)` returned the `app.state.config` snapshot captured at startup. The worker / lead-agent path then threaded that frozen `AppConfig` through `RunContext` and `agent_factory`, so per-run fields edited in `config.yaml` (notably `max_tokens`) were ignored until the gateway process was restarted — even though `get_app_config()` already does mtime-based reload at the bottom layer. Route the request dependency through `get_app_config()` directly. Runtime `ContextVar` overrides (`push_current_app_config`) and test-injected singletons (`set_app_config`) keep working; `app.state.config` is now only read at startup for one-shot bootstrap (logging level, IM channels, `langgraph_runtime` engines). `tests/test_gateway_deps_config.py` encoded the old snapshot contract and is removed; `tests/test_gateway_config_freshness.py` replaces it with mtime, ContextVar, and `set_app_config` coverage. `test_skills_custom_router.py` and `test_uploads_router.py` now inject test configs via FastAPI `dependency_overrides[get_config]` instead of mutating `app.state.config`. Document the hot-reload boundary in `backend/CLAUDE.md` so reviewers know which fields are picked up on the next request vs. which still require a restart (`database`, `checkpointer`, `run_events`, `stream_bridge`, `sandbox.use`, `log_level`, `channels.`). Refs: bytedance/deer-flow#3107 (BUG-001) fix(gateway): broaden get_config 503 to any config-load failure Address review feedback on the previous commit: 1. Narrow exception catch removed. The old contract returned 503 whenever `app.state.config is None`. The first cut only mapped `FileNotFoundError`, leaving `PermissionError`, YAML parse errors, and pydantic `ValidationError` to bubble up as 500. At the request boundary we treat any inability to materialise the config as "configuration not available" (503) and log the original exception so the operator still has the stack. 2. Removed the unused `request: Request` parameter and the matching `# noqa: ARG001`. FastAPI's `Depends()` does not require the dependency to accept `Request`; the only call site uses the no-arg form. 3. `backend/CLAUDE.md` boundary now lists the reason each field is restart-required (engine binding, singleton caching, one-shot `apply_logging_level`, etc.), not just the field name, so reviewers do not have to reverse-engineer the boundary themselves. Tests parametrise four exception classes (`FileNotFoundError`, `PermissionError`, `ValueError`, `RuntimeError`) and assert 503 for each. Refs: bytedance/deer-flow#3107 (BUG-001) * fix(task-tool): defend _find_usage_recorder against non-list callbacks Address review feedback. The previous commit handled the two common shapes LangChain hands to async tool runs — a plain `list[BaseCallbackHandler]` and a `BaseCallbackManager` subclass — but iterated any other shape directly, which would still raise `TypeError` if e.g. a single handler instance leaked through without a list wrapper. Treat any non-list, non-manager `config["callbacks"]` value as "no recorder" rather than crash. Docstring now lists all four shapes explicitly. New tests cover the single-handler-object case, `runtime is None`, `callbacks is None`, and `runtime.config` being a non-dict — all required to be silent no-ops. Refs: bytedance/deer-flow#3107 (BUG-002) * fix(frontend): drop dead identity ternary and add opt-in export tests Address review feedback on the previous export commit: 1. Removed the no-op `typeof msg.content === "string" ? msg.content : msg.content` expression in `formatThreadAsJSON`. Both branches returned the same value; the message content now flows through unchanged whether it is a string or the rich `MessageContent[]` shape (LangChain JSON-serialises the array structure correctly already). 2. Expanded the JSDoc on `ExportOptions` to make it clearer that the four flags are not currently wired to any UI control — callers wanting a debug export must build the options object explicitly. The default behaviour continues to match the explicit prescription in bytedance/deer-flow#3107 BUG-006. 3. Added opt-in coverage. The previous tests only exercised the `options = {}` default path; the new cases verify each flag flips the corresponding payload back into the export so a future debug-export surface does not silently break the contract. Refs: bytedance/deer-flow#3107 (BUG-006) * fix(frontend): export subtask prefix constants and document fallback intent Address review feedback on the previous BUG-007 commit: 1. `SUCCESS_PREFIX`, `FAILURE_PREFIX`, `TIMEOUT_PREFIX`, and the `ERROR_WRAPPER_PATTERN` regex are now exported. The JSDoc explicitly pins them as part of the backend↔frontend contract defined in `task_tool.py` and `tool_error_handling_middleware.py`, so any future structured-status migration (e.g. backend writing `additional_kwargs.subagent_status` instead of leading text) can reference these from one canonical place rather than redefine them. 2. The `in_progress` fallback now carries a docstring explaining the deliberate choice — LangChain only ever emits a `ToolMessage` once the tool itself has returned, so unrecognised content means the contract has drifted and "still running" is the right operator signal (eagerly marking it terminal-failed would mask the drift). No behaviour change; this is documentation and an API export. Refs: bytedance/deer-flow#3107 (BUG-007) * fix(gateway): drop app.state.config snapshot and freeze run_events_config Address @ShenAC-SAC's BUG-001 review on #3131. The previous cut still stored an ``AppConfig`` snapshot on ``app.state.config`` for startup bootstrap. Two follow-on hazards from that: 1. Future code touching the gateway lifespan could accidentally start reading ``app.state.config`` again, silently regressing the request hot path back to a stale snapshot. 2. ``get_run_context()`` paired a freshly-reloaded ``AppConfig`` with the startup-bound ``event_store`` and a live ``run_events_config`` field — so an operator who edited ``run_events.backend`` mid-flight would have produced a run context whose ``event_store`` and ``run_events_config`` referred to different backends. Clean approach (aligned with the direction in PR #3128): - ``lifespan()`` keeps a local ``startup_config`` variable and passes it explicitly into ``langgraph_runtime(app, startup_config)`` and into ``start_channel_service``. No ``app.state.config`` attribute is set at any point. - ``langgraph_runtime`` now accepts ``startup_config`` as a required parameter, removing the ``getattr(app.state, "config", None)`` lookup and the "config not initialised" runtime error. - The matching ``run_events_config`` is frozen onto ``app.state`` next to ``run_event_store`` so ``get_run_context`` reads the two from the same startup-time source. ``app_config`` continues to be resolved live via ``get_app_config()``. - ``backend/CLAUDE.md`` boundary explanation updated to spell out the ``startup_config`` / ``get_app_config()`` split. New regression test ``test_run_context_app_config_reflects_yaml_edit`` exercises the worker-feeding path: it asserts that ``ctx.app_config`` follows a mid-flight ``config.yaml`` edit while ``ctx.run_events_config`` stays frozen to the startup snapshot the event store was built from. Refs: bytedance/deer-flow#3107 (BUG-001), bytedance/deer-flow#3131 review * fix(frontend): parse Task cancelled and polling timed out as terminal Address @ShenAC-SAC's BUG-007 review on #3131. `task_tool.py` actually emits five terminal strings: - `Task Succeeded. Result: …` - `Task failed. …` - `Task timed out. …` - `Task cancelled by user.` ← previously matched none - `Task polling timed out after N minutes …` ← previously matched none The previous cut handled three; the last two fell through to the "unknown content" branch and pushed the subtask card back to `in_progress` even though the backend had already reached a terminal state. Add explicit matches plus regression tests for both. The `in_progress` fallback is now reserved for genuinely unrecognised output (i.e. contract drift), as documented. Refs: bytedance/deer-flow#3107 (BUG-007), bytedance/deer-flow#3131 review * fix(frontend): sanitize JSON export content via the Markdown content path Address @ShenAC-SAC's BUG-006 review and the Copilot inline comment on #3131. The previous cut filtered hidden/tool messages out of the JSON export but still serialised `msg.content` verbatim, so: - inline `<think>…</think>` wrappers stayed in the exported `content` even with `includeReasoning: false`, - content-array thinking blocks leaked the `thinking` field, - `<uploaded_files>…</uploaded_files>` markers leaked the workspace paths a user uploaded files to. JSON now goes through the same sanitiser the Markdown path uses (`extractContentFromMessage` + `stripUploadedFilesTag`). Reasoning and tool_calls remain gated behind their `ExportOptions` flags. AI / human rows that sanitise to empty content with no opted-in reasoning or tool calls are dropped so the JSON matches the Markdown path's `continue` on empty assistant fragments. New regression tests cover the three leak shapes the reviewer called out plus the empty-content-drop case. Refs: bytedance/deer-flow#3107 (BUG-006), bytedance/deer-flow#3131 review * test(gateway): align lifespan stub with langgraph_runtime two-arg signature Codex round-3 review of c0bc7a06 flagged this: changing `langgraph_runtime` to require `startup_config` as a second positional argument broke the one-arg stub `_noop_langgraph_runtime(_app)` in `test_gateway_lifespan_shutdown.py`, which is patched into `app.gateway.app.langgraph_runtime` by the lifespan shutdown bounded-timeout regression. Lifespan would then call the stub with two args and raise `TypeError` before the bounded-shutdown assertion ran. Update the stub to match the new signature. The shutdown test itself is unaffected — it only cares about the channel `stop_channel_service` hang path. Refs: bytedance/deer-flow#3107 (BUG-001), bytedance/deer-flow#3131 review * fix(frontend): strip every known backend marker in export, not just uploads Codex round-3 review of 258ca800 and the matching maintainer feedback on PR #3131 made the same point: the JSON export now ran the Markdown-side sanitiser, but that sanitiser only stripped `<uploaded_files>`. The full set of payloads middleware embeds inside message `content` is larger: - `<uploaded_files>` — `UploadsMiddleware` - `<system-reminder>` — `DynamicContextMiddleware` - `<memory>` — `DynamicContextMiddleware` (nested inside system-reminder) - `<current_date>` — `DynamicContextMiddleware` The primary protection is still `isHiddenFromUIMessage`: the `<system-reminder>` HumanMessage is marked `hide_from_ui: true` and never reaches the formatter. This commit adds the second line of defence so a regression that drops the `hide_from_ui` flag — or any future middleware that injects the same tag vocabulary into a visible HumanMessage — cannot leak the payload into the export file. Concrete changes: - New `INTERNAL_MARKER_TAGS` constant + `stripInternalMarkers(content)` helper in `core/messages/utils.ts`. The constant doubles as documentation for the backend↔frontend contract. - `formatMessageContent` in `export.ts` now calls `stripInternalMarkers` instead of `stripUploadedFilesTag`. UI render paths (`message-list-item.tsx`) keep using the narrower function so a user legitimately typing `<memory>` in a meta-discussion is preserved. - The "drop empty rows" guard in `buildJSONMessage` switched from `=== undefined` to truthy `!` checks. Codex spotted the asymmetry: when `extractReasoningContentFromMessage` returned the empty string (which it legitimately can), the JSON path emitted `{reasoning: ""}` while the Markdown path's `!reasoning` `continue` correctly dropped the row. New regression tests cover the defence-in-depth strip with a `<system-reminder><memory><current_date>` payload deliberately not marked `hide_from_ui`; tool-message sanitization under `includeToolMessages: true`; the mixed-content-array case (`thinking + text + image_url`); and the opted-in empty-reasoning drop. Live verification on a real Ultra-mode thread that uploaded a PDF (`曾鑫民-薪资交易流水.pdf`): backend state's first HumanMessage carries the `<uploaded_files>` block (with `/mnt/user-data/uploads/...` paths) as part of a content-array. The Markdown and JSON export blobs both come back free of `<uploaded_files>`, `<system-reminder>`, `<current_date>`, `tool_calls`, and reasoning — while preserving the user's `这是什么？` prompt and the assistant's visible answer. Refs: bytedance/deer-flow#3107 (BUG-006), bytedance/deer-flow#3131 review * test(frontend): cover trim, varied N, and pre-execution Error: prefixes Codex round-3 review of 50e2c257 flagged three coverage gaps in the subtask-status parser: 1. `Task cancelled by user.` and `Task polling timed out` previously had no whitespace-trim coverage — the original trim test only exercised the success prefix. Streaming chunks can arrive with leading/trailing newlines; the regex needed an explicit assertion. 2. The polling-timeout case was tested only at one `N` (15 minutes). The backend interpolates the live `timeout_seconds // 60` value, so the matcher must hold for any positive integer. Now we run the case for 1, 5, and 60 minutes. 3. `task_tool.py` also emits three `Error:` strings for pre-execution failures — unknown subagent type, host-bash disabled, and "task disappeared from background tasks". They are intentionally handled by `ERROR_WRAPPER_PATTERN` rather than dedicated prefixes (the wrapper already produces the right terminal-failed shape) but had no test coverage proving that wiring. Codex was right that a refactor splitting one of them off into its own prefix would silently break things. The JSDoc on the constants block now spells the three pre-execution errors out so the relationship between `task_tool.py` returns and the prefix vocabulary is explicit. No production code change beyond the docstring — this commit is pure coverage hardening for the contract that already exists. Refs: bytedance/deer-flow#3107 (BUG-007), bytedance/deer-flow#3131 review	2026-05-21 21:18:10 +08:00
Nan Gao	dcc6f1e678	feat(loop-detection): defer warning injection (#2752 ) * fix(loop-detection): defer warn injection to wrap_model_call The warn branch in LoopDetectionMiddleware injected a HumanMessage into state from after_model. The tools node had not yet produced ToolMessage responses to the previous AIMessage(tool_calls=...), so the new HumanMessage landed between the assistant's tool_calls and their responses. OpenAI/Moonshot reject the next request with "tool_call_ids did not have response messages" because their validators require tool_calls to be followed immediately by tool messages. Detection now runs in after_model as before, but only enqueues the warning into a per-thread list. Injection happens in wrap_model_call, where every prior ToolMessage is already present in request.messages. The warning is appended at the end as HumanMessage(name="loop_warning") — pairing intact, AIMessage semantics untouched, no SystemMessage issues for Anthropic. Closes #2029, addresses #2255 #2293 #2304 #2511. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(channels): remove loop warning display filter * feat(loop-detection): scope pending warnings by run * docs(loop-detection): update docs * test(loop-detection): assert deferred warnings are queued * fix(loop-detection): cap transient warning state * docs: update docs * add async awrap_model_call test coverage * docs(loop-detection): document transient warnings --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-21 14:36:07 +08:00
dependabot[bot]	006948232c	chore(deps): bump brace-expansion from 1.1.12 to 5.0.5 in /frontend (#3078 ) Bumps [brace-expansion](https://github.com/juliangruber/brace-expansion) from 1.1.12 to 5.0.5. - [Release notes](https://github.com/juliangruber/brace-expansion/releases) - [Commits](https://github.com/juliangruber/brace-expansion/compare/v1.1.12...v5.0.5) --- updated-dependencies: - dependency-name: brace-expansion dependency-version: 5.0.5 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-20 07:42:03 +08:00
jinghuan-Chen	c0233cae26	fix(frontend): resolve login page flickering and resize observer loop. (#2954 ) * fix(frontend): resolve login page flickering and resize observer loop. * fix(frontend): allow vertical scrolling on login page Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-05-17 09:01:42 +08:00
pereverzev	4538c32298	Fix type check for 'thinking' in message content (#2964 ) * Fix type check for 'thinking' in message content When Gemini via Vertex AI returns content as a string inside an array, the in operator throws TypeError because it can't be used on primitives. * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Zil6n <136249885+Zil6n@users.noreply.github.com> Co-authored-by: Willem Jiang <willem.jiang@gmail.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-05-16 17:55:34 +08:00
Nan Gao	6d3cffb4f0	fix(frontend): deduplicate restored thread messages (#2958 ) * fix(frontend): fix duplicate messages when reopening agent sessions (#2957) * make format * fix(frontend): retry pending thread history loads	2026-05-16 08:48:19 +08:00
Admire	7c42ab3e16	fix(frontend): wait for async chat submit before clearing (#2940 ) * fix(frontend): wait for async chat submit before clearing * test(frontend): cover pending attachment uploads * fix(frontend): preserve sync submit semantics	2026-05-15 22:27:10 +08:00
Nan Gao	0c37509b38	fix(middleware): Prevent todo completion reminder IMMessage leak (#2907 ) * fix(middleware): Prevent todo completion reminder IMMessage leak (#2892) * make format * fix(middleware): Clear stale todo reminder counts (#2892) * add size guard for _completion_reminder_counts and add a integration test	2026-05-15 22:12:37 +08:00
YuJitang	eab7ae3d62	feat: stream subagent token usage to header via terminal task events (#2882 ) * feat: real-time subagent token usage display in header and per-turn Backend: - Persist subagent token usage to AIMessage.usage_metadata via TokenUsageMiddleware, so accumulateUsage() naturally includes subagent tokens without frontend state management - Cache subagent usage by tool_call_id in task_tool, write back to the dispatching AIMessage on next model response - Emit subagent token usage on all terminal task events (task_completed, task_failed, task_cancelled, task_timed_out) - Report subagent usage to parent RunJournal for API totals - Search backward from ToolMessage to find dispatching AIMessage for correct multi-tool-call attribution Frontend: - Remove subagentUsage state, custom event handling, and prop threading — subagent tokens are now embedded in message metadata - Simplify selectHeaderTokenUsage (no subagentUsage parameter) - Per-turn inline badges show turn-specific usage via message accumulation - Remove isLoading guard from MessageTokenUsageList for dynamic updates during streaming * fix: prevent header token double counting from baseline reset race onFinish, onError, and thread-switch useEffect all reset pendingUsageBaselineMessageIdsRef to an empty Set. If thread.isLoading is still true on the next render, all messages pass the getMessagesAfterBaseline filter and their tokens are added to backendUsage (which already includes them), causing the header to display up to 2× the actual token count. Capture current message IDs instead of using an empty Set so that getMessagesAfterBaseline correctly returns no pending messages even if thread.isLoading lags behind the stream end. * fix: write back subagent tokens for all concurrent task tool calls TokenUsageMiddleware only processed messages[-2], so when a single model response dispatched multiple task tool calls only the last ToolMessage had its cached subagent usage written back to the dispatch AIMessage.usage_metadata. Earlier tasks' usage stayed in _subagent_usage_cache indefinitely (leak) and never appeared in the per-turn inline token display. Walk backward through all consecutive ToolMessages before the new AIMessage, and accumulate updates targeting the same dispatch message into one state update so overlapping writes don't clobber each other. * fix: clean up subagent usage cache entry on task cancellation When a task_tool invocation is cancelled via CancelledError, any cached subagent usage entry leaked because the TokenUsageMiddleware writeback path never fires after cancellation. Pop the cache entry before re-raising to prevent unbounded growth of the module-level _subagent_usage_cache dict. * fix: address token usage review feedback * fix: handle missing config for subagent usage cache --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-05-13 23:52:19 +08:00
Eilen Shin	84f88b6610	docs: align runtime docs with gateway mode (#2868 ) Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-05-12 16:19:21 +08:00
dependabot[bot]	0009655454	chore(deps): bump next from 16.1.7 to 16.2.6 in /frontend (#2899 ) Bumps [next](https://github.com/vercel/next.js) from 16.1.7 to 16.2.6. - [Release notes](https://github.com/vercel/next.js/releases) - [Changelog](https://github.com/vercel/next.js/blob/canary/release.js) - [Commits](https://github.com/vercel/next.js/compare/v16.1.7...v16.2.6) --- updated-dependencies: - dependency-name: next dependency-version: 16.2.6 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-12 10:45:40 +08:00
YuJitang	e82b2fb4d0	docs: clarify token usage accounting semantics (#2845 )	2026-05-11 07:17:49 +08:00
YuJitang	9892a7d468	fix: bucket subagent token usage into parent run totals (#2838 ) * fix: bucket subagent token usage into RunRow.subagent_tokens Add caller-bucketed token tracking to RunJournal so subagent and middleware LLM calls are written to the correct RunRow columns instead of all falling into lead_agent_tokens (default 0). - RunJournal: accumulate _lead_agent_tokens / _subagent_tokens / _middleware_tokens in on_llm_end, deduped by langchain run_id. Add record_external_llm_usage_records() for external sources (respects track_token_usage flag). Return caller buckets from get_completion_data(). - SubagentTokenCollector: new lightweight callback handler that collects LLM usage within subagent execution. - SubagentExecutor: wire collector into subagent run_config and sync records to SubagentResult on every chunk (timeout/cancel safe). - SubagentResult: add token_usage_records and usage_reported fields. - task_tool: report subagent usage to parent RunJournal on every terminal status (COMPLETED/FAILED/CANCELLED/TIMED_OUT), including the CancelledError path, guarded against double-reporting. No DB migration needed — RunRow columns already exist. * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * fix: address token usage review feedback * Address review follow-ups --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-05-10 22:47:30 +08:00
YuJitang	5127f08e1a	enable token usage by default (#2841 )	2026-05-10 22:00:57 +08:00
DanielWalnut	dfa4eb0c1a	[codex] fix follow-up suggestions layout (#2836 ) * fix follow-up suggestions layout * fix agent chat welcome layout transition --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-05-10 15:10:44 +08:00
Eilen Shin	1c96a6afc8	fix: keep new agent bootstrap in user scope (#2784 )	2026-05-09 19:43:50 +08:00
YuJitang	417416087b	fix: use backend thread token usage for header total (#2800 ) * fix: use backend thread token usage for header total * Refactor thread token usage fetch	2026-05-09 19:40:32 +08:00
dependabot[bot]	41b04a556f	chore(deps): bump uuid from 10.0.0 to 14.0.0 in /frontend (#2802 ) Bumps [uuid](https://github.com/uuidjs/uuid) from 10.0.0 to 14.0.0. - [Release notes](https://github.com/uuidjs/uuid/releases) - [Changelog](https://github.com/uuidjs/uuid/blob/main/CHANGELOG.md) - [Commits](https://github.com/uuidjs/uuid/compare/v10.0.0...v14.0.0) --- updated-dependencies: - dependency-name: uuid dependency-version: 14.0.0 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-09 09:33:00 +08:00
YuJitang	530bda7107	fix: dedupe token usage aggregation by message id (#2770 )	2026-05-08 09:54:20 +08:00
Willem Jiang	6c220a9aef	fix(chat): prevent first user message from being swallowed in new conversations (#2731 ) * fix(chat): prevent first user message from being swallowed in new conversations The optimistic message clearing effect cleared too eagerly — any stream message (including AI messages from messages-tuple events) triggered the clear before the server's human message had arrived via values events. For new threads this caused the user's first prompt to disappear permanently. Only clear optimistic messages once the server's human message has been confirmed to arrive in thread.messages, not just when any message arrives. Fixes #2730 * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-05-07 17:31:48 +08:00
Xinmin Zeng	27559f3675	fix(frontend): defer thread id to onStart to avoid 404 on new chat (#2749 ) * fix(frontend): defer thread id to onStart to avoid 404 on new chat The LangGraph SDK's useStream eagerly fetches /threads/{id}/history the moment it receives a thread id, and the local useThreadRuns issues GET /threads/{id}/runs for the same reason. The chats page used to flip isNewThread=false (and forward the client-generated thread id) inside the synchronous onSend callback, before thread.submit had created the thread on the backend. The two queries therefore raced ahead of POST /runs/stream and returned 404 on the very first send. Drop the onSend handler so isNewThread stays true until onStart fires from useStream's onCreated — by then the backend has the thread, and the SDK's submittingRef guard naturally suppresses the redundant history fetch. The agent chat page already uses this pattern, so this also unifies the two flows. Adds an E2E regression that records request ordering and asserts GET /history and GET /runs are never issued before POST /runs/stream on the first send from /chats/new. Closes #2746 * fix(frontend): split welcome layout from backend thread state Removing onSend kept GET /history and GET /runs from racing ahead of POST /runs/stream, but it also coupled the welcome layout (centered input, hero, quick actions) to backend thread creation. Until onCreated returned, the user's optimistic message and the welcome hero rendered on top of each other. Introduce a dedicated `isWelcomeMode` UI flag, separate from `isNewThread`: - `isNewThread` still tracks "backend has no thread yet" and gates the thread id forwarded to useStream. - `isWelcomeMode` drives the visual layout (header background, input box position, max width, hero, quick actions, autoFocus) and flips to false inside onSend so the layout animates immediately. `isWelcomeMode` is kept in sync with `isNewThread` via an effect so sidebar navigation and "new chat" still behave correctly. All 15 E2E tests pass, including the ordering regression added in the previous commit. * test(e2e): use monotonic sequence for thread-init ordering check Date.now() is millisecond-resolution, so two requests emitted within the same tick would share a timestamp and slip past the strict `<` ordering assertions. Replace the timestamp with a monotonic counter that increments on every observed request/requestfinished event so the ordering check is robust regardless of scheduling. Per PR #2749 review feedback from copilot-pull-request-reviewer. * refactor(input-box): rename isNewThread prop to isWelcomeMode Inside InputBox, the prop named `isNewThread` is only ever consulted for visual layout decisions — gating follow-up suggestions, the bottom background strip, and the welcome-mode quick-action SuggestionList. It never reflects "the backend has created the thread", which after #2746 is tracked separately via `isNewThread` in the chat pages themselves. Rename the prop to `isWelcomeMode` and update both call sites (workspace chats page and agent chats page) so the prop name matches its actual semantics. No behavior change. Per PR #2749 review feedback from @WillemJiang.	2026-05-07 16:11:44 +08:00
yangzheli	59c4a3f0a4	feat(agent): add custom-agent self-updates with user isolation (#2713 ) * feat(agent): add update_agent tool for in-chat custom-agent self-updates (#2616) Custom agents had no built-in way to persist updates to their own SOUL.md / config.yaml from a normal chat — `setup_agent` was only bound during the bootstrap flow, so when the user asked the agent to refine its description or personality, the agent would shell out via bash/write_file and the edits landed in a temporary sandbox/tool workspace instead of `{base_dir}/agents/{agent_name}/`. Changes: - New `update_agent` builtin tool with partial-update semantics (only the fields you pass are written) and atomic temp-file + os.replace writes so a failed update never corrupts existing SOUL.md / config.yaml. - Lead agent now binds `update_agent` in the non-bootstrap path whenever `agent_name` is set in the runtime context. Default agent (no agent_name) and bootstrap flow are unchanged. - New `<self_update>` system-prompt section is injected for custom agents, instructing them to use `update_agent` — and explicitly NOT bash / write_file — to persist self-updates. - Tests: 11 new cases in `tests/test_update_agent_tool.py` covering validation (missing/invalid agent_name, unknown agent, no fields), partial updates (soul-only, description-only, skills=[] vs omitted), no-op detection, atomic-write safety, and AgentConfig round-tripping; plus 2 new cases in `tests/test_lead_agent_prompt.py` covering the self-update prompt section. - Docs: updated backend/CLAUDE.md builtin tools list and tools.mdx (en/zh) with the new tool description. * feat(agent): isolate custom agents per user Store custom agent definitions under the effective user, keep legacy agents readable until migration, and cover API/tool/migration behavior with tests. Co-authored-by: Cursor <cursoragent@cursor.com> * feat: consistent write/delete targets & add --user-id to migration --------- Co-authored-by: Cursor <cursoragent@cursor.com>	2026-05-05 23:17:42 +08:00
Xinmin Zeng	aded753de3	fix(frontend): restore localhost fallback for getGatewayConfig in prod mode (#2705 ) (#2718 ) * fix(frontend): unify gateway-config localhost fallback for prod (#2705) `getGatewayConfig()` only fell back to localhost defaults when `NODE_ENV === "development"`, while `next.config.js` always falls back to `127.0.0.1:8001`. Running `make start` (which sets NODE_ENV=production via `next start`) without `DEER_FLOW_INTERNAL_GATEWAY_BASE_URL` / `DEER_FLOW_TRUSTED_ORIGINS` therefore caused zod to throw inside SSR layouts and surfaced as a 500. Drop the NODE_ENV gating and use localhost defaults everywhere — the "force explicit config in prod" intent should be enforced by deployment templates (docker-compose already sets both vars), not by request-time crashes. Document the two vars in both .env.example files and add unit coverage for the dev/prod env-unset paths. * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Update internalGatewayUrl in gateway config tests --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-05-05 16:27:29 +08:00
Willem Jiang	af6e48ccaa	fix(i18n): add Chinese translations for account settings page (#2712 ) The account settings page had all user-facing strings (profile labels, password form placeholders, validation messages, button text) hardcoded in English. Replace them with i18n translation keys so the page renders correctly when the locale is set to Chinese. Fixed #2710	2026-05-04 11:15:16 +08:00
YuJitang	d02f762ab0	feat: refine token usage display modes (#2329 ) * feat: refine token usage display modes * docs: clarify token usage accounting semantics * fix: avoid duplicate subtask debug keys * style: format token usage tests * chore: address token attribution review feedback * Update test_token_usage_middleware.py * Update test_token_usage_middleware.py * chore: simplify token attribution fallback * fix token usage metadata follow-up handling --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-05-04 09:56:16 +08:00
Nan Gao	222a7773cb	fix(frontend): avoid misleading error message when agent api is disable (#2697 ) (#2698 )	2026-05-04 09:38:05 +08:00
Jsonz	24a5a00679	fix: avoid duplicate call to extractReasoningContentFromMessage (#2661 ) In convertToSteps(), the extractReasoningContentFromMessage function was called twice for the same message - once to check if reasoning exists and again to assign it to the step object. Reuse the already-extracted value from the local variable instead.	2026-04-30 11:33:49 +08:00
yangzheli	f7b10d42e4	fix(frontend): create thread on first submit in new-agent page (#2656 ) The new-agent page pre-generates a thread UUID and passed it directly to useThreadStream, which made the LangGraph SDK POST to /threads/{uuid}/runs/stream against a thread the backend had never created. After PR #2566 introduced multi-tenant owner checks on the runs endpoints, that request now 404s with "Thread not found". Pass threadId: undefined to useThreadStream so the SDK takes the create-then-run path. The pre-generated UUID is still forwarded via SubmitOptions.threadId in sendMessage, so the new thread is created with that exact id and onCreated rebinds the hook to it. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-30 06:41:54 +08:00
yangzheli	748429ef0d	fix(frontend): add missing mock routes for runs-list, models, and suggestions (#2578 ) * fix(frontend): add missing mock routes for runs-list, models, and suggestions * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-04-26 23:29:59 +08:00
JeffJiang	7bf618de67	Refactor DeerFlow to use Gateway's LangGraph-compatible API - Updated documentation and comments to reflect the transition from LangGraph Server to Gateway. - Changed default URLs in ChannelManager and tests to point to Gateway. - Removed references to LangGraph Server in deployment scripts and configurations. - Updated Nginx configuration to route API traffic to Gateway. - Adjusted frontend configurations to utilize Gateway's API. - Removed LangGraph service from Docker Compose files, consolidating services under Gateway. - Added regression tests to ensure Gateway integration works as expected. Co-authored-by: Copilot <copilot@github.com>	2026-04-26 20:38:34 +08:00
yangzheli	c5d57b4533	fix: resolve make dev and test-e2e errors (#2570 )	2026-04-26 17:27:32 +08:00
Willem Jiang	64a43bc448	fix the lint error by updating the .prettierignore	2026-04-26 16:19:36 +08:00
Willem Jiang	3f88045b98	try to fix the frontend e2e test errors	2026-04-26 15:48:57 +08:00
Willem Jiang	9eca429a29	fix the lint errors in the frontend	2026-04-26 15:37:16 +08:00
Willem Jiang	28381e1383	fix the lint errors in frontend	2026-04-26 15:11:22 +08:00
JeffJiang	98a5b34f76	fix: resolve merge conflict in pnpm-lock.yaml and clean up better-auth dependencies	2026-04-26 12:31:52 +08:00
JeffJiang	db5ad86381	feat: enhance chat history loading with new hooks and UI components (#2338 ) * Refactor API fetch calls to use a unified fetch function; enhance chat history loading with new hooks and UI components - Replaced `fetchWithAuth` with a generic `fetch` function across various API modules for consistency. - Updated `useThreadStream` and `useThreadHistory` hooks to manage chat history loading, including loading states and pagination. - Introduced `LoadMoreHistoryIndicator` component for better user experience when loading more chat history. - Enhanced message handling in `MessageList` to accommodate new loading states and history management. - Added support for run messages in the thread context, improving the overall message handling logic. - Updated translations for loading indicators in English and Chinese. * Fix test assertions for run ordering in RunManager tests - Updated assertions in `test_list_by_thread` to reflect correct ordering of runs. - Modified `test_list_by_thread_is_stable_when_timestamps_tie` to ensure stable ordering when timestamps are tied.	2026-04-26 11:20:17 +08:00
foreleven	00a90bbd3d	refactor: Remove init_token handling from admin initialization logic and related tests	2026-04-26 11:09:56 +08:00
JeffJiang	44d9953e2e	feat: Add metadata and descriptions to various documentation pages in Chinese - Added titles and descriptions to workspace usage, configuration, customization, design principles, installation, integration guide, lead agent, MCP integration, memory system, middleware, quick start, sandbox, skills, subagents, and tools documentation. - Removed outdated API/Gateway reference and concepts glossary pages. - Updated configuration reference to reflect current structure and removed unnecessary sections. - Introduced new model provider documentation for Ark and updated the index page for model providers. - Enhanced tutorials with titles and descriptions for better clarity and navigation.	2026-04-26 11:09:55 +08:00

1 2 3 4 5 ...

455 Commits