1921 Commits

Author SHA1 Message Date
DanielWalnut
eef0a6e2da
feat(dx): Setup Wizard + doctor command — closes #2030 (#2034) 2026-04-10 17:43:39 +08:00
Javen Fang
b107444878
docs(api): document recursion_limit for LangGraph API runs (#1929)
The /api/langgraph/* endpoints proxy straight to the LangGraph server,
so clients inherit LangGraph's native recursion_limit default of 25
instead of the 100 that build_run_config sets for the Gateway and IM
channel paths. 25 is too low for plan-mode or subagent runs and
reliably triggers GraphRecursionError on the lead agent's final
synthesis step after subagents return.

Set recursion_limit: 100 in the Create Run example and the cURL
snippet, and add a short note explaining the discrepancy so users
following the docs don't hit the 25-step ceiling as a surprise.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 09:28:57 +08:00
KKK
16aa51c9b3
feat(skills): add systematic-literature-review skill for multi-paper SLR workflows (#2032)
* feat(skills): add systematic-literature-review skill for multi-paper SLR workflows

Adds a new skill that produces a structured systematic literature review (SLR)
across multiple academic papers on a topic. Addresses #1862 with a pure skill
approach: no new tools, no architectural changes, no new dependencies.

Skill layout:
- SKILL.md — 4+1 phase workflow (plan, search, extract, synthesize, present)
- scripts/arxiv_search.py — arXiv API client, stdlib only, with a
  requests->urllib fallback shim modeled after github-deep-research's
  github_api.py
- templates/{apa,ieee,bibtex}.md — citation format templates selected
  dynamically in Phase 4, mirroring podcast-generation's templates/ pattern

Design notes:
- Multi-paper synthesis uses the existing `task` tool to dispatch extraction
  subagents in parallel. SKILL.md's Phase 3 includes a fixed decision table
  for batch splitting to respect the runtime's MAX_CONCURRENT_SUBAGENTS = 3
  cap, and explicitly tells the agent to strip the "Task Succeeded. Result: "
  prefix before parsing subagent JSON output.
- arXiv only, by design. Semantic Scholar and PubMed adapters would push the
  scope toward a standalone MCP server (see #933) and are intentionally out
  of scope for this skill.
- Coexists with the existing `academic-paper-review` skill: this skill does
  breadth-first synthesis across many papers, academic-paper-review does
  single-paper peer review. The two are routed via distinct triggers and
  can compose (SLR on many + deep review on 1-2 important ones).
- Hard upper bound of 50 papers, tied to the Phase 3 concurrency strategy.
  Larger surveys degrade in synthesis quality and are better split by
  sub-topic.

BibTeX template explicitly uses @misc for arXiv preprints (not @article),
which is the most common mistake when generating BibTeX for arXiv papers.

arxiv_search.py was smoke-tested end-to-end against the live arXiv API with
two query shapes (relevance sort, submittedDate sort with category filter);
all returned JSON fields parse correctly (id normalization, Atom namespace
handling, URL encoding for multi-word queries).

* fix(skills): prevent LLM from saving intermediate search results to file

Adds an explicit "do not save" instruction at the end of Phase 2.
Observed during Test 1 with DeepSeek: the model saved search results
to a markdown file before proceeding to Phase 3, wasting 2-3 tool call
rounds and increasing the risk of hitting the graph recursion limit.
The search JSON should stay in context for Phase 3, not be persisted.

* fix(skills): use relevance+start-date instead of submittedDate sorting

Test 2 revealed that arXiv's submittedDate sorting returns the most
recently submitted papers in the category regardless of query relevance.
Searching "diffusion models" with sortBy=submittedDate in cs.CV returned
papers on spatial memory, Navier-Stokes, and photon-counting CT — none
about diffusion models. The LLM then retried with 4 different queries,
wasting tool calls and approaching the recursion limit.

Fix: always sort by relevance; when the user wants "recent" papers,
combine relevance sorting with --start-date to constrain the time window.
Also add an explicit "run the search exactly once" instruction to prevent
the retry loop.

* fix(skills): wrap multi-word arXiv queries in double quotes for phrase matching

Without quotes, `all:diffusion model` is parsed by arXiv's Lucene as
`all:diffusion OR model`, pulling in unrelated papers from physics
(thermal diffusion) and other fields. Wrapping in double quotes forces
phrase matching: `all:"diffusion model"`.

Also fixes date filtering: the previous bug caused 2011 papers to appear
in results despite --start-date 2024-04-09, because the unquoted query
words were OR'd with the date constraint.

Verified: "diffusion models" --category cs.CV --start-date 2024-04-09
now returns only relevant diffusion model papers published after April
2024.

* fix(skills): add query phrasing guide and enforce subagent delegation

Two fixes from Test 2 observations with DeepSeek:

1. Query phrasing: add a table showing good vs bad query examples.
   The script wraps multi-word queries in double quotes for phrase
   matching, so long queries like "diffusion models in computer vision"
   return 0 results. Guide the LLM to use 2-3 core keywords + --category
   instead.

2. Subagent enforcement: DeepSeek was extracting metadata inline via
   python -c scripts instead of using the task tool. Strengthen Phase 3
   to explicitly name the task tool, say "do not extract metadata
   yourself", and explain why (token budget, isolation). This is more
   direct than the previous natural-language-only approach while still
   providing the reasoning behind the constraint.

* fix(skills): strengthen search keyword guidance and subagent enforcement

Address two issues found during end-to-end testing with DeepSeek:

1. Search retry: LLM passed full topic descriptions as queries (e.g.
   "diffusion models in computer vision"), which returned 0 results due
   to exact phrase matching and triggered retries. Added explicit
   instruction to extract 2-3 core keywords before searching.

2. Subagent bypass: LLM used python -c to extract metadata instead of
   dispatching via task tool. Added explicit prohibition list (python -c,
   bash scripts, inline extraction) with  markers for clarity.

* fix(skills): address Copilot review feedback on SLR skill

- Fix legacy arXiv ID parsing: preserve archive prefix for pre-2007
  papers (e.g. hep-th/9901001 instead of just 9901001)
- Fix phase count: "four phases" -> "five phases"
- Add subagent_enabled prerequisite note to SKILL.md Notes section
- Remove PR-specific references ("PR 1") from ieee.md and bibtex.md
  templates, replace with workflow-scoped wording
- Fix script header: "stdlib only" -> "no additional dependencies
  required", fix relative path to github_api.py reference
- Remove reference to non-existent docs/enhancement/ path in header

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-04-10 08:54:28 +08:00
Javen Fang
133ffe7174
feat(models): add langchain-ollama for native Ollama thinking support (#2062)
Add langchain-ollama as an optional dependency and provide ChatOllama
config examples, enabling proper thinking/reasoning content preservation
for local Ollama models.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:38:31 +08:00
yangzheli
f88970985a
fix(frontend): replace invalid "context" select field with "metadata" in threads.search (#2053)
* fix(frontend): replace invalid "context" select field with "metadata" in threads.search

The LangGraph API server does not support "context" as a select field for
threads/search, causing a 422 Unprocessable Entity error introduced by
commit 60e0abf (#1771).

- Replace "context" with "metadata" in the default select list
- Persist agent_name into thread metadata on creation so search results
  carry the agent identity
- Update pathOfThread() to fall back to metadata.agent_name when
  context is unavailable from search results
- Add regression tests for metadata-based agent routing

Fixes #2037

Made-with: Cursor

* fix: apply Copilot suggestions

* style: fix the lint error
2026-04-10 08:35:07 +08:00
knukn
6572fa5b75
feat(smoke-test): add smoke test skill (#1947)
* feat(smoke-test): add end-to-end smoke test skill

* Update .agent/skills/smoke-test/SKILL.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update .agent/skills/smoke-test/SKILL.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update .agent/skills/smoke-test/references/SOP.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update .agent/skills/smoke-test/scripts/check_local_env.sh

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update .agent/skills/smoke-test/scripts/check_docker.sh

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update .agent/skills/smoke-test/scripts/deploy_docker.sh

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* refactor(smoke-test): optimize health check scripts and update document structure

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-04-09 18:56:28 +08:00
shivam johri
194bab4691
feat(config): add when_thinking_disabled support for model configs (#1970)
* feat(config): add when_thinking_disabled support for model configs

Allow users to explicitly configure what parameters are sent to the
model when thinking is disabled, via a new `when_thinking_disabled`
field in model config. This mirrors the existing `when_thinking_enabled`
pattern and takes full precedence over the hardcoded disable behavior
when set. Backwards compatible — existing configs work unchanged.

Closes #1675

* fix(config): address copilot review — gate when_thinking_disabled independently

- Switch truthiness check to `is not None` so empty dict overrides work
- Restructure disable path so when_thinking_disabled is gated independently
  of has_thinking_settings, allowing it to work without when_thinking_enabled
- Update test to reflect new behavior
2026-04-09 18:49:00 +08:00
luo jiyin
35f141fc48
feat: implement full checkpoint rollback on user cancellation (#1867)
* feat: implement full checkpoint rollback on user cancellation

- Capture pre-run checkpoint snapshot including checkpoint state, metadata, and pending_writes
- Add _rollback_to_pre_run_checkpoint() function to restore thread state
- Implement _call_checkpointer_method() helper to support both async and sync checkpointer methods
- Rollback now properly restores checkpoint, metadata, channel_versions, and pending_writes
- Remove obsolete TODO comment (Phase 2) as rollback is now complete

This resolves the TODO(Phase 2) comment and enables full thread state
restoration when a run is cancelled by the user.

* fix: address rollback review feedback

* fix: strengthen checkpoint rollback validation and error handling

- Validate restored_config structure and checkpoint_id before use
- Raise RuntimeError on malformed pending_writes instead of silent skip
- Normalize None checkpoint_ns to empty string instead of "None"
- Move delete_thread to only execute when pre_run_snapshot is None
- Add docstring noting non-atomic rollback as known limitation

This addresses review feedback on PR #1867 regarding data integrity
in the checkpoint rollback implementation.

* test: add comprehensive coverage for checkpoint rollback edge cases

- test_rollback_restores_snapshot_without_deleting_thread
- test_rollback_deletes_thread_when_no_snapshot_exists
- test_rollback_raises_when_restore_config_has_no_checkpoint_id
- test_rollback_normalizes_none_checkpoint_ns_to_root_namespace
- test_rollback_raises_on_malformed_pending_write_not_a_tuple
- test_rollback_raises_on_malformed_pending_write_non_string_channel
- test_rollback_propagates_aput_writes_failure

Covers all scenarios from PR #1867 review feedback.

* test: format rollback worker tests
2026-04-09 17:56:36 +08:00
Xinmin Zeng
0b6fa8b9e1
fix(sandbox): add startup reconciliation to prevent orphaned container leaks (#1976)
* fix(sandbox): add startup reconciliation to prevent orphaned container leaks

Sandbox containers were never cleaned up when the managing process restarted,
because all lifecycle tracking lived in in-memory dictionaries. This adds
startup reconciliation that enumerates running containers via `docker ps` and
either destroys orphans (age > idle_timeout) or adopts them into the warm pool.

Closes #1972

* fix(sandbox): address Copilot review — adopt-all strategy, improved error handling

- Reconciliation now adopts all containers into warm pool unconditionally,
  letting the idle checker decide cleanup. Avoids destroying containers
  that another concurrent process may still be using.
- list_running() logs stderr on docker ps failure and catches
  FileNotFoundError/OSError.
- Signal handler test restores SIGTERM/SIGINT in addition to SIGHUP.
- E2E test docstring corrected to match actual coverage scope.

* fix(sandbox): address maintainer review — batch inspect, lock tightening, import hygiene

- _reconcile_orphans(): merge check-and-insert into a single lock acquisition
  per container to eliminate the TOCTOU window.
- list_running(): batch the per-container docker inspect into a single call.
  Total subprocess calls drop from 2N+1 to 2 (one ps + one batch inspect).
  Parse port and created_at from the inspect JSON payload.
- Extract _parse_docker_timestamp() and _extract_host_port() as module-level
  pure helpers and test them directly.
- Move datetime/json imports to module top level.
- _make_provider_for_reconciliation(): document the __new__ bypass and the
  lockstep coupling to AioSandboxProvider.__init__.
- Add assertion that list_running() makes exactly ONE inspect call.
2026-04-09 17:21:23 +08:00
Admire
140907ce1d
Fix abnormal preview of HTML files (#1986)
* Fix HTML artifact preview rendering

* Add after screenshot for HTML preview fix

* Add before screenshot for HTML preview fix

* Update before screenshot for HTML preview fix

* Update after screenshot for HTML preview fix

* Update before screenshot to Tsinghua homepage repro

* Update after screenshot to Tsinghua homepage preview

* Address PR review on HTML artifact preview

* Harden HTML artifact preview isolation
2026-04-09 16:32:01 +08:00
yangzheli
52718b0f23
fix(frontend): disable incomplete markdown parsing for human messages (#2014)
Streamdown's streaming safeguard appends closing markers (e.g. `*`) to
text with unmatched markdown syntax. This causes user messages containing
literal `*` (such as `99 * 87`) to display with a spurious trailing
asterisk. Human messages are always complete, so the incomplete-markdown
pre-processing is unnecessary.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-09 16:30:32 +08:00
Admire
563383c60f
fix(agent): file-io path guidance in agent prompts (#2019)
* fix(prompt): guide workspace-relative file io

* Clarify bash agent file IO path guidance
2026-04-09 16:12:34 +08:00
Xun
1b74d84590
fix: resolve missing serialized kwargs in PatchedChatDeepSeek (#2025)
* add tests

* fix ci

* fix ci
2026-04-09 16:07:16 +08:00
Zhou
823f3af98c
fix(docker): dev uv cache mounts on macOS (#2036) 2026-04-09 15:59:33 +08:00
Gao Mingfei
13664e99e7
fix(docker): nginx fails to start on hosts without IPv6 (#2027)
* fix(docker): nginx fails to start on hosts without IPv6

- Detect IPv6 support at runtime and remove `listen [::]` directive
  when unavailable, preventing nginx startup failure on non-IPv6 hosts
- Use `exec` to replace shell with nginx as PID 1 for proper signal
  handling (graceful shutdown on SIGTERM)
- Reformat command from YAML folded scalar to block scalar (no
  functional change)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(docker): harden nginx startup script (Copilot review feedback)

Add `set -e` so envsubst failures exit immediately instead of starting
nginx with an incomplete config. Narrow the sed pattern to match only
the `listen [::]:2026;` directive to avoid accidentally removing future
lines containing [::].

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-09 15:58:30 +08:00
60e0abfdb8
fix(frontend): preserve agent context in thread history routes (#1771)
* fix(frontend): preserve agent context in thread history routes

* fix(frontend): preserve agent thread fallback context

* style(frontend): format thread route utils test

---------

Co-authored-by: luoxiao6645 <luoxiao6645@gmail.com>
2026-04-09 15:11:57 +08:00
Octopus
616caa92b1
fix(models): resolve duplicate keyword argument error when reasoning_effort appears in both config and kwargs (#2017)
When a model config includes `reasoning_effort` as an extra YAML field
(ModelConfig uses `extra="allow"`), and the thinking-disabled code path
also injects `reasoning_effort="minimal"` into kwargs, the previous
`model_class(**kwargs, **model_settings_from_config)` call raises:

  TypeError: got multiple values for keyword argument 'reasoning_effort'

Fix by merging the two dicts before instantiation, giving runtime kwargs
precedence over config values: `{**model_settings_from_config, **kwargs}`.

Fixes #1977

Co-authored-by: octo-patch <octo-patch@github.com>
2026-04-09 15:09:39 +08:00
knukn
31a3c9a3de
feat(client): add thread query methods list_threads and get_thread (#1609)
* feat(client): add thread query methods `list_threads` and `get_thread`

Implemented two public API methods in `DeerFlowClient` to query threads using the underlying `checkpointer`.

* Update backend/packages/harness/deerflow/client.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update backend/packages/harness/deerflow/client.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update backend/tests/test_client.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update backend/packages/harness/deerflow/client.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix(deerflow): Fix possible KeyError issue when sorting threads

* fix unit test

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-04-09 15:00:22 +08:00
Xinmin Zeng
ad6d934a5f
fix(middleware): handle string-serialized options in ClarificationMiddleware (#1997)
* fix(middleware): handle string-serialized options in ClarificationMiddleware (#1995)

Some models (e.g. Qwen3-Max) serialize array tool parameters as JSON
strings instead of native arrays. Add defensive type checking in
_format_clarification_message() to deserialize string options before
iteration, preventing per-character rendering.

* fix(middleware): normalize options after JSON deserialization

Address Copilot review feedback:
- Add post-deserialization normalization so options is always a list
  (handles json.loads returning a scalar string, dict, or None)
- Add test for JSON-encoded scalar string ("development")
- Fix test_json_string_with_mixed_types to use actual mixed types
2026-04-08 21:04:20 +08:00
hung_ng__
5350b2fb24
feat(community): add Exa search as community tool provider (#1357)
* feat(community): add Exa search as community tool provider

Add Exa (exa.ai) as a new community search provider alongside Tavily,
Firecrawl, InfoQuest, and Jina AI. Exa is an AI-native search engine
with neural, keyword, and auto search types.

New files:
- community/exa/tools.py: web_search_tool and web_fetch_tool
- tests/test_exa_tools.py: 10 unit tests with mocked Exa client

Changes:
- pyproject.toml: add exa-py dependency
- config.example.yaml: add commented-out Exa configuration examples

Usage: set `use: deerflow.community.exa.tools:web_search_tool` in
config.yaml and provide EXA_API_KEY.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(community): address PR review comments for Exa tools

- Make _get_exa_client() accept tool_name param so web_fetch reads its own config
- Remove __init__.py to match namespace package pattern of other providers
- Add duplicate tool name warning in config.example.yaml
- Add regression tests for web_fetch config resolution

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Update revision in uv.lock to 3

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-04-08 17:13:39 +08:00
Gao Mingfei
29817c3b34
fix(backend): use timezone-aware UTC in memory modules (fix pytest DeprecationWarnings) (#1992)
* fix(backend): use timezone-aware UTC in memory modules

Replace datetime.utcnow() with datetime.now(timezone.utc) and a shared
utc_now_iso_z() helper so persisted ISO timestamps keep the trailing Z
suffix without triggering Python 3.12+ deprecation warnings.

Made-with: Cursor

* refactor(backend): use removesuffix for utc_now_iso_z suffix

Makes the +00:00 -> Z transform explicit for the trailing offset only
(Copilot review on PR #1992).

Made-with: Cursor

* style(backend): satisfy ruff UP017 with datetime.UTC in memory queue

Made-with: Cursor

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-04-08 16:28:00 +08:00
Saber
e5b149068c
Fix(subagent): Event loop conflict in SubagentExecutor.execute() (#1965)
* Fix event loop conflict in SubagentExecutor.execute()

When SubagentExecutor.execute() is called from within an already-running
event loop (e.g., when the parent agent uses async/await), calling
asyncio.run() creates a new event loop that conflicts with asyncio
primitives (like httpx.AsyncClient) that were created in and bound to
the parent loop.

This fix detects if we're already in a running event loop, and if so,
runs the subagent in a separate thread with its own isolated event loop
to avoid conflicts.

Fixes: sub-task cards not appearing in Ultra mode when using async parent agents

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(subagent): harden isolated event loop execution

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 11:46:06 +08:00
85b7ed3cec
fix(frontend): avoid using route new as thread id (#1967)
Co-authored-by: luoxiao6645 <luoxiao6645@gmail.com>
2026-04-08 10:08:55 +08:00
siwuai
24805200f0
fix(frontend): prevent stale 'new' thread ID from triggering 422 history requests (#1960)
After history.replaceState updates the URL from /chats/new to
/chats/{UUID}, Next.js useParams does not update because replaceState
bypasses the router. The useEffect in useThreadChat would then set
threadIdFromPath ('new') as the threadId, causing the LangGraph SDK
to call POST /threads/new/history which returns HTTP 422 (Invalid
thread ID: must be a UUID).

This fix adds a guard to skip the threadId update when
threadIdFromPath is the literal string 'new', preserving the
already-correct UUID that was set when the thread was created.
2026-04-08 10:03:07 +08:00
13ernkastel
722a9c4753
docs: clarify deployment sizing guidance (#1963) 2026-04-08 09:45:31 +08:00
Xinmin Zeng
d1baf7212b
fix(frontend): UI polish - fix CSS typo, dark mode border, and hardcoded colors (#1942)
- Fix `font-norma` typo to `font-normal` in message-list subtask count
- Fix dark mode `--border` using reddish hue (22.216) instead of neutral
- Replace hardcoded `rgb(184,184,192)` in hero with `text-muted-foreground`
- Replace hardcoded `bg-[#a3a1a1]` in streaming indicator with `bg-muted-foreground`
- Add missing `font-sans` to welcome description `<pre>` for consistency
- Make case-study-section padding responsive (`px-4 md:px-20`)

Closes #1940
2026-04-08 09:07:39 +08:00
Async23
0948c7a4e1
fix(provider): preserve streamed Codex output when response.completed.output is empty (#1928)
* fix: preserve streamed Codex output items

* fix: prefer completed Codex output over streamed placeholders
2026-04-07 18:21:22 +08:00
koppx
c3170f22da
fix(backend): make loop detection hash tool calls by stable keys (#1911)
* fix(backend): make loop detection hash tool calls by stable keys

The loop detection middleware previously hashed full tool call arguments,
which made repeated calls look different when only non-essential argument
details changed. In particular, `read_file` calls with nearby line ranges
could bypass repetition detection even when the agent was effectively
reading the same file region again and again.

- Hash tool calls using stable keys instead of the full raw args payload
- Bucket `read_file` line ranges so nearby reads map to the same region key
- Prefer stable identifiers such as `path`, `url`, `query`, or `command`
  before falling back to JSON serialization of args
- Keep hashing order-independent so the same tool call set produces the
  same hash regardless of call order

Fixes #1905

* fix(backend): harden loop detection hash normalization

- Normalize and parse stringified tool args defensively
- Expand stable key derivation to include pattern, glob, and cmd
- Normalize reversed read_file ranges before bucketing

Fixes #1905

* fix(backend): harden loop detection tool format

* exclude write_file and str_replace from the stable-key path — writing different content to the same file shouldn't be flagged.

---------

Co-authored-by: JeffJiang <for-eleven@hotmail.com>
2026-04-07 17:46:33 +08:00
Anson Li
1193ac64dc
fix(frontend): unify local settings runtime state and remove sidebar layout from LocalSettings (#1879)
* fix(frontend): resolve layout flickering by migrating workspace sidebar state to cookie

* fix(frontend): unify local settings runtime state to fix state drift

* fix(frontend): only persist thread model on explicit context model updates
2026-04-07 17:41:34 +08:00
Admire
ab41de2961
fix(frontend):keep DeerFlow chat thread ids in sync (#1931)
* fix: replay thread sync changes on top of main

* fix: avoid stale thread ids during stream startup
2026-04-07 17:15:46 +08:00
KKK
3b3e8e1b0b
feat(sandbox): strengthen bash command auditing with compound splitting and expanded patterns (#1881)
* fix(sandbox): strengthen regex coverage in SandboxAuditMiddleware

Expand high-risk patterns from 6 to 13 and medium-risk from 4 to 6,
closing several bypass vectors identified by cross-referencing Claude
Code's BashSecurity validator chain against DeerFlow's threat model.

High-risk additions:
- Generalised pipe-to-sh (replaces narrow curl|sh rule)
- Targeted command substitution ($() / backtick with dangerous executables)
- base64 decode piped to execution
- Overwrite system binaries (/usr/bin/, /bin/, /sbin/)
- Overwrite shell startup files (~/.bashrc, ~/.profile, etc.)
- /proc/*/environ leakage
- LD_PRELOAD / LD_LIBRARY_PATH hijack
- /dev/tcp/ bash built-in networking

Medium-risk additions:
- sudo/su (no-op under Docker root, warn only)
- PATH= modification (long attack chain, warn only)

Design decisions:
- Command substitution uses targeted matching (curl/wget/bash/sh/python/
  ruby/perl/base64) rather than blanket block to avoid false positives
  on safe usage like $(date) or `whoami`.
- Skipped encoding/obfuscation checks (hex, octal, Unicode homoglyphs)
  as ROI is low in Docker sandbox — LLMs don't generate encoded commands
  and container isolation bounds the blast radius.
- Merged pip/pip3 into single pip3? pattern.

* feat(sandbox): compound command splitting and fork bomb detection

Split compound bash commands (&&, ||, ;) into sub-commands and classify
each independently — prevents dangerous commands hidden after safe
prefixes (e.g. "cd /workspace && rm -rf /") from bypassing detection.

- Add _split_compound_command() with shlex quote-aware splitting
- Add fork bomb detection patterns (classic and while-loop variants)
- Most severe verdict wins; block short-circuits
- 15 new tests covering compound commands, splitting, and fork bombs

* test(sandbox): add async tests for fork bomb and compound commands

Cover awrap_tool_call path for fork bomb detection (3 variants) and
compound command splitting (block/warn/pass scenarios).

* fix(sandbox): address Copilot review — no-whitespace operators, >>/etc/, whole-command scan

- _split_compound_command: replace shlex-based implementation with a
  character-by-character quote/escape-aware scanner. shlex.split only
  separates '&&' / '||' / ';' when they are surrounded by whitespace,
  so payloads like 'rm -rf /&&echo ok' or 'safe;rm -rf /' bypassed the
  previous splitter and therefore the per-sub-command classifier.
- _HIGH_RISK_PATTERNS: change r'>\s*/etc/' to r'>+\s*/etc/' so append
  redirection ('>>/etc/hosts') is also blocked.
- _classify_command: run a whole-command high-risk scan *before*
  splitting. Structural attacks like 'while true; do bash & done'
  span multiple shell statements — splitting on ';' destroys the
  pattern context, so the raw command must be scanned first.
- tests: add no-whitespace operator cases to TestSplitCompoundCommand
  and test_compound_command_classification to lock in the bypass fix.
2026-04-07 17:15:24 +08:00
Admire
4004fb849f
Fix agent gallery after bootstrap creation 修复新建智能体后菜单仍为空的问题 (#1934)
* fix: persist agent before bootstrap chat

* style: normalize line endings for agent creation page

* fix: address review feedback for agent creation flow

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-04-07 17:10:08 +08:00
Henry Li
f467e613b6
feat: add BytePlus logo (#1948) 2026-04-07 16:07:37 +08:00
lulusiyuyu
f0dd8cb0d2
fix(subagents): add cooperative cancellation for subagent threads (#1873)
* fix(subagents): add cooperative cancellation for subagent threads

Subagent tasks run inside ThreadPoolExecutor threads with their own
event loop (asyncio.run). When a user clicks stop, RunManager cancels
the parent asyncio.Task, but Future.cancel() cannot terminate a running
thread and asyncio.Event does not propagate across event loops. This
causes subagent threads to keep executing (writing files, calling LLMs)
even after the user explicitly stops the run.

Fix: add a threading.Event (cancel_event) to SubagentResult and check
it cooperatively in _aexecute()'s astream iteration loop. On cancel,
request_cancel_background_task() sets the event, and the thread exits
at the next iteration boundary.

Changes:
- executor.py: Add cancel_event field to SubagentResult, check it in
  _aexecute loop, set it on timeout, add request_cancel_background_task
- task_tool.py: Call request_cancel_background_task on CancelledError

* fix(subagents): guard cancel status and add pre-check before astream

- Only overwrite status to FAILED when still RUNNING, preserving
  TIMED_OUT set by the scheduler thread.
- Add cancel_event pre-check before entering the astream loop so
  cancellation is detected immediately when already signalled.

* fix(subagents): guard status updates with lock to prevent race condition

Wrap the check-and-set on result.status in _aexecute with
_background_tasks_lock so the timeout handler in execute_async
cannot interleave between the read and write.

* fix(subagents): add dedicated CANCELLED status for user cancellation

Introduce SubagentStatus.CANCELLED to distinguish user-initiated
cancellation from actual execution failures.  Update _aexecute,
task_tool polling, cleanup terminal-status sets, and test fixtures.

* test(subagents): add cancellation tests and fix timeout regression test

- Add dedicated TestCooperativeCancellation test class with 6 tests:
  - Pre-set cancel_event prevents astream from starting
  - Mid-stream cancel_event returns CANCELLED immediately
  - request_cancel_background_task() sets cancel_event correctly
  - request_cancel on nonexistent task is a no-op
  - Real execute_async timeout does not overwrite CANCELLED (deterministic
    threading.Event sync, no wall-clock sleeps)
  - cleanup_background_task removes CANCELLED tasks

- Add task_tool cancellation coverage:
  - test_cancellation_calls_request_cancel: assert CancelledError path
    calls request_cancel_background_task(task_id)
  - test_task_tool_returns_cancelled_message: assert CANCELLED polling
    branch emits task_cancelled event and returns expected message

- Fix pre-existing test infrastructure issue: add deerflow.sandbox.security
  to _MOCKED_MODULE_NAMES (fixes ModuleNotFoundError for all executor tests)

- Add RUNNING guard to timeout handler in executor.py to prevent
  TIMED_OUT from overwriting CANCELLED status

- Add cooperative cancellation granularity comment documenting that
  cancellation is only detected at astream iteration boundaries

---------

Co-authored-by: lulusiyuyu <lulusiyuyu@users.noreply.github.com>
2026-04-07 11:12:25 +08:00
DanielWalnut
7643a46fca
fix(skill): make skill prompt cache refresh nonblocking (#1924)
* fix: make skill prompt cache refresh nonblocking

* fix: harden skills prompt cache refresh

* chore: add timeout to skills cache warm-up
2026-04-07 10:50:34 +08:00
Markus Corazzione
c4da0e8ca9
Move async SQLite mkdir off the event loop (#1921)
Co-authored-by: DanielWalnut <45447813+hetaoBackend@users.noreply.github.com>
2026-04-07 10:47:20 +08:00
yangzheli
3acdf79beb
fix(frontend): resolve invalid HTML nesting and tabnabbing vulnerabilities (#1904)
* fix(frontend): resolve invalid HTML nesting and tabnabbing vulnerabilities

Fix `<button>` inside `<a>` invalid HTML in artifact components and add
missing `noopener,noreferrer` to `window.open` calls to prevent reverse
tabnabbing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(frontend): address Copilot review on tabnabbing and double-tab-open

Remove redundant parent onClick on web_fetch ChainOfThoughtStep to
prevent opening two tabs on link click, and explicitly null out
window.opener after window.open() for defensive tabnabbing hardening.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-07 09:44:17 +08:00
jie
2d068cc075
fix(docker): restore gateway env vars and fix langgraph empty arg issue (#1915)
Two production docker-compose.yaml bugs prevent `make up` from working:

1. Gateway missing DEER_FLOW_CONFIG_PATH and DEER_FLOW_EXTENSIONS_CONFIG_PATH
   environment overrides. Added in fb2d99f (#1836) but accidentally reverted
   by ca2fb95 (#1847). Without them, gateway reads host paths from .env via
   env_file, causing FileNotFoundError inside the container.

2. Langgraph command fails when LANGGRAPH_ALLOW_BLOCKING is unset (default).
   Empty $${allow_blocking} inserts a bare space between flags, causing
   ' --no-reload' to be parsed as unexpected extra argument. Fix by building
   args string first and conditionally appending --allow-blocking.

Co-authored-by: cooper <cooperfu@tencent.com>
2026-04-07 08:54:44 +08:00
JilongSun
88e535269e
Feature/feishu receive file (#1608)
* feat(feishu): add channel file materialization hook for inbound messages

- Introduce Channel.receive_file(msg, thread_id) as a base method for file materialization; default is no-op.
- Implement FeishuChannel.receive_file to download files/images from Feishu messages, save to sandbox, and inject virtual paths into msg.text.
- Update ChannelManager to call receive_file for any channel if msg.files is present, enabling downstream model access to user-uploaded files.
- No impact on Slack/Telegram or other channels (they inherit the default no-op).

* style(backend): format code with ruff for lint compliance

- Auto-formatted packages/harness/deerflow/agents/factory.py and tests/test_create_deerflow_agent.py using `ruff format`
- Ensured both files conform to project linting standards
- Fixes CI lint check failures caused by code style issues

* fix(feishu): handle file write operation asynchronously to prevent blocking

* fix(feishu): rename GetMessageResourceRequest to _GetMessageResourceRequest and remove redundant code

* test(feishu): add tests for receive_file method and placeholder replacement

* fix(manager): remove unnecessary type casting for channel retrieval

* fix(feishu): update logging messages to reflect resource handling instead of image

* fix(feishu): sanitize filename by replacing invalid characters in file uploads

* fix(feishu): improve filename sanitization and reorder image key handling in message processing

* fix(feishu): add thread lock to prevent filename conflicts during file downloads

* fix(test): correct bad merge in test_feishu_parser.py

* chore: run ruff and apply formatting cleanup
fix(feishu): preserve rich-text attachment order and improve fallback filename handling
2026-04-06 22:14:12 +08:00
DanielWalnut
888f7bfb9d
Implement skill self-evolution and skill_manage flow (#1874)
* chore: ignore .worktrees directory

* Add skill_manage self-evolution flow

* Fix CI regressions for skill_manage

* Address PR review feedback for skill evolution

* fix(skill-evolution): preserve history on delete

* fix(skill-evolution): tighten scanner fallbacks

* docs: add skill_manage e2e evidence screenshot

* fix(skill-manage): avoid blocking fs ops in session runtime

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-04-06 22:07:11 +08:00
KKK
055e4df049
fix(sandbox): add input sanitisation guard to SandboxAuditMiddleware (#1872)
* fix(sandbox): add L2 input sanitisation to SandboxAuditMiddleware

Add _validate_input() to reject malformed bash commands before regex
classification: empty commands, oversized commands (>10 000 chars), and
null bytes that could cause detection/execution layer inconsistency.

* fix(sandbox): address Copilot review — type guard, log truncation, reject reason

- Coerce None/non-string command to str before validation
- Truncate oversized commands in audit logs to prevent log amplification
- Propagate reject_reason through _pre_process() to block message
- Remove L2 label from comments and test class names

* fix(sandbox): isinstance type guard + async input sanitisation tests

Address review comments:
- Replace str() coercion with isinstance(raw_command, str) guard so
  non-string truthy values (0, [], False) fall back to empty string
  instead of passing validation as "0"/"[]"/"False".
- Add TestInputSanitisationBlocksInAwrapToolCall with 4 async tests
  covering empty, null-byte, oversized, and None command via
  awrap_tool_call path.
2026-04-06 17:21:58 +08:00
Zhou
1ced6e977c
fix(backend): preserve viewed image reducer metadata (#1900)
Fix concurrent viewed_images state updates for multi-image input by preserving the reducer metadata in the vision middleware state schema.
2026-04-06 16:47:19 +08:00
Zhou
f5088ed70d
fix(frontend): artifact download action bounds and lint errors (#1899)
* fix: keep artifact download action in bounds

* fix: fix lint error
2026-04-06 16:34:40 +08:00
Zhou
55e78de6fc
fix: wrap suggestion chips without overlapping input (#1895)
* fix: wrap suggestion chips without overlapping input

* fix: fix lint error
2026-04-06 16:30:57 +08:00
NmanQAQ
dd30e609f7
feat(models): add vLLM provider support (#1860)
support for vLLM 0.19.0 OpenAI-compatible chat endpoints and fixes the Qwen reasoning toggle so flash mode can actually disable thinking.

Co-authored-by: NmanQAQ <normangyao@qq.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-04-06 15:18:34 +08:00
yangzheli
5fd2c581f6
fix: add output truncation to ls_tool to prevent context window overflow (#1896)
ls_tool was the only sandbox tool without output size limits, allowing
multi-MB results from large directories to blow up the model context
window. Add head-truncation (configurable via ls_output_max_chars,
default 20000) consistent with existing bash and read_file truncation.

Closes #1887

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-06 15:09:57 +08:00
Chincherry93
d7a3eff23e
fix(docker): command syntax for LANGGRAPH_ALLOW_BLOCKING (#1891) 2026-04-06 15:02:29 +08:00
qqwas
ee06440205
fix(frontend): Update route.ts default backend port(#1892) 2026-04-06 14:54:50 +08:00
7c68dd4ad4
Fix(#1702): stream resume run (#1858)
* fix: repair stream resume run metadata

# Conflicts:
#	backend/packages/harness/deerflow/runtime/stream_bridge/memory.py
#	frontend/src/core/threads/hooks.ts

* fix(stream): repair resumable replay validation

---------

Co-authored-by: luoxiao6645 <luoxiao6645@gmail.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-04-06 14:51:10 +08:00
suyua9
29575c32f9
fix: expose custom events from DeerFlowClient.stream() (#1827)
* fix: expose custom client stream events

Signed-off-by: suyua9 <1521777066@qq.com>

* fix(client): normalize streamed custom mode values

* test(client): satisfy backend ruff import ordering

---------

Signed-off-by: suyua9 <1521777066@qq.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-04-06 10:09:39 +08:00