* fix(frontend): add missing rel="noopener noreferrer" to target="_blank" links
Prevent tabnabbing attacks and referrer leakage by ensuring all
external links with target="_blank" include both noopener and
noreferrer in the rel attribute.
Made-with: Cursor
* style: fix code formatting
* fix(sandbox): URL路径被误判为不安全绝对路径 (#1385)
在本地沙箱模式下,bash工具对命令做绝对路径安全校验时,会把curl命令中的
HTTPS URL(如 https://example.com/api/v1/check)误识别为本地绝对路径并拦截。
根因:_ABSOLUTE_PATH_PATTERN 正则的负向后行断言 (?<![:\w]) 只排除了冒号和
单词字符,但 :// 中第二个斜杠前面是第一个斜杠(/),不在排除列表中,导致
//example.com/api/... 被匹配为绝对路径 /example.com/api/...。
修复:在负向后行断言中增加斜杠字符,改为 (?<![:\w/]),使得 :// 中的连续
斜杠不会触发绝对路径匹配。同时补充了URL相关的单元测试用例。
Signed-off-by: moose-lab <moose-lab@users.noreply.github.com>
* fix(sandbox): refine absolute path regex to preserve file:// defense-in-depth
Change lookbehind from (?<![:\w/]) to (?<![:\w])(?<!:/) so only the
second slash in :// sequences is excluded. This keeps URL paths from
false-positiving while still letting the regex detect /etc/passwd in
file:///etc/passwd. Also add explicit file:// URL blocking and tests.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Signed-off-by: moose-lab <moose-lab@users.noreply.github.com>
Co-authored-by: moose-lab <moose-lab@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
* fix: prevent concurrent subagent file write conflicts
Serialize same-path str_replace operations in sandbox tools
Guard AioSandbox write_file/update_file with the existing sandbox lock
Add regression tests for concurrent str_replace and append races
Verify with backend full tests and ruff lint checks
* fix(sandbox): Fix the concurrency issue of file operations on the same path in isolated sandboxes.
Ensure that different sandbox instances use independent locks for file operations on the same virtual path to avoid concurrency conflicts. Change the lock key from a single path to a composite key of (sandbox.id, path), and add tests to verify the concurrent safety of isolated sandboxes.
* feat(sandbox): Extract file operation lock logic to standalone module and fix concurrency issues
Extract file operation lock related logic from tools.py into a separate file_operation_lock.py module.
Fix data race issues during concurrent str_replace and write_file operations.
* feat(agent): 为AgentConfig添加skills字段并更新lead_agent系统提示
在AgentConfig中添加skills字段以支持配置agent可用技能
更新lead_agent的系统提示模板以包含可用技能信息
* fix: resolve agent skill configuration edge cases and add tests
* Update backend/packages/harness/deerflow/agents/lead_agent/prompt.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* refactor(agent): address PR review comments for skills configuration
- Add detailed docstring to `skills` field in `AgentConfig` to clarify the semantics of `None` vs `[]`.
- Add unit tests in `test_custom_agent.py` to verify `load_agent_config()` correctly parses omitted skills and explicit empty lists.
- Fix `test_make_lead_agent_empty_skills_passed_correctly` to include `agent_name` in the runtime config, ensuring it exercises the real code path.
* docs: 添加关于按代理过滤技能的配置说明
在配置示例文件和文档中添加说明,解释如何通过代理的config.yaml文件限制加载的技能
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* feat(sandbox): truncate oversized bash and read_file tool outputs
Long tool outputs (large directory listings, multi-MB source files) can
overflow the model's context window. Two new configurable limits:
- bash_output_max_chars (default 20000): middle-truncates bash output,
preserving both head and tail so stderr at the end is not lost
- read_file_output_max_chars (default 50000): head-truncates file output
with a hint to use start_line/end_line for targeted reads
Both limits are enforced at the tool layer (sandbox/tools.py) rather
than middleware, so truncation is guaranteed regardless of call path.
Setting either limit to 0 disables truncation entirely.
Measured: read_file on a 250KB source file drops from 63,698 tokens to
19,927 tokens (69% reduction) with the default limit.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(tests): remove unused pytest import and fix import sort order
* style: apply ruff format to sandbox/tools.py
* refactor(sandbox): address Copilot review feedback on truncation feature
- strict hard cap: while-loop ensures result (including marker) ≤ max_chars
- max_chars=0 now returns "" instead of original output
- get_app_config() wrapped in try/except with fallback to defaults
- sandbox_config.py: add ge=0 validation on truncation limit fields
- config.example.yaml: bump config_version 4→5
- tests: add len(result) <= max_chars assertions, edge-case (max=0, small
max, various sizes) tests; fix skipped-count test for strict hard cap
* refactor(sandbox): replace while-loop truncation with fixed marker budget
Use a pre-allocated constant (_MARKER_MAX_LEN) instead of a convergence
loop to ensure result <= max_chars. Simpler, safer, and skipped-char
count in the marker is now an exact predictable value.
* refactor(sandbox): compute marker budget dynamically instead of hardcoding
* fix(sandbox): make max_chars=0 disable truncation instead of returning empty string
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: JeffJiang <for-eleven@hotmail.com>
Feishu channel classified any slash-prefixed text (including absolute
paths such as /mnt/user-data/...) as a COMMAND, causing them to be
misrouted through the command pipeline instead of the chat pipeline.
Fix by introducing a shared KNOWN_CHANNEL_COMMANDS frozenset in
app/channels/commands.py — the single authoritative source for the set
of supported slash commands. Both the Feishu inbound parser and the
ChannelManager's unknown-command reply now derive from it, so adding
or removing a command requires only one edit.
Changes:
- app/channels/commands.py (new): defines KNOWN_CHANNEL_COMMANDS
- app/channels/feishu.py: replace local KNOWN_FEISHU_COMMANDS with the
shared constant; _is_feishu_command() now gates on it
- app/channels/manager.py: import KNOWN_CHANNEL_COMMANDS and use it in
the unknown-command fallback reply so the displayed list stays in sync
- tests/test_feishu_parser.py: parametrize over every entry in
KNOWN_CHANNEL_COMMANDS (each must yield msg_type=command) and add
parametrized chat cases for /unknown, absolute paths, etc.
Made with Cursor
Made-with: Cursor
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
* fix(gateway): prevent 400 error when client sends context with configurable
Fixes#1290
LangGraph >= 0.6.0 rejects requests that include both 'configurable' and
'context' in the run config. If the client (e.g. useStream hook) sends
a 'context' key, we now honour it and skip creating our own
'configurable' dict to avoid the conflict.
When no 'context' is provided, we fall back to the existing
'configurable' behaviour with thread_id.
* fix(gateway): address review feedback — warn on dual keys, fix runtime injection, add tests
- Log a warning when client sends both 'context' and 'configurable' so
it's no longer silently dropped (reviewer feedback)
- Ensure thread_id is available in config['context'] when present so
middlewares can find it there too
- Add test coverage for the context path, the both-keys-present case,
passthrough of other keys, and the no-config fallback
* style: ruff format services.py
---------
Co-authored-by: JasonOA888 <JasonOA888@users.noreply.github.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
* windows check and dev fixes
* fix windows startup scripts
* fix windows startup scripts
---------
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
The langgraph-compat layer dropped the DeerFlow-specific `context` field
from run requests, causing agent config (subagent_enabled, is_plan_mode,
thinking_enabled, etc.) to fall back to defaults. Add `context` to
RunCreateRequest and merge allowlisted keys into config.configurable in
start_run, with existing configurable values taking precedence.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: replace sync requests with async httpx in Jina AI client
Replace synchronous `requests.post()` with `httpx.AsyncClient` in
JinaClient.crawl() and make web_fetch_tool async. This is part of the
planned async concurrency optimization for the agent hot path
(see docs/TODO.md).
* fix: address Copilot review feedback on async Jina client
- Short-circuit error strings in web_fetch_tool before passing to
ReadabilityExtractor, preventing misleading extraction results
- Log missing JINA_API_KEY warning only once per process to reduce
noise under concurrent async fetching
- Use logger.exception instead of logger.error in crawl exception
handler to preserve stack traces for debugging
- Add async web_fetch_tool tests and warn-once coverage
* fix: mock get_app_config in web_fetch_tool tests for CI
The web_fetch_tool tests failed in CI because get_app_config requires
a config.yaml file that isn't present in the test environment. Mock
the config loader to remove the filesystem dependency.
---------
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
#1623 added this flag to both Docker Compose files but missed the
backend Makefile used by `make dev`. Without it `langgraph dev`
defaults to n_jobs_per_worker=1, so all conversation runs are
serialised and concurrent requests block.
This mirrors the Docker configuration.
Docker's -v host:container syntax is ambiguous for Windows drive-letter
paths (e.g. D:/...) because ':' is both the drive separator and the
volume separator, causing mount failures on Windows hosts.
Introduce _format_container_mount() which uses '--mount type=bind,...'
for Docker (unambiguous on all platforms) and keeps '-v' for Apple
Container runtime which does not support the --mount flag yet.
Adds unit tests covering Windows paths, read-only mounts, and Apple
Container pass-through.
Made-with: Cursor
* fix(gateway): forward assistant_id as agent_name in build_run_config
Fixes#1644
When the LangGraph Platform-compatible /runs endpoint receives a custom
assistant_id (e.g. 'finalis'), the Gateway's build_run_config() silently
ignored it — configurable['agent_name'] was never set, so make_lead_agent
fell through to the default lead agent and SOUL.md was never loaded.
Root cause (introduced in #1403):
resolve_agent_factory() correctly falls back to make_lead_agent for all
assistant_id values, but build_run_config() had no assistant_id parameter
and never injected configurable['agent_name']. The full call chain:
POST /runs (assistant_id='finalis')
→ resolve_agent_factory('finalis') # returns make_lead_agent ✓
→ build_run_config(thread_id, ...) # no agent_name injected ✗
→ make_lead_agent(config)
→ cfg.get('agent_name') → None
→ load_agent_soul(None) → base SOUL.md (doesn't exist) → None
Fix:
- Add keyword-only parameter to build_run_config().
- When assistant_id is set and differs from 'lead_agent', inject it as
configurable['agent_name'] (matching the channel manager's existing
_resolve_run_params() logic for IM channels).
- Honour an explicit configurable['agent_name'] in the request body;
assistant_id mapping only fills the gap when it is absent.
- Remove stale log-only branch from resolve_agent_factory(); update
docstring to explain the factory/configurable split.
Tests added (test_gateway_services.py):
- Custom assistant_id injects configurable['agent_name']
- 'lead_agent' assistant_id does NOT inject agent_name
- None assistant_id does NOT inject agent_name
- Explicit configurable['agent_name'] in request is not overwritten
- resolve_agent_factory returns make_lead_agent for all inputs
* style: format with ruff
* fix: validate and normalize assistant_id to prevent path traversal
Addresses Copilot review: strip/lowercase/replace underscores and
reject names that don't match [a-z0-9-]+, consistent with
ChannelManager._normalize_custom_agent_name().
---------
Co-authored-by: voidborne-d <voidborne-d@users.noreply.github.com>
* fix(sandbox): serialize concurrent exec_command calls in AioSandbox
The AIO sandbox container maintains a single persistent shell session
that corrupts when multiple exec_command requests arrive concurrently
(e.g. when ToolNode issues parallel tool_calls). The corrupted session
returns 'ErrorObservation' strings as output, cascading into subsequent
commands.
Add a threading.Lock to AioSandbox to serialize shell commands. As a
secondary defense, detect ErrorObservation in output and retry with a
fresh session ID.
Fixes#1433
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(sandbox): address Copilot review findings
- Fix shell injection in list_dir: use shlex.quote(path) to escape
user-provided paths in the find command
- Narrow ErrorObservation retry condition from broad substring match
to the specific corruption signature to prevent false retries
- Improve test_lock_prevents_concurrent_execution: use threading.Barrier
to ensure all workers contend for the lock simultaneously
- Improve test_list_dir_uses_lock: assert lock.locked() is True during
exec_command to verify lock acquisition
* style: auto-format with ruff
---------
Co-authored-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
view_image_tool.py had a top-level import of deerflow.sandbox.tools, which
created a circular dependency chain:
sandbox.tools
-> deerflow.agents.thread_state (triggers agents/__init__.py)
-> agents/factory.py
-> tools/builtins/__init__.py
-> view_image_tool.py
-> deerflow.sandbox.tools <-- circular!
This caused ImportError when any test directly imported sandbox.tools,
making test_sandbox_tools_security.py fail to collect since #1522.
Fix: move the sandbox.tools import inside the view_image_tool function body.
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(frontend): distinguish CORS errors from generic name check failures
* fix(frontend): improve network error message for agent name check
* Fix network error message in zh-CN locale
---------
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
The `environment` section in docker-compose.yaml set
`LANGSMITH_TRACING=${LANGSMITH_TRACING:-false}`, which always resolves
to `false` because Docker Compose evaluates `${}` substitutions from
the host shell environment, not from `env_file`.
Since `environment` entries take precedence over `env_file`, setting
`LANGSMITH_TRACING=true` in `.env` had no effect — tracing stayed
disabled despite following the documented instructions.
Remove the explicit `LANGSMITH_TRACING` from `environment` so the
value from `.env` (loaded via `env_file`) is used as intended.
The dev Docker Compose uses named volumes (langgraph-venv, gateway-venv)
to persist .venv across container restarts. Docker only populates named
volumes from the image on first creation — subsequent rebuilds do NOT
refresh existing volume contents.
When new dependencies are added to packages/harness/pyproject.toml
(e.g. langchain-anthropic), the stale named volume still contains
the old .venv missing the new packages, causing ModuleNotFoundError
at runtime.
Add `uv sync` before launching both gateway and langgraph services.
When deps are already satisfied this is a no-op (~1s), but when the
volume is stale it installs missing packages before the service starts.
Fixes#1624
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
`langgraph dev` defaults `n_jobs_per_worker` to 1 when the flag is not
explicitly passed (see langgraph_api/cli.py), even though the
`N_JOBS_PER_WORKER` env-var default is 10.
This causes the LangGraph server to run with a single background worker,
meaning all conversation runs are processed serially. When one run is
busy (e.g. summarization or long tool-calling chains), all other threads
are blocked until it finishes.
Add `--n-jobs-per-worker 10` to both production and dev Docker Compose
files to match the intended default concurrency.
* feat(gateway): implement LangGraph Platform API in Gateway, replace langgraph-cli
Implement all core LangGraph Platform API endpoints in the Gateway,
allowing it to fully replace the langgraph-cli dev server for local
development. This eliminates a heavyweight dependency and simplifies
the development stack.
Changes:
- Add runs lifecycle endpoints (create, stream, wait, cancel, join)
- Add threads CRUD and search endpoints
- Add assistants compatibility endpoints (search, get, graph, schemas)
- Add StreamBridge (in-memory pub/sub for SSE) and async provider
- Add RunManager with atomic create_or_reject (eliminates TOCTOU race)
- Add worker with interrupt/rollback cancel actions and runtime context injection
- Route /api/langgraph/* to Gateway in nginx config
- Skip langgraph-cli startup by default (SKIP_LANGGRAPH_SERVER=0 to restore)
- Add unit tests for RunManager, SSE format, and StreamBridge
* fix: drain bridge queue on client disconnect to prevent backpressure
When on_disconnect=continue, keep consuming events from the bridge
without yielding, so the worker is not blocked by a full queue.
Only on_disconnect=cancel breaks out immediately.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* fix: remove pytest import
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* fix: Fix default stream_mode to ["values", "messages-tuple"]
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* fix: Remove unused if_exists field from ThreadCreateRequest
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* fix: address review comments on gateway LangGraph API
- Mount runs.py router in app.py (missing include_router)
- Normalize interrupt_before/after "*" to node list before run_agent()
- Use entry.id for SSE event ID instead of counter
- Drain bridge queue on disconnect when on_disconnect=continue
- Reuse serialization helper in wait_run() for consistent wire format
- Reject unsupported multitask_strategy with 400
- Remove SKIP_LANGGRAPH_SERVER fallback, always use Gateway
* feat: extract app.state access into deps.py
Encapsulate read/write operations for singleton objects (RunManager,
StreamBridge, checkpointer) held in app.state into a shared utility,
reducing repeated access patterns across router modules.
* feat: extract deerflow.runtime.serialization module with tests
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: replace duplicated serialization with deerflow.runtime.serialization
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: extract app/gateway/services.py with run lifecycle logic
Create a service layer that centralizes SSE formatting, input/config
normalization, and run lifecycle management. Router modules will delegate
to these functions instead of using private cross-imported helpers.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: wire routers to use services layer, remove cross-module private imports
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* style: apply ruff formatting to refactored files
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(runtime): support LangGraph dev server and add compat route
- Enable official LangGraph dev server for local development workflow
- Decouple runtime components from agents package for better separation
- Provide gateway-backed fallback route when dev server is skipped
- Simplify lifecycle management using context manager in gateway
* feat(runtime): add Store providers with auto-backend selection
- Add async_provider.py and provider.py under deerflow/runtime/store/
- Support memory, sqlite, postgres backends matching checkpointer config
- Integrate into FastAPI lifespan via AsyncExitStack in deps.py
- Replace hardcoded InMemoryStore with config-driven factory
* refactor(gateway): migrate thread management from checkpointer to Store and resolve multiple endpoint failures
- Add Store-backed CRUD helpers (_store_get, _store_put, _store_upsert)
- Replace checkpoint-scanning search with two-phase strategy:
phase 1 reads Store (O(threads)), phase 2 backfills from checkpointer
for legacy/LangGraph Server threads with lazy migration
- Extend Store record schema with values field for title persistence
- Sync thread title from checkpoint to Store after run completion
- Fix /threads/{id}/runs/{run_id}/stream 405 by accepting both
GET and POST methods; POST handles interrupt/rollback actions
- Fix /threads/{id}/state 500 by separating read_config and
write_config, adding checkpoint_ns to configurable, and
shallow-copying checkpoint/metadata before mutation
- Sync title to Store on state update for immediate search reflection
- Move _upsert_thread_in_store into services.py, remove duplicate logic
- Add _sync_thread_title_after_run: await run task, read final
checkpoint title, write back to Store record
- Spawn title sync as background task from start_run when Store exists
* refactor(runtime): deduplicate store and checkpointer provider logic
Extract _ensure_sqlite_parent_dir() helper into checkpointer/provider.py
and use it in all three places that previously inlined the same mkdir logic.
Consolidate duplicate error constants in store/async_provider.py by importing
from store/provider.py instead of redefining them.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(runtime): move SQLite helpers to runtime/store, checkpointer imports from store
_resolve_sqlite_conn_str and _ensure_sqlite_parent_dir now live in
runtime/store/provider.py. agents/checkpointer/provider and
agents/checkpointer/async_provider import from there, reversing the
previous dependency direction (store → checkpointer becomes
checkpointer → store).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(runtime): extract SQLite helpers into runtime/store/_sqlite_utils.py
Move resolve_sqlite_conn_str and ensure_sqlite_parent_dir out of
checkpointer/provider.py into a dedicated _sqlite_utils module.
Functions are now public (no underscore prefix), making cross-module
imports semantically correct. All four provider files import from
the single shared location.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(gateway): use adelete_thread to fully remove thread checkpoints on delete
AsyncSqliteSaver has no adelete method — the previous hasattr check
always evaluated to False, silently leaving all checkpoint rows in the
database. Switch to adelete_thread(thread_id) which deletes every
checkpoint and pending-write row for the thread across all namespaces
(including sub-graph checkpoints).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(gateway): remove dead bridge_cm/ckpt_cm code and fix StrEnum lint
app.py had unreachable code after the async-with lifespan refactor:
bridge_cm and ckpt_cm were referenced but never defined (F821), and
the channel service startup/shutdown was outside the langgraph_runtime
block so it never ran. Move channel service lifecycle inside the
async-with block where it belongs.
Replace str+Enum inheritance in RunStatus and DisconnectMode with
StrEnum as suggested by UP042.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* style: format with ruff
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: JeffJiang <for-eleven@hotmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
The lark-oapi SDK defaults to open.feishu.cn (China), but apps on the
international Lark platform (open.larksuite.com) fail to connect with
error 1000040351 'Incorrect domain name'.
Changes:
- Add 'domain' config option to feishu channel (default: open.feishu.cn)
- Pass domain to both API client and WebSocket client
- Update config.example.yaml and all README files
* fix: promote matched tools from deferred registry after tool_search returns schema
After tool_search returns a tool's full schema, the tool is promoted
(removed from the deferred registry) so DeferredToolFilterMiddleware
stops filtering it from bind_tools on subsequent LLM calls.
Without this, deferred tools are permanently filtered — the LLM gets
the schema from tool_search but can never invoke the tool because
the middleware keeps stripping it.
Fixes#1554
* test: add promote() and tool_search promotion tests
Tests cover:
- promote removes tools from registry
- promote nonexistent/empty is no-op
- search returns nothing after promote
- middleware passes promoted tools through
- tool_search auto-promotes matched tools (select + keyword)
* fix: address review — lint blank line + empty registry guard
- Add missing blank line between FakeRequest methods (E301)
- Use 'if not registry' to handle empty registries consistently
---------
Co-authored-by: d 🔹 <258577966+voidborne-d@users.noreply.github.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
* fix(config): correct MiniMax M2.7 highspeed model name and add thinking support
- Rename minimax-m2.5-highspeed to minimax-m2.7-highspeed for CN region
- Add supports_thinking: true for both M2.7 and M2.7-highspeed models
* Add supports_thinking option to config examples
Added support_thinking configuration option in examples.
---------
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
* fix(dev): exclude sandbox dirs from gateway hot-reload watcher
The dev-mode gateway uses --reload which watches for file changes.
Sandbox containers mount the repo and write .pyc/__pycache__ during
execution, triggering spurious gateway restarts mid-request.
Add --reload-exclude for .pyc, __pycache__, and sandbox/ paths so
only actual source changes trigger a reload.
Fixes#1513
* Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: JasonOA888 <JasonOA888@users.noreply.github.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* feat(sandbox): add SandboxAuditMiddleware for bash command security auditing
Addresses the LocalSandbox escape vector reported in #1224 where bash tool
calls can execute destructive commands against the host filesystem.
- Add SandboxAuditMiddleware with three-tier command classification:
- High-risk (block): rm -rf /, curl|bash, dd if=, mkfs, /etc/shadow access
- Medium-risk (warn): pip install, apt install, chmod 777
- Safe (pass): normal workspace operations
- Register middleware after GuardrailMiddleware in _build_runtime_middlewares,
applied to both lead agent and subagents
- Structured audit log via standard logger (visible in langgraph.log)
- Medium-risk commands execute but append a warning to the tool result,
allowing the LLM to self-correct without blocking legitimate workflows
- High-risk commands return an error ToolMessage without calling the handler,
so the agent loop continues gracefully
* fix(lint): sort imports in test_sandbox_audit_middleware
* refactor(sandbox-audit): address Copilot review feedback (3/5/6)
- Fix class docstring to match implementation: medium-risk commands are
executed with a warning appended (not rejected), and cwd anchoring note
removed (handled in a separate PR)
- Remove capsys.disabled() from benchmark test to avoid CI log noise;
keep assertions for recall/precision targets
- Remove misleading 'cwd fix' from test module docstring
* test(sandbox-audit): add async tests for awrap_tool_call
* fix(sandbox-audit): address Copilot review feedback (1/2)
- Narrow rm high-risk regex to only block truly destructive targets
(/, /*, ~, ~/*, /home, /root); legitimate workspace paths like
/mnt/user-data/ are no longer false-positived
- Handle list-typed ToolMessage content in _append_warn_to_result;
append a text block instead of str()-ing the list to avoid breaking
structured content normalization
* style: apply ruff format to sandbox_audit_middleware files
* fix(sandbox-audit): update benchmark comment to match assert-based implementation
---------
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
* fix(task_tool): fallback to configurable thread_id when context is missing
task_tool only read thread_id from runtime.context, but when invoked
via LangGraph Server, thread_id lives in config.configurable instead.
Add the same fallback that ThreadDataMiddleware uses (PR #1237).
Fixes subagent execution failure: 'Thread ID is required in runtime
context or config.configurable'
* remove debug logging from task_tool
* fix(sandbox): anchor relative paths to thread workspace in local mode
In local sandbox mode, bash commands using relative paths were resolved
against the langgraph server process cwd (backend/) instead of the
per-thread workspace directory. This allowed relative-path writes to
escape the thread isolation boundary.
Root cause: validate_local_bash_command_paths and
replace_virtual_paths_in_command only process absolute paths (scanning
for '/' prefix). Relative paths pass through untouched and inherit the
process cwd at subprocess.run time.
Fix: after virtual path translation, prepend `cd {workspace} &&` to
anchor the shell's cwd to the thread-isolated workspace directory before
execution. shlex.quote() ensures paths with spaces or special characters
are handled safely.
This mirrors the approach used by OpenHands (fixed cwd at execution
layer) and is the correct fix for local mode where each subprocess.run
is an independent process with no persistent shell session.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* refactor(sandbox): extract _apply_cwd_prefix and add unit tests
Extract the workspace cd-prefix logic from bash_tool into a dedicated
_apply_cwd_prefix() helper so it can be unit-tested in isolation.
Add four tests covering: normal prefix, no thread_data, missing
workspace_path, and paths with spaces (shlex.quote).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* revert: remove unrelated configurable thread_id fallback from sandbox/tools.py
This change belongs in a separate PR.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* style: remove trailing whitespace in test_sandbox_tools_security
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
* Fix path for TitleMiddleware implementation
* Fix link to Provisioner Setup Guide in CONFIGURATION.md
* Update file path for TitleMiddleware implementation
* Update image paths in Leica photography article
* fix: use Git Bash for Windows local startup
* Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>