25 Commits

Author SHA1 Message Date
DanielWalnut
c1b7f1d189
feat: static system prompt with DynamicContextMiddleware for prefix-cache optimization (#2801)
* feat(middleware): inject dynamic context via DynamicContextMiddleware

Move memory and current date out of the system prompt and into a
dedicated <system-reminder> HumanMessage injected once per session
(frozen-snapshot pattern) via a new DynamicContextMiddleware.

This keeps the system prompt byte-exact across all users and sessions,
enabling maximum Anthropic/Bedrock prefix-cache reuse.

Key design decisions:
- ID-swap technique: reminder takes the first HumanMessage's ID
  (replacing it in-place via add_messages), original content gets a
  derived `{id}__user` ID (appended after). Preserves correct ordering.
- hide_from_ui: True on reminder messages so frontend filters them out.
- Midnight crossing: date-update reminder injected before the current
  turn's HumanMessage when the conversation spans midnight.
- INFO-level logging for production diagnostics.

Also adds prompt-caching breakpoint budget enforcement tests and
updates ClaudeChatModel docs to reference the new pattern.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(token-usage): log input/output token detail breakdown in middleware

Extend the LLM token usage log line to include input_token_details and
output_token_details (cache_creation, cache_read, reasoning, audio, etc.)
when present. Adds tests covering Anthropic cache detail logging from
both usage_metadata and response_metadata.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: fix nginx

* fix(middleware): always inject date; gate memory on injection_enabled

Date injection is now unconditional — it is part of the static system
prompt replacement and should always be present. Memory injection
remains gated by `memory.injection_enabled` in the app config.

Previously the entire DynamicContextMiddleware was skipped when
injection_enabled was False, which also suppressed the date.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lint): format files and correct test assertions for token usage middleware

- ruff format dynamic_context_middleware.py and test_claude_provider_prompt_caching.py
- Remove unused pytest import from test_dynamic_context_middleware.py
- Fix two tests that asserted response_metadata fallback logic that
  doesn't exist: replace with tests that match actual middleware behavior

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(middleware): address Copilot review comments on DynamicContextMiddleware

- Use additional_kwargs flag for reminder detection instead of content
  substring matching, so user messages containing '<system-reminder>'
  are not mistakenly treated as injected reminders
- Generate stable UUID when original HumanMessage.id is None to prevent
  ambiguous 'None__user' derived IDs and message collisions
- Downgrade per-turn no-op log to DEBUG; keep actual injection events at INFO
- Add two new tests: missing-id UUID fallback and user-text false-positive

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 09:27:02 +08:00
Willem Jiang
bb8b234d85
chroe(2585): keep polishing the code of codex token usage (#2689) 2026-05-02 15:04:11 +08:00
KiteEater
866d1ca409
Populate Codex usage metadata for token accounting (#2585) 2026-05-02 11:16:03 +08:00
Willem Jiang
844ad8e528
Merge branch 'main' into release/2.0-rc 2026-04-28 15:44:02 +08:00
pyp0327
395c14357b
chore(adpator):Adapt MindIE engine model and improve testing and fixes (#2523)
* feat(models): 适配 MindIE引擎的模型

* test: add unit tests for MindIEChatModel adapter and fix PR review comments

* chore: update uv.lock with pytest-asyncio

* build: add pytest-asyncio to test dependencies

* fix: address PR review comments (lazy import, cache clients, safe newline escape, strict xml regex)

* fix(mindie): preserve string args without JSON quotes in XML tool call serialization

* fix(mindie): preserve string args without JSON quotes in XML tool call serialization

* test_mindie_provider:format

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* fix(mindie): prevent nested tool_call params from leaking into outer args

* fixed by escaping XML entities in _fix_messages and unescaping during parse, with regression tests added.

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-04-28 15:09:31 +08:00
greatmengqi
e82940c03d
refactor: thread release config through lead path (#2612)
Co-authored-by: greatmengqi <chenmengqi.0376@bytedance.com>
2026-04-28 14:53:18 +08:00
rayhpeng
d8ecaf46c9 feat(persistence): add unified persistence layer with event store, token tracking, and feedback (#1930)
* feat(persistence): add SQLAlchemy 2.0 async ORM scaffold

Introduce a unified database configuration (DatabaseConfig) that
controls both the LangGraph checkpointer and the DeerFlow application
persistence layer from a single `database:` config section.

New modules:
- deerflow.config.database_config — Pydantic config with memory/sqlite/postgres backends
- deerflow.persistence — async engine lifecycle, DeclarativeBase with to_dict mixin, Alembic skeleton
- deerflow.runtime.runs.store — RunStore ABC + MemoryRunStore implementation

Gateway integration initializes/tears down the persistence engine in
the existing langgraph_runtime() context manager. Legacy checkpointer
config is preserved for backward compatibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(persistence): add RunEventStore ABC + MemoryRunEventStore

Phase 2-A prerequisite for event storage: adds the unified run event
stream interface (RunEventStore) with an in-memory implementation,
RunEventsConfig, gateway integration, and comprehensive tests (27 cases).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(persistence): add ORM models, repositories, DB/JSONL event stores, RunJournal, and API endpoints

Phase 2-B: run persistence + event storage + token tracking.

- ORM models: RunRow (with token fields), ThreadMetaRow, RunEventRow
- RunRepository implements RunStore ABC via SQLAlchemy ORM
- ThreadMetaRepository with owner access control
- DbRunEventStore with trace content truncation and cursor pagination
- JsonlRunEventStore with per-run files and seq recovery from disk
- RunJournal (BaseCallbackHandler) captures LLM/tool/lifecycle events,
  accumulates token usage by caller type, buffers and flushes to store
- RunManager now accepts optional RunStore for persistent backing
- Worker creates RunJournal, writes human_message, injects callbacks
- Gateway deps use factory functions (RunRepository when DB available)
- New endpoints: messages, run messages, run events, token-usage
- ThreadCreateRequest gains assistant_id field
- 92 tests pass (33 new), zero regressions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(persistence): add user feedback + follow-up run association

Phase 2-C: feedback and follow-up tracking.

- FeedbackRow ORM model (rating +1/-1, optional message_id, comment)
- FeedbackRepository with CRUD, list_by_run/thread, aggregate stats
- Feedback API endpoints: create, list, stats, delete
- follow_up_to_run_id in RunCreateRequest (explicit or auto-detected
  from latest successful run on the thread)
- Worker writes follow_up_to_run_id into human_message event metadata
- Gateway deps: feedback_repo factory + getter
- 17 new tests (14 FeedbackRepository + 3 follow-up association)
- 109 total tests pass, zero regressions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test+config: comprehensive Phase 2 test coverage + deprecate checkpointer config

- config.example.yaml: deprecate standalone checkpointer section, activate
  unified database:sqlite as default (drives both checkpointer + app data)
- New: test_thread_meta_repo.py (14 tests) — full ThreadMetaRepository coverage
  including check_access owner logic, list_by_owner pagination
- Extended test_run_repository.py (+4 tests) — completion preserves fields,
  list ordering desc, limit, owner_none returns all
- Extended test_run_journal.py (+8 tests) — on_chain_error, track_tokens=false,
  middleware no ai_message, unknown caller tokens, convenience fields,
  tool_error, non-summarization custom event
- Extended test_run_event_store.py (+7 tests) — DB batch seq continuity,
  make_run_event_store factory (memory/db/jsonl/fallback/unknown)
- Extended test_phase2b_integration.py (+4 tests) — create_or_reject persists,
  follow-up metadata, summarization in history, full DB-backed lifecycle
- Fixed DB integration test to use proper fake objects (not MagicMock)
  for JSON-serializable metadata
- 157 total Phase 2 tests pass, zero regressions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* config: move default sqlite_dir to .deer-flow/data

Keep SQLite databases alongside other DeerFlow-managed data
(threads, memory) under the .deer-flow/ directory instead of a
top-level ./data folder.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(persistence): remove UTFJSON, use engine-level json_serializer + datetime.now()

- Replace custom UTFJSON type with standard sqlalchemy.JSON in all ORM
  models. Add json_serializer=json.dumps(ensure_ascii=False) to all
  create_async_engine calls so non-ASCII text (Chinese etc.) is stored
  as-is in both SQLite and Postgres.
- Change ORM datetime defaults from datetime.now(UTC) to datetime.now(),
  remove UTC imports.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(gateway): simplify deps.py with getter factory + inline repos

- Replace 6 identical getter functions with _require() factory.
- Inline 3 _make_*_repo() factories into langgraph_runtime(), call
  get_session_factory() once instead of 3 times.
- Add thread_meta upsert in start_run (services.py).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(docker): add UV_EXTRAS build arg for optional dependencies

Support installing optional dependency groups (e.g. postgres) at
Docker build time via UV_EXTRAS build arg:
  UV_EXTRAS=postgres docker compose build

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(journal): fix flush, token tracking, and consolidate tests

RunJournal fixes:
- _flush_sync: retain events in buffer when no event loop instead of
  dropping them; worker's finally block flushes via async flush().
- on_llm_end: add tool_calls filter and caller=="lead_agent" guard for
  ai_message events; mark message IDs for dedup with record_llm_usage.
- worker.py: persist completion data (tokens, message count) to RunStore
  in finally block.

Model factory:
- Auto-inject stream_usage=True for BaseChatOpenAI subclasses with
  custom api_base, so usage_metadata is populated in streaming responses.

Test consolidation:
- Delete test_phase2b_integration.py (redundant with existing tests).
- Move DB-backed lifecycle test into test_run_journal.py.
- Add tests for stream_usage injection in test_model_factory.py.
- Clean up executor/task_tool dead journal references.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(events): widen content type to str|dict in all store backends

Allow event content to be a dict (for structured OpenAI-format messages)
in addition to plain strings. Dict values are JSON-serialized for the DB
backend and deserialized on read; memory and JSONL backends handle dicts
natively. Trace truncation now serializes dicts to JSON before measuring.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(events): use metadata flag instead of heuristic for dict content detection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(converters): add LangChain-to-OpenAI message format converters

Pure functions langchain_to_openai_message, langchain_to_openai_completion,
langchain_messages_to_openai, and _infer_finish_reason for converting
LangChain BaseMessage objects to OpenAI Chat Completions format, used by
RunJournal for event storage. 15 unit tests added.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(converters): handle empty list content as null, clean up test

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(events): human_message content uses OpenAI user message format

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(events): ai_message uses OpenAI format, add ai_tool_call message event

- ai_message content now uses {"role": "assistant", "content": "..."} format
- New ai_tool_call message event emitted when lead_agent LLM responds with tool_calls
- ai_tool_call uses langchain_to_openai_message converter for consistent format
- Both events include finish_reason in metadata ("stop" or "tool_calls")

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(events): add tool_result message event with OpenAI tool message format

Cache tool_call_id from on_tool_start keyed by run_id as fallback for on_tool_end,
then emit a tool_result message event (role=tool, tool_call_id, content) after each
successful tool completion.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(events): summary content uses OpenAI system message format

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(events): replace llm_start/llm_end with llm_request/llm_response in OpenAI format

Add on_chat_model_start to capture structured prompt messages as llm_request events.
Replace llm_end trace events with llm_response using OpenAI Chat Completions format.
Track llm_call_index to pair request/response events.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(events): add record_middleware method for middleware trace events

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test(events): add full run sequence integration test for OpenAI content format

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(events): align message events with checkpoint format and add middleware tag injection

- Message events (ai_message, ai_tool_call, tool_result, human_message) now use
  BaseMessage.model_dump() format, matching LangGraph checkpoint values.messages
- on_tool_end extracts tool_call_id/name/status from ToolMessage objects
- on_tool_error now emits tool_result message events with error status
- record_middleware uses middleware:{tag} event_type and middleware category
- Summarization custom events use middleware:summarize category
- TitleMiddleware injects middleware:title tag via get_config() inheritance
- SummarizationMiddleware model bound with middleware:summarize tag
- Worker writes human_message using HumanMessage.model_dump()

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(threads): switch search endpoint to threads_meta table and sync title

- POST /api/threads/search now queries threads_meta table directly,
  removing the two-phase Store + Checkpointer scan approach
- Add ThreadMetaRepository.search() with metadata/status filters
- Add ThreadMetaRepository.update_display_name() for title sync
- Worker syncs checkpoint title to threads_meta.display_name on run completion
- Map display_name to values.title in search response for API compatibility

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(threads): history endpoint reads messages from event store

- POST /api/threads/{thread_id}/history now combines two data sources:
  checkpointer for checkpoint_id, metadata, title, thread_data;
  event store for messages (complete history, not truncated by summarization)
- Strip internal LangGraph metadata keys from response
- Remove full channel_values serialization in favor of selective fields

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: remove duplicate optional-dependencies header in pyproject.toml

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(middleware): pass tagged config to TitleMiddleware ainvoke call

Without the config, the middleware:title tag was not injected,
causing the LLM response to be recorded as a lead_agent ai_message
in run_events.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: resolve merge conflict in .env.example

Keep both DATABASE_URL (from persistence-scaffold) and WECOM
credentials (from main) after the merge.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(persistence): address review feedback on PR #1851

- Fix naive datetime.now() → datetime.now(UTC) in all ORM models
- Fix seq race condition in DbRunEventStore.put() with FOR UPDATE
  and UNIQUE(thread_id, seq) constraint
- Encapsulate _store access in RunManager.update_run_completion()
- Deduplicate _store.put() logic in RunManager via _persist_to_store()
- Add update_run_completion to RunStore ABC + MemoryRunStore
- Wire follow_up_to_run_id through the full create path
- Add error recovery to RunJournal._flush_sync() lost-event scenario
- Add migration note for search_threads breaking change
- Fix test_checkpointer_none_fix mock to set database=None

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: update uv.lock

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(persistence): address 22 review comments from CodeQL, Copilot, and Code Quality

Bug fixes:
- Sanitize log params to prevent log injection (CodeQL)
- Reset threads_meta.status to idle/error when run completes
- Attach messages only to latest checkpoint in /history response
- Write threads_meta on POST /threads so new threads appear in search

Lint fixes:
- Remove unused imports (journal.py, migrations/env.py, test_converters.py)
- Convert lambda to named function (engine.py, Ruff E731)
- Remove unused logger definitions in repos (Ruff F841)
- Add logging to JSONL decode errors and empty except blocks
- Separate assert side-effects in tests (CodeQL)
- Remove unused local variables in tests (Ruff F841)
- Fix max_trace_content truncation to use byte length, not char length

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style: apply ruff format to persistence and runtime files

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Potential fix for pull request finding 'Statement has no effect'

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

* refactor(runtime): introduce RunContext to reduce run_agent parameter bloat

Extract checkpointer, store, event_store, run_events_config, thread_meta_repo,
and follow_up_to_run_id into a frozen RunContext dataclass. Add get_run_context()
in deps.py to build the base context from app.state singletons. start_run() uses
dataclasses.replace() to enrich per-run fields before passing ctx to run_agent.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(gateway): move sanitize_log_param to app/gateway/utils.py

Extract the log-injection sanitizer from routers/threads.py into a shared
utils module and rename to sanitize_log_param (public API). Eliminates the
reverse service → router import in services.py.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* perf: use SQL aggregation for feedback stats and thread token usage

Replace Python-side counting in FeedbackRepository.aggregate_by_run with
a single SELECT COUNT/SUM query. Add RunStore.aggregate_tokens_by_thread
abstract method with SQL GROUP BY implementation in RunRepository and
Python fallback in MemoryRunStore. Simplify the thread_token_usage
endpoint to delegate to the new method, eliminating the limit=10000
truncation risk.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: annotate DbRunEventStore.put() as low-frequency path

Add docstring clarifying that put() opens a per-call transaction with
FOR UPDATE and should only be used for infrequent writes (currently
just the initial human_message event). High-throughput callers should
use put_batch() instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(threads): fall back to Store search when ThreadMetaRepository is unavailable

When database.backend=memory (default) or no SQL session factory is
configured, search_threads now queries the LangGraph Store instead of
returning 503. Returns empty list if neither Store nor repo is available.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(persistence): introduce ThreadMetaStore ABC for backend-agnostic thread metadata

Add ThreadMetaStore abstract base class with create/get/search/update/delete
interface. ThreadMetaRepository (SQL) now inherits from it. New
MemoryThreadMetaStore wraps LangGraph BaseStore for memory-mode deployments.

deps.py now always provides a non-None thread_meta_repo, eliminating all
`if thread_meta_repo is not None` guards in services.py, worker.py, and
routers/threads.py. search_threads no longer needs a Store fallback branch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(history): read messages from checkpointer instead of RunEventStore

The /history endpoint now reads messages directly from the
checkpointer's channel_values (the authoritative source) instead of
querying RunEventStore.list_messages(). The RunEventStore API is
preserved for other consumers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(persistence): address new Copilot review comments

- feedback.py: validate thread_id/run_id before deleting feedback
- jsonl.py: add path traversal protection with ID validation
- run_repo.py: parse `before` to datetime for PostgreSQL compat
- thread_meta_repo.py: fix pagination when metadata filter is active
- database_config.py: use resolve_path for sqlite_dir consistency

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Implement skill self-evolution and skill_manage flow (#1874)

* chore: ignore .worktrees directory

* Add skill_manage self-evolution flow

* Fix CI regressions for skill_manage

* Address PR review feedback for skill evolution

* fix(skill-evolution): preserve history on delete

* fix(skill-evolution): tighten scanner fallbacks

* docs: add skill_manage e2e evidence screenshot

* fix(skill-manage): avoid blocking fs ops in session runtime

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>

* fix(config): resolve sqlite_dir relative to CWD, not Paths.base_dir

resolve_path() resolves relative to Paths.base_dir (.deer-flow),
which double-nested the path to .deer-flow/.deer-flow/data/app.db.
Use Path.resolve() (CWD-relative) instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Feature/feishu receive file (#1608)

* feat(feishu): add channel file materialization hook for inbound messages

- Introduce Channel.receive_file(msg, thread_id) as a base method for file materialization; default is no-op.
- Implement FeishuChannel.receive_file to download files/images from Feishu messages, save to sandbox, and inject virtual paths into msg.text.
- Update ChannelManager to call receive_file for any channel if msg.files is present, enabling downstream model access to user-uploaded files.
- No impact on Slack/Telegram or other channels (they inherit the default no-op).

* style(backend): format code with ruff for lint compliance

- Auto-formatted packages/harness/deerflow/agents/factory.py and tests/test_create_deerflow_agent.py using `ruff format`
- Ensured both files conform to project linting standards
- Fixes CI lint check failures caused by code style issues

* fix(feishu): handle file write operation asynchronously to prevent blocking

* fix(feishu): rename GetMessageResourceRequest to _GetMessageResourceRequest and remove redundant code

* test(feishu): add tests for receive_file method and placeholder replacement

* fix(manager): remove unnecessary type casting for channel retrieval

* fix(feishu): update logging messages to reflect resource handling instead of image

* fix(feishu): sanitize filename by replacing invalid characters in file uploads

* fix(feishu): improve filename sanitization and reorder image key handling in message processing

* fix(feishu): add thread lock to prevent filename conflicts during file downloads

* fix(test): correct bad merge in test_feishu_parser.py

* chore: run ruff and apply formatting cleanup
fix(feishu): preserve rich-text attachment order and improve fallback filename handling

* fix(docker): restore gateway env vars and fix langgraph empty arg issue (#1915)

Two production docker-compose.yaml bugs prevent `make up` from working:

1. Gateway missing DEER_FLOW_CONFIG_PATH and DEER_FLOW_EXTENSIONS_CONFIG_PATH
   environment overrides. Added in fb2d99f (#1836) but accidentally reverted
   by ca2fb95 (#1847). Without them, gateway reads host paths from .env via
   env_file, causing FileNotFoundError inside the container.

2. Langgraph command fails when LANGGRAPH_ALLOW_BLOCKING is unset (default).
   Empty $${allow_blocking} inserts a bare space between flags, causing
   ' --no-reload' to be parsed as unexpected extra argument. Fix by building
   args string first and conditionally appending --allow-blocking.

Co-authored-by: cooper <cooperfu@tencent.com>

* fix(frontend): resolve invalid HTML nesting and tabnabbing vulnerabilities (#1904)

* fix(frontend): resolve invalid HTML nesting and tabnabbing vulnerabilities

Fix `<button>` inside `<a>` invalid HTML in artifact components and add
missing `noopener,noreferrer` to `window.open` calls to prevent reverse
tabnabbing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(frontend): address Copilot review on tabnabbing and double-tab-open

Remove redundant parent onClick on web_fetch ChainOfThoughtStep to
prevent opening two tabs on link click, and explicitly null out
window.opener after window.open() for defensive tabnabbing hardening.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(persistence): organize entities into per-entity directories

Restructure the persistence layer from horizontal "models/ + repositories/"
split into vertical entity-aligned directories. Each entity (thread_meta,
run, feedback) now owns its ORM model, abstract interface (where applicable),
and concrete implementations under a single directory with an aggregating
__init__.py for one-line imports.

Layout:
  persistence/thread_meta/{base,model,sql,memory}.py
  persistence/run/{model,sql}.py
  persistence/feedback/{model,sql}.py

models/__init__.py is kept as a facade so Alembic autogenerate continues to
discover all ORM tables via Base.metadata. RunEventRow remains under
models/run_event.py because its storage implementation lives in
runtime/events/store/db.py and has no matching repository directory.

The repositories/ directory is removed entirely. All call sites in
gateway/deps.py and tests are updated to import from the new entity
packages, e.g.:

    from deerflow.persistence.thread_meta import ThreadMetaRepository
    from deerflow.persistence.run import RunRepository
    from deerflow.persistence.feedback import FeedbackRepository

Full test suite passes (1690 passed, 14 skipped).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(gateway): sync thread rename and delete through ThreadMetaStore

The POST /threads/{id}/state endpoint previously synced title changes
only to the LangGraph Store via _store_upsert. In sqlite mode the search
endpoint reads from the ThreadMetaRepository SQL table, so renames never
appeared in /threads/search until the next agent run completed (worker.py
syncs title from checkpoint to thread_meta in its finally block).

Likewise the DELETE /threads/{id} endpoint cleaned up the filesystem,
Store, and checkpointer but left the threads_meta row orphaned in sqlite,
so deleted threads kept appearing in /threads/search.

Fix both endpoints by routing through the ThreadMetaStore abstraction
which already has the correct sqlite/memory implementations wired up by
deps.py. The rename path now calls update_display_name() and the delete
path calls delete() — both work uniformly across backends.

Verified end-to-end with curl in gateway mode against sqlite backend.
Existing test suite (1690 passed) and focused router/repo tests pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(gateway): route all thread metadata access through ThreadMetaStore

Following the rename/delete bug fix in PR1, migrate the remaining direct
LangGraph Store reads/writes in the threads router and services to the
ThreadMetaStore abstraction so that the sqlite and memory backends behave
identically and the legacy dual-write paths can be removed.

Migrated endpoints (threads.py):
- create_thread: idempotency check + write now use thread_meta_repo.get/create
  instead of dual-writing the LangGraph Store and the SQL row.
- get_thread: reads from thread_meta_repo.get; the checkpoint-only fallback
  for legacy threads is preserved.
- patch_thread: replaced _store_get/_store_put with thread_meta_repo.update_metadata.
- delete_thread_data: dropped the legacy store.adelete; thread_meta_repo.delete
  already covers it.

Removed dead code (services.py):
- _upsert_thread_in_store — redundant with the immediately following
  thread_meta_repo.create() call.
- _sync_thread_title_after_run — worker.py's finally block already syncs
  the title via thread_meta_repo.update_display_name() after each run.

Removed dead code (threads.py):
- _store_get / _store_put / _store_upsert helpers (no remaining callers).
- THREADS_NS constant.
- get_store import (router no longer touches the LangGraph Store directly).

New abstract method:
- ThreadMetaStore.update_metadata(thread_id, metadata) merges metadata into
  the thread's metadata field. Implemented in both ThreadMetaRepository (SQL,
  read-modify-write inside one session) and MemoryThreadMetaStore. Three new
  unit tests cover merge / empty / nonexistent behaviour.

Net change: -134 lines. Full test suite: 1693 passed, 14 skipped.
Verified end-to-end with curl in gateway mode against sqlite backend
(create / patch / get / rename / search / delete).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
Co-authored-by: DanielWalnut <45447813+hetaoBackend@users.noreply.github.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: JilongSun <965640067@qq.com>
Co-authored-by: jie <49781832+stan-fu@users.noreply.github.com>
Co-authored-by: cooper <cooperfu@tencent.com>
Co-authored-by: yangzheli <43645580+yangzheli@users.noreply.github.com>
2026-04-26 11:05:47 +08:00
Octopus
1f59e945af
fix: cap prompt caching breakpoints at 4 to prevent API 400 errors (#2449)
* fix: cap prompt caching breakpoints at 4 to prevent API 400 errors (fixes #2448)

The previous _apply_prompt_caching() attached cache_control to every text
block in the system prompt, every content block in the last N messages, and
the last tool definition. In multi-turn conversations with structured content
blocks this easily exceeded the 4-breakpoint hard limit enforced by both the
Anthropic API and AWS Bedrock, producing a 400 Bad Request (or a silent
"No generations found in stream" when streaming).

Fix: collect all candidate blocks in document order, then apply cache_control
only to the last MAX_CACHE_BREAKPOINTS (4) of them. Later breakpoints cover a
larger prefix and therefore yield better cache hit rates, making this the
optimal placement strategy as well as the safe one.

Adds 13 unit tests covering the budget cap, edge cases, and correct
last-candidate placement.

* docs: clarify _apply_prompt_caching docstring includes tool definitions

Per Copilot review: the implementation also caches the last tool definition
(see the candidates list at lines 202-205), so the docstring summary should
explicitly mention tools alongside system and recent messages.

* Fix the lint error

* style: fix ruff format check for test_claude_provider_prompt_caching.py

Add the missing blank line before the 'Edge cases' section comment so
that ruff format --check passes in CI.

---------

Co-authored-by: octo-patch <octo-patch@github.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-04-25 19:40:06 +08:00
pyp0327
2bb1a2dfa2
feat(models): Provider for MindIE model engine (#2483)
* feat(models): 适配 MindIE引擎的模型

* test: add unit tests for MindIEChatModel adapter and fix PR review comments

* chore: update uv.lock with pytest-asyncio

* build: add pytest-asyncio to test dependencies

* fix: address PR review comments (lazy import, cache clients, safe newline escape, strict xml regex)

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-04-25 08:59:03 +08:00
Admire
c99865f53d
fix(token-usage): enable stream usage for openai-compatible models (#2217)
* fix(token-usage): enable stream usage for openai-compatible models

* fix(token-usage): narrow stream_usage default to ChatOpenAI
2026-04-19 22:42:55 +08:00
shivam johri
194bab4691
feat(config): add when_thinking_disabled support for model configs (#1970)
* feat(config): add when_thinking_disabled support for model configs

Allow users to explicitly configure what parameters are sent to the
model when thinking is disabled, via a new `when_thinking_disabled`
field in model config. This mirrors the existing `when_thinking_enabled`
pattern and takes full precedence over the hardcoded disable behavior
when set. Backwards compatible — existing configs work unchanged.

Closes #1675

* fix(config): address copilot review — gate when_thinking_disabled independently

- Switch truthiness check to `is not None` so empty dict overrides work
- Restructure disable path so when_thinking_disabled is gated independently
  of has_thinking_settings, allowing it to work without when_thinking_enabled
- Update test to reflect new behavior
2026-04-09 18:49:00 +08:00
Xun
1b74d84590
fix: resolve missing serialized kwargs in PatchedChatDeepSeek (#2025)
* add tests

* fix ci

* fix ci
2026-04-09 16:07:16 +08:00
Octopus
616caa92b1
fix(models): resolve duplicate keyword argument error when reasoning_effort appears in both config and kwargs (#2017)
When a model config includes `reasoning_effort` as an extra YAML field
(ModelConfig uses `extra="allow"`), and the thinking-disabled code path
also injects `reasoning_effort="minimal"` into kwargs, the previous
`model_class(**kwargs, **model_settings_from_config)` call raises:

  TypeError: got multiple values for keyword argument 'reasoning_effort'

Fix by merging the two dicts before instantiation, giving runtime kwargs
precedence over config values: `{**model_settings_from_config, **kwargs}`.

Fixes #1977

Co-authored-by: octo-patch <octo-patch@github.com>
2026-04-09 15:09:39 +08:00
Async23
0948c7a4e1
fix(provider): preserve streamed Codex output when response.completed.output is empty (#1928)
* fix: preserve streamed Codex output items

* fix: prefer completed Codex output over streamed placeholders
2026-04-07 18:21:22 +08:00
NmanQAQ
dd30e609f7
feat(models): add vLLM provider support (#1860)
support for vLLM 0.19.0 OpenAI-compatible chat endpoints and fixes the Qwen reasoning toggle so flash mode can actually disable thinking.

Co-authored-by: NmanQAQ <normangyao@qq.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-04-06 15:18:34 +08:00
totoyang
2d1f90d5dc
feat(tracing): add optional Langfuse support (#1717)
* feat(tracing): add optional Langfuse support

* Fix tracing fail-fast behavior for explicitly enabled providers

* fix(lint)
2026-04-02 13:06:10 +08:00
Markus Corazzione
5ceb19f6f6
fix(oauth): Harden Claude OAuth cache-control handling (#1583) 2026-03-30 07:41:18 +08:00
greatmengqi
084dc7e748
ci: enforce code formatting checks for backend and frontend (#1536) 2026-03-29 15:34:38 +08:00
taka6745
43ef3691a5
fix(oauth): inject billing header for Claude oAuth Models (#1442)
* fix(oauth): inject billing header for non-Haiku model access

The Anthropic Messages API requires a billing identification block
in the system prompt when using Claude Code OAuth tokens (sk-ant-oat*)
to access non-Haiku models (Opus, Sonnet). Without it, the API returns
a generic 400 "Error" with no actionable detail.

This was discovered by intercepting Claude Code CLI requests — the CLI
injects an `x-anthropic-billing-header` text block as the first system
prompt entry on every request. Third-party consumers of the same OAuth
tokens must do the same.

Changes:
- Add `_apply_oauth_billing()` to `ClaudeChatModel` that prepends the
  billing header block to the system prompt when `_is_oauth` is True
- Add `metadata.user_id` with device/session identifiers (required by
  the API alongside the billing header)
- Called from `_get_request_payload()` before prompt caching runs

Verified with Claude Max OAuth tokens against all three model tiers:
- claude-opus-4-6: 200 OK
- claude-sonnet-4-6: 200 OK
- claude-haiku-4-5-20251001: 200 OK (was already working)

Closes #1245

* fix(oauth): address review feedback on billing header injection

- Make OAUTH_BILLING_HEADER configurable via ANTHROPIC_BILLING_HEADER env var
- Normalize billing block to always be first in system list (strip + reinsert)
- Guard metadata with isinstance check for non-dict values
- Replace os.uname() with socket.gethostname() for Windows compat
- Fix docstrings to say "all OAuth requests" instead of "non-Haiku"
- Move inline imports to module level (fixes ruff I001)
- Add 9 unit tests for _apply_oauth_billing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-28 08:49:34 +08:00
Admire
b9583f7204
Fix Windows backend test compatibility (#1384)
* Fix Windows backend test compatibility

* Preserve ACP path style on Windows

* Fix installer import ordering

* Address review comments for Windows fixes

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-03-26 17:39:16 +08:00
Willem Jiang
a087fe7bcc
fix(LLM): fixing Gemini thinking + tool calls via OpenAI gateway (#1180) (#1205)
* fix(LLM): fixing Gemini thinking + tool calls via OpenAI gateway (#1180)

When using Gemini with thinking enabled through an OpenAI-compatible gateway,
the API requires that  fields on thinking content blocks are
preserved and echoed back verbatim in subsequent requests. Standard
 silently drops these signatures when serializing
messages, causing HTTP 400 errors:

Changes:
- Add PatchedChatOpenAI adapter that re-injects signed thinking blocks into
  request payloads, preserving the signature chain across multi-turn
  conversations with tool calls.
- Support two LangChain storage patterns: additional_kwargs.thinking_blocks
  and content list.
- Add 11 unit tests covering signed/unsigned blocks, storage patterns, edge
  cases, and precedence rules.
- Update config.example.yaml with Gemini + thinking gateway example.
- Update CONFIGURATION.md with detailed guidance and error explanation.

Fixes: #1180

* Updated the patched_openai.py with thought_signature of function call

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* docs: fix inaccurate thought_signature description in CONFIGURATION.md (#1220)

* Initial plan

* docs: fix CONFIGURATION.md wording for thought_signature - tool-call objects, not thinking blocks

Co-authored-by: WillemJiang <219644+WillemJiang@users.noreply.github.com>
Agent-Logs-Url: https://github.com/bytedance/deer-flow/sessions/360f5226-4631-48a7-a050-189094af8ffe

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: WillemJiang <219644+WillemJiang@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
2026-03-26 15:07:05 +08:00
DanielWalnut
d119214fee
feat(harness): integration ACP agent tool (#1344)
* refactor: extract shared utils to break harness→app cross-layer imports

Move _validate_skill_frontmatter to src/skills/validation.py and
CONVERTIBLE_EXTENSIONS + convert_file_to_markdown to src/utils/file_conversion.py.
This eliminates the two reverse dependencies from client.py (harness layer)
into gateway/routers/ (app layer), preparing for the harness/app package split.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor: split backend/src into harness (deerflow.*) and app (app.*)

Physically split the monolithic backend/src/ package into two layers:

- **Harness** (`packages/harness/deerflow/`): publishable agent framework
  package with import prefix `deerflow.*`. Contains agents, sandbox, tools,
  models, MCP, skills, config, and all core infrastructure.

- **App** (`app/`): unpublished application code with import prefix `app.*`.
  Contains gateway (FastAPI REST API) and channels (IM integrations).

Key changes:
- Move 13 harness modules to packages/harness/deerflow/ via git mv
- Move gateway + channels to app/ via git mv
- Rename all imports: src.* → deerflow.* (harness) / app.* (app layer)
- Set up uv workspace with deerflow-harness as workspace member
- Update langgraph.json, config.example.yaml, all scripts, Docker files
- Add build-system (hatchling) to harness pyproject.toml
- Add PYTHONPATH=. to gateway startup commands for app.* resolution
- Update ruff.toml with known-first-party for import sorting
- Update all documentation to reflect new directory structure

Boundary rule enforced: harness code never imports from app.
All 429 tests pass. Lint clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: add harness→app boundary check test and update docs

Add test_harness_boundary.py that scans all Python files in
packages/harness/deerflow/ and fails if any `from app.*` or
`import app.*` statement is found. This enforces the architectural
rule that the harness layer never depends on the app layer.

Update CLAUDE.md to document the harness/app split architecture,
import conventions, and the boundary enforcement test.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add config versioning with auto-upgrade on startup

When config.example.yaml schema changes, developers' local config.yaml
files can silently become outdated. This adds a config_version field and
auto-upgrade mechanism so breaking changes (like src.* → deerflow.*
renames) are applied automatically before services start.

- Add config_version: 1 to config.example.yaml
- Add startup version check warning in AppConfig.from_file()
- Add scripts/config-upgrade.sh with migration registry for value replacements
- Add `make config-upgrade` target
- Auto-run config-upgrade in serve.sh and start-daemon.sh before starting services
- Add config error hints in service failure messages

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix comments

* fix: update src.* import in test_sandbox_tools_security to deerflow.*

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: handle empty config and search parent dirs for config.example.yaml

Address Copilot review comments on PR #1131:
- Guard against yaml.safe_load() returning None for empty config files
- Search parent directories for config.example.yaml instead of only
  looking next to config.yaml, fixing detection in common setups

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: correct skills root path depth and config_version type coercion

- loader.py: fix get_skills_root_path() to use 5 parent levels (was 3)
  after harness split, file lives at packages/harness/deerflow/skills/
  so parent×3 resolved to backend/packages/harness/ instead of backend/
- app_config.py: coerce config_version to int() before comparison in
  _check_config_version() to prevent TypeError when YAML stores value
  as string (e.g. config_version: "1")
- tests: add regression tests for both fixes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: update test imports from src.* to deerflow.*/app.* after harness refactor

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(harness): add tool-first ACP agent invocation (#37)

* feat(harness): add tool-first ACP agent invocation

* build(harness): make ACP dependency required

* fix(harness): address ACP review feedback

* feat(harness): decouple ACP agent workspace from thread data

ACP agents (codex, claude-code) previously used per-thread workspace
directories, causing path resolution complexity and coupling task
execution to DeerFlow's internal thread data layout. This change:

- Replace _resolve_cwd() with a fixed _get_work_dir() that always uses
  {base_dir}/acp-workspace/, eliminating virtual path translation and
  thread_id lookups
- Introduce /mnt/acp-workspace virtual path for lead agent read-only
  access to ACP agent output files (same pattern as /mnt/skills)
- Add security guards: read-only validation, path traversal prevention,
  command path allowlisting, and output masking for acp-workspace
- Update system prompt and tool description to guide LLM: send
  self-contained tasks to ACP agents, copy results via /mnt/acp-workspace
- Add 11 new security tests for ACP workspace path handling

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(prompt): inject ACP section only when ACP agents are configured

The ACP agent guidance in the system prompt is now conditionally built
by _build_acp_section(), which checks get_acp_agents() and returns an
empty string when no ACP agents are configured. This avoids polluting
the prompt with irrelevant instructions for users who don't use ACP.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix lint

* fix(harness): address Copilot review comments on sandbox path handling and ACP tool

- local_sandbox: fix path-segment boundary bug in _resolve_path (== or startswith +"/")
  and add lookahead in _resolve_paths_in_command regex to prevent /mnt/skills matching
  inside /mnt/skills-extra
- local_sandbox_provider: replace print() with logger.warning(..., exc_info=True)
- invoke_acp_agent_tool: guard getattr(option, "optionId") with None default + continue;
  move full prompt from INFO to DEBUG level (truncated to 200 chars)
- sandbox/tools: fix _get_acp_workspace_host_path docstring to match implementation;
  remove misleading "read-only" language from validate_local_bash_command_paths

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(acp): thread-isolated workspaces, permission guardrail, and ContextVar registry

P1.1 – ACP workspace thread isolation
- Add `Paths.acp_workspace_dir(thread_id)` for per-thread paths
- `_get_work_dir(thread_id)` in invoke_acp_agent_tool now uses
  `{base_dir}/threads/{thread_id}/acp-workspace/`; falls back to
  global workspace when thread_id is absent or invalid
- `_invoke` extracts thread_id from `RunnableConfig` via
  `Annotated[RunnableConfig, InjectedToolArg]`
- `sandbox/tools.py`: `_get_acp_workspace_host_path(thread_id)`,
  `_resolve_acp_workspace_path(path, thread_id)`, and all callers
  (`replace_virtual_paths_in_command`, `mask_local_paths_in_output`,
  `ls_tool`, `read_file_tool`) now resolve ACP paths per-thread

P1.2 – ACP permission guardrail
- New `auto_approve_permissions: bool = False` field in `ACPAgentConfig`
- `_build_permission_response(options, *, auto_approve: bool)` now
  defaults to deny; only approves when `auto_approve=True`
- Document field in `config.example.yaml`

P2 – Deferred tool registry race condition
- Replace module-level `_registry` global with `contextvars.ContextVar`
- Each asyncio request context gets its own registry; worker threads
  inherit the context automatically via `loop.run_in_executor`
- Expose `get_deferred_registry` / `set_deferred_registry` /
  `reset_deferred_registry` helpers

Tests: 831 pass (57 for affected modules, 3 new tests)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(sandbox): mount /mnt/acp-workspace in docker sandbox container

The AioSandboxProvider was not mounting the ACP workspace into the
sandbox container, so /mnt/acp-workspace was inaccessible when the lead
agent tried to read ACP results in docker mode.

Changes:
- `ensure_thread_dirs`: also create `acp-workspace/` (chmod 0o777) so
  the directory exists before the sandbox container starts — required
  for Docker volume mounts
- `_get_thread_mounts`: add read-only `/mnt/acp-workspace` mount using
  the per-thread host path (`host_paths.acp_workspace_dir(thread_id)`)
- Update stale CLAUDE.md description (was "fixed global workspace")

Tests: `test_aio_sandbox_provider.py` (4 new tests)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lint): remove unused imports in test_aio_sandbox_provider

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix config

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 14:20:18 +08:00
Purricane
835ba041f8
feat: add Claude Code OAuth and Codex CLI as LLM providers (#1166)
* feat: add Claude Code OAuth and Codex CLI providers

Port of bytedance/deer-flow#1136 from @solanian's feat/cli-oauth-providers branch.\n\nCarries the feature forward on top of current main without the original CLA-blocked commit metadata, while preserving attribution in the commit message for review.

* fix: harden CLI credential loading

Align Codex auth loading with the current ~/.codex/auth.json shape, make Docker credential mounts directory-based to avoid broken file binds on hosts without exported credential files, and add focused loader tests.

* refactor: tighten codex auth typing

Replace the temporary Any return type in CodexChatModel._load_codex_auth with the concrete CodexCliCredential type after the credential loader was stabilized.

* fix: load Claude Code OAuth from Keychain

Match Claude Code's macOS storage strategy more closely by checking the Keychain-backed credentials store before falling back to ~/.claude/.credentials.json. Keep explicit file overrides and add focused tests for the Keychain path.

* fix: require explicit Claude OAuth handoff

* style: format thread hooks reasoning request

* docs: document CLI-backed auth providers

* fix: address provider review feedback

* fix: harden provider edge cases

* Fix deferred tools, Codex message normalization, and local sandbox paths

* chore: narrow PR scope to OAuth providers

* chore: remove unrelated frontend changes

* chore: reapply OAuth branch frontend scope cleanup

* fix: preserve upload guards with reasoning effort wiring

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-03-22 22:39:50 +08:00
Simon Su
ceab7fac14
fix: improve MiniMax code plan integration (#1169)
This PR improves MiniMax Code Plan integration in DeerFlow by fixing three issues in the current flow: stream errors were not clearly surfaced in the UI, the frontend could not display the actual provider model ID, and MiniMax reasoning output could leak into final assistant content as inline <think>...</think>. The change adds a MiniMax-specific adapter, exposes real model IDs end-to-end, and adds a frontend fallback for historical messages.
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-03-20 17:18:59 +08:00
DanielWalnut
76803b826f
refactor: split backend into harness (deerflow.*) and app (app.*) (#1131)
* refactor: extract shared utils to break harness→app cross-layer imports

Move _validate_skill_frontmatter to src/skills/validation.py and
CONVERTIBLE_EXTENSIONS + convert_file_to_markdown to src/utils/file_conversion.py.
This eliminates the two reverse dependencies from client.py (harness layer)
into gateway/routers/ (app layer), preparing for the harness/app package split.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor: split backend/src into harness (deerflow.*) and app (app.*)

Physically split the monolithic backend/src/ package into two layers:

- **Harness** (`packages/harness/deerflow/`): publishable agent framework
  package with import prefix `deerflow.*`. Contains agents, sandbox, tools,
  models, MCP, skills, config, and all core infrastructure.

- **App** (`app/`): unpublished application code with import prefix `app.*`.
  Contains gateway (FastAPI REST API) and channels (IM integrations).

Key changes:
- Move 13 harness modules to packages/harness/deerflow/ via git mv
- Move gateway + channels to app/ via git mv
- Rename all imports: src.* → deerflow.* (harness) / app.* (app layer)
- Set up uv workspace with deerflow-harness as workspace member
- Update langgraph.json, config.example.yaml, all scripts, Docker files
- Add build-system (hatchling) to harness pyproject.toml
- Add PYTHONPATH=. to gateway startup commands for app.* resolution
- Update ruff.toml with known-first-party for import sorting
- Update all documentation to reflect new directory structure

Boundary rule enforced: harness code never imports from app.
All 429 tests pass. Lint clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: add harness→app boundary check test and update docs

Add test_harness_boundary.py that scans all Python files in
packages/harness/deerflow/ and fails if any `from app.*` or
`import app.*` statement is found. This enforces the architectural
rule that the harness layer never depends on the app layer.

Update CLAUDE.md to document the harness/app split architecture,
import conventions, and the boundary enforcement test.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add config versioning with auto-upgrade on startup

When config.example.yaml schema changes, developers' local config.yaml
files can silently become outdated. This adds a config_version field and
auto-upgrade mechanism so breaking changes (like src.* → deerflow.*
renames) are applied automatically before services start.

- Add config_version: 1 to config.example.yaml
- Add startup version check warning in AppConfig.from_file()
- Add scripts/config-upgrade.sh with migration registry for value replacements
- Add `make config-upgrade` target
- Auto-run config-upgrade in serve.sh and start-daemon.sh before starting services
- Add config error hints in service failure messages

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix comments

* fix: update src.* import in test_sandbox_tools_security to deerflow.*

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: handle empty config and search parent dirs for config.example.yaml

Address Copilot review comments on PR #1131:
- Guard against yaml.safe_load() returning None for empty config files
- Search parent directories for config.example.yaml instead of only
  looking next to config.yaml, fixing detection in common setups

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: correct skills root path depth and config_version type coercion

- loader.py: fix get_skills_root_path() to use 5 parent levels (was 3)
  after harness split, file lives at packages/harness/deerflow/skills/
  so parent×3 resolved to backend/packages/harness/ instead of backend/
- app_config.py: coerce config_version to int() before comparison in
  _check_config_version() to prevent TypeError when YAML stores value
  as string (e.g. config_version: "1")
- tests: add regression tests for both fixes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: update test imports from src.* to deerflow.*/app.* after harness refactor

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 22:55:52 +08:00