feat(persistence): per-user filesystem isolation, run-scoped APIs, and state/history simplification (#2153)

* feat(persistence): add unified persistence layer with event store, token tracking, and feedback (#1930)

* feat(persistence): add SQLAlchemy 2.0 async ORM scaffold

Introduce a unified database configuration (DatabaseConfig) that
controls both the LangGraph checkpointer and the DeerFlow application
persistence layer from a single `database:` config section.

New modules:
- deerflow.config.database_config — Pydantic config with memory/sqlite/postgres backends
- deerflow.persistence — async engine lifecycle, DeclarativeBase with to_dict mixin, Alembic skeleton
- deerflow.runtime.runs.store — RunStore ABC + MemoryRunStore implementation

Gateway integration initializes/tears down the persistence engine in
the existing langgraph_runtime() context manager. Legacy checkpointer
config is preserved for backward compatibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(persistence): add RunEventStore ABC + MemoryRunEventStore

Phase 2-A prerequisite for event storage: adds the unified run event
stream interface (RunEventStore) with an in-memory implementation,
RunEventsConfig, gateway integration, and comprehensive tests (27 cases).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(persistence): add ORM models, repositories, DB/JSONL event stores, RunJournal, and API endpoints

Phase 2-B: run persistence + event storage + token tracking.

- ORM models: RunRow (with token fields), ThreadMetaRow, RunEventRow
- RunRepository implements RunStore ABC via SQLAlchemy ORM
- ThreadMetaRepository with owner access control
- DbRunEventStore with trace content truncation and cursor pagination
- JsonlRunEventStore with per-run files and seq recovery from disk
- RunJournal (BaseCallbackHandler) captures LLM/tool/lifecycle events,
  accumulates token usage by caller type, buffers and flushes to store
- RunManager now accepts optional RunStore for persistent backing
- Worker creates RunJournal, writes human_message, injects callbacks
- Gateway deps use factory functions (RunRepository when DB available)
- New endpoints: messages, run messages, run events, token-usage
- ThreadCreateRequest gains assistant_id field
- 92 tests pass (33 new), zero regressions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(persistence): add user feedback + follow-up run association

Phase 2-C: feedback and follow-up tracking.

- FeedbackRow ORM model (rating +1/-1, optional message_id, comment)
- FeedbackRepository with CRUD, list_by_run/thread, aggregate stats
- Feedback API endpoints: create, list, stats, delete
- follow_up_to_run_id in RunCreateRequest (explicit or auto-detected
  from latest successful run on the thread)
- Worker writes follow_up_to_run_id into human_message event metadata
- Gateway deps: feedback_repo factory + getter
- 17 new tests (14 FeedbackRepository + 3 follow-up association)
- 109 total tests pass, zero regressions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test+config: comprehensive Phase 2 test coverage + deprecate checkpointer config

- config.example.yaml: deprecate standalone checkpointer section, activate
  unified database:sqlite as default (drives both checkpointer + app data)
- New: test_thread_meta_repo.py (14 tests) — full ThreadMetaRepository coverage
  including check_access owner logic, list_by_owner pagination
- Extended test_run_repository.py (+4 tests) — completion preserves fields,
  list ordering desc, limit, owner_none returns all
- Extended test_run_journal.py (+8 tests) — on_chain_error, track_tokens=false,
  middleware no ai_message, unknown caller tokens, convenience fields,
  tool_error, non-summarization custom event
- Extended test_run_event_store.py (+7 tests) — DB batch seq continuity,
  make_run_event_store factory (memory/db/jsonl/fallback/unknown)
- Extended test_phase2b_integration.py (+4 tests) — create_or_reject persists,
  follow-up metadata, summarization in history, full DB-backed lifecycle
- Fixed DB integration test to use proper fake objects (not MagicMock)
  for JSON-serializable metadata
- 157 total Phase 2 tests pass, zero regressions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* config: move default sqlite_dir to .deer-flow/data

Keep SQLite databases alongside other DeerFlow-managed data
(threads, memory) under the .deer-flow/ directory instead of a
top-level ./data folder.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(persistence): remove UTFJSON, use engine-level json_serializer + datetime.now()

- Replace custom UTFJSON type with standard sqlalchemy.JSON in all ORM
  models. Add json_serializer=json.dumps(ensure_ascii=False) to all
  create_async_engine calls so non-ASCII text (Chinese etc.) is stored
  as-is in both SQLite and Postgres.
- Change ORM datetime defaults from datetime.now(UTC) to datetime.now(),
  remove UTC imports.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(gateway): simplify deps.py with getter factory + inline repos

- Replace 6 identical getter functions with _require() factory.
- Inline 3 _make_*_repo() factories into langgraph_runtime(), call
  get_session_factory() once instead of 3 times.
- Add thread_meta upsert in start_run (services.py).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(docker): add UV_EXTRAS build arg for optional dependencies

Support installing optional dependency groups (e.g. postgres) at
Docker build time via UV_EXTRAS build arg:
  UV_EXTRAS=postgres docker compose build

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(journal): fix flush, token tracking, and consolidate tests

RunJournal fixes:
- _flush_sync: retain events in buffer when no event loop instead of
  dropping them; worker's finally block flushes via async flush().
- on_llm_end: add tool_calls filter and caller=="lead_agent" guard for
  ai_message events; mark message IDs for dedup with record_llm_usage.
- worker.py: persist completion data (tokens, message count) to RunStore
  in finally block.

Model factory:
- Auto-inject stream_usage=True for BaseChatOpenAI subclasses with
  custom api_base, so usage_metadata is populated in streaming responses.

Test consolidation:
- Delete test_phase2b_integration.py (redundant with existing tests).
- Move DB-backed lifecycle test into test_run_journal.py.
- Add tests for stream_usage injection in test_model_factory.py.
- Clean up executor/task_tool dead journal references.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(events): widen content type to str|dict in all store backends

Allow event content to be a dict (for structured OpenAI-format messages)
in addition to plain strings. Dict values are JSON-serialized for the DB
backend and deserialized on read; memory and JSONL backends handle dicts
natively. Trace truncation now serializes dicts to JSON before measuring.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(events): use metadata flag instead of heuristic for dict content detection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(converters): add LangChain-to-OpenAI message format converters

Pure functions langchain_to_openai_message, langchain_to_openai_completion,
langchain_messages_to_openai, and _infer_finish_reason for converting
LangChain BaseMessage objects to OpenAI Chat Completions format, used by
RunJournal for event storage. 15 unit tests added.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(converters): handle empty list content as null, clean up test

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(events): human_message content uses OpenAI user message format

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(events): ai_message uses OpenAI format, add ai_tool_call message event

- ai_message content now uses {"role": "assistant", "content": "..."} format
- New ai_tool_call message event emitted when lead_agent LLM responds with tool_calls
- ai_tool_call uses langchain_to_openai_message converter for consistent format
- Both events include finish_reason in metadata ("stop" or "tool_calls")

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(events): add tool_result message event with OpenAI tool message format

Cache tool_call_id from on_tool_start keyed by run_id as fallback for on_tool_end,
then emit a tool_result message event (role=tool, tool_call_id, content) after each
successful tool completion.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(events): summary content uses OpenAI system message format

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(events): replace llm_start/llm_end with llm_request/llm_response in OpenAI format

Add on_chat_model_start to capture structured prompt messages as llm_request events.
Replace llm_end trace events with llm_response using OpenAI Chat Completions format.
Track llm_call_index to pair request/response events.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(events): add record_middleware method for middleware trace events

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test(events): add full run sequence integration test for OpenAI content format

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(events): align message events with checkpoint format and add middleware tag injection

- Message events (ai_message, ai_tool_call, tool_result, human_message) now use
  BaseMessage.model_dump() format, matching LangGraph checkpoint values.messages
- on_tool_end extracts tool_call_id/name/status from ToolMessage objects
- on_tool_error now emits tool_result message events with error status
- record_middleware uses middleware:{tag} event_type and middleware category
- Summarization custom events use middleware:summarize category
- TitleMiddleware injects middleware:title tag via get_config() inheritance
- SummarizationMiddleware model bound with middleware:summarize tag
- Worker writes human_message using HumanMessage.model_dump()

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(threads): switch search endpoint to threads_meta table and sync title

- POST /api/threads/search now queries threads_meta table directly,
  removing the two-phase Store + Checkpointer scan approach
- Add ThreadMetaRepository.search() with metadata/status filters
- Add ThreadMetaRepository.update_display_name() for title sync
- Worker syncs checkpoint title to threads_meta.display_name on run completion
- Map display_name to values.title in search response for API compatibility

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(threads): history endpoint reads messages from event store

- POST /api/threads/{thread_id}/history now combines two data sources:
  checkpointer for checkpoint_id, metadata, title, thread_data;
  event store for messages (complete history, not truncated by summarization)
- Strip internal LangGraph metadata keys from response
- Remove full channel_values serialization in favor of selective fields

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: remove duplicate optional-dependencies header in pyproject.toml

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(middleware): pass tagged config to TitleMiddleware ainvoke call

Without the config, the middleware:title tag was not injected,
causing the LLM response to be recorded as a lead_agent ai_message
in run_events.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: resolve merge conflict in .env.example

Keep both DATABASE_URL (from persistence-scaffold) and WECOM
credentials (from main) after the merge.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(persistence): address review feedback on PR #1851

- Fix naive datetime.now() → datetime.now(UTC) in all ORM models
- Fix seq race condition in DbRunEventStore.put() with FOR UPDATE
  and UNIQUE(thread_id, seq) constraint
- Encapsulate _store access in RunManager.update_run_completion()
- Deduplicate _store.put() logic in RunManager via _persist_to_store()
- Add update_run_completion to RunStore ABC + MemoryRunStore
- Wire follow_up_to_run_id through the full create path
- Add error recovery to RunJournal._flush_sync() lost-event scenario
- Add migration note for search_threads breaking change
- Fix test_checkpointer_none_fix mock to set database=None

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: update uv.lock

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(persistence): address 22 review comments from CodeQL, Copilot, and Code Quality

Bug fixes:
- Sanitize log params to prevent log injection (CodeQL)
- Reset threads_meta.status to idle/error when run completes
- Attach messages only to latest checkpoint in /history response
- Write threads_meta on POST /threads so new threads appear in search

Lint fixes:
- Remove unused imports (journal.py, migrations/env.py, test_converters.py)
- Convert lambda to named function (engine.py, Ruff E731)
- Remove unused logger definitions in repos (Ruff F841)
- Add logging to JSONL decode errors and empty except blocks
- Separate assert side-effects in tests (CodeQL)
- Remove unused local variables in tests (Ruff F841)
- Fix max_trace_content truncation to use byte length, not char length

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style: apply ruff format to persistence and runtime files

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Potential fix for pull request finding 'Statement has no effect'

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

* refactor(runtime): introduce RunContext to reduce run_agent parameter bloat

Extract checkpointer, store, event_store, run_events_config, thread_meta_repo,
and follow_up_to_run_id into a frozen RunContext dataclass. Add get_run_context()
in deps.py to build the base context from app.state singletons. start_run() uses
dataclasses.replace() to enrich per-run fields before passing ctx to run_agent.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(gateway): move sanitize_log_param to app/gateway/utils.py

Extract the log-injection sanitizer from routers/threads.py into a shared
utils module and rename to sanitize_log_param (public API). Eliminates the
reverse service → router import in services.py.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* perf: use SQL aggregation for feedback stats and thread token usage

Replace Python-side counting in FeedbackRepository.aggregate_by_run with
a single SELECT COUNT/SUM query. Add RunStore.aggregate_tokens_by_thread
abstract method with SQL GROUP BY implementation in RunRepository and
Python fallback in MemoryRunStore. Simplify the thread_token_usage
endpoint to delegate to the new method, eliminating the limit=10000
truncation risk.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: annotate DbRunEventStore.put() as low-frequency path

Add docstring clarifying that put() opens a per-call transaction with
FOR UPDATE and should only be used for infrequent writes (currently
just the initial human_message event). High-throughput callers should
use put_batch() instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(threads): fall back to Store search when ThreadMetaRepository is unavailable

When database.backend=memory (default) or no SQL session factory is
configured, search_threads now queries the LangGraph Store instead of
returning 503. Returns empty list if neither Store nor repo is available.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(persistence): introduce ThreadMetaStore ABC for backend-agnostic thread metadata

Add ThreadMetaStore abstract base class with create/get/search/update/delete
interface. ThreadMetaRepository (SQL) now inherits from it. New
MemoryThreadMetaStore wraps LangGraph BaseStore for memory-mode deployments.

deps.py now always provides a non-None thread_meta_repo, eliminating all
`if thread_meta_repo is not None` guards in services.py, worker.py, and
routers/threads.py. search_threads no longer needs a Store fallback branch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(history): read messages from checkpointer instead of RunEventStore

The /history endpoint now reads messages directly from the
checkpointer's channel_values (the authoritative source) instead of
querying RunEventStore.list_messages(). The RunEventStore API is
preserved for other consumers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(persistence): address new Copilot review comments

- feedback.py: validate thread_id/run_id before deleting feedback
- jsonl.py: add path traversal protection with ID validation
- run_repo.py: parse `before` to datetime for PostgreSQL compat
- thread_meta_repo.py: fix pagination when metadata filter is active
- database_config.py: use resolve_path for sqlite_dir consistency

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Implement skill self-evolution and skill_manage flow (#1874)

* chore: ignore .worktrees directory

* Add skill_manage self-evolution flow

* Fix CI regressions for skill_manage

* Address PR review feedback for skill evolution

* fix(skill-evolution): preserve history on delete

* fix(skill-evolution): tighten scanner fallbacks

* docs: add skill_manage e2e evidence screenshot

* fix(skill-manage): avoid blocking fs ops in session runtime

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>

* fix(config): resolve sqlite_dir relative to CWD, not Paths.base_dir

resolve_path() resolves relative to Paths.base_dir (.deer-flow),
which double-nested the path to .deer-flow/.deer-flow/data/app.db.
Use Path.resolve() (CWD-relative) instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Feature/feishu receive file (#1608)

* feat(feishu): add channel file materialization hook for inbound messages

- Introduce Channel.receive_file(msg, thread_id) as a base method for file materialization; default is no-op.
- Implement FeishuChannel.receive_file to download files/images from Feishu messages, save to sandbox, and inject virtual paths into msg.text.
- Update ChannelManager to call receive_file for any channel if msg.files is present, enabling downstream model access to user-uploaded files.
- No impact on Slack/Telegram or other channels (they inherit the default no-op).

* style(backend): format code with ruff for lint compliance

- Auto-formatted packages/harness/deerflow/agents/factory.py and tests/test_create_deerflow_agent.py using `ruff format`
- Ensured both files conform to project linting standards
- Fixes CI lint check failures caused by code style issues

* fix(feishu): handle file write operation asynchronously to prevent blocking

* fix(feishu): rename GetMessageResourceRequest to _GetMessageResourceRequest and remove redundant code

* test(feishu): add tests for receive_file method and placeholder replacement

* fix(manager): remove unnecessary type casting for channel retrieval

* fix(feishu): update logging messages to reflect resource handling instead of image

* fix(feishu): sanitize filename by replacing invalid characters in file uploads

* fix(feishu): improve filename sanitization and reorder image key handling in message processing

* fix(feishu): add thread lock to prevent filename conflicts during file downloads

* fix(test): correct bad merge in test_feishu_parser.py

* chore: run ruff and apply formatting cleanup
fix(feishu): preserve rich-text attachment order and improve fallback filename handling

* fix(docker): restore gateway env vars and fix langgraph empty arg issue (#1915)

Two production docker-compose.yaml bugs prevent `make up` from working:

1. Gateway missing DEER_FLOW_CONFIG_PATH and DEER_FLOW_EXTENSIONS_CONFIG_PATH
   environment overrides. Added in fb2d99f (#1836) but accidentally reverted
   by ca2fb95 (#1847). Without them, gateway reads host paths from .env via
   env_file, causing FileNotFoundError inside the container.

2. Langgraph command fails when LANGGRAPH_ALLOW_BLOCKING is unset (default).
   Empty $${allow_blocking} inserts a bare space between flags, causing
   ' --no-reload' to be parsed as unexpected extra argument. Fix by building
   args string first and conditionally appending --allow-blocking.

Co-authored-by: cooper <cooperfu@tencent.com>

* fix(frontend): resolve invalid HTML nesting and tabnabbing vulnerabilities (#1904)

* fix(frontend): resolve invalid HTML nesting and tabnabbing vulnerabilities

Fix `<button>` inside `<a>` invalid HTML in artifact components and add
missing `noopener,noreferrer` to `window.open` calls to prevent reverse
tabnabbing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(frontend): address Copilot review on tabnabbing and double-tab-open

Remove redundant parent onClick on web_fetch ChainOfThoughtStep to
prevent opening two tabs on link click, and explicitly null out
window.opener after window.open() for defensive tabnabbing hardening.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(persistence): organize entities into per-entity directories

Restructure the persistence layer from horizontal "models/ + repositories/"
split into vertical entity-aligned directories. Each entity (thread_meta,
run, feedback) now owns its ORM model, abstract interface (where applicable),
and concrete implementations under a single directory with an aggregating
__init__.py for one-line imports.

Layout:
  persistence/thread_meta/{base,model,sql,memory}.py
  persistence/run/{model,sql}.py
  persistence/feedback/{model,sql}.py

models/__init__.py is kept as a facade so Alembic autogenerate continues to
discover all ORM tables via Base.metadata. RunEventRow remains under
models/run_event.py because its storage implementation lives in
runtime/events/store/db.py and has no matching repository directory.

The repositories/ directory is removed entirely. All call sites in
gateway/deps.py and tests are updated to import from the new entity
packages, e.g.:

    from deerflow.persistence.thread_meta import ThreadMetaRepository
    from deerflow.persistence.run import RunRepository
    from deerflow.persistence.feedback import FeedbackRepository

Full test suite passes (1690 passed, 14 skipped).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(gateway): sync thread rename and delete through ThreadMetaStore

The POST /threads/{id}/state endpoint previously synced title changes
only to the LangGraph Store via _store_upsert. In sqlite mode the search
endpoint reads from the ThreadMetaRepository SQL table, so renames never
appeared in /threads/search until the next agent run completed (worker.py
syncs title from checkpoint to thread_meta in its finally block).

Likewise the DELETE /threads/{id} endpoint cleaned up the filesystem,
Store, and checkpointer but left the threads_meta row orphaned in sqlite,
so deleted threads kept appearing in /threads/search.

Fix both endpoints by routing through the ThreadMetaStore abstraction
which already has the correct sqlite/memory implementations wired up by
deps.py. The rename path now calls update_display_name() and the delete
path calls delete() — both work uniformly across backends.

Verified end-to-end with curl in gateway mode against sqlite backend.
Existing test suite (1690 passed) and focused router/repo tests pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(gateway): route all thread metadata access through ThreadMetaStore

Following the rename/delete bug fix in PR1, migrate the remaining direct
LangGraph Store reads/writes in the threads router and services to the
ThreadMetaStore abstraction so that the sqlite and memory backends behave
identically and the legacy dual-write paths can be removed.

Migrated endpoints (threads.py):
- create_thread: idempotency check + write now use thread_meta_repo.get/create
  instead of dual-writing the LangGraph Store and the SQL row.
- get_thread: reads from thread_meta_repo.get; the checkpoint-only fallback
  for legacy threads is preserved.
- patch_thread: replaced _store_get/_store_put with thread_meta_repo.update_metadata.
- delete_thread_data: dropped the legacy store.adelete; thread_meta_repo.delete
  already covers it.

Removed dead code (services.py):
- _upsert_thread_in_store — redundant with the immediately following
  thread_meta_repo.create() call.
- _sync_thread_title_after_run — worker.py's finally block already syncs
  the title via thread_meta_repo.update_display_name() after each run.

Removed dead code (threads.py):
- _store_get / _store_put / _store_upsert helpers (no remaining callers).
- THREADS_NS constant.
- get_store import (router no longer touches the LangGraph Store directly).

New abstract method:
- ThreadMetaStore.update_metadata(thread_id, metadata) merges metadata into
  the thread's metadata field. Implemented in both ThreadMetaRepository (SQL,
  read-modify-write inside one session) and MemoryThreadMetaStore. Three new
  unit tests cover merge / empty / nonexistent behaviour.

Net change: -134 lines. Full test suite: 1693 passed, 14 skipped.
Verified end-to-end with curl in gateway mode against sqlite backend
(create / patch / get / rename / search / delete).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
Co-authored-by: DanielWalnut <45447813+hetaoBackend@users.noreply.github.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: JilongSun <965640067@qq.com>
Co-authored-by: jie <49781832+stan-fu@users.noreply.github.com>
Co-authored-by: cooper <cooperfu@tencent.com>
Co-authored-by: yangzheli <43645580+yangzheli@users.noreply.github.com>

* feat(auth): release-validation pass for 2.0-rc — 12 blockers + simplify follow-ups (#2008)

* feat(auth): introduce backend auth module

Port RFC-001 authentication core from PR #1728:
- JWT token handling (create_access_token, decode_token, TokenPayload)
- Password hashing (bcrypt) with verify_password
- SQLite UserRepository with base interface
- Provider Factory pattern (LocalAuthProvider)
- CLI reset_admin tool
- Auth-specific errors (AuthErrorCode, TokenError, AuthErrorResponse)

Deps:
- bcrypt>=4.0.0
- pyjwt>=2.9.0
- email-validator>=2.0.0
- backend/uv.toml pins public PyPI index

Tests: 12 pure unit tests (test_auth_config.py, test_auth_errors.py).

Scope note: authz.py, test_auth.py, and test_auth_type_system.py are
deferred to commit 2 because they depend on middleware and deps wiring
that is not yet in place. Commit 1 stays "pure new files only" as the
spec mandates.

* feat(auth): wire auth end-to-end (middleware + frontend replacement)

Backend:
- Port auth_middleware, csrf_middleware, langgraph_auth, routers/auth
- Port authz decorator (owner_filter_key defaults to 'owner_id')
- Merge app.py: register AuthMiddleware + CSRFMiddleware + CORS, add
  _ensure_admin_user lifespan hook, _migrate_orphaned_threads helper,
  register auth router
- Merge deps.py: add get_local_provider, get_current_user_from_request,
  get_optional_user_from_request; keep get_current_user as thin str|None
  adapter for feedback router
- langgraph.json: add auth path pointing to langgraph_auth.py:auth
- Rename metadata['user_id'] -> metadata['owner_id'] in langgraph_auth
  (both metadata write and LangGraph filter dict) + test fixtures

Frontend:
- Delete better-auth library and api catch-all route
- Remove better-auth npm dependency and env vars (BETTER_AUTH_SECRET,
  BETTER_AUTH_GITHUB_*) from env.js
- Port frontend/src/core/auth/* (AuthProvider, gateway-config,
  proxy-policy, server-side getServerSideUser, types)
- Port frontend/src/core/api/fetcher.ts
- Port (auth)/layout, (auth)/login, (auth)/setup pages
- Rewrite workspace/layout.tsx as server component that calls
  getServerSideUser and wraps in AuthProvider
- Port workspace/workspace-content.tsx for the client-side sidebar logic

Tests:
- Port 5 auth test files (test_auth, test_auth_middleware,
  test_auth_type_system, test_ensure_admin, test_langgraph_auth)
- 176 auth tests PASS

After this commit: login/logout/registration flow works, but persistence
layer does not yet filter by owner_id. Commit 4 closes that gap.

* feat(auth): account settings page + i18n

- Port account-settings-page.tsx (change password, change email, logout)
- Wire into settings-dialog.tsx as new "account" section with UserIcon,
  rendered first in the section list
- Add i18n keys:
  - en-US/zh-CN: settings.sections.account ("Account" / "账号")
  - en-US/zh-CN: button.logout ("Log out" / "退出登录")
  - types.ts: matching type declarations

* feat(auth): enforce owner_id across 2.0-rc persistence layer

Add request-scoped contextvar-based owner filtering to threads_meta,
runs, run_events, and feedback repositories. Router code is unchanged
— isolation is enforced at the storage layer so that any caller that
forgets to pass owner_id still gets filtered results, and new routes
cannot accidentally leak data.

Core infrastructure
-------------------
- deerflow/runtime/user_context.py (new):
  - ContextVar[CurrentUser | None] with default None
  - runtime_checkable CurrentUser Protocol (structural subtype with .id)
  - set/reset/get/require helpers
  - AUTO sentinel + resolve_owner_id(value, method_name) for sentinel
    three-state resolution: AUTO reads contextvar, explicit str
    overrides, explicit None bypasses the filter (for migration/CLI)

Repository changes
------------------
- ThreadMetaRepository: create/get/search/update_*/delete gain
  owner_id=AUTO kwarg; read paths filter by owner, writes stamp it,
  mutations check ownership before applying
- RunRepository: put/get/list_by_thread/delete gain owner_id=AUTO kwarg
- FeedbackRepository: create/get/list_by_run/list_by_thread/delete
  gain owner_id=AUTO kwarg
- DbRunEventStore: list_messages/list_events/list_messages_by_run/
  count_messages/delete_by_thread/delete_by_run gain owner_id=AUTO
  kwarg. Write paths (put/put_batch) read contextvar softly: when a
  request-scoped user is available, owner_id is stamped; background
  worker writes without a user context pass None which is valid
  (orphan row to be bound by migration)

Schema
------
- persistence/models/run_event.py: RunEventRow.owner_id = Mapped[
  str | None] = mapped_column(String(64), nullable=True, index=True)
- No alembic migration needed: 2.0 ships fresh, Base.metadata.create_all
  picks up the new column automatically

Middleware
----------
- auth_middleware.py: after cookie check, call get_optional_user_from_
  request to load the real User, stamp it into request.state.user AND
  the contextvar via set_current_user, reset in a try/finally. Public
  paths and unauthenticated requests continue without contextvar, and
  @require_auth handles the strict 401 path

Test infrastructure
-------------------
- tests/conftest.py: @pytest.fixture(autouse=True) _auto_user_context
  sets a default SimpleNamespace(id="test-user-autouse") on every test
  unless marked @pytest.mark.no_auto_user. Keeps existing 20+
  persistence tests passing without modification
- pyproject.toml [tool.pytest.ini_options]: register no_auto_user
  marker so pytest does not emit warnings for opt-out tests
- tests/test_user_context.py: 6 tests covering three-state semantics,
  Protocol duck typing, and require/optional APIs
- tests/test_thread_meta_repo.py: one test updated to pass owner_id=
  None explicitly where it was previously relying on the old default

Test results
------------
- test_user_context.py: 6 passed
- test_auth*.py + test_langgraph_auth.py + test_ensure_admin.py: 127
- test_run_event_store / test_run_repository / test_thread_meta_repo
  / test_feedback: 92 passed
- Full backend suite: 1905 passed, 2 failed (both @requires_llm flaky
  integration tests unrelated to auth), 1 skipped

* feat(auth): extend orphan migration to 2.0-rc persistence tables

_ensure_admin_user now runs a three-step pipeline on every boot:

  Step 1 (fatal):     admin user exists / is created / password is reset
  Step 2 (non-fatal): LangGraph store orphan threads → admin
  Step 3 (non-fatal): SQL persistence tables → admin
    - threads_meta
    - runs
    - run_events
    - feedback

Each step is idempotent. The fatal/non-fatal split mirrors PR #1728's
original philosophy: admin creation failure blocks startup (the system
is unusable without an admin), whereas migration failures log a warning
and let the service proceed (a partial migration is recoverable; a
missing admin is not).

Key helpers
-----------
- _iter_store_items(store, namespace, *, page_size=500):
  async generator that cursor-paginates across LangGraph store pages.
  Fixes PR #1728's hardcoded limit=1000 bug that would silently lose
  orphans beyond the first page.

- _migrate_orphaned_threads(store, admin_user_id):
  Rewritten to use _iter_store_items. Returns the migrated count so the
  caller can log it; raises only on unhandled exceptions.

- _migrate_orphan_sql_tables(admin_user_id):
  Imports the 4 ORM models lazily, grabs the shared session factory,
  runs one UPDATE per table in a single transaction, commits once.
  No-op when no persistence backend is configured (in-memory dev).

Tests: test_ensure_admin.py (8 passed)

* test(auth): port AUTH test plan docs + lint/format pass

- Port backend/docs/AUTH_TEST_PLAN.md and AUTH_UPGRADE.md from PR #1728
- Rename metadata.user_id → metadata.owner_id in AUTH_TEST_PLAN.md
  (4 occurrences from the original PR doc)
- ruff auto-fix UP037 in sentinel type annotations: drop quotes around
  "str | None | _AutoSentinel" now that from __future__ import
  annotations makes them implicit string forms
- ruff format: 2 files (app/gateway/app.py, runtime/user_context.py)

Note on test coverage additions:
- conftest.py autouse fixture was already added in commit 4 (had to
  be co-located with the repository changes to keep pre-existing
  persistence tests passing)
- cross-user isolation E2E tests (test_owner_isolation.py) deferred
  — enforcement is already proven by the 98-test repository suite
  via the autouse fixture + explicit _AUTO sentinel exercises
- New test cases (TC-API-17..20, TC-ATK-13, TC-MIG-01..07) listed
  in AUTH_TEST_PLAN.md are deferred to a follow-up PR — they are
  manual-QA test cases rather than pytest code, and the spec-level
  coverage is already met by test_user_context.py + the 98-test
  repository suite.

Final test results:
- Auth suite (test_auth*, test_langgraph_auth, test_ensure_admin,
  test_user_context): 186 passed
- Persistence suite (test_run_event_store, test_run_repository,
  test_thread_meta_repo, test_feedback): 98 passed
- Lint: ruff check + ruff format both clean

* test(auth): add cross-user isolation test suite

10 tests exercising the storage-layer owner filter by manually
switching the user_context contextvar between two users. Verifies
the safety invariant:

  After a repository write with owner_id=A, a subsequent read with
  owner_id=B must not return the row, and vice versa.

Covers all 4 tables that own user-scoped data:

TC-API-17  threads_meta  — read, search, update, delete cross-user
TC-API-18  runs          — get, list_by_thread, delete cross-user
TC-API-19  run_events    — list_messages, list_events, count_messages,
                           delete_by_thread (CRITICAL: raw conversation
                           content leak vector)
TC-API-20  feedback      — get, list_by_run, delete cross-user

Plus two meta-tests verifying the sentinel pattern itself:
- AUTO + unset contextvar raises RuntimeError
- explicit owner_id=None bypasses the filter (migration escape hatch)

Architecture note
-----------------
These tests bypass the HTTP layer by design. The full chain
(cookie → middleware → contextvar → repository) is covered piecewise:

- test_auth_middleware.py: middleware sets contextvar from cookies
- test_owner_isolation.py: repositories enforce isolation when
  contextvar is set to different users

Together they prove the end-to-end safety property without the
ceremony of spinning up a full TestClient + in-memory DB for every
router endpoint.

Tests pass: 231 (full auth + persistence + isolation suite)
Lint: clean

* refactor(auth): migrate user repository to SQLAlchemy ORM

Move the users table into the shared persistence engine so auth
matches the pattern of threads_meta, runs, run_events, and feedback —
one engine, one session factory, one schema init codepath.

New files
---------
- persistence/user/__init__.py, persistence/user/model.py: UserRow
  ORM class with partial unique index on (oauth_provider, oauth_id)
- Registered in persistence/models/__init__.py so
  Base.metadata.create_all() picks it up

Modified
--------
- auth/repositories/sqlite.py: rewritten as async SQLAlchemy,
  identical constructor pattern to the other four repositories
  (def __init__(self, session_factory) + self._sf = session_factory)
- auth/config.py: drop users_db_path field — storage is configured
  through config.database like every other table
- deps.py/get_local_provider: construct SQLiteUserRepository with
  the shared session factory, fail fast if engine is not initialised
- tests/test_auth.py: rewrite test_sqlite_round_trip_new_fields to
  use the shared engine (init_engine + close_engine in a tempdir)
- tests/test_auth_type_system.py: add per-test autouse fixture that
  spins up a scratch engine and resets deps._cached_* singletons

* refactor(auth): remove SQL orphan migration (unused in supported scenarios)

The _migrate_orphan_sql_tables helper existed to bind NULL owner_id
rows in threads_meta, runs, run_events, and feedback to the admin on
first boot. But in every supported upgrade path, it's a no-op:

  1. Fresh install: create_all builds fresh tables, no legacy rows
  2. No-auth → with-auth (no existing persistence DB): persistence
     tables are created fresh by create_all, no legacy rows
  3. No-auth → with-auth (has existing persistence DB from #1930):
     NOT a supported upgrade path — "有 DB 到有 DB" schema evolution
     is out of scope; users wipe DB or run manual ALTER

So the SQL orphan migration never has anything to do in the
supported matrix. Delete the function, simplify _ensure_admin_user
from a 3-step pipeline to a 2-step one (admin creation + LangGraph
store orphan migration only).

LangGraph store orphan migration stays: it serves the real
"no-auth → with-auth" upgrade path where a user's existing LangGraph
thread metadata has no owner_id field and needs to be stamped with
the newly-created admin's id.

Tests: 284 passed (auth + persistence + isolation)
Lint: clean

* security(auth): write initial admin password to 0600 file instead of logs

CodeQL py/clear-text-logging-sensitive-data flagged 3 call sites that
logged the auto-generated admin password to stdout via logger.info().
Production log aggregators (ELK/Splunk/etc) would have captured those
cleartext secrets. Replace with a shared helper that writes to
.deer-flow/admin_initial_credentials.txt with mode 0600, and log only
the path.

New file
--------
- app/gateway/auth/credential_file.py: write_initial_credentials()
  helper. Takes email, password, and a "initial"/"reset" label.
  Creates .deer-flow/ if missing, writes a header comment plus the
  email+password, chmods 0o600, returns the absolute Path.

Modified
--------
- app/gateway/app.py: both _ensure_admin_user paths (fresh creation
  + needs_setup password reset) now write to file and log the path
- app/gateway/auth/reset_admin.py: rewritten to use the shared ORM
  repo (SQLiteUserRepository with session_factory) and the
  credential_file helper. The previous implementation was broken
  after the earlier ORM refactor — it still imported _get_users_conn
  and constructed SQLiteUserRepository() without a session factory.

No tests changed — the three password-log sites are all exercised
via existing test_ensure_admin.py which checks that startup
succeeds, not that a specific string appears in logs.

CodeQL alerts 272, 283, 284: all resolved.

* security(auth): strict JWT validation in middleware (fix junk cookie bypass)

AUTH_TEST_PLAN test 7.5.8 expects junk cookies to be rejected with
401. The previous middleware behaviour was "presence-only": check
that some access_token cookie exists, then pass through. In
combination with my Task-12 decision to skip @require_auth
decorators on routes, this created a gap where a request with any
cookie-shaped string (e.g. access_token=not-a-jwt) would bypass
authentication on routes that do not touch the repository
(/api/models, /api/mcp/config, /api/memory, /api/skills, …).

Fix: middleware now calls get_current_user_from_request() strictly
and catches the resulting HTTPException to render a 401 with the
proper fine-grained error code (token_invalid, token_expired,
user_not_found, …). On success it stamps request.state.user and
the contextvar so repository-layer owner filters work downstream.

The 4 old "_with_cookie_passes" tests in test_auth_middleware.py
were written for the presence-only behaviour; they asserted that
a junk cookie would make the handler return 200. They are renamed
to "_with_junk_cookie_rejected" and their assertions flipped to
401. The negative path (no cookie → 401 not_authenticated)
is unchanged.

Verified:
  no cookie       → 401 not_authenticated
  junk cookie     → 401 token_invalid     (the fixed bug)
  expired cookie  → 401 token_expired

Tests: 284 passed (auth + persistence + isolation)
Lint: clean

* security(auth): wire @require_permission(owner_check=True) on isolation routes

Apply the require_permission decorator to all 28 routes that take a
{thread_id} path parameter. Combined with the strict middleware
(previous commit), this gives the double-layer protection that
AUTH_TEST_PLAN test 7.5.9 documents:

  Layer 1 (AuthMiddleware): cookie + JWT validation, rejects junk
                            cookies and stamps request.state.user
  Layer 2 (@require_permission with owner_check=True): per-resource
                            ownership verification via
                            ThreadMetaStore.check_access — returns
                            404 if a different user owns the thread

The decorator's owner_check branch is rewritten to use the SQL
thread_meta_repo (the 2.0-rc persistence layer) instead of the
LangGraph store path that PR #1728 used (_store_get / get_store
in routers/threads.py). The inject_record convenience is dropped
— no caller in 2.0 needs the LangGraph blob, and the SQL repo has
a different shape.

Routes decorated (28 total):
- threads.py: delete, patch, get, get-state, post-state, post-history
- thread_runs.py: post-runs, post-runs-stream, post-runs-wait,
  list_runs, get_run, cancel_run, join_run, stream_existing_run,
  list_thread_messages, list_run_messages, list_run_events,
  thread_token_usage
- feedback.py: create, list, stats, delete
- uploads.py: upload (added Request param), list, delete
- artifacts.py: get_artifact
- suggestions.py: generate (renamed body parameter to avoid
  conflict with FastAPI Request)

Test fixes:
- test_suggestions_router.py: bypass the decorator via __wrapped__
  (the unit tests cover parsing logic, not auth — no point spinning
  up a thread_meta_repo just to test JSON unwrapping)
- test_auth_middleware.py 4 fake-cookie tests: already updated in
  the previous commit (745bf432)

Tests: 293 passed (auth + persistence + isolation + suggestions)
Lint: clean

* security(auth): defense-in-depth fixes from release validation pass

Eight findings caught while running the AUTH_TEST_PLAN end-to-end against
the deployed sg_dev stack. Each is a pre-condition for shipping
release/2.0-rc that the previous PRs missed.

Backend hardening
- routers/auth.py: rate limiter X-Real-IP now requires AUTH_TRUSTED_PROXIES
  whitelist (CIDR/IP allowlist). Without nginx in front, the previous code
  honored arbitrary X-Real-IP, letting an attacker rotate the header to
  fully bypass the per-IP login lockout.
- routers/auth.py: 36-entry common-password blocklist via Pydantic
  field_validator on RegisterRequest + ChangePasswordRequest. The shared
  _validate_strong_password helper keeps the constraint in one place.
- routers/threads.py: ThreadCreateRequest + ThreadPatchRequest strip
  server-reserved metadata keys (owner_id, user_id) via Pydantic
  field_validator so a forged value can never round-trip back to other
  clients reading the same thread. The actual ownership invariant stays
  on the threads_meta row; this closes the metadata-blob echo gap.
- authz.py + thread_meta/sql.py: require_permission gains a require_existing
  flag plumbed through check_access(require_existing=True). Destructive
  routes (DELETE/PATCH/state-update/runs/feedback) now treat a missing
  thread_meta row as 404 instead of "untracked legacy thread, allow",
  closing the cross-user delete-idempotence gap where any user could
  successfully DELETE another user's deleted thread.
- repositories/sqlite.py + base.py: update_user raises UserNotFoundError
  on a vanished row instead of silently returning the input. Concurrent
  delete during password reset can no longer look like a successful update.
- runtime/user_context.py: resolve_owner_id() coerces User.id (UUID) to
  str at the contextvar boundary so SQLAlchemy String(64) columns can
  bind it. The whole 2.0-rc isolation pipeline was previously broken
  end-to-end (POST /api/threads → 500 "type 'UUID' is not supported").
- persistence/engine.py: SQLAlchemy listener enables PRAGMA journal_mode=WAL,
  synchronous=NORMAL, foreign_keys=ON on every new SQLite connection.
  TC-UPG-06 in the test plan expects WAL; previous code shipped with the
  default 'delete' journal.
- auth_middleware.py: stamp request.state.auth = AuthContext(...) so
  @require_permission's short-circuit fires; previously every isolation
  request did a duplicate JWT decode + users SELECT. Also unifies the
  401 payload through AuthErrorResponse(...).model_dump().
- app.py: _ensure_admin_user restructure removes the noqa F821 scoping
  bug where 'password' was referenced outside the branch that defined it.
  New _announce_credentials helper absorbs the duplicate log block in
  the fresh-admin and reset-admin branches.

* fix(frontend+nginx): rollout CSRF on every state-changing client path

The frontend was 100% broken in gateway-pro mode for any user trying to
open a specific chat thread. Three cumulative bugs each silently
masked the next.

LangGraph SDK CSRF gap (api-client.ts)
- The Client constructor took only apiUrl, no defaultHeaders, no fetch
  interceptor. The SDK's internal fetch never sent X-CSRF-Token, so
  every state-changing /api/langgraph-compat/* call (runs/stream,
  threads/search, threads/{tid}/history, ...) hit CSRFMiddleware and
  got 403 before reaching the auth check. UI symptom: empty thread page
  with no error message; the SPA's hooks swallowed the rejection.
- Fix: pass an onRequest hook that injects X-CSRF-Token from the
  csrf_token cookie per request. Reading the cookie per call (not at
  construction time) handles login / logout / password-change cookie
  rotation transparently. The SDK's prepareFetchOptions calls
  onRequest for both regular requests AND streaming/SSE/reconnect, so
  the same hook covers runs.stream and runs.joinStream.

Raw fetch CSRF gap (7 files)
- Audit: 11 frontend fetch sites, only 2 included CSRF (login/setup +
  account-settings change-password). The other 7 routed through raw
  fetch() with no header — suggestions, memory, agents, mcp, skills,
  uploads, and the local thread cleanup hook all 403'd silently.
- Fix: enhance fetcher.ts:fetchWithAuth to auto-inject X-CSRF-Token on
  POST/PUT/DELETE/PATCH from a single shared readCsrfCookie() helper.
  Convert all 7 raw fetch() callers to fetchWithAuth so the contract
  is centrally enforced. api-client.ts and fetcher.ts share
  readCsrfCookie + STATE_CHANGING_METHODS to avoid drift.

nginx routing + buffering (nginx.local.conf)
- The auth feature shipped without updating the nginx config: per-API
  explicit location blocks but no /api/v1/auth/, /api/feedback, /api/runs.
  The frontend's client-side fetches to /api/v1/auth/login/local 404'd
  from the Next.js side because nginx routed /api/* to the frontend.
- Fix: add catch-all `location /api/` that proxies to the gateway.
  nginx longest-prefix matching keeps the explicit blocks (/api/models,
  /api/threads regex, /api/langgraph/, ...) winning for their paths.
- Fix: disable proxy_buffering + proxy_request_buffering for the
  frontend `location /` block. Without it, nginx tries to spool large
  Next.js chunks into /var/lib/nginx/proxy (root-owned) and fails with
  Permission denied → ERR_INCOMPLETE_CHUNKED_ENCODING → ChunkLoadError.

* test(auth): release-validation test infra and new coverage

Test fixtures and unit tests added during the validation pass.

Router test helpers (NEW: tests/_router_auth_helpers.py)
- make_authed_test_app(): builds a FastAPI test app with a stub
  middleware that stamps request.state.user + request.state.auth and a
  permissive thread_meta_repo mock. TestClient-based router tests
  (test_artifacts_router, test_threads_router) use it instead of bare
  FastAPI() so the new @require_permission(owner_check=True) decorators
  short-circuit cleanly.
- call_unwrapped(): walks the __wrapped__ chain to invoke the underlying
  handler without going through the authz wrappers. Direct-call tests
  (test_uploads_router) use it. Typed with ParamSpec so the wrapped
  signature flows through.

Backend test additions
- test_auth.py: 7 tests for the new _get_client_ip trust model (no
  proxy / trusted proxy / untrusted peer / XFF rejection / invalid
  CIDR / no client). 5 tests for the password blocklist (literal,
  case-insensitive, strong password accepted, change-password binding,
  short-password length-check still fires before blocklist).
  test_update_user_raises_when_row_concurrently_deleted: closes a
  shipped-without-coverage gap on the new UserNotFoundError contract.
- test_thread_meta_repo.py: 4 tests for check_access(require_existing=True)
  — strict missing-row denial, strict owner match, strict owner mismatch,
  strict null-owner still allowed (shared rows survive the tightening).
- test_ensure_admin.py: 3 tests for _migrate_orphaned_threads /
  _iter_store_items pagination, covering the TC-UPG-02 upgrade story
  end-to-end via mock store. Closes the gap where the cursor pagination
  was untested even though the previous PR rewrote it.
- test_threads_router.py: 5 tests for _strip_reserved_metadata
  (owner_id removal, user_id removal, safe-keys passthrough, empty
  input, both-stripped).
- test_auth_type_system.py: replace "password123" fixtures with
  Tr0ub4dor3a / AnotherStr0ngPwd! so the new password blocklist
  doesn't reject the test data.

* docs(auth): refresh TC-DOCKER-05 + document Docker validation gap

- AUTH_TEST_PLAN.md TC-DOCKER-05: the previous expectation
  ("admin password visible in docker logs") was stale after the simplify
  pass that moved credentials to a 0600 file. The grep "Password:" check
  would have silently failed and given a false sense of coverage. New
  expectation matches the actual file-based path: 0600 file in
  DEER_FLOW_HOME, log shows the path (not the secret), reverse-grep
  asserts no leaked password in container logs.
- NEW: docs/AUTH_TEST_DOCKER_GAP.md documents the only un-executed
  block in the test plan (TC-DOCKER-01..06). Reason: sg_dev validation
  host has no Docker daemon installed. The doc maps each Docker case
  to an already-validated bare-metal equivalent (TC-1.1, TC-REENT-01,
  TC-API-02 etc.) so the gap is auditable, and includes pre-flight
  reproduction steps for whoever has Docker available.

---------

Co-authored-by: greatmengqi <chenmengqi.0376@bytedance.com>

* refactor(persistence): unify SQLite to single deerflow.db and move checkpointer to runtime

Merge checkpoints.db and app.db into a single deerflow.db file (WAL mode
handles concurrent access safely). Move checkpointer module from
agents/checkpointer to runtime/checkpointer to better reflect its role
as a runtime infrastructure concern.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(persistence): rename owner_id to user_id and thread_meta_repo to thread_store

Rename owner_id to user_id across all persistence models, repositories,
stores, routers, and tests for clearer semantics. Rename thread_meta_repo
to thread_store for consistency with run_store/run_event_store naming.
Add ThreadMetaStore return type annotation to get_thread_store().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(persistence): unify ThreadMetaStore interface with user isolation and factory

Add user_id parameter to all ThreadMetaStore abstract methods. Implement
owner isolation in MemoryThreadMetaStore with _get_owned_record helper.
Add check_access to base class and memory implementation. Add
make_thread_store factory to simplify deps.py initialization. Add
memory-backend isolation tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(feedback): add UNIQUE(thread_id, run_id, user_id) constraint

Add UNIQUE constraint to FeedbackRow to enforce one feedback per user per run,
enabling upsert behavior in Task 2. Update tests to use distinct user_ids for
multiple feedback records per run, and pass user_id=None to list_by_run for
admin-style queries that bypass user isolation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(feedback): add upsert() method with UNIQUE enforcement

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(feedback): add delete_by_run() and list_by_thread_grouped()

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(feedback): add PUT upsert and DELETE-by-run endpoints

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(feedback): enrich messages endpoint with per-run feedback data

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(feedback): add frontend feedback API client

Adds upsertFeedback and deleteFeedback API functions backed by
fetchWithAuth, targeting the /api/threads/{id}/runs/{id}/feedback
endpoint.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(feedback): wire feedback data into message rendering for history echo

Adds useThreadFeedback hook that fetches run-level feedback from the
messages API and builds a runId->FeedbackData map. MessageList now calls
this hook and passes feedback and runId to each MessageListItem so
previously-submitted thumbs are pre-filled when revisiting a thread.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(feedback): correct run_id mapping for feedback echo

The feedbackMap was keyed by run_id but looked up by LangGraph message ID.
Fixed by tracking AI message ordinal index to correlate event store
run_ids with LangGraph SDK messages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(feedback): use real threadId and refresh after stream

- Pass threadId prop to MessageListItem instead of reading "new" from URL params
- Invalidate thread-feedback query on stream finish so buttons appear immediately
- Show feedback buttons always visible, copy button on hover only

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style(feedback): group copy and feedback buttons together on the left

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style(feedback): always show toolbar buttons without hover

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(persistence): stream hang when run_events.backend=db

DbRunEventStore._user_id_from_context() returned user.id without
coercing it to str. User.id is a Pydantic UUID, and aiosqlite cannot
bind a raw UUID object to a VARCHAR column, so the INSERT for the
initial human_message event silently rolled back and raised out of
the worker task. Because that put() sat outside the worker's try
block, the finally-clause that publishes end-of-stream never ran
and the SSE stream hung forever.

jsonl mode was unaffected because json.dumps(default=str) coerces
UUID objects transparently.

Fixes:
- db.py: coerce user.id to str at the context-read boundary (matches
  what resolve_user_id already does for the other repositories)
- worker.py: move RunJournal init + human_message put inside the try
  block so any failure flows through the finally/publish_end path
  instead of hanging the subscriber

Defense-in-depth:
- engine.py: add PRAGMA busy_timeout=5000 so checkpointer and event
  store wait for each other on the shared deerflow.db file instead
  of failing immediately under write-lock contention
- journal.py: skip fire-and-forget _flush_sync when a previous flush
  task is still in flight, to avoid piling up concurrent put_batch
  writes on the same SQLAlchemy engine during streaming; flush() now
  waits for pending tasks before draining the buffer
- database_config.py: doc-only update clarifying WAL + busy_timeout
  keep the unified deerflow.db safe for both workloads

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore(persistence): drop redundant busy_timeout PRAGMA

Python's sqlite3 driver defaults to a 5-second busy timeout via the
``timeout`` kwarg of ``sqlite3.connect``, and aiosqlite + SQLAlchemy's
aiosqlite dialect inherit that default. Setting ``PRAGMA busy_timeout=5000``
explicitly was a no-op — verified by reading back the PRAGMA on a fresh
connection (it already reports 5000ms without our PRAGMA).

Concurrent stress test (50 checkpoint writes + 20 event batches + 50
thread_meta updates on the same deerflow.db) still completes with zero
errors and 200/200 rows after removing the explicit PRAGMA.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(journal): unwrap Command tool results in on_tool_end

Tools that update graph state (e.g. ``present_files``) return
``Command(update={'messages': [ToolMessage(...)], 'artifacts': [...]})``.
LangGraph later unwraps the inner ``ToolMessage`` into checkpoint state,
but ``RunJournal.on_tool_end`` was receiving the ``Command`` object
directly via the LangChain callback chain and storing
``str(Command(update={...}))`` as the tool_result content.

This produced a visible divergence between the event-store and the
checkpoint for any thread that used a Command-returning tool, blocking
the event-store-backed history fix in the follow-up commit. Concrete
example from thread ``6d30913e-dcd4-41c8-8941-f66c716cf359`` (seq=48):
checkpoint had ``'Successfully presented files'`` while event_store
stored the full Command repr.

The fix detects ``Command`` in ``on_tool_end``, extracts the first
``ToolMessage`` from ``update['messages']``, and lets the existing
ToolMessage branch handle the ``model_dump()`` path. Legacy rows still
containing the Command repr are separately cleaned up by the history
helper in the follow-up commit.

Tests:
- ``test_tool_end_unwraps_command_with_inner_tool_message`` — unit test
  of the unwrap branch with a constructed Command
- ``test_tool_invoke_end_to_end_unwraps_command`` — end-to-end via
  ``CallbackManager`` + ``tool.invoke`` to exercise the real LangChain
  dispatch path that production uses, matching the repro shape from
  ``present_files``
- Counter-proof: temporarily reverted the patch, both tests failed with
  the exact ``Command(update={...})`` repr that was stored in the
  production SQLite row at seq=48, confirming LangChain does pass the
  ``Command`` through callbacks (the unwrap is load-bearing)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(threads): load history messages from event store, immune to summarize

``get_thread_history`` and ``get_thread_state`` in Gateway mode read
messages from ``checkpoint.channel_values["messages"]``. After
SummarizationMiddleware runs mid-run, that list is rewritten in-place:
pre-summarize messages are dropped and a synthetic summary-as-human
message takes position 0. The frontend then renders a chat history that
starts with ``"Here is a summary of the conversation to date:..."``
instead of the user's original query, and all earlier turns are gone.

The event store (``RunEventStore``) is append-only and never rewritten,
so it retains the full transcript. This commit adds a helper
``_get_event_store_messages`` that loads the event store's message
stream and overrides ``values["messages"]`` in both endpoints; the
checkpoint fallback kicks in only when the event store is unavailable.

Behavior contract of the helper:

- **Full pagination.** ``list_messages`` returns the newest ``limit``
  records when no cursor is given, so a fixed limit silently drops
  older messages on long threads. The helper sizes the read from
  ``count_messages()`` and pages forward with ``after_seq`` cursors.
- **Copy-on-read.** Each content dict is copied before ``id`` is
  patched so the live store object (``MemoryRunEventStore`` returns
  references) is never mutated.
- **Stable ids.** Messages with ``id=None`` (human + tool_result,
  which don't receive an id until checkpoint persistence) get a
  deterministic ``uuid5(NAMESPACE_URL, f"{thread_id}:{seq}")`` so
  React keys stay stable across requests. AI messages keep their
  LLM-assigned ``lc_run--*`` ids.
- **Legacy ``Command`` repr sanitization.** Rows captured before the
  ``journal.py`` ``on_tool_end`` fix (previous commit) stored
  ``str(Command(update={'messages': [ToolMessage(content='X', ...)]}))``
  as the tool_result content. ``_sanitize_legacy_command_repr``
  regex-extracts the inner text so old threads render cleanly.
- **Inline feedback.** When loading the stream, the helper also pulls
  ``feedback_repo.list_by_thread_grouped`` and attaches ``run_id`` to
  every message plus ``feedback`` to the final ``ai_message`` of each
  run. This removes the frontend's need to fetch a second endpoint
  and positional-index-map its way back to the right run. When the
  feedback subsystem is unavailable, the ``feedback`` field is left
  absent entirely so the frontend hides the button rather than
  rendering it over a broken write path.
- **User context.** ``DbRunEventStore`` is user-scoped by default via
  ``resolve_user_id(AUTO)``. The helper relies on the ``@require_permission``
  decorator having populated the user contextvar on both callers; the
  docstring documents this dependency explicitly so nobody wires it
  into a CLI or migration script without passing ``user_id=None``.

Real data verification against thread
``6d30913e-dcd4-41c8-8941-f66c716cf359``: checkpoint showed 12 messages
(summarize-corrupted), event store had 16. The original human message
``"最新伊美局势"`` was preserved as seq=1 in the event store and
correctly restored to position 0 in the helper output. Helper output
for AI messages was byte-identical to checkpoint for every overlapping
message; only tool_result ids differed (patched to uuid5) and the
legacy Command repr at seq=48 was sanitized.

Tests:
- ``test_thread_state_event_store.py`` — 18 tests covering
  ``_sanitize_legacy_command_repr`` (passthrough, single/double-quote
  extraction, unparseable fallback), helper happy path (all message
  types, stable uuid5, store non-mutation), multi-page pagination,
  summarize regression (recovers pre-summarize messages), feedback
  attachment (per-run, multi-run threads, repo failure graceful),
  and dependency failure fallback to ``None``.

Docs:
- ``docs/superpowers/plans/2026-04-10-event-store-history.md`` — the
  implementation plan this commit realizes, with Task 1 revised after
  the evaluation findings (pagination, copy-on-read, Command wrap
  already landed in journal.py, frontend feedback pagination in the
  follow-up commit, Standard-mode follow-up noted).
- ``docs/superpowers/specs/2026-04-11-runjournal-history-evaluation.md``
  — the Claude + second-opinion evaluation document that drove the
  plan revisions (pagination bug, dict-mutation bug, feedback hidden
  bug, Command bug).
- ``docs/superpowers/specs/2026-04-11-summarize-marker-design.md`` —
  design for a follow-up PR that visually marks summarize events in
  history, based on a verified ``adispatch_custom_event`` experiment
  (``trace=False`` middleware nodes can still forward the Pregel task
  config via explicit signature injection).

Scope: Gateway mode only (``make dev-pro``). Standard mode
(``make dev``) hits LangGraph Server directly and bypasses these
endpoints; the summarize symptom is still present there and is
tracked as a separate follow-up in the plan.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(feedback): inline feedback on history and drop positional mapping

The old ``useThreadFeedback`` hook loaded ``GET /api/threads/{id}/messages?limit=200``
and built two parallel lookup tables: ``runIdByAiIndex`` (an ordinal array of
run_ids for every ``ai_message``-typed event) and ``feedbackByRunId``. The render
loop in ``message-list.tsx`` walked the AI messages in order, incrementing
``aiMessageIndex`` on each non-human message, and used that ordinal to look up
the run_id and feedback.

This shape had three latent bugs we could observe on real threads:

1. **Fetch was capped at 200 messages.** Long or tool-heavy threads silently
   dropped earlier entries from the map, so feedback buttons could be missing
   on messages they should own.
2. **Ordinal mismatch.** The render loop counted every non-human message
   (including each intermediate ``ai_tool_call``), but ``runIdByAiIndex`` only
   pushed entries for ``event_type == "ai_message"``. A run with 3 tool_calls
   + 1 final AI message would push 1 entry while the render consumed 4
   positions, so buttons mapped to the wrong positions across multi-run
   threads.
3. **Two parallel data paths.** The ``/history`` render path and the
   ``/messages`` feedback-lookup path could drift in-between an
   ``invalidateQueries`` call and the next refetch, producing transient
   mismaps.

The previous commit moved the authoritative message source for history to
the event store and added ``run_id`` + ``feedback`` inline on each message
dict returned by ``_get_event_store_messages``. This commit aligns the
frontend with that contract:

- **Delete** ``useThreadFeedback``, ``ThreadFeedbackData``,
  ``runIdByAiIndex``, ``feedbackByRunId``, and ``fetchAllThreadMessages``.
- **Introduce** ``useThreadMessageEnrichment`` that fetches
  ``POST /history?limit=1`` once, indexes the returned messages by
  ``message.id`` into a ``Map<id, {run_id, feedback?}>``, and invalidates
  on stream completion (``onFinish`` in ``useThreadStream``). Keying by
  ``message.id`` is stable across runs, tool_call chains, and summarize.
- **Simplify** ``message-list.tsx`` to drop the ``aiMessageIndex``
  counter and read ``enrichment?.get(msg.id)`` at each render step.
- **Rewire** ``message-list-item.tsx`` so the feedback button renders
  when ``feedback !== undefined`` rather than when the message happens
  to be non-human. ``feedback`` is ``undefined`` for non-eligible
  messages (humans, non-final AI, tools), ``null`` for the final
  ai_message of an unrated run, and a ``FeedbackData`` object once
  rated — cleanly distinguishing "not eligible" from "eligible but
  unrated".

``/api/threads/{id}/messages`` is kept as a debug/export surface; no
frontend code calls it anymore but the backend router is untouched.

Validation:
- ``pnpm check`` clean (0 errors, 1 pre-existing unrelated warning)
- Live test on thread ``3d5dea4a`` after gateway restart confirmed the
  original user query is restored to position 0 and the feedback
  button behaves correctly on the final AI message.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(rebase): remove duplicate definitions and update stale module paths

Rebase left duplicate function blocks in worker.py (triple human_message
write causing 3x user messages in /history), deps.py, and prompt.py.
Also update checkpointer imports from the old deerflow.agents.checkpointer
path to deerflow.runtime.checkpointer, and clean up orphaned feedback
props in the frontend message components.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(rebase): restore FeedbackButtons component and enrichment lost during rebase

The FeedbackButtons component (defined inline in message-list-item.tsx)
was introduced in commit 95df8d13 but lost during rebase. The previous
rebase cleanup commit incorrectly removed the feedback/runId props and
enrichment hook as "orphaned code" instead of restoring the missing
component. This commit restores:

- FeedbackButtons component with thumbs up/down toggle and optimistic state
- FeedbackData/upsertFeedback/deleteFeedback imports
- feedback and runId props on MessageListItem
- useThreadMessageEnrichment hook and entry lookup in message-list.tsx

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(user-context): add DEFAULT_USER_ID and get_effective_user_id helper

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(paths): add user-aware path methods with optional user_id parameter

Add _validate_user_id(), user_dir(), user_memory_file(), user_agent_memory_file()
and optional keyword-only user_id parameter to all thread-related path methods.
When user_id is provided, paths resolve under users/{user_id}/threads/{thread_id}/;
when omitted, legacy layout is preserved for backward compatibility.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(memory): add user_id to MemoryStorage interface for per-user isolation

Thread user_id through MemoryStorage.load/reload/save abstract methods and
FileMemoryStorage, re-keying the in-memory cache from bare agent_name to a
(user_id, agent_name) tuple to prevent cross-user cache collisions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(memory): thread user_id through memory updater layer

Add `user_id` keyword-only parameter to all public updater functions
(_save_memory_to_file, get_memory_data, reload_memory_data, import_memory_data,
clear_memory_data, create/delete/update_memory_fact) and regular keyword param
to MemoryUpdater.update_memory + update_memory_from_conversation, propagating
it to every storage load/save/reload call.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(memory): capture user_id at enqueue time for async-safe thread isolation

Add user_id field to ConversationContext and MemoryUpdateQueue.add() so the
user identity is stored explicitly at request time, before threading.Timer
fires on a different thread where ContextVar values do not propagate.
MemoryMiddleware.after_agent() now calls get_effective_user_id() at enqueue
time and passes the value through to updater.update_memory().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(isolation): wire user_id through all Paths and memory callsites

Pass user_id=get_effective_user_id() at every callsite that invokes
Paths methods or memory functions, enabling per-user filesystem isolation
throughout the harness and app layers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(migration): add idempotent script for per-user data migration

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update CLAUDE.md and config docs for per-user isolation

* feat(events): add pagination to list_messages_by_run on all store backends

Replicates the existing before_seq/after_seq/limit cursor-pagination pattern
from list_messages onto list_messages_by_run across the abstract interface,
MemoryRunEventStore, JsonlRunEventStore, and DbRunEventStore.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(api): add GET /api/runs/{run_id}/messages with cursor pagination

New endpoint resolves thread_id from the run record and delegates to
RunEventStore.list_messages_by_run for cursor-based pagination.
Ownership is enforced implicitly via RunStore.get() user filtering.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(api): add GET /api/runs/{run_id}/feedback

Delegates to FeedbackRepository.list_by_run via the existing _resolve_run
helper; includes tests for success, 404, empty list, and 503 (no DB).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(api): retrofit cursor pagination onto GET /threads/{tid}/runs/{rid}/messages

Replace bare list[dict] response with {data: [...], has_more: bool} envelope,
forwarding limit/before_seq/after_seq query params to the event store.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add run-level API endpoints to CLAUDE.md routers table

* refactor(threads): remove event-store message loader and feedback from state/history endpoints

State and history endpoints now return messages purely from the
checkpointer's channel_values. The _get_event_store_messages helper
(which loaded the full event-store transcript with feedback attached)
is removed along with its tests. Frontend will use the dedicated
GET /api/runs/{run_id}/messages and /feedback endpoints instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(persistence): add unified persistence layer with event store, token tracking, and feedback (#1930)

* feat(persistence): add SQLAlchemy 2.0 async ORM scaffold

Introduce a unified database configuration (DatabaseConfig) that
controls both the LangGraph checkpointer and the DeerFlow application
persistence layer from a single `database:` config section.

New modules:
- deerflow.config.database_config — Pydantic config with memory/sqlite/postgres backends
- deerflow.persistence — async engine lifecycle, DeclarativeBase with to_dict mixin, Alembic skeleton
- deerflow.runtime.runs.store — RunStore ABC + MemoryRunStore implementation

Gateway integration initializes/tears down the persistence engine in
the existing langgraph_runtime() context manager. Legacy checkpointer
config is preserved for backward compatibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(persistence): add RunEventStore ABC + MemoryRunEventStore

Phase 2-A prerequisite for event storage: adds the unified run event
stream interface (RunEventStore) with an in-memory implementation,
RunEventsConfig, gateway integration, and comprehensive tests (27 cases).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(persistence): add ORM models, repositories, DB/JSONL event stores, RunJournal, and API endpoints

Phase 2-B: run persistence + event storage + token tracking.

- ORM models: RunRow (with token fields), ThreadMetaRow, RunEventRow
- RunRepository implements RunStore ABC via SQLAlchemy ORM
- ThreadMetaRepository with owner access control
- DbRunEventStore with trace content truncation and cursor pagination
- JsonlRunEventStore with per-run files and seq recovery from disk
- RunJournal (BaseCallbackHandler) captures LLM/tool/lifecycle events,
  accumulates token usage by caller type, buffers and flushes to store
- RunManager now accepts optional RunStore for persistent backing
- Worker creates RunJournal, writes human_message, injects callbacks
- Gateway deps use factory functions (RunRepository when DB available)
- New endpoints: messages, run messages, run events, token-usage
- ThreadCreateRequest gains assistant_id field
- 92 tests pass (33 new), zero regressions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(persistence): add user feedback + follow-up run association

Phase 2-C: feedback and follow-up tracking.

- FeedbackRow ORM model (rating +1/-1, optional message_id, comment)
- FeedbackRepository with CRUD, list_by_run/thread, aggregate stats
- Feedback API endpoints: create, list, stats, delete
- follow_up_to_run_id in RunCreateRequest (explicit or auto-detected
  from latest successful run on the thread)
- Worker writes follow_up_to_run_id into human_message event metadata
- Gateway deps: feedback_repo factory + getter
- 17 new tests (14 FeedbackRepository + 3 follow-up association)
- 109 total tests pass, zero regressions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test+config: comprehensive Phase 2 test coverage + deprecate checkpointer config

- config.example.yaml: deprecate standalone checkpointer section, activate
  unified database:sqlite as default (drives both checkpointer + app data)
- New: test_thread_meta_repo.py (14 tests) — full ThreadMetaRepository coverage
  including check_access owner logic, list_by_owner pagination
- Extended test_run_repository.py (+4 tests) — completion preserves fields,
  list ordering desc, limit, owner_none returns all
- Extended test_run_journal.py (+8 tests) — on_chain_error, track_tokens=false,
  middleware no ai_message, unknown caller tokens, convenience fields,
  tool_error, non-summarization custom event
- Extended test_run_event_store.py (+7 tests) — DB batch seq continuity,
  make_run_event_store factory (memory/db/jsonl/fallback/unknown)
- Extended test_phase2b_integration.py (+4 tests) — create_or_reject persists,
  follow-up metadata, summarization in history, full DB-backed lifecycle
- Fixed DB integration test to use proper fake objects (not MagicMock)
  for JSON-serializable metadata
- 157 total Phase 2 tests pass, zero regressions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* config: move default sqlite_dir to .deer-flow/data

Keep SQLite databases alongside other DeerFlow-managed data
(threads, memory) under the .deer-flow/ directory instead of a
top-level ./data folder.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(persistence): remove UTFJSON, use engine-level json_serializer + datetime.now()

- Replace custom UTFJSON type with standard sqlalchemy.JSON in all ORM
  models. Add json_serializer=json.dumps(ensure_ascii=False) to all
  create_async_engine calls so non-ASCII text (Chinese etc.) is stored
  as-is in both SQLite and Postgres.
- Change ORM datetime defaults from datetime.now(UTC) to datetime.now(),
  remove UTC imports.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(gateway): simplify deps.py with getter factory + inline repos

- Replace 6 identical getter functions with _require() factory.
- Inline 3 _make_*_repo() factories into langgraph_runtime(), call
  get_session_factory() once instead of 3 times.
- Add thread_meta upsert in start_run (services.py).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(docker): add UV_EXTRAS build arg for optional dependencies

Support installing optional dependency groups (e.g. postgres) at
Docker build time via UV_EXTRAS build arg:
  UV_EXTRAS=postgres docker compose build

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(journal): fix flush, token tracking, and consolidate tests

RunJournal fixes:
- _flush_sync: retain events in buffer when no event loop instead of
  dropping them; worker's finally block flushes via async flush().
- on_llm_end: add tool_calls filter and caller=="lead_agent" guard for
  ai_message events; mark message IDs for dedup with record_llm_usage.
- worker.py: persist completion data (tokens, message count) to RunStore
  in finally block.

Model factory:
- Auto-inject stream_usage=True for BaseChatOpenAI subclasses with
  custom api_base, so usage_metadata is populated in streaming responses.

Test consolidation:
- Delete test_phase2b_integration.py (redundant with existing tests).
- Move DB-backed lifecycle test into test_run_journal.py.
- Add tests for stream_usage injection in test_model_factory.py.
- Clean up executor/task_tool dead journal references.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(events): widen content type to str|dict in all store backends

Allow event content to be a dict (for structured OpenAI-format messages)
in addition to plain strings. Dict values are JSON-serialized for the DB
backend and deserialized on read; memory and JSONL backends handle dicts
natively. Trace truncation now serializes dicts to JSON before measuring.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(events): use metadata flag instead of heuristic for dict content detection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(converters): add LangChain-to-OpenAI message format converters

Pure functions langchain_to_openai_message, langchain_to_openai_completion,
langchain_messages_to_openai, and _infer_finish_reason for converting
LangChain BaseMessage objects to OpenAI Chat Completions format, used by
RunJournal for event storage. 15 unit tests added.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(converters): handle empty list content as null, clean up test

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(events): human_message content uses OpenAI user message format

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(events): ai_message uses OpenAI format, add ai_tool_call message event

- ai_message content now uses {"role": "assistant", "content": "..."} format
- New ai_tool_call message event emitted when lead_agent LLM responds with tool_calls
- ai_tool_call uses langchain_to_openai_message converter for consistent format
- Both events include finish_reason in metadata ("stop" or "tool_calls")

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(events): add tool_result message event with OpenAI tool message format

Cache tool_call_id from on_tool_start keyed by run_id as fallback for on_tool_end,
then emit a tool_result message event (role=tool, tool_call_id, content) after each
successful tool completion.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(events): summary content uses OpenAI system message format

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(events): replace llm_start/llm_end with llm_request/llm_response in OpenAI format

Add on_chat_model_start to capture structured prompt messages as llm_request events.
Replace llm_end trace events with llm_response using OpenAI Chat Completions format.
Track llm_call_index to pair request/response events.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(events): add record_middleware method for middleware trace events

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test(events): add full run sequence integration test for OpenAI content format

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(events): align message events with checkpoint format and add middleware tag injection

- Message events (ai_message, ai_tool_call, tool_result, human_message) now use
  BaseMessage.model_dump() format, matching LangGraph checkpoint values.messages
- on_tool_end extracts tool_call_id/name/status from ToolMessage objects
- on_tool_error now emits tool_result message events with error status
- record_middleware uses middleware:{tag} event_type and middleware category
- Summarization custom events use middleware:summarize category
- TitleMiddleware injects middleware:title tag via get_config() inheritance
- SummarizationMiddleware model bound with middleware:summarize tag
- Worker writes human_message using HumanMessage.model_dump()

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(threads): switch search endpoint to threads_meta table and sync title

- POST /api/threads/search now queries threads_meta table directly,
  removing the two-phase Store + Checkpointer scan approach
- Add ThreadMetaRepository.search() with metadata/status filters
- Add ThreadMetaRepository.update_display_name() for title sync
- Worker syncs checkpoint title to threads_meta.display_name on run completion
- Map display_name to values.title in search response for API compatibility

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(threads): history endpoint reads messages from event store

- POST /api/threads/{thread_id}/history now combines two data sources:
  checkpointer for checkpoint_id, metadata, title, thread_data;
  event store for messages (complete history, not truncated by summarization)
- Strip internal LangGraph metadata keys from response
- Remove full channel_values serialization in favor of selective fields

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: remove duplicate optional-dependencies header in pyproject.toml

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(middleware): pass tagged config to TitleMiddleware ainvoke call

Without the config, the middleware:title tag was not injected,
causing the LLM response to be recorded as a lead_agent ai_message
in run_events.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: resolve merge conflict in .env.example

Keep both DATABASE_URL (from persistence-scaffold) and WECOM
credentials (from main) after the merge.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(persistence): address review feedback on PR #1851

- Fix naive datetime.now() → datetime.now(UTC) in all ORM models
- Fix seq race condition in DbRunEventStore.put() with FOR UPDATE
  and UNIQUE(thread_id, seq) constraint
- Encapsulate _store access in RunManager.update_run_completion()
- Deduplicate _store.put() logic in RunManager via _persist_to_store()
- Add update_run_completion to RunStore ABC + MemoryRunStore
- Wire follow_up_to_run_id through the full create path
- Add error recovery to RunJournal._flush_sync() lost-event scenario
- Add migration note for search_threads breaking change
- Fix test_checkpointer_none_fix mock to set database=None

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: update uv.lock

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(persistence): address 22 review comments from CodeQL, Copilot, and Code Quality

Bug fixes:
- Sanitize log params to prevent log injection (CodeQL)
- Reset threads_meta.status to idle/error when run completes
- Attach messages only to latest checkpoint in /history response
- Write threads_meta on POST /threads so new threads appear in search

Lint fixes:
- Remove unused imports (journal.py, migrations/env.py, test_converters.py)
- Convert lambda to named function (engine.py, Ruff E731)
- Remove unused logger definitions in repos (Ruff F841)
- Add logging to JSONL decode errors and empty except blocks
- Separate assert side-effects in tests (CodeQL)
- Remove unused local variables in tests (Ruff F841)
- Fix max_trace_content truncation to use byte length, not char length

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style: apply ruff format to persistence and runtime files

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Potential fix for pull request finding 'Statement has no effect'

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

* refactor(runtime): introduce RunContext to reduce run_agent parameter bloat

Extract checkpointer, store, event_store, run_events_config, thread_meta_repo,
and follow_up_to_run_id into a frozen RunContext dataclass. Add get_run_context()
in deps.py to build the base context from app.state singletons. start_run() uses
dataclasses.replace() to enrich per-run fields before passing ctx to run_agent.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(gateway): move sanitize_log_param to app/gateway/utils.py

Extract the log-injection sanitizer from routers/threads.py into a shared
utils module and rename to sanitize_log_param (public API). Eliminates the
reverse service → router import in services.py.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* perf: use SQL aggregation for feedback stats and thread token usage

Replace Python-side counting in FeedbackRepository.aggregate_by_run with
a single SELECT COUNT/SUM query. Add RunStore.aggregate_tokens_by_thread
abstract method with SQL GROUP BY implementation in RunRepository and
Python fallback in MemoryRunStore. Simplify the thread_token_usage
endpoint to delegate to the new method, eliminating the limit=10000
truncation risk.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: annotate DbRunEventStore.put() as low-frequency path

Add docstring clarifying that put() opens a per-call transaction with
FOR UPDATE and should only be used for infrequent writes (currently
just the initial human_message event). High-throughput callers should
use put_batch() instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(threads): fall back to Store search when ThreadMetaRepository is unavailable

When database.backend=memory (default) or no SQL session factory is
configured, search_threads now queries the LangGraph Store instead of
returning 503. Returns empty list if neither Store nor repo is available.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(persistence): introduce ThreadMetaStore ABC for backend-agnostic thread metadata

Add ThreadMetaStore abstract base class with create/get/search/update/delete
interface. ThreadMetaRepository (SQL) now inherits from it. New
MemoryThreadMetaStore wraps LangGraph BaseStore for memory-mode deployments.

deps.py now always provides a non-None thread_meta_repo, eliminating all
`if thread_meta_repo is not None` guards in services.py, worker.py, and
routers/threads.py. search_threads no longer needs a Store fallback branch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(history): read messages from checkpointer instead of RunEventStore

The /history endpoint now reads messages directly from the
checkpointer's channel_values (the authoritative source) instead of
querying RunEventStore.list_messages(). The RunEventStore API is
preserved for other consumers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(persistence): address new Copilot review comments

- feedback.py: validate thread_id/run_id before deleting feedback
- jsonl.py: add path traversal protection with ID validation
- run_repo.py: parse `before` to datetime for PostgreSQL compat
- thread_meta_repo.py: fix pagination when metadata filter is active
- database_config.py: use resolve_path for sqlite_dir consistency

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Implement skill self-evolution and skill_manage flow (#1874)

* chore: ignore .worktrees directory

* Add skill_manage self-evolution flow

* Fix CI regressions for skill_manage

* Address PR review feedback for skill evolution

* fix(skill-evolution): preserve history on delete

* fix(skill-evolution): tighten scanner fallbacks

* docs: add skill_manage e2e evidence screenshot

* fix(skill-manage): avoid blocking fs ops in session runtime

---------

Co-authored-by: Willem Jiang <willem.jiang@gmail.com>

* fix(config): resolve sqlite_dir relative to CWD, not Paths.base_dir

resolve_path() resolves relative to Paths.base_dir (.deer-flow),
which double-nested the path to .deer-flow/.deer-flow/data/app.db.
Use Path.resolve() (CWD-relative) instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Feature/feishu receive file (#1608)

* feat(feishu): add channel file materialization hook for inbound messages

- Introduce Channel.receive_file(msg, thread_id) as a base method for file materialization; default is no-op.
- Implement FeishuChannel.receive_file to download files/images from Feishu messages, save to sandbox, and inject virtual paths into msg.text.
- Update ChannelManager to call receive_file for any channel if msg.files is present, enabling downstream model access to user-uploaded files.
- No impact on Slack/Telegram or other channels (they inherit the default no-op).

* style(backend): format code with ruff for lint compliance

- Auto-formatted packages/harness/deerflow/agents/factory.py and tests/test_create_deerflow_agent.py using `ruff format`
- Ensured both files conform to project linting standards
- Fixes CI lint check failures caused by code style issues

* fix(feishu): handle file write operation asynchronously to prevent blocking

* fix(feishu): rename GetMessageResourceRequest to _GetMessageResourceRequest and remove redundant code

* test(feishu): add tests for receive_file method and placeholder replacement

* fix(manager): remove unnecessary type casting for channel retrieval

* fix(feishu): update logging messages to reflect resource handling instead of image

* fix(feishu): sanitize filename by replacing invalid characters in file uploads

* fix(feishu): improve filename sanitization and reorder image key handling in message processing

* fix(feishu): add thread lock to prevent filename conflicts during file downloads

* fix(test): correct bad merge in test_feishu_parser.py

* chore: run ruff and apply formatting cleanup
fix(feishu): preserve rich-text attachment order and improve fallback filename handling

* fix(docker): restore gateway env vars and fix langgraph empty arg issue (#1915)

Two production docker-compose.yaml bugs prevent `make up` from working:

1. Gateway missing DEER_FLOW_CONFIG_PATH and DEER_FLOW_EXTENSIONS_CONFIG_PATH
   environment overrides. Added in fb2d99f (#1836) but accidentally reverted
   by ca2fb95 (#1847). Without them, gateway reads host paths from .env via
   env_file, causing FileNotFoundError inside the container.

2. Langgraph command fails when LANGGRAPH_ALLOW_BLOCKING is unset (default).
   Empty $${allow_blocking} inserts a bare space between flags, causing
   ' --no-reload' to be parsed as unexpected extra argument. Fix by building
   args string first and conditionally appending --allow-blocking.

Co-authored-by: cooper <cooperfu@tencent.com>

* fix(frontend): resolve invalid HTML nesting and tabnabbing vulnerabilities (#1904)

* fix(frontend): resolve invalid HTML nesting and tabnabbing vulnerabilities

Fix `<button>` inside `<a>` invalid HTML in artifact components and add
missing `noopener,noreferrer` to `window.open` calls to prevent reverse
tabnabbing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(frontend): address Copilot review on tabnabbing and double-tab-open

Remove redundant parent onClick on web_fetch ChainOfThoughtStep to
prevent opening two tabs on link click, and explicitly null out
window.opener after window.open() for defensive tabnabbing hardening.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(persistence): organize entities into per-entity directories

Restructure the persistence layer from horizontal "models/ + repositories/"
split into vertical entity-aligned directories. Each entity (thread_meta,
run, feedback) now owns its ORM model, abstract interface (where applicable),
and concrete implementations under a single directory with an aggregating
__init__.py for one-line imports.

Layout:
  persistence/thread_meta/{base,model,sql,memory}.py
  persistence/run/{model,sql}.py
  persistence/feedback/{model,sql}.py

models/__init__.py is kept as a facade so Alembic autogenerate continues to
discover all ORM tables via Base.metadata. RunEventRow remains under
models/run_event.py because its storage implementation lives in
runtime/events/store/db.py and has no matching repository directory.

The repositories/ directory is removed entirely. All call sites in
gateway/deps.py and tests are updated to import from the new entity
packages, e.g.:

    from deerflow.persistence.thread_meta import ThreadMetaRepository
    from deerflow.persistence.run import RunRepository
    from deerflow.persistence.feedback import FeedbackRepository

Full test suite passes (1690 passed, 14 skipped).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(gateway): sync thread rename and delete through ThreadMetaStore

The POST /threads/{id}/state endpoint previously synced title changes
only to the LangGraph Store via _store_upsert. In sqlite mode the search
endpoint reads from the ThreadMetaRepository SQL table, so renames never
appeared in /threads/search until the next agent run completed (worker.py
syncs title from checkpoint to thread_meta in its finally block).

Likewise the DELETE /threads/{id} endpoint cleaned up the filesystem,
Store, and checkpointer but left the threads_meta row orphaned in sqlite,
so deleted threads kept appearing in /threads/search.

Fix both endpoints by routing through the ThreadMetaStore abstraction
which already has the correct sqlite/memory implementations wired up by
deps.py. The rename path now calls update_display_name() and the delete
path calls delete() — both work uniformly across backends.

Verified end-to-end with curl in gateway mode against sqlite backend.
Existing test suite (1690 passed) and focused router/repo tests pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(gateway): route all thread metadata access through ThreadMetaStore

Following the rename/delete bug fix in PR1, migrate the remaining direct
LangGraph Store reads/writes in the threads router and services to the
ThreadMetaStore abstraction so that the sqlite and memory backends behave
identically and the legacy dual-write paths can be removed.

Migrated endpoints (threads.py):
- create_thread: idempotency check + write now use thread_meta_repo.get/create
  instead of dual-writing the LangGraph Store and the SQL row.
- get_thread: reads from thread_meta_repo.get; the checkpoint-only fallback
  for legacy threads is preserved.
- patch_thread: replaced _store_get/_store_put with thread_meta_repo.update_metadata.
- delete_thread_data: dropped the legacy store.adelete; thread_meta_repo.delete
  already covers it.

Removed dead code (services.py):
- _upsert_thread_in_store — redundant with the immediately following
  thread_meta_repo.create() call.
- _sync_thread_title_after_run — worker.py's finally block already syncs
  the title via thread_meta_repo.update_display_name() after each run.

Removed dead code (threads.py):
- _store_get / _store_put / _store_upsert helpers (no remaining callers).
- THREADS_NS constant.
- get_store import (router no longer touches the LangGraph Store directly).

New abstract method:
- ThreadMetaStore.update_metadata(thread_id, metadata) merges metadata into
  the thread's metadata field. Implemented in both ThreadMetaRepository (SQL,
  read-modify-write inside one session) and MemoryThreadMetaStore. Three new
  unit tests cover merge / empty / nonexistent behaviour.

Net change: -134 lines. Full test suite: 1693 passed, 14 skipped.
Verified end-to-end with curl in gateway mode against sqlite backend
(create / patch / get / rename / search / delete).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
Co-authored-by: DanielWalnut <45447813+hetaoBackend@users.noreply.github.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: JilongSun <965640067@qq.com>
Co-authored-by: jie <49781832+stan-fu@users.noreply.github.com>
Co-authored-by: cooper <cooperfu@tencent.com>
Co-authored-by: yangzheli <43645580+yangzheli@users.noreply.github.com>

* feat(auth): release-validation pass for 2.0-rc — 12 blockers + simplify follow-ups (#2008)

* feat(auth): introduce backend auth module

Port RFC-001 authentication core from PR #1728:
- JWT token handling (create_access_token, decode_token, TokenPayload)
- Password hashing (bcrypt) with verify_password
- SQLite UserRepository with base interface
- Provider Factory pattern (LocalAuthProvider)
- CLI reset_admin tool
- Auth-specific errors (AuthErrorCode, TokenError, AuthErrorResponse)

Deps:
- bcrypt>=4.0.0
- pyjwt>=2.9.0
- email-validator>=2.0.0
- backend/uv.toml pins public PyPI index

Tests: 12 pure unit tests (test_auth_config.py, test_auth_errors.py).

Scope note: authz.py, test_auth.py, and test_auth_type_system.py are
deferred to commit 2 because they depend on middleware and deps wiring
that is not yet in place. Commit 1 stays "pure new files only" as the
spec mandates.

* feat(auth): wire auth end-to-end (middleware + frontend replacement)

Backend:
- Port auth_middleware, csrf_middleware, langgraph_auth, routers/auth
- Port authz decorator (owner_filter_key defaults to 'owner_id')
- Merge app.py: register AuthMiddleware + CSRFMiddleware + CORS, add
  _ensure_admin_user lifespan hook, _migrate_orphaned_threads helper,
  register auth router
- Merge deps.py: add get_local_provider, get_current_user_from_request,
  get_optional_user_from_request; keep get_current_user as thin str|None
  adapter for feedback router
- langgraph.json: add auth path pointing to langgraph_auth.py:auth
- Rename metadata['user_id'] -> metadata['owner_id'] in langgraph_auth
  (both metadata write and LangGraph filter dict) + test fixtures

Frontend:
- Delete better-auth library and api catch-all route
- Remove better-auth npm dependency and env vars (BETTER_AUTH_SECRET,
  BETTER_AUTH_GITHUB_*) from env.js
- Port frontend/src/core/auth/* (AuthProvider, gateway-config,
  proxy-policy, server-side getServerSideUser, types)
- Port frontend/src/core/api/fetcher.ts
- Port (auth)/layout, (auth)/login, (auth)/setup pages
- Rewrite workspace/layout.tsx as server component that calls
  getServerSideUser and wraps in AuthProvider
- Port workspace/workspace-content.tsx for the client-side sidebar logic

Tests:
- Port 5 auth test files (test_auth, test_auth_middleware,
  test_auth_type_system, test_ensure_admin, test_langgraph_auth)
- 176 auth tests PASS

After this commit: login/logout/registration flow works, but persistence
layer does not yet filter by owner_id. Commit 4 closes that gap.

* feat(auth): account settings page + i18n

- Port account-settings-page.tsx (change password, change email, logout)
- Wire into settings-dialog.tsx as new "account" section with UserIcon,
  rendered first in the section list
- Add i18n keys:
  - en-US/zh-CN: settings.sections.account ("Account" / "账号")
  - en-US/zh-CN: button.logout ("Log out" / "退出登录")
  - types.ts: matching type declarations

* feat(auth): enforce owner_id across 2.0-rc persistence layer

Add request-scoped contextvar-based owner filtering to threads_meta,
runs, run_events, and feedback repositories. Router code is unchanged
— isolation is enforced at the storage layer so that any caller that
forgets to pass owner_id still gets filtered results, and new routes
cannot accidentally leak data.

Core infrastructure
-------------------
- deerflow/runtime/user_context.py (new):
  - ContextVar[CurrentUser | None] with default None
  - runtime_checkable CurrentUser Protocol (structural subtype with .id)
  - set/reset/get/require helpers
  - AUTO sentinel + resolve_owner_id(value, method_name) for sentinel
    three-state resolution: AUTO reads contextvar, explicit str
    overrides, explicit None bypasses the filter (for migration/CLI)

Repository changes
------------------
- ThreadMetaRepository: create/get/search/update_*/delete gain
  owner_id=AUTO kwarg; read paths filter by owner, writes stamp it,
  mutations check ownership before applying
- RunRepository: put/get/list_by_thread/delete gain owner_id=AUTO kwarg
- FeedbackRepository: create/get/list_by_run/list_by_thread/delete
  gain owner_id=AUTO kwarg
- DbRunEventStore: list_messages/list_events/list_messages_by_run/
  count_messages/delete_by_thread/delete_by_run gain owner_id=AUTO
  kwarg. Write paths (put/put_batch) read contextvar softly: when a
  request-scoped user is available, owner_id is stamped; background
  worker writes without a user context pass None which is valid
  (orphan row to be bound by migration)

Schema
------
- persistence/models/run_event.py: RunEventRow.owner_id = Mapped[
  str | None] = mapped_column(String(64), nullable=True, index=True)
- No alembic migration needed: 2.0 ships fresh, Base.metadata.create_all
  picks up the new column automatically

Middleware
----------
- auth_middleware.py: after cookie check, call get_optional_user_from_
  request to load the real User, stamp it into request.state.user AND
  the contextvar via set_current_user, reset in a try/finally. Public
  paths and unauthenticated requests continue without contextvar, and
  @require_auth handles the strict 401 path

Test infrastructure
-------------------
- tests/conftest.py: @pytest.fixture(autouse=True) _auto_user_context
  sets a default SimpleNamespace(id="test-user-autouse") on every test
  unless marked @pytest.mark.no_auto_user. Keeps existing 20+
  persistence tests passing without modification
- pyproject.toml [tool.pytest.ini_options]: register no_auto_user
  marker so pytest does not emit warnings for opt-out tests
- tests/test_user_context.py: 6 tests covering three-state semantics,
  Protocol duck typing, and require/optional APIs
- tests/test_thread_meta_repo.py: one test updated to pass owner_id=
  None explicitly where it was previously relying on the old default

Test results
------------
- test_user_context.py: 6 passed
- test_auth*.py + test_langgraph_auth.py + test_ensure_admin.py: 127
- test_run_event_store / test_run_repository / test_thread_meta_repo
  / test_feedback: 92 passed
- Full backend suite: 1905 passed, 2 failed (both @requires_llm flaky
  integration tests unrelated to auth), 1 skipped

* feat(auth): extend orphan migration to 2.0-rc persistence tables

_ensure_admin_user now runs a three-step pipeline on every boot:

  Step 1 (fatal):     admin user exists / is created / password is reset
  Step 2 (non-fatal): LangGraph store orphan threads → admin
  Step 3 (non-fatal): SQL persistence tables → admin
    - threads_meta
    - runs
    - run_events
    - feedback

Each step is idempotent. The fatal/non-fatal split mirrors PR #1728's
original philosophy: admin creation failure blocks startup (the system
is unusable without an admin), whereas migration failures log a warning
and let the service proceed (a partial migration is recoverable; a
missing admin is not).

Key helpers
-----------
- _iter_store_items(store, namespace, *, page_size=500):
  async generator that cursor-paginates across LangGraph store pages.
  Fixes PR #1728's hardcoded limit=1000 bug that would silently lose
  orphans beyond the first page.

- _migrate_orphaned_threads(store, admin_user_id):
  Rewritten to use _iter_store_items. Returns the migrated count so the
  caller can log it; raises only on unhandled exceptions.

- _migrate_orphan_sql_tables(admin_user_id):
  Imports the 4 ORM models lazily, grabs the shared session factory,
  runs one UPDATE per table in a single transaction, commits once.
  No-op when no persistence backend is configured (in-memory dev).

Tests: test_ensure_admin.py (8 passed)

* test(auth): port AUTH test plan docs + lint/format pass

- Port backend/docs/AUTH_TEST_PLAN.md and AUTH_UPGRADE.md from PR #1728
- Rename metadata.user_id → metadata.owner_id in AUTH_TEST_PLAN.md
  (4 occurrences from the original PR doc)
- ruff auto-fix UP037 in sentinel type annotations: drop quotes around
  "str | None | _AutoSentinel" now that from __future__ import
  annotations makes them implicit string forms
- ruff format: 2 files (app/gateway/app.py, runtime/user_context.py)

Note on test coverage additions:
- conftest.py autouse fixture was already added in commit 4 (had to
  be co-located with the repository changes to keep pre-existing
  persistence tests passing)
- cross-user isolation E2E tests (test_owner_isolation.py) deferred
  — enforcement is already proven by the 98-test repository suite
  via the autouse fixture + explicit _AUTO sentinel exercises
- New test cases (TC-API-17..20, TC-ATK-13, TC-MIG-01..07) listed
  in AUTH_TEST_PLAN.md are deferred to a follow-up PR — they are
  manual-QA test cases rather than pytest code, and the spec-level
  coverage is already met by test_user_context.py + the 98-test
  repository suite.

Final test results:
- Auth suite (test_auth*, test_langgraph_auth, test_ensure_admin,
  test_user_context): 186 passed
- Persistence suite (test_run_event_store, test_run_repository,
  test_thread_meta_repo, test_feedback): 98 passed
- Lint: ruff check + ruff format both clean

* test(auth): add cross-user isolation test suite

10 tests exercising the storage-layer owner filter by manually
switching the user_context contextvar between two users. Verifies
the safety invariant:

  After a repository write with owner_id=A, a subsequent read with
  owner_id=B must not return the row, and vice versa.

Covers all 4 tables that own user-scoped data:

TC-API-17  threads_meta  — read, search, update, delete cross-user
TC-API-18  runs          — get, list_by_thread, delete cross-user
TC-API-19  run_events    — list_messages, list_events, count_messages,
                           delete_by_thread (CRITICAL: raw conversation
                           content leak vector)
TC-API-20  feedback      — get, list_by_run, delete cross-user

Plus two meta-tests verifying the sentinel pattern itself:
- AUTO + unset contextvar raises RuntimeError
- explicit owner_id=None bypasses the filter (migration escape hatch)

Architecture note
-----------------
These tests bypass the HTTP layer by design. The full chain
(cookie → middleware → contextvar → repository) is covered piecewise:

- test_auth_middleware.py: middleware sets contextvar from cookies
- test_owner_isolation.py: repositories enforce isolation when
  contextvar is set to different users

Together they prove the end-to-end safety property without the
ceremony of spinning up a full TestClient + in-memory DB for every
router endpoint.

Tests pass: 231 (full auth + persistence + isolation suite)
Lint: clean

* refactor(auth): migrate user repository to SQLAlchemy ORM

Move the users table into the shared persistence engine so auth
matches the pattern of threads_meta, runs, run_events, and feedback —
one engine, one session factory, one schema init codepath.

New files
---------
- persistence/user/__init__.py, persistence/user/model.py: UserRow
  ORM class with partial unique index on (oauth_provider, oauth_id)
- Registered in persistence/models/__init__.py so
  Base.metadata.create_all() picks it up

Modified
--------
- auth/repositories/sqlite.py: rewritten as async SQLAlchemy,
  identical constructor pattern to the other four repositories
  (def __init__(self, session_factory) + self._sf = session_factory)
- auth/config.py: drop users_db_path field — storage is configured
  through config.database like every other table
- deps.py/get_local_provider: construct SQLiteUserRepository with
  the shared session factory, fail fast if engine is not initialised
- tests/test_auth.py: rewrite test_sqlite_round_trip_new_fields to
  use the shared engine (init_engine + close_engine in a tempdir)
- tests/test_auth_type_system.py: add per-test autouse fixture that
  spins up a scratch engine and resets deps._cached_* singletons

* refactor(auth): remove SQL orphan migration (unused in supported scenarios)

The _migrate_orphan_sql_tables helper existed to bind NULL owner_id
rows in threads_meta, runs, run_events, and feedback to the admin on
first boot. But in every supported upgrade path, it's a no-op:

  1. Fresh install: create_all builds fresh tables, no legacy rows
  2. No-auth → with-auth (no existing persistence DB): persistence
     tables are created fresh by create_all, no legacy rows
  3. No-auth → with-auth (has existing persistence DB from #1930):
     NOT a supported upgrade path — "有 DB 到有 DB" schema evolution
     is out of scope; users wipe DB or run manual ALTER

So the SQL orphan migration never has anything to do in the
supported matrix. Delete the function, simplify _ensure_admin_user
from a 3-step pipeline to a 2-step one (admin creation + LangGraph
store orphan migration only).

LangGraph store orphan migration stays: it serves the real
"no-auth → with-auth" upgrade path where a user's existing LangGraph
thread metadata has no owner_id field and needs to be stamped with
the newly-created admin's id.

Tests: 284 passed (auth + persistence + isolation)
Lint: clean

* security(auth): write initial admin password to 0600 file instead of logs

CodeQL py/clear-text-logging-sensitive-data flagged 3 call sites that
logged the auto-generated admin password to stdout via logger.info().
Production log aggregators (ELK/Splunk/etc) would have captured those
cleartext secrets. Replace with a shared helper that writes to
.deer-flow/admin_initial_credentials.txt with mode 0600, and log only
the path.

New file
--------
- app/gateway/auth/credential_file.py: write_initial_credentials()
  helper. Takes email, password, and a "initial"/"reset" label.
  Creates .deer-flow/ if missing, writes a header comment plus the
  email+password, chmods 0o600, returns the absolute Path.

Modified
--------
- app/gateway/app.py: both _ensure_admin_user paths (fresh creation
  + needs_setup password reset) now write to file and log the path
- app/gateway/auth/reset_admin.py: rewritten to use the shared ORM
  repo (SQLiteUserRepository with session_factory) and the
  credential_file helper. The previous implementation was broken
  after the earlier ORM refactor — it still imported _get_users_conn
  and constructed SQLiteUserRepository() without a session factory.

No tests changed — the three password-log sites are all exercised
via existing test_ensure_admin.py which checks that startup
succeeds, not that a specific string appears in logs.

CodeQL alerts 272, 283, 284: all resolved.

* security(auth): strict JWT validation in middleware (fix junk cookie bypass)

AUTH_TEST_PLAN test 7.5.8 expects junk cookies to be rejected with
401. The previous middleware behaviour was "presence-only": check
that some access_token cookie exists, then pass through. In
combination with my Task-12 decision to skip @require_auth
decorators on routes, this created a gap where a request with any
cookie-shaped string (e.g. access_token=not-a-jwt) would bypass
authentication on routes that do not touch the repository
(/api/models, /api/mcp/config, /api/memory, /api/skills, …).

Fix: middleware now calls get_current_user_from_request() strictly
and catches the resulting HTTPException to render a 401 with the
proper fine-grained error code (token_invalid, token_expired,
user_not_found, …). On success it stamps request.state.user and
the contextvar so repository-layer owner filters work downstream.

The 4 old "_with_cookie_passes" tests in test_auth_middleware.py
were written for the presence-only behaviour; they asserted that
a junk cookie would make the handler return 200. They are renamed
to "_with_junk_cookie_rejected" and their assertions flipped to
401. The negative path (no cookie → 401 not_authenticated)
is unchanged.

Verified:
  no cookie       → 401 not_authenticated
  junk cookie     → 401 token_invalid     (the fixed bug)
  expired cookie  → 401 token_expired

Tests: 284 passed (auth + persistence + isolation)
Lint: clean

* security(auth): wire @require_permission(owner_check=True) on isolation routes

Apply the require_permission decorator to all 28 routes that take a
{thread_id} path parameter. Combined with the strict middleware
(previous commit), this gives the double-layer protection that
AUTH_TEST_PLAN test 7.5.9 documents:

  Layer 1 (AuthMiddleware): cookie + JWT validation, rejects junk
                            cookies and stamps request.state.user
  Layer 2 (@require_permission with owner_check=True): per-resource
                            ownership verification via
                            ThreadMetaStore.check_access — returns
                            404 if a different user owns the thread

The decorator's owner_check branch is rewritten to use the SQL
thread_meta_repo (the 2.0-rc persistence layer) instead of the
LangGraph store path that PR #1728 used (_store_get / get_store
in routers/threads.py). The inject_record convenience is dropped
— no caller in 2.0 needs the LangGraph blob, and the SQL repo has
a different shape.

Routes decorated (28 total):
- threads.py: delete, patch, get, get-state, post-state, post-history
- thread_runs.py: post-runs, post-runs-stream, post-runs-wait,
  list_runs, get_run, cancel_run, join_run, stream_existing_run,
  list_thread_messages, list_run_messages, list_run_events,
  thread_token_usage
- feedback.py: create, list, stats, delete
- uploads.py: upload (added Request param), list, delete
- artifacts.py: get_artifact
- suggestions.py: generate (renamed body parameter to avoid
  conflict with FastAPI Request)

Test fixes:
- test_suggestions_router.py: bypass the decorator via __wrapped__
  (the unit tests cover parsing logic, not auth — no point spinning
  up a thread_meta_repo just to test JSON unwrapping)
- test_auth_middleware.py 4 fake-cookie tests: already updated in
  the previous commit (745bf432)

Tests: 293 passed (auth + persistence + isolation + suggestions)
Lint: clean

* security(auth): defense-in-depth fixes from release validation pass

Eight findings caught while running the AUTH_TEST_PLAN end-to-end against
the deployed sg_dev stack. Each is a pre-condition for shipping
release/2.0-rc that the previous PRs missed.

Backend hardening
- routers/auth.py: rate limiter X-Real-IP now requires AUTH_TRUSTED_PROXIES
  whitelist (CIDR/IP allowlist). Without nginx in front, the previous code
  honored arbitrary X-Real-IP, letting an attacker rotate the header to
  fully bypass the per-IP login lockout.
- routers/auth.py: 36-entry common-password blocklist via Pydantic
  field_validator on RegisterRequest + ChangePasswordRequest. The shared
  _validate_strong_password helper keeps the constraint in one place.
- routers/threads.py: ThreadCreateRequest + ThreadPatchRequest strip
  server-reserved metadata keys (owner_id, user_id) via Pydantic
  field_validator so a forged value can never round-trip back to other
  clients reading the same thread. The actual ownership invariant stays
  on the threads_meta row; this closes the metadata-blob echo gap.
- authz.py + thread_meta/sql.py: require_permission gains a require_existing
  flag plumbed through check_access(require_existing=True). Destructive
  routes (DELETE/PATCH/state-update/runs/feedback) now treat a missing
  thread_meta row as 404 instead of "untracked legacy thread, allow",
  closing the cross-user delete-idempotence gap where any user could
  successfully DELETE another user's deleted thread.
- repositories/sqlite.py + base.py: update_user raises UserNotFoundError
  on a vanished row instead of silently returning the input. Concurrent
  delete during password reset can no longer look like a successful update.
- runtime/user_context.py: resolve_owner_id() coerces User.id (UUID) to
  str at the contextvar boundary so SQLAlchemy String(64) columns can
  bind it. The whole 2.0-rc isolation pipeline was previously broken
  end-to-end (POST /api/threads → 500 "type 'UUID' is not supported").
- persistence/engine.py: SQLAlchemy listener enables PRAGMA journal_mode=WAL,
  synchronous=NORMAL, foreign_keys=ON on every new SQLite connection.
  TC-UPG-06 in the test plan expects WAL; previous code shipped with the
  default 'delete' journal.
- auth_middleware.py: stamp request.state.auth = AuthContext(...) so
  @require_permission's short-circuit fires; previously every isolation
  request did a duplicate JWT decode + users SELECT. Also unifies the
  401 payload through AuthErrorResponse(...).model_dump().
- app.py: _ensure_admin_user restructure removes the noqa F821 scoping
  bug where 'password' was referenced outside the branch that defined it.
  New _announce_credentials helper absorbs the duplicate log block in
  the fresh-admin and reset-admin branches.

* fix(frontend+nginx): rollout CSRF on every state-changing client path

The frontend was 100% broken in gateway-pro mode for any user trying to
open a specific chat thread. Three cumulative bugs each silently
masked the next.

LangGraph SDK CSRF gap (api-client.ts)
- The Client constructor took only apiUrl, no defaultHeaders, no fetch
  interceptor. The SDK's internal fetch never sent X-CSRF-Token, so
  every state-changing /api/langgraph-compat/* call (runs/stream,
  threads/search, threads/{tid}/history, ...) hit CSRFMiddleware and
  got 403 before reaching the auth check. UI symptom: empty thread page
  with no error message; the SPA's hooks swallowed the rejection.
- Fix: pass an onRequest hook that injects X-CSRF-Token from the
  csrf_token cookie per request. Reading the cookie per call (not at
  construction time) handles login / logout / password-change cookie
  rotation transparently. The SDK's prepareFetchOptions calls
  onRequest for both regular requests AND streaming/SSE/reconnect, so
  the same hook covers runs.stream and runs.joinStream.

Raw fetch CSRF gap (7 files)
- Audit: 11 frontend fetch sites, only 2 included CSRF (login/setup +
  account-settings change-password). The other 7 routed through raw
  fetch() with no header — suggestions, memory, agents, mcp, skills,
  uploads, and the local thread cleanup hook all 403'd silently.
- Fix: enhance fetcher.ts:fetchWithAuth to auto-inject X-CSRF-Token on
  POST/PUT/DELETE/PATCH from a single shared readCsrfCookie() helper.
  Convert all 7 raw fetch() callers to fetchWithAuth so the contract
  is centrally enforced. api-client.ts and fetcher.ts share
  readCsrfCookie + STATE_CHANGING_METHODS to avoid drift.

nginx routing + buffering (nginx.local.conf)
- The auth feature shipped without updating the nginx config: per-API
  explicit location blocks but no /api/v1/auth/, /api/feedback, /api/runs.
  The frontend's client-side fetches to /api/v1/auth/login/local 404'd
  from the Next.js side because nginx routed /api/* to the frontend.
- Fix: add catch-all `location /api/` that proxies to the gateway.
  nginx longest-prefix matching keeps the explicit blocks (/api/models,
  /api/threads regex, /api/langgraph/, ...) winning for their paths.
- Fix: disable proxy_buffering + proxy_request_buffering for the
  frontend `location /` block. Without it, nginx tries to spool large
  Next.js chunks into /var/lib/nginx/proxy (root-owned) and fails with
  Permission denied → ERR_INCOMPLETE_CHUNKED_ENCODING → ChunkLoadError.

* test(auth): release-validation test infra and new coverage

Test fixtures and unit tests added during the validation pass.

Router test helpers (NEW: tests/_router_auth_helpers.py)
- make_authed_test_app(): builds a FastAPI test app with a stub
  middleware that stamps request.state.user + request.state.auth and a
  permissive thread_meta_repo mock. TestClient-based router tests
  (test_artifacts_router, test_threads_router) use it instead of bare
  FastAPI() so the new @require_permission(owner_check=True) decorators
  short-circuit cleanly.
- call_unwrapped(): walks the __wrapped__ chain to invoke the underlying
  handler without going through the authz wrappers. Direct-call tests
  (test_uploads_router) use it. Typed with ParamSpec so the wrapped
  signature flows through.

Backend test additions
- test_auth.py: 7 tests for the new _get_client_ip trust model (no
  proxy / trusted proxy / untrusted peer / XFF rejection / invalid
  CIDR / no client). 5 tests for the password blocklist (literal,
  case-insensitive, strong password accepted, change-password binding,
  short-password length-check still fires before blocklist).
  test_update_user_raises_when_row_concurrently_deleted: closes a
  shipped-without-coverage gap on the new UserNotFoundError contract.
- test_thread_meta_repo.py: 4 tests for check_access(require_existing=True)
  — strict missing-row denial, strict owner match, strict owner mismatch,
  strict null-owner still allowed (shared rows survive the tightening).
- test_ensure_admin.py: 3 tests for _migrate_orphaned_threads /
  _iter_store_items pagination, covering the TC-UPG-02 upgrade story
  end-to-end via mock store. Closes the gap where the cursor pagination
  was untested even though the previous PR rewrote it.
- test_threads_router.py: 5 tests for _strip_reserved_metadata
  (owner_id removal, user_id removal, safe-keys passthrough, empty
  input, both-stripped).
- test_auth_type_system.py: replace "password123" fixtures with
  Tr0ub4dor3a / AnotherStr0ngPwd! so the new password blocklist
  doesn't reject the test data.

* docs(auth): refresh TC-DOCKER-05 + document Docker validation gap

- AUTH_TEST_PLAN.md TC-DOCKER-05: the previous expectation
  ("admin password visible in docker logs") was stale after the simplify
  pass that moved credentials to a 0600 file. The grep "Password:" check
  would have silently failed and given a false sense of coverage. New
  expectation matches the actual file-based path: 0600 file in
  DEER_FLOW_HOME, log shows the path (not the secret), reverse-grep
  asserts no leaked password in container logs.
- NEW: docs/AUTH_TEST_DOCKER_GAP.md documents the only un-executed
  block in the test plan (TC-DOCKER-01..06). Reason: sg_dev validation
  host has no Docker daemon installed. The doc maps each Docker case
  to an already-validated bare-metal equivalent (TC-1.1, TC-REENT-01,
  TC-API-02 etc.) so the gap is auditable, and includes pre-flight
  reproduction steps for whoever has Docker available.

---------

Co-authored-by: greatmengqi <chenmengqi.0376@bytedance.com>

* fix(persistence): stream hang when run_events.backend=db

DbRunEventStore._user_id_from_context() returned user.id without
coercing it to str. User.id is a Pydantic UUID, and aiosqlite cannot
bind a raw UUID object to a VARCHAR column, so the INSERT for the
initial human_message event silently rolled back and raised out of
the worker task. Because that put() sat outside the worker's try
block, the finally-clause that publishes end-of-stream never ran
and the SSE stream hung forever.

jsonl mode was unaffected because json.dumps(default=str) coerces
UUID objects transparently.

Fixes:
- db.py: coerce user.id to str at the context-read boundary (matches
  what resolve_user_id already does for the other repositories)
- worker.py: move RunJournal init + human_message put inside the try
  block so any failure flows through the finally/publish_end path
  instead of hanging the subscriber

Defense-in-depth:
- engine.py: add PRAGMA busy_timeout=5000 so checkpointer and event
  store wait for each other on the shared deerflow.db file instead
  of failing immediately under write-lock contention
- journal.py: skip fire-and-forget _flush_sync when a previous flush
  task is still in flight, to avoid piling up concurrent put_batch
  writes on the same SQLAlchemy engine during streaming; flush() now
  waits for pending tasks before draining the buffer
- database_config.py: doc-only update clarifying WAL + busy_timeout
  keep the unified deerflow.db safe for both workloads

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore(persistence): drop redundant busy_timeout PRAGMA

Python's sqlite3 driver defaults to a 5-second busy timeout via the
``timeout`` kwarg of ``sqlite3.connect``, and aiosqlite + SQLAlchemy's
aiosqlite dialect inherit that default. Setting ``PRAGMA busy_timeout=5000``
explicitly was a no-op — verified by reading back the PRAGMA on a fresh
connection (it already reports 5000ms without our PRAGMA).

Concurrent stress test (50 checkpoint writes + 20 event batches + 50
thread_meta updates on the same deerflow.db) still completes with zero
errors and 200/200 rows after removing the explicit PRAGMA.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(rebase): remove duplicate definitions and update stale module paths

Rebase left duplicate function blocks in worker.py (triple human_message
write causing 3x user messages in /history), deps.py, and prompt.py.
Also update checkpointer imports from the old deerflow.agents.checkpointer
path to deerflow.runtime.checkpointer, and clean up orphaned feedback
props in the frontend message components.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(user-context): add DEFAULT_USER_ID and get_effective_user_id helper

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(paths): add user-aware path methods with optional user_id parameter

Add _validate_user_id(), user_dir(), user_memory_file(), user_agent_memory_file()
and optional keyword-only user_id parameter to all thread-related path methods.
When user_id is provided, paths resolve under users/{user_id}/threads/{thread_id}/;
when omitted, legacy layout is preserved for backward compatibility.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(memory): add user_id to MemoryStorage interface for per-user isolation

Thread user_id through MemoryStorage.load/reload/save abstract methods and
FileMemoryStorage, re-keying the in-memory cache from bare agent_name to a
(user_id, agent_name) tuple to prevent cross-user cache collisions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(memory): thread user_id through memory updater layer

Add `user_id` keyword-only parameter to all public updater functions
(_save_memory_to_file, get_memory_data, reload_memory_data, import_memory_data,
clear_memory_data, create/delete/update_memory_fact) and regular keyword param
to MemoryUpdater.update_memory + update_memory_from_conversation, propagating
it to every storage load/save/reload call.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(memory): capture user_id at enqueue time for async-safe thread isolation

Add user_id field to ConversationContext and MemoryUpdateQueue.add() so the
user identity is stored explicitly at request time, before threading.Timer
fires on a different thread where ContextVar values do not propagate.
MemoryMiddleware.after_agent() now calls get_effective_user_id() at enqueue
time and passes the value through to updater.update_memory().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(isolation): wire user_id through all Paths and memory callsites

Pass user_id=get_effective_user_id() at every callsite that invokes
Paths methods or memory functions, enabling per-user filesystem isolation
throughout the harness and app layers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(migration): add idempotent script for per-user data migration

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update CLAUDE.md and config docs for per-user isolation

* feat(events): add pagination to list_messages_by_run on all store backends

Replicates the existing before_seq/after_seq/limit cursor-pagination pattern
from list_messages onto list_messages_by_run across the abstract interface,
MemoryRunEventStore, JsonlRunEventStore, and DbRunEventStore.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(api): add GET /api/runs/{run_id}/messages with cursor pagination

New endpoint resolves thread_id from the run record and delegates to
RunEventStore.list_messages_by_run for cursor-based pagination.
Ownership is enforced implicitly via RunStore.get() user filtering.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(api): add GET /api/runs/{run_id}/feedback

Delegates to FeedbackRepository.list_by_run via the existing _resolve_run
helper; includes tests for success, 404, empty list, and 503 (no DB).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(api): retrofit cursor pagination onto GET /threads/{tid}/runs/{rid}/messages

Replace bare list[dict] response with {data: [...], has_more: bool} envelope,
forwarding limit/before_seq/after_seq query params to the event store.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add run-level API endpoints to CLAUDE.md routers table

* refactor(threads): remove event-store message loader and feedback from state/history endpoints

State and history endpoints now return messages purely from the
checkpointer's channel_values. The _get_event_store_messages helper
(which loaded the full event-store transcript with feedback attached)
is removed along with its tests. Frontend will use the dedicated
GET /api/runs/{run_id}/messages and /feedback endpoints instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
Co-authored-by: DanielWalnut <45447813+hetaoBackend@users.noreply.github.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: JilongSun <965640067@qq.com>
Co-authored-by: jie <49781832+stan-fu@users.noreply.github.com>
Co-authored-by: cooper <cooperfu@tencent.com>
Co-authored-by: yangzheli <43645580+yangzheli@users.noreply.github.com>
Co-authored-by: greatmengqi <chenmengqi.0376@gmail.com>
Co-authored-by: greatmengqi <chenmengqi.0376@bytedance.com>
This commit is contained in:
rayhpeng 2026-04-12 20:04:32 +08:00 committed by GitHub
parent 0572ef44b9
commit 7b9d224b3a
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
52 changed files with 1641 additions and 791 deletions

View File

@ -158,7 +158,7 @@ from deerflow.config import get_app_config
Middlewares execute in strict order in `packages/harness/deerflow/agents/lead_agent/agent.py`:
1. **ThreadDataMiddleware** - Creates per-thread directories (`backend/.deer-flow/threads/{thread_id}/user-data/{workspace,uploads,outputs}`); Web UI thread deletion now follows LangGraph thread removal with Gateway cleanup of the local `.deer-flow/threads/{thread_id}` directory
1. **ThreadDataMiddleware** - Creates per-thread directories under the user's isolation scope (`backend/.deer-flow/users/{user_id}/threads/{thread_id}/user-data/{workspace,uploads,outputs}`); resolves `user_id` via `get_effective_user_id()` (falls back to `"default"` in no-auth mode); Web UI thread deletion now follows LangGraph thread removal with Gateway cleanup of the local thread directory
2. **UploadsMiddleware** - Tracks and injects newly uploaded files into conversation
3. **SandboxMiddleware** - Acquires sandbox, stores `sandbox_id` in state
4. **DanglingToolCallMiddleware** - Injects placeholder ToolMessages for AIMessage tool_calls that lack responses (e.g., due to user interruption)
@ -216,6 +216,9 @@ FastAPI application on port 8001 with health check at `GET /health`.
| **Threads** (`/api/threads/{id}`) | `DELETE /` - remove DeerFlow-managed local thread data after LangGraph thread deletion; unexpected failures are logged server-side and return a generic 500 detail |
| **Artifacts** (`/api/threads/{id}/artifacts`) | `GET /{path}` - serve artifacts; active content types (`text/html`, `application/xhtml+xml`, `image/svg+xml`) are always forced as download attachments to reduce XSS risk; `?download=true` still forces download for other file types |
| **Suggestions** (`/api/threads/{id}/suggestions`) | `POST /` - generate follow-up questions; rich list/block model content is normalized before JSON parsing |
| **Thread Runs** (`/api/threads/{id}/runs`) | `POST /` - create background run; `POST /stream` - create + SSE stream; `POST /wait` - create + block; `GET /` - list runs; `GET /{rid}` - run details; `POST /{rid}/cancel` - cancel; `GET /{rid}/join` - join SSE; `GET /{rid}/messages` - paginated messages `{data, has_more}`; `GET /{rid}/events` - full event stream; `GET /../messages` - thread messages with feedback; `GET /../token-usage` - aggregate tokens |
| **Feedback** (`/api/threads/{id}/runs/{rid}/feedback`) | `PUT /` - upsert feedback; `DELETE /` - delete user feedback; `POST /` - create feedback; `GET /` - list feedback; `GET /stats` - aggregate stats; `DELETE /{fid}` - delete specific |
| **Runs** (`/api/runs`) | `POST /stream` - stateless run + SSE; `POST /wait` - stateless run + block; `GET /{rid}/messages` - paginated messages by run_id `{data, has_more}` (cursor: `after_seq`/`before_seq`); `GET /{rid}/feedback` - list feedback by run_id |
Proxied through nginx: `/api/langgraph/*` → LangGraph, all other `/api/*` → Gateway.
@ -229,7 +232,7 @@ Proxied through nginx: `/api/langgraph/*` → LangGraph, all other `/api/*` →
**Virtual Path System**:
- Agent sees: `/mnt/user-data/{workspace,uploads,outputs}`, `/mnt/skills`
- Physical: `backend/.deer-flow/threads/{thread_id}/user-data/...`, `deer-flow/skills/`
- Physical: `backend/.deer-flow/users/{user_id}/threads/{thread_id}/user-data/...`, `deer-flow/skills/`
- Translation: `replace_virtual_path()` / `replace_virtual_paths_in_command()`
- Detection: `is_local_sandbox()` checks `sandbox_id == "local"`
@ -269,7 +272,7 @@ Proxied through nginx: `/api/langgraph/*` → LangGraph, all other `/api/*` →
- `invoke_acp_agent` - Invokes external ACP-compatible agents from `config.yaml`
- ACP launchers must be real ACP adapters. The standard `codex` CLI is not ACP-compatible by itself; configure a wrapper such as `npx -y @zed-industries/codex-acp` or an installed `codex-acp` binary
- Missing ACP executables now return an actionable error message instead of a raw `[Errno 2]`
- Each ACP agent uses a per-thread workspace at `{base_dir}/threads/{thread_id}/acp-workspace/`. The workspace is accessible to the lead agent via the virtual path `/mnt/acp-workspace/` (read-only). In docker sandbox mode, the directory is volume-mounted into the container at `/mnt/acp-workspace` (read-only); in local sandbox mode, path translation is handled by `tools.py`
- Each ACP agent uses a per-thread workspace at `{base_dir}/users/{user_id}/threads/{thread_id}/acp-workspace/`. The workspace is accessible to the lead agent via the virtual path `/mnt/acp-workspace/` (read-only). In docker sandbox mode, the directory is volume-mounted into the container at `/mnt/acp-workspace` (read-only); in local sandbox mode, path translation is handled by `tools.py`
- `image_search/` - Image search via DuckDuckGo
### MCP System (`packages/harness/deerflow/mcp/`)
@ -338,18 +341,27 @@ Bridges external messaging platforms (Feishu, Slack, Telegram) to the DeerFlow a
**Components**:
- `updater.py` - LLM-based memory updates with fact extraction, whitespace-normalized fact deduplication (trims leading/trailing whitespace before comparing), and atomic file I/O
- `queue.py` - Debounced update queue (per-thread deduplication, configurable wait time)
- `queue.py` - Debounced update queue (per-thread deduplication, configurable wait time); captures `user_id` at enqueue time so it survives the `threading.Timer` boundary
- `prompt.py` - Prompt templates for memory updates
- `storage.py` - File-based storage with per-user isolation; cache keyed by `(user_id, agent_name)` tuple
**Data Structure** (stored in `backend/.deer-flow/memory.json`):
**Per-User Isolation**:
- Memory is stored per-user at `{base_dir}/users/{user_id}/memory.json`
- Per-agent per-user memory at `{base_dir}/users/{user_id}/agents/{agent_name}/memory.json`
- `user_id` is resolved via `get_effective_user_id()` from `deerflow.runtime.user_context`
- In no-auth mode, `user_id` defaults to `"default"` (constant `DEFAULT_USER_ID`)
- Absolute `storage_path` in config opts out of per-user isolation
- **Migration**: Run `PYTHONPATH=. python scripts/migrate_user_isolation.py` to move legacy `memory.json` and `threads/` into per-user layout; supports `--dry-run`
**Data Structure** (stored in `{base_dir}/users/{user_id}/memory.json`):
- **User Context**: `workContext`, `personalContext`, `topOfMind` (1-3 sentence summaries)
- **History**: `recentMonths`, `earlierContext`, `longTermBackground`
- **Facts**: Discrete facts with `id`, `content`, `category` (preference/knowledge/context/behavior/goal), `confidence` (0-1), `createdAt`, `source`
**Workflow**:
1. `MemoryMiddleware` filters messages (user inputs + final AI responses) and queues conversation
1. `MemoryMiddleware` filters messages (user inputs + final AI responses), captures `user_id` via `get_effective_user_id()`, and queues conversation with the captured `user_id`
2. Queue debounces (30s default), batches updates, deduplicates per-thread
3. Background thread invokes LLM to extract context updates and facts
3. Background thread invokes LLM to extract context updates and facts, using the stored `user_id` (not the contextvar, which is unavailable on timer threads)
4. Applies updates atomically (temp file + rename) with cache invalidation, skipping duplicate fact content before append
5. Next interaction injects top 15 facts + context into `<memory>` tags in system prompt
@ -357,7 +369,7 @@ Focused regression coverage for the updater lives in `backend/tests/test_memory_
**Configuration** (`config.yaml``memory`):
- `enabled` / `injection_enabled` - Master switches
- `storage_path` - Path to memory.json
- `storage_path` - Path to memory.json (absolute path opts out of per-user isolation)
- `debounce_seconds` - Wait time before processing (default: 30)
- `model_name` - LLM for updates (null = default model)
- `max_facts` / `fact_confidence_threshold` - Fact storage limits (100 / 0.7)

View File

@ -13,6 +13,7 @@ from app.channels.base import Channel
from app.channels.commands import KNOWN_CHANNEL_COMMANDS
from app.channels.message_bus import InboundMessage, InboundMessageType, MessageBus, OutboundMessage, ResolvedAttachment
from deerflow.config.paths import VIRTUAL_PATH_PREFIX, get_paths
from deerflow.runtime.user_context import get_effective_user_id
from deerflow.sandbox.sandbox_provider import get_sandbox_provider
logger = logging.getLogger(__name__)
@ -344,8 +345,9 @@ class FeishuChannel(Channel):
return f"Failed to obtain the [{type}]"
paths = get_paths()
paths.ensure_thread_dirs(thread_id)
uploads_dir = paths.sandbox_uploads_dir(thread_id).resolve()
user_id = get_effective_user_id()
paths.ensure_thread_dirs(thread_id, user_id=user_id)
uploads_dir = paths.sandbox_uploads_dir(thread_id, user_id=user_id).resolve()
ext = "png" if type == "image" else "bin"
raw_filename = getattr(response, "file_name", "") or f"feishu_{file_key[-12:]}.{ext}"

View File

@ -17,6 +17,7 @@ from langgraph_sdk.errors import ConflictError
from app.channels.commands import KNOWN_CHANNEL_COMMANDS
from app.channels.message_bus import InboundMessage, InboundMessageType, MessageBus, OutboundMessage, ResolvedAttachment
from app.channels.store import ChannelStore
from deerflow.runtime.user_context import get_effective_user_id
logger = logging.getLogger(__name__)
@ -341,14 +342,15 @@ def _resolve_attachments(thread_id: str, artifacts: list[str]) -> list[ResolvedA
attachments: list[ResolvedAttachment] = []
paths = get_paths()
outputs_dir = paths.sandbox_outputs_dir(thread_id).resolve()
user_id = get_effective_user_id()
outputs_dir = paths.sandbox_outputs_dir(thread_id, user_id=user_id).resolve()
for virtual_path in artifacts:
# Security: only allow files from the agent outputs directory
if not virtual_path.startswith(_OUTPUTS_VIRTUAL_PREFIX):
logger.warning("[Manager] rejected non-outputs artifact path: %s", virtual_path)
continue
try:
actual = paths.resolve_virtual_path(thread_id, virtual_path)
actual = paths.resolve_virtual_path(thread_id, virtual_path, user_id=user_id)
# Verify the resolved path is actually under the outputs directory
# (guards against path-traversal even after prefix check)
try:

View File

@ -5,6 +5,7 @@ from pathlib import Path
from fastapi import HTTPException
from deerflow.config.paths import get_paths
from deerflow.runtime.user_context import get_effective_user_id
def resolve_thread_virtual_path(thread_id: str, virtual_path: str) -> Path:
@ -22,7 +23,7 @@ def resolve_thread_virtual_path(thread_id: str, virtual_path: str) -> Path:
HTTPException: If the path is invalid or outside allowed directories.
"""
try:
return get_paths().resolve_virtual_path(thread_id, virtual_path)
return get_paths().resolve_virtual_path(thread_id, virtual_path, user_id=get_effective_user_id())
except ValueError as e:
status = 403 if "traversal" in str(e) else 400
raise HTTPException(status_code=status, detail=str(e))

View File

@ -13,6 +13,7 @@ from deerflow.agents.memory.updater import (
update_memory_fact,
)
from deerflow.config.memory_config import get_memory_config
from deerflow.runtime.user_context import get_effective_user_id
router = APIRouter(prefix="/api", tags=["memory"])
@ -147,7 +148,7 @@ async def get_memory() -> MemoryResponse:
}
```
"""
memory_data = get_memory_data()
memory_data = get_memory_data(user_id=get_effective_user_id())
return MemoryResponse(**memory_data)
@ -167,7 +168,7 @@ async def reload_memory() -> MemoryResponse:
Returns:
The reloaded memory data.
"""
memory_data = reload_memory_data()
memory_data = reload_memory_data(user_id=get_effective_user_id())
return MemoryResponse(**memory_data)
@ -181,7 +182,7 @@ async def reload_memory() -> MemoryResponse:
async def clear_memory() -> MemoryResponse:
"""Clear all persisted memory data."""
try:
memory_data = clear_memory_data()
memory_data = clear_memory_data(user_id=get_effective_user_id())
except OSError as exc:
raise HTTPException(status_code=500, detail="Failed to clear memory data.") from exc
@ -202,6 +203,7 @@ async def create_memory_fact_endpoint(request: FactCreateRequest) -> MemoryRespo
content=request.content,
category=request.category,
confidence=request.confidence,
user_id=get_effective_user_id(),
)
except ValueError as exc:
raise _map_memory_fact_value_error(exc) from exc
@ -221,7 +223,7 @@ async def create_memory_fact_endpoint(request: FactCreateRequest) -> MemoryRespo
async def delete_memory_fact_endpoint(fact_id: str) -> MemoryResponse:
"""Delete a single fact from memory by fact id."""
try:
memory_data = delete_memory_fact(fact_id)
memory_data = delete_memory_fact(fact_id, user_id=get_effective_user_id())
except KeyError as exc:
raise HTTPException(status_code=404, detail=f"Memory fact '{fact_id}' not found.") from exc
except OSError as exc:
@ -245,6 +247,7 @@ async def update_memory_fact_endpoint(fact_id: str, request: FactPatchRequest) -
content=request.content,
category=request.category,
confidence=request.confidence,
user_id=get_effective_user_id(),
)
except ValueError as exc:
raise _map_memory_fact_value_error(exc) from exc
@ -265,7 +268,7 @@ async def update_memory_fact_endpoint(fact_id: str, request: FactPatchRequest) -
)
async def export_memory() -> MemoryResponse:
"""Export the current memory data."""
memory_data = get_memory_data()
memory_data = get_memory_data(user_id=get_effective_user_id())
return MemoryResponse(**memory_data)
@ -279,7 +282,7 @@ async def export_memory() -> MemoryResponse:
async def import_memory(request: MemoryResponse) -> MemoryResponse:
"""Import and persist memory data."""
try:
memory_data = import_memory_data(request.model_dump())
memory_data = import_memory_data(request.model_dump(), user_id=get_effective_user_id())
except OSError as exc:
raise HTTPException(status_code=500, detail="Failed to import memory data.") from exc
@ -337,7 +340,7 @@ async def get_memory_status() -> MemoryStatusResponse:
Combined memory configuration and current data.
"""
config = get_memory_config()
memory_data = get_memory_data()
memory_data = get_memory_data(user_id=get_effective_user_id())
return MemoryStatusResponse(
config=MemoryConfigResponse(

View File

@ -11,10 +11,11 @@ import asyncio
import logging
import uuid
from fastapi import APIRouter, Request
from fastapi import APIRouter, HTTPException, Query, Request
from fastapi.responses import StreamingResponse
from app.gateway.deps import get_checkpointer, get_run_manager, get_stream_bridge
from app.gateway.authz import require_permission
from app.gateway.deps import get_checkpointer, get_feedback_repo, get_run_event_store, get_run_manager, get_run_store, get_stream_bridge
from app.gateway.routers.thread_runs import RunCreateRequest
from app.gateway.services import sse_consumer, start_run
from deerflow.runtime import serialize_channel_values
@ -85,3 +86,57 @@ async def stateless_wait(body: RunCreateRequest, request: Request) -> dict:
logger.exception("Failed to fetch final state for run %s", record.run_id)
return {"status": record.status.value, "error": record.error}
# ---------------------------------------------------------------------------
# Run-scoped read endpoints
# ---------------------------------------------------------------------------
async def _resolve_run(run_id: str, request: Request) -> dict:
"""Fetch run by run_id with user ownership check. Raises 404 if not found."""
run_store = get_run_store(request)
record = await run_store.get(run_id) # user_id=AUTO filters by contextvar
if record is None:
raise HTTPException(status_code=404, detail=f"Run {run_id} not found")
return record
@router.get("/{run_id}/messages")
@require_permission("runs", "read")
async def run_messages(
run_id: str,
request: Request,
limit: int = Query(default=50, le=200, ge=1),
before_seq: int | None = Query(default=None),
after_seq: int | None = Query(default=None),
) -> dict:
"""Return paginated messages for a run (cursor-based).
Pagination:
- after_seq: messages with seq > after_seq (forward)
- before_seq: messages with seq < before_seq (backward)
- neither: latest messages
Response: { data: [...], has_more: bool }
"""
run = await _resolve_run(run_id, request)
event_store = get_run_event_store(request)
rows = await event_store.list_messages_by_run(
run["thread_id"], run_id,
limit=limit + 1,
before_seq=before_seq,
after_seq=after_seq,
)
has_more = len(rows) > limit
data = rows[:limit] if has_more else rows
return {"data": data, "has_more": has_more}
@router.get("/{run_id}/feedback")
@require_permission("runs", "read")
async def run_feedback(run_id: str, request: Request) -> list[dict]:
"""Return all feedback for a run."""
run = await _resolve_run(run_id, request)
feedback_repo = get_feedback_repo(request)
return await feedback_repo.list_by_run(run["thread_id"], run_id)

View File

@ -325,10 +325,28 @@ async def list_thread_messages(
@router.get("/{thread_id}/runs/{run_id}/messages")
@require_permission("runs", "read", owner_check=True)
async def list_run_messages(thread_id: str, run_id: str, request: Request) -> list[dict]:
"""Return displayable messages for a specific run."""
async def list_run_messages(
thread_id: str,
run_id: str,
request: Request,
limit: int = Query(default=50, le=200, ge=1),
before_seq: int | None = Query(default=None),
after_seq: int | None = Query(default=None),
) -> dict:
"""Return paginated messages for a specific run.
Response: { data: [...], has_more: bool }
"""
event_store = get_run_event_store(request)
return await event_store.list_messages_by_run(thread_id, run_id)
rows = await event_store.list_messages_by_run(
thread_id, run_id,
limit=limit + 1,
before_seq=before_seq,
after_seq=after_seq,
)
has_more = len(rows) > limit
data = rows[:limit] if has_more else rows
return {"data": data, "has_more": has_more}
@router.get("/{thread_id}/runs/{run_id}/events")

View File

@ -22,10 +22,11 @@ from fastapi import APIRouter, HTTPException, Request
from pydantic import BaseModel, Field, field_validator
from app.gateway.authz import require_permission
from app.gateway.deps import get_checkpointer, get_current_user, get_feedback_repo, get_run_event_store
from app.gateway.deps import get_checkpointer
from app.gateway.utils import sanitize_log_param
from deerflow.config.paths import Paths, get_paths
from deerflow.runtime import serialize_channel_values
from deerflow.runtime.user_context import get_effective_user_id
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/api/threads", tags=["threads"])
@ -143,11 +144,11 @@ class ThreadHistoryRequest(BaseModel):
# ---------------------------------------------------------------------------
def _delete_thread_data(thread_id: str, paths: Paths | None = None) -> ThreadDeleteResponse:
def _delete_thread_data(thread_id: str, paths: Paths | None = None, *, user_id: str | None = None) -> ThreadDeleteResponse:
"""Delete local persisted filesystem data for a thread."""
path_manager = paths or get_paths()
try:
path_manager.delete_thread_dir(thread_id)
path_manager.delete_thread_dir(thread_id, user_id=user_id)
except ValueError as exc:
raise HTTPException(status_code=422, detail=str(exc)) from exc
except FileNotFoundError:
@ -198,7 +199,7 @@ async def delete_thread_data(thread_id: str, request: Request) -> ThreadDeleteRe
from app.gateway.deps import get_thread_store
# Clean local filesystem
response = _delete_thread_data(thread_id)
response = _delete_thread_data(thread_id, user_id=get_effective_user_id())
# Remove checkpoints (best-effort)
checkpointer = getattr(request.app.state, "checkpointer", None)
@ -404,164 +405,6 @@ async def get_thread(thread_id: str, request: Request) -> ThreadResponse:
# ---------------------------------------------------------------------------
# Event-store-backed message loader
# ---------------------------------------------------------------------------
_LEGACY_CMD_INNER_CONTENT_RE = re.compile(
r"ToolMessage\(content=(?P<q>['\"])(?P<inner>.*?)(?P=q)",
re.DOTALL,
)
def _sanitize_legacy_command_repr(content_field: Any) -> Any:
"""Recover the inner ToolMessage text from a legacy ``str(Command(...))`` repr.
Runs captured before the ``on_tool_end`` fix in ``journal.py`` stored
``str(Command(update={'messages':[ToolMessage(content='X', ...)]}))`` as the
tool_result content. New runs store ``'X'`` directly. For legacy rows, try
to extract ``'X'`` defensively; return the original string if extraction
fails (still no worse than the checkpoint fallback for summarized threads).
"""
if not isinstance(content_field, str) or not content_field.startswith("Command(update="):
return content_field
match = _LEGACY_CMD_INNER_CONTENT_RE.search(content_field)
return match.group("inner") if match else content_field
async def _get_event_store_messages(request: Request, thread_id: str) -> list[dict] | None:
"""Load the full message stream for ``thread_id`` from the event store.
The event store is append-only and unaffected by summarization the
checkpoint's ``channel_values["messages"]`` is rewritten in-place when the
SummarizationMiddleware runs, which drops all pre-summarize messages. The
event store retains the full transcript, so callers in Gateway mode should
prefer it for rendering the conversation history.
In addition to the core message content, this helper attaches two extra
fields to every returned dict:
- ``run_id``: the ``run_id`` of the event that produced this message.
Always present.
- ``feedback``: thumbs-up/down data. Present only on the **final
``ai_message`` of each run** (matching the per-run feedback semantics
of ``POST /api/threads/{id}/runs/{run_id}/feedback``). The frontend uses
the presence of this field to decide whether to render the feedback
button, which sidesteps the positional-index mapping bug that an
out-of-band ``/messages`` fetch exhibited.
Behaviour contract:
- **Full pagination.** ``RunEventStore.list_messages`` returns the newest
``limit`` records when no cursor is given, so a fixed limit silently
drops older messages on long threads. We size the read from
``count_messages()`` and then page forward with ``after_seq`` cursors.
- **Copy-on-read.** Each content dict is copied before ``id`` is patched
so the live store object is never mutated; ``MemoryRunEventStore``
returns live references.
- **Stable ids.** Messages with ``id=None`` (human + tool_result) receive
a deterministic ``uuid5(NAMESPACE_URL, f"{thread_id}:{seq}")`` so React
keys are stable across requests without altering stored data. AI messages
retain their LLM-assigned ``lc_run--*`` ids.
- **Legacy Command repr.** Rows captured before the ``journal.py``
``on_tool_end`` fix stored ``str(Command(update={...}))`` as the tool
result content. ``_sanitize_legacy_command_repr`` extracts the inner
ToolMessage text.
- **User context.** ``DbRunEventStore`` is user-scoped by default via
``resolve_user_id(AUTO)`` in ``runtime/user_context.py``. This helper
must run inside a request where ``@require_permission`` has populated
the user contextvar. Both callers below are decorated appropriately.
Do not call this helper from CLI or migration scripts without passing
``user_id=None`` explicitly to the underlying store methods.
Returns ``None`` when the event store is not configured or has no message
events for this thread, so callers fall back to checkpoint messages.
"""
try:
event_store = get_run_event_store(request)
except Exception:
return None
try:
total = await event_store.count_messages(thread_id)
except Exception:
logger.exception("count_messages failed for thread %s", sanitize_log_param(thread_id))
return None
if not total:
return None
# Batch by page_size to keep memory bounded for very long threads.
page_size = 500
collected: list[dict] = []
after_seq: int | None = None
while True:
try:
page = await event_store.list_messages(thread_id, limit=page_size, after_seq=after_seq)
except Exception:
logger.exception("list_messages failed for thread %s", sanitize_log_param(thread_id))
return None
if not page:
break
collected.extend(page)
if len(page) < page_size:
break
next_cursor = page[-1].get("seq")
if next_cursor is None or (after_seq is not None and next_cursor <= after_seq):
break
after_seq = next_cursor
# Build the message list; track the final ``ai_message`` index per run so
# feedback can be attached at the right position (matches thread_runs.py).
messages: list[dict] = []
last_ai_per_run: dict[str, int] = {}
for evt in collected:
raw = evt.get("content")
if not isinstance(raw, dict) or "type" not in raw:
continue
content = dict(raw)
if content.get("id") is None:
content["id"] = str(uuid.uuid5(uuid.NAMESPACE_URL, f"{thread_id}:{evt['seq']}"))
if content.get("type") == "tool":
content["content"] = _sanitize_legacy_command_repr(content.get("content"))
run_id = evt.get("run_id")
if run_id:
content["run_id"] = run_id
if evt.get("event_type") == "ai_message" and run_id:
last_ai_per_run[run_id] = len(messages)
messages.append(content)
if not messages:
return None
# Attach feedback to the final ai_message of each run. If the feedback
# subsystem is unavailable, leave the ``feedback`` field absent entirely
# so the frontend hides the button rather than showing it over a broken
# write path.
feedback_available = False
feedback_map: dict[str, dict] = {}
try:
feedback_repo = get_feedback_repo(request)
user_id = await get_current_user(request)
feedback_map = await feedback_repo.list_by_thread_grouped(thread_id, user_id=user_id)
feedback_available = True
except Exception:
logger.exception("feedback lookup failed for thread %s", sanitize_log_param(thread_id))
if feedback_available:
for run_id, idx in last_ai_per_run.items():
fb = feedback_map.get(run_id)
messages[idx]["feedback"] = (
{
"feedback_id": fb["feedback_id"],
"rating": fb["rating"],
"comment": fb.get("comment"),
}
if fb
else None
)
return messages
@router.get("/{thread_id}/state", response_model=ThreadStateResponse)
@require_permission("threads", "read", owner_check=True)
async def get_thread_state(thread_id: str, request: Request) -> ThreadStateResponse:
@ -602,11 +445,6 @@ async def get_thread_state(thread_id: str, request: Request) -> ThreadStateRespo
values = serialize_channel_values(channel_values)
# Prefer event-store messages: append-only, immune to summarization.
es_messages = await _get_event_store_messages(request, thread_id)
if es_messages is not None:
values["messages"] = es_messages
return ThreadStateResponse(
values=values,
next=next_tasks,
@ -726,11 +564,6 @@ async def get_thread_history(thread_id: str, body: ThreadHistoryRequest, request
if body.before:
config["configurable"]["checkpoint_id"] = body.before
# Load the full event-store message stream once; attach to the latest
# checkpoint entry only (matching the prior semantics). The event store
# is append-only and immune to summarization.
es_messages = await _get_event_store_messages(request, thread_id)
entries: list[HistoryEntry] = []
is_latest_checkpoint = True
try:
@ -754,17 +587,11 @@ async def get_thread_history(thread_id: str, body: ThreadHistoryRequest, request
if thread_data := channel_values.get("thread_data"):
values["thread_data"] = thread_data
# Attach messages only to the latest checkpoint. Prefer the
# event-store stream (complete and unaffected by summarization);
# fall back to checkpoint channel_values when the event store is
# unavailable or empty.
# Attach messages only to the latest checkpoint entry.
if is_latest_checkpoint:
if es_messages is not None:
values["messages"] = es_messages
else:
messages = channel_values.get("messages")
if messages:
values["messages"] = serialize_channel_values({"messages": messages}).get("messages", [])
messages = channel_values.get("messages")
if messages:
values["messages"] = serialize_channel_values({"messages": messages}).get("messages", [])
is_latest_checkpoint = False
# Derive next tasks

View File

@ -9,6 +9,7 @@ from pydantic import BaseModel
from app.gateway.authz import require_permission
from deerflow.config.paths import get_paths
from deerflow.runtime.user_context import get_effective_user_id
from deerflow.sandbox.sandbox_provider import get_sandbox_provider
from deerflow.uploads.manager import (
PathTraversalError,
@ -69,7 +70,7 @@ async def upload_files(
uploads_dir = ensure_uploads_dir(thread_id)
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
sandbox_uploads = get_paths().sandbox_uploads_dir(thread_id)
sandbox_uploads = get_paths().sandbox_uploads_dir(thread_id, user_id=get_effective_user_id())
uploaded_files = []
sandbox_provider = get_sandbox_provider()
@ -147,7 +148,7 @@ async def list_uploaded_files(thread_id: str, request: Request) -> dict:
enrich_file_listing(result, thread_id)
# Gateway additionally includes the sandbox-relative path.
sandbox_uploads = get_paths().sandbox_uploads_dir(thread_id)
sandbox_uploads = get_paths().sandbox_uploads_dir(thread_id, user_id=get_effective_user_id())
for f in result["files"]:
f["path"] = str(sandbox_uploads / f["filename"])

View File

@ -519,12 +519,13 @@ def _get_memory_context(agent_name: str | None = None) -> str:
try:
from deerflow.agents.memory import format_memory_for_injection, get_memory_data
from deerflow.config.memory_config import get_memory_config
from deerflow.runtime.user_context import get_effective_user_id
config = get_memory_config()
if not config.enabled or not config.injection_enabled:
return ""
memory_data = get_memory_data(agent_name)
memory_data = get_memory_data(agent_name, user_id=get_effective_user_id())
memory_content = format_memory_for_injection(memory_data, max_tokens=config.max_injection_tokens)
if not memory_content.strip():

View File

@ -20,6 +20,7 @@ class ConversationContext:
messages: list[Any]
timestamp: datetime = field(default_factory=lambda: datetime.now(UTC))
agent_name: str | None = None
user_id: str | None = None
correction_detected: bool = False
reinforcement_detected: bool = False
@ -44,6 +45,7 @@ class MemoryUpdateQueue:
thread_id: str,
messages: list[Any],
agent_name: str | None = None,
user_id: str | None = None,
correction_detected: bool = False,
reinforcement_detected: bool = False,
) -> None:
@ -53,6 +55,9 @@ class MemoryUpdateQueue:
thread_id: The thread ID.
messages: The conversation messages.
agent_name: If provided, memory is stored per-agent. If None, uses global memory.
user_id: The user ID captured at enqueue time. Stored in ConversationContext so it
survives the threading.Timer boundary (ContextVar does not propagate across
raw threads).
correction_detected: Whether recent turns include an explicit correction signal.
reinforcement_detected: Whether recent turns include a positive reinforcement signal.
"""
@ -71,6 +76,7 @@ class MemoryUpdateQueue:
thread_id=thread_id,
messages=messages,
agent_name=agent_name,
user_id=user_id,
correction_detected=merged_correction_detected,
reinforcement_detected=merged_reinforcement_detected,
)
@ -136,6 +142,7 @@ class MemoryUpdateQueue:
agent_name=context.agent_name,
correction_detected=context.correction_detected,
reinforcement_detected=context.reinforcement_detected,
user_id=context.user_id,
)
if success:
logger.info("Memory updated successfully for thread %s", context.thread_id)

View File

@ -43,17 +43,17 @@ class MemoryStorage(abc.ABC):
"""Abstract base class for memory storage providers."""
@abc.abstractmethod
def load(self, agent_name: str | None = None) -> dict[str, Any]:
def load(self, agent_name: str | None = None, *, user_id: str | None = None) -> dict[str, Any]:
"""Load memory data for the given agent."""
pass
@abc.abstractmethod
def reload(self, agent_name: str | None = None) -> dict[str, Any]:
def reload(self, agent_name: str | None = None, *, user_id: str | None = None) -> dict[str, Any]:
"""Force reload memory data for the given agent."""
pass
@abc.abstractmethod
def save(self, memory_data: dict[str, Any], agent_name: str | None = None) -> bool:
def save(self, memory_data: dict[str, Any], agent_name: str | None = None, *, user_id: str | None = None) -> bool:
"""Save memory data for the given agent."""
pass
@ -63,9 +63,9 @@ class FileMemoryStorage(MemoryStorage):
def __init__(self):
"""Initialize the file memory storage."""
# Per-agent memory cache: keyed by agent_name (None = global)
# Per-user/agent memory cache: keyed by (user_id, agent_name) tuple (None = global)
# Value: (memory_data, file_mtime)
self._memory_cache: dict[str | None, tuple[dict[str, Any], float | None]] = {}
self._memory_cache: dict[tuple[str | None, str | None], tuple[dict[str, Any], float | None]] = {}
def _validate_agent_name(self, agent_name: str) -> None:
"""Validate that the agent name is safe to use in filesystem paths.
@ -78,21 +78,29 @@ class FileMemoryStorage(MemoryStorage):
if not AGENT_NAME_PATTERN.match(agent_name):
raise ValueError(f"Invalid agent name {agent_name!r}: names must match {AGENT_NAME_PATTERN.pattern}")
def _get_memory_file_path(self, agent_name: str | None = None) -> Path:
def _get_memory_file_path(self, agent_name: str | None = None, *, user_id: str | None = None) -> Path:
"""Get the path to the memory file."""
if user_id is not None:
if agent_name is not None:
self._validate_agent_name(agent_name)
return get_paths().user_agent_memory_file(user_id, agent_name)
config = get_memory_config()
if config.storage_path and Path(config.storage_path).is_absolute():
return Path(config.storage_path)
return get_paths().user_memory_file(user_id)
# Legacy: no user_id
if agent_name is not None:
self._validate_agent_name(agent_name)
return get_paths().agent_memory_file(agent_name)
config = get_memory_config()
if config.storage_path:
p = Path(config.storage_path)
return p if p.is_absolute() else get_paths().base_dir / p
return get_paths().memory_file
def _load_memory_from_file(self, agent_name: str | None = None) -> dict[str, Any]:
def _load_memory_from_file(self, agent_name: str | None = None, *, user_id: str | None = None) -> dict[str, Any]:
"""Load memory data from file."""
file_path = self._get_memory_file_path(agent_name)
file_path = self._get_memory_file_path(agent_name, user_id=user_id)
if not file_path.exists():
return create_empty_memory()
@ -105,40 +113,42 @@ class FileMemoryStorage(MemoryStorage):
logger.warning("Failed to load memory file: %s", e)
return create_empty_memory()
def load(self, agent_name: str | None = None) -> dict[str, Any]:
def load(self, agent_name: str | None = None, *, user_id: str | None = None) -> dict[str, Any]:
"""Load memory data (cached with file modification time check)."""
file_path = self._get_memory_file_path(agent_name)
file_path = self._get_memory_file_path(agent_name, user_id=user_id)
try:
current_mtime = file_path.stat().st_mtime if file_path.exists() else None
except OSError:
current_mtime = None
cached = self._memory_cache.get(agent_name)
cache_key = (user_id, agent_name)
cached = self._memory_cache.get(cache_key)
if cached is None or cached[1] != current_mtime:
memory_data = self._load_memory_from_file(agent_name)
self._memory_cache[agent_name] = (memory_data, current_mtime)
memory_data = self._load_memory_from_file(agent_name, user_id=user_id)
self._memory_cache[cache_key] = (memory_data, current_mtime)
return memory_data
return cached[0]
def reload(self, agent_name: str | None = None) -> dict[str, Any]:
def reload(self, agent_name: str | None = None, *, user_id: str | None = None) -> dict[str, Any]:
"""Reload memory data from file, forcing cache invalidation."""
file_path = self._get_memory_file_path(agent_name)
memory_data = self._load_memory_from_file(agent_name)
file_path = self._get_memory_file_path(agent_name, user_id=user_id)
memory_data = self._load_memory_from_file(agent_name, user_id=user_id)
try:
mtime = file_path.stat().st_mtime if file_path.exists() else None
except OSError:
mtime = None
self._memory_cache[agent_name] = (memory_data, mtime)
cache_key = (user_id, agent_name)
self._memory_cache[cache_key] = (memory_data, mtime)
return memory_data
def save(self, memory_data: dict[str, Any], agent_name: str | None = None) -> bool:
def save(self, memory_data: dict[str, Any], agent_name: str | None = None, *, user_id: str | None = None) -> bool:
"""Save memory data to file and update cache."""
file_path = self._get_memory_file_path(agent_name)
file_path = self._get_memory_file_path(agent_name, user_id=user_id)
try:
file_path.parent.mkdir(parents=True, exist_ok=True)
@ -155,7 +165,8 @@ class FileMemoryStorage(MemoryStorage):
except OSError:
mtime = None
self._memory_cache[agent_name] = (memory_data, mtime)
cache_key = (user_id, agent_name)
self._memory_cache[cache_key] = (memory_data, mtime)
logger.info("Memory saved to %s", file_path)
return True
except OSError as e:

View File

@ -27,27 +27,28 @@ def _create_empty_memory() -> dict[str, Any]:
return create_empty_memory()
def _save_memory_to_file(memory_data: dict[str, Any], agent_name: str | None = None) -> bool:
def _save_memory_to_file(memory_data: dict[str, Any], agent_name: str | None = None, *, user_id: str | None = None) -> bool:
"""Backward-compatible wrapper around the configured memory storage save path."""
return get_memory_storage().save(memory_data, agent_name)
return get_memory_storage().save(memory_data, agent_name, user_id=user_id)
def get_memory_data(agent_name: str | None = None) -> dict[str, Any]:
def get_memory_data(agent_name: str | None = None, *, user_id: str | None = None) -> dict[str, Any]:
"""Get the current memory data via storage provider."""
return get_memory_storage().load(agent_name)
return get_memory_storage().load(agent_name, user_id=user_id)
def reload_memory_data(agent_name: str | None = None) -> dict[str, Any]:
def reload_memory_data(agent_name: str | None = None, *, user_id: str | None = None) -> dict[str, Any]:
"""Reload memory data via storage provider."""
return get_memory_storage().reload(agent_name)
return get_memory_storage().reload(agent_name, user_id=user_id)
def import_memory_data(memory_data: dict[str, Any], agent_name: str | None = None) -> dict[str, Any]:
def import_memory_data(memory_data: dict[str, Any], agent_name: str | None = None, *, user_id: str | None = None) -> dict[str, Any]:
"""Persist imported memory data via storage provider.
Args:
memory_data: Full memory payload to persist.
agent_name: If provided, imports into per-agent memory.
user_id: If provided, scopes memory to a specific user.
Returns:
The saved memory data after storage normalization.
@ -56,15 +57,15 @@ def import_memory_data(memory_data: dict[str, Any], agent_name: str | None = Non
OSError: If persisting the imported memory fails.
"""
storage = get_memory_storage()
if not storage.save(memory_data, agent_name):
if not storage.save(memory_data, agent_name, user_id=user_id):
raise OSError("Failed to save imported memory data")
return storage.load(agent_name)
return storage.load(agent_name, user_id=user_id)
def clear_memory_data(agent_name: str | None = None) -> dict[str, Any]:
def clear_memory_data(agent_name: str | None = None, *, user_id: str | None = None) -> dict[str, Any]:
"""Clear all stored memory data and persist an empty structure."""
cleared_memory = create_empty_memory()
if not _save_memory_to_file(cleared_memory, agent_name):
if not _save_memory_to_file(cleared_memory, agent_name, user_id=user_id):
raise OSError("Failed to save cleared memory data")
return cleared_memory
@ -81,6 +82,8 @@ def create_memory_fact(
category: str = "context",
confidence: float = 0.5,
agent_name: str | None = None,
*,
user_id: str | None = None,
) -> dict[str, Any]:
"""Create a new fact and persist the updated memory data."""
normalized_content = content.strip()
@ -90,7 +93,7 @@ def create_memory_fact(
normalized_category = category.strip() or "context"
validated_confidence = _validate_confidence(confidence)
now = utc_now_iso_z()
memory_data = get_memory_data(agent_name)
memory_data = get_memory_data(agent_name, user_id=user_id)
updated_memory = dict(memory_data)
facts = list(memory_data.get("facts", []))
facts.append(
@ -105,15 +108,15 @@ def create_memory_fact(
)
updated_memory["facts"] = facts
if not _save_memory_to_file(updated_memory, agent_name):
if not _save_memory_to_file(updated_memory, agent_name, user_id=user_id):
raise OSError("Failed to save memory data after creating fact")
return updated_memory
def delete_memory_fact(fact_id: str, agent_name: str | None = None) -> dict[str, Any]:
def delete_memory_fact(fact_id: str, agent_name: str | None = None, *, user_id: str | None = None) -> dict[str, Any]:
"""Delete a fact by its id and persist the updated memory data."""
memory_data = get_memory_data(agent_name)
memory_data = get_memory_data(agent_name, user_id=user_id)
facts = memory_data.get("facts", [])
updated_facts = [fact for fact in facts if fact.get("id") != fact_id]
if len(updated_facts) == len(facts):
@ -122,7 +125,7 @@ def delete_memory_fact(fact_id: str, agent_name: str | None = None) -> dict[str,
updated_memory = dict(memory_data)
updated_memory["facts"] = updated_facts
if not _save_memory_to_file(updated_memory, agent_name):
if not _save_memory_to_file(updated_memory, agent_name, user_id=user_id):
raise OSError(f"Failed to save memory data after deleting fact '{fact_id}'")
return updated_memory
@ -134,9 +137,11 @@ def update_memory_fact(
category: str | None = None,
confidence: float | None = None,
agent_name: str | None = None,
*,
user_id: str | None = None,
) -> dict[str, Any]:
"""Update an existing fact and persist the updated memory data."""
memory_data = get_memory_data(agent_name)
memory_data = get_memory_data(agent_name, user_id=user_id)
updated_memory = dict(memory_data)
updated_facts: list[dict[str, Any]] = []
found = False
@ -163,7 +168,7 @@ def update_memory_fact(
updated_memory["facts"] = updated_facts
if not _save_memory_to_file(updated_memory, agent_name):
if not _save_memory_to_file(updated_memory, agent_name, user_id=user_id):
raise OSError(f"Failed to save memory data after updating fact '{fact_id}'")
return updated_memory
@ -276,6 +281,7 @@ class MemoryUpdater:
agent_name: str | None = None,
correction_detected: bool = False,
reinforcement_detected: bool = False,
user_id: str | None = None,
) -> bool:
"""Update memory based on conversation messages.
@ -285,6 +291,7 @@ class MemoryUpdater:
agent_name: If provided, updates per-agent memory. If None, updates global memory.
correction_detected: Whether recent turns include an explicit correction signal.
reinforcement_detected: Whether recent turns include a positive reinforcement signal.
user_id: If provided, scopes memory to a specific user.
Returns:
True if update was successful, False otherwise.
@ -298,7 +305,7 @@ class MemoryUpdater:
try:
# Get current memory
current_memory = get_memory_data(agent_name)
current_memory = get_memory_data(agent_name, user_id=user_id)
# Format conversation for prompt
conversation_text = format_conversation_for_update(messages)
@ -353,7 +360,7 @@ class MemoryUpdater:
updated_memory = _strip_upload_mentions_from_memory(updated_memory)
# Save
return get_memory_storage().save(updated_memory, agent_name)
return get_memory_storage().save(updated_memory, agent_name, user_id=user_id)
except json.JSONDecodeError as e:
logger.warning("Failed to parse LLM response for memory update: %s", e)
@ -455,6 +462,7 @@ def update_memory_from_conversation(
agent_name: str | None = None,
correction_detected: bool = False,
reinforcement_detected: bool = False,
user_id: str | None = None,
) -> bool:
"""Convenience function to update memory from a conversation.
@ -464,9 +472,10 @@ def update_memory_from_conversation(
agent_name: If provided, updates per-agent memory. If None, updates global memory.
correction_detected: Whether recent turns include an explicit correction signal.
reinforcement_detected: Whether recent turns include a positive reinforcement signal.
user_id: If provided, scopes memory to a specific user.
Returns:
True if successful, False otherwise.
"""
updater = MemoryUpdater()
return updater.update_memory(messages, thread_id, agent_name, correction_detected, reinforcement_detected)
return updater.update_memory(messages, thread_id, agent_name, correction_detected, reinforcement_detected, user_id=user_id)

View File

@ -11,6 +11,7 @@ from langgraph.runtime import Runtime
from deerflow.agents.memory.queue import get_memory_queue
from deerflow.config.memory_config import get_memory_config
from deerflow.runtime.user_context import get_effective_user_id
logger = logging.getLogger(__name__)
@ -236,11 +237,16 @@ class MemoryMiddleware(AgentMiddleware[MemoryMiddlewareState]):
# Queue the filtered conversation for memory update
correction_detected = detect_correction(filtered_messages)
reinforcement_detected = not correction_detected and detect_reinforcement(filtered_messages)
# Capture user_id at enqueue time while the request context is still alive.
# threading.Timer fires on a different thread where ContextVar values are not
# propagated, so we must store user_id explicitly in ConversationContext.
user_id = get_effective_user_id()
queue = get_memory_queue()
queue.add(
thread_id=thread_id,
messages=filtered_messages,
agent_name=self._agent_name,
user_id=user_id,
correction_detected=correction_detected,
reinforcement_detected=reinforcement_detected,
)

View File

@ -8,6 +8,7 @@ from langgraph.runtime import Runtime
from deerflow.agents.thread_state import ThreadDataState
from deerflow.config.paths import Paths, get_paths
from deerflow.runtime.user_context import get_effective_user_id
logger = logging.getLogger(__name__)
@ -46,32 +47,34 @@ class ThreadDataMiddleware(AgentMiddleware[ThreadDataMiddlewareState]):
self._paths = Paths(base_dir) if base_dir else get_paths()
self._lazy_init = lazy_init
def _get_thread_paths(self, thread_id: str) -> dict[str, str]:
def _get_thread_paths(self, thread_id: str, user_id: str | None = None) -> dict[str, str]:
"""Get the paths for a thread's data directories.
Args:
thread_id: The thread ID.
user_id: Optional user ID for per-user path isolation.
Returns:
Dictionary with workspace_path, uploads_path, and outputs_path.
"""
return {
"workspace_path": str(self._paths.sandbox_work_dir(thread_id)),
"uploads_path": str(self._paths.sandbox_uploads_dir(thread_id)),
"outputs_path": str(self._paths.sandbox_outputs_dir(thread_id)),
"workspace_path": str(self._paths.sandbox_work_dir(thread_id, user_id=user_id)),
"uploads_path": str(self._paths.sandbox_uploads_dir(thread_id, user_id=user_id)),
"outputs_path": str(self._paths.sandbox_outputs_dir(thread_id, user_id=user_id)),
}
def _create_thread_directories(self, thread_id: str) -> dict[str, str]:
def _create_thread_directories(self, thread_id: str, user_id: str | None = None) -> dict[str, str]:
"""Create the thread data directories.
Args:
thread_id: The thread ID.
user_id: Optional user ID for per-user path isolation.
Returns:
Dictionary with the created directory paths.
"""
self._paths.ensure_thread_dirs(thread_id)
return self._get_thread_paths(thread_id)
self._paths.ensure_thread_dirs(thread_id, user_id=user_id)
return self._get_thread_paths(thread_id, user_id=user_id)
@override
def before_agent(self, state: ThreadDataMiddlewareState, runtime: Runtime) -> dict | None:
@ -84,12 +87,14 @@ class ThreadDataMiddleware(AgentMiddleware[ThreadDataMiddlewareState]):
if thread_id is None:
raise ValueError("Thread ID is required in runtime context or config.configurable")
user_id = get_effective_user_id()
if self._lazy_init:
# Lazy initialization: only compute paths, don't create directories
paths = self._get_thread_paths(thread_id)
paths = self._get_thread_paths(thread_id, user_id=user_id)
else:
# Eager initialization: create directories immediately
paths = self._create_thread_directories(thread_id)
paths = self._create_thread_directories(thread_id, user_id=user_id)
logger.debug("Created thread data directories for thread %s", thread_id)
return {

View File

@ -10,6 +10,7 @@ from langchain_core.messages import HumanMessage
from langgraph.runtime import Runtime
from deerflow.config.paths import Paths, get_paths
from deerflow.runtime.user_context import get_effective_user_id
from deerflow.utils.file_conversion import extract_outline
logger = logging.getLogger(__name__)
@ -221,7 +222,7 @@ class UploadsMiddleware(AgentMiddleware[UploadsMiddlewareState]):
thread_id = get_config().get("configurable", {}).get("thread_id")
except RuntimeError:
pass # get_config() raises outside a runnable context (e.g. unit tests)
uploads_dir = self._paths.sandbox_uploads_dir(thread_id) if thread_id else None
uploads_dir = self._paths.sandbox_uploads_dir(thread_id, user_id=get_effective_user_id()) if thread_id else None
# Get newly uploaded files from the current message's additional_kwargs.files
new_files = self._files_from_kwargs(last_message, uploads_dir) or []

View File

@ -40,6 +40,7 @@ from deerflow.config.app_config import get_app_config, reload_app_config
from deerflow.config.extensions_config import ExtensionsConfig, SkillStateConfig, get_extensions_config, reload_extensions_config
from deerflow.config.paths import get_paths
from deerflow.models import create_chat_model
from deerflow.runtime.user_context import get_effective_user_id
from deerflow.skills.installer import install_skill_from_archive
from deerflow.uploads.manager import (
claim_unique_filename,
@ -769,19 +770,19 @@ class DeerFlowClient:
"""
from deerflow.agents.memory.updater import get_memory_data
return get_memory_data()
return get_memory_data(user_id=get_effective_user_id())
def export_memory(self) -> dict:
"""Export current memory data for backup or transfer."""
from deerflow.agents.memory.updater import get_memory_data
return get_memory_data()
return get_memory_data(user_id=get_effective_user_id())
def import_memory(self, memory_data: dict) -> dict:
"""Import and persist full memory data."""
from deerflow.agents.memory.updater import import_memory_data
return import_memory_data(memory_data)
return import_memory_data(memory_data, user_id=get_effective_user_id())
def get_model(self, name: str) -> dict | None:
"""Get a specific model's configuration by name.
@ -956,13 +957,13 @@ class DeerFlowClient:
"""
from deerflow.agents.memory.updater import reload_memory_data
return reload_memory_data()
return reload_memory_data(user_id=get_effective_user_id())
def clear_memory(self) -> dict:
"""Clear all persisted memory data."""
from deerflow.agents.memory.updater import clear_memory_data
return clear_memory_data()
return clear_memory_data(user_id=get_effective_user_id())
def create_memory_fact(self, content: str, category: str = "context", confidence: float = 0.5) -> dict:
"""Create a single fact manually."""
@ -1179,7 +1180,7 @@ class DeerFlowClient:
ValueError: If the path is invalid.
"""
try:
actual = get_paths().resolve_virtual_path(thread_id, path)
actual = get_paths().resolve_virtual_path(thread_id, path, user_id=get_effective_user_id())
except ValueError as exc:
if "traversal" in str(exc):
from deerflow.uploads.manager import PathTraversalError

View File

@ -27,6 +27,7 @@ except ImportError: # pragma: no cover - Windows fallback
from deerflow.config import get_app_config
from deerflow.config.paths import VIRTUAL_PATH_PREFIX, get_paths
from deerflow.runtime.user_context import get_effective_user_id
from deerflow.sandbox.sandbox import Sandbox
from deerflow.sandbox.sandbox_provider import SandboxProvider
@ -260,15 +261,16 @@ class AioSandboxProvider(SandboxProvider):
mounted Docker socket (DooD), the host Docker daemon can resolve the paths.
"""
paths = get_paths()
paths.ensure_thread_dirs(thread_id)
user_id = get_effective_user_id()
paths.ensure_thread_dirs(thread_id, user_id=user_id)
return [
(paths.host_sandbox_work_dir(thread_id), f"{VIRTUAL_PATH_PREFIX}/workspace", False),
(paths.host_sandbox_uploads_dir(thread_id), f"{VIRTUAL_PATH_PREFIX}/uploads", False),
(paths.host_sandbox_outputs_dir(thread_id), f"{VIRTUAL_PATH_PREFIX}/outputs", False),
(paths.host_sandbox_work_dir(thread_id, user_id=user_id), f"{VIRTUAL_PATH_PREFIX}/workspace", False),
(paths.host_sandbox_uploads_dir(thread_id, user_id=user_id), f"{VIRTUAL_PATH_PREFIX}/uploads", False),
(paths.host_sandbox_outputs_dir(thread_id, user_id=user_id), f"{VIRTUAL_PATH_PREFIX}/outputs", False),
# ACP workspace: read-only inside the sandbox (lead agent reads results;
# the ACP subprocess writes from the host side, not from within the container).
(paths.host_acp_workspace_dir(thread_id), "/mnt/acp-workspace", True),
(paths.host_acp_workspace_dir(thread_id, user_id=user_id), "/mnt/acp-workspace", True),
]
@staticmethod
@ -480,8 +482,9 @@ class AioSandboxProvider(SandboxProvider):
across multiple processes, preventing container-name conflicts.
"""
paths = get_paths()
paths.ensure_thread_dirs(thread_id)
lock_path = paths.thread_dir(thread_id) / f"{sandbox_id}.lock"
user_id = get_effective_user_id()
paths.ensure_thread_dirs(thread_id, user_id=user_id)
lock_path = paths.thread_dir(thread_id, user_id=user_id) / f"{sandbox_id}.lock"
with open(lock_path, "a", encoding="utf-8") as lock_file:
locked = False

View File

@ -14,8 +14,9 @@ class MemoryConfig(BaseModel):
default="",
description=(
"Path to store memory data. "
"If empty, defaults to `{base_dir}/memory.json` (see Paths.memory_file). "
"Absolute paths are used as-is. "
"If empty, defaults to per-user memory at `{base_dir}/users/{user_id}/memory.json`. "
"Absolute paths are used as-is and opt out of per-user isolation "
"(all users share the same file). "
"Relative paths are resolved against `Paths.base_dir` "
"(not the backend working directory). "
"Note: if you previously set this to `.deer-flow/memory.json`, "

View File

@ -7,6 +7,7 @@ from pathlib import Path, PureWindowsPath
VIRTUAL_PATH_PREFIX = "/mnt/user-data"
_SAFE_THREAD_ID_RE = re.compile(r"^[A-Za-z0-9_\-]+$")
_SAFE_USER_ID_RE = re.compile(r"^[A-Za-z0-9_\-]+$")
def _default_local_base_dir() -> Path:
@ -22,6 +23,13 @@ def _validate_thread_id(thread_id: str) -> str:
return thread_id
def _validate_user_id(user_id: str) -> str:
"""Validate a user ID before using it in filesystem paths."""
if not _SAFE_USER_ID_RE.match(user_id):
raise ValueError(f"Invalid user_id {user_id!r}: only alphanumeric characters, hyphens, and underscores are allowed.")
return user_id
def _join_host_path(base: str, *parts: str) -> str:
"""Join host filesystem path segments while preserving native style.
@ -134,44 +142,63 @@ class Paths:
"""Per-agent memory file: `{base_dir}/agents/{name}/memory.json`."""
return self.agent_dir(name) / "memory.json"
def thread_dir(self, thread_id: str) -> Path:
def user_dir(self, user_id: str) -> Path:
"""Directory for a specific user: `{base_dir}/users/{user_id}/`."""
return self.base_dir / "users" / _validate_user_id(user_id)
def user_memory_file(self, user_id: str) -> Path:
"""Per-user memory file: `{base_dir}/users/{user_id}/memory.json`."""
return self.user_dir(user_id) / "memory.json"
def user_agent_memory_file(self, user_id: str, agent_name: str) -> Path:
"""Per-user per-agent memory: `{base_dir}/users/{user_id}/agents/{name}/memory.json`."""
return self.user_dir(user_id) / "agents" / agent_name.lower() / "memory.json"
def thread_dir(self, thread_id: str, *, user_id: str | None = None) -> Path:
"""
Host path for a thread's data: `{base_dir}/threads/{thread_id}/`
Host path for a thread's data.
When *user_id* is provided:
`{base_dir}/users/{user_id}/threads/{thread_id}/`
Otherwise (legacy layout):
`{base_dir}/threads/{thread_id}/`
This directory contains a `user-data/` subdirectory that is mounted
as `/mnt/user-data/` inside the sandbox.
Raises:
ValueError: If `thread_id` contains unsafe characters (path separators
or `..`) that could cause directory traversal.
ValueError: If `thread_id` or `user_id` contains unsafe characters (path
separators or `..`) that could cause directory traversal.
"""
if user_id is not None:
return self.user_dir(user_id) / "threads" / _validate_thread_id(thread_id)
return self.base_dir / "threads" / _validate_thread_id(thread_id)
def sandbox_work_dir(self, thread_id: str) -> Path:
def sandbox_work_dir(self, thread_id: str, *, user_id: str | None = None) -> Path:
"""
Host path for the agent's workspace directory.
Host: `{base_dir}/threads/{thread_id}/user-data/workspace/`
Sandbox: `/mnt/user-data/workspace/`
"""
return self.thread_dir(thread_id) / "user-data" / "workspace"
return self.thread_dir(thread_id, user_id=user_id) / "user-data" / "workspace"
def sandbox_uploads_dir(self, thread_id: str) -> Path:
def sandbox_uploads_dir(self, thread_id: str, *, user_id: str | None = None) -> Path:
"""
Host path for user-uploaded files.
Host: `{base_dir}/threads/{thread_id}/user-data/uploads/`
Sandbox: `/mnt/user-data/uploads/`
"""
return self.thread_dir(thread_id) / "user-data" / "uploads"
return self.thread_dir(thread_id, user_id=user_id) / "user-data" / "uploads"
def sandbox_outputs_dir(self, thread_id: str) -> Path:
def sandbox_outputs_dir(self, thread_id: str, *, user_id: str | None = None) -> Path:
"""
Host path for agent-generated artifacts.
Host: `{base_dir}/threads/{thread_id}/user-data/outputs/`
Sandbox: `/mnt/user-data/outputs/`
"""
return self.thread_dir(thread_id) / "user-data" / "outputs"
return self.thread_dir(thread_id, user_id=user_id) / "user-data" / "outputs"
def acp_workspace_dir(self, thread_id: str) -> Path:
def acp_workspace_dir(self, thread_id: str, *, user_id: str | None = None) -> Path:
"""
Host path for the ACP workspace of a specific thread.
Host: `{base_dir}/threads/{thread_id}/acp-workspace/`
@ -180,41 +207,43 @@ class Paths:
Each thread gets its own isolated ACP workspace so that concurrent
sessions cannot read each other's ACP agent outputs.
"""
return self.thread_dir(thread_id) / "acp-workspace"
return self.thread_dir(thread_id, user_id=user_id) / "acp-workspace"
def sandbox_user_data_dir(self, thread_id: str) -> Path:
def sandbox_user_data_dir(self, thread_id: str, *, user_id: str | None = None) -> Path:
"""
Host path for the user-data root.
Host: `{base_dir}/threads/{thread_id}/user-data/`
Sandbox: `/mnt/user-data/`
"""
return self.thread_dir(thread_id) / "user-data"
return self.thread_dir(thread_id, user_id=user_id) / "user-data"
def host_thread_dir(self, thread_id: str) -> str:
def host_thread_dir(self, thread_id: str, *, user_id: str | None = None) -> str:
"""Host path for a thread directory, preserving Windows path syntax."""
if user_id is not None:
return _join_host_path(self._host_base_dir_str(), "users", _validate_user_id(user_id), "threads", _validate_thread_id(thread_id))
return _join_host_path(self._host_base_dir_str(), "threads", _validate_thread_id(thread_id))
def host_sandbox_user_data_dir(self, thread_id: str) -> str:
def host_sandbox_user_data_dir(self, thread_id: str, *, user_id: str | None = None) -> str:
"""Host path for a thread's user-data root."""
return _join_host_path(self.host_thread_dir(thread_id), "user-data")
return _join_host_path(self.host_thread_dir(thread_id, user_id=user_id), "user-data")
def host_sandbox_work_dir(self, thread_id: str) -> str:
def host_sandbox_work_dir(self, thread_id: str, *, user_id: str | None = None) -> str:
"""Host path for the workspace mount source."""
return _join_host_path(self.host_sandbox_user_data_dir(thread_id), "workspace")
return _join_host_path(self.host_sandbox_user_data_dir(thread_id, user_id=user_id), "workspace")
def host_sandbox_uploads_dir(self, thread_id: str) -> str:
def host_sandbox_uploads_dir(self, thread_id: str, *, user_id: str | None = None) -> str:
"""Host path for the uploads mount source."""
return _join_host_path(self.host_sandbox_user_data_dir(thread_id), "uploads")
return _join_host_path(self.host_sandbox_user_data_dir(thread_id, user_id=user_id), "uploads")
def host_sandbox_outputs_dir(self, thread_id: str) -> str:
def host_sandbox_outputs_dir(self, thread_id: str, *, user_id: str | None = None) -> str:
"""Host path for the outputs mount source."""
return _join_host_path(self.host_sandbox_user_data_dir(thread_id), "outputs")
return _join_host_path(self.host_sandbox_user_data_dir(thread_id, user_id=user_id), "outputs")
def host_acp_workspace_dir(self, thread_id: str) -> str:
def host_acp_workspace_dir(self, thread_id: str, *, user_id: str | None = None) -> str:
"""Host path for the ACP workspace mount source."""
return _join_host_path(self.host_thread_dir(thread_id), "acp-workspace")
return _join_host_path(self.host_thread_dir(thread_id, user_id=user_id), "acp-workspace")
def ensure_thread_dirs(self, thread_id: str) -> None:
def ensure_thread_dirs(self, thread_id: str, *, user_id: str | None = None) -> None:
"""Create all standard sandbox directories for a thread.
Directories are created with mode 0o777 so that sandbox containers
@ -228,24 +257,24 @@ class Paths:
ACP agent invocation.
"""
for d in [
self.sandbox_work_dir(thread_id),
self.sandbox_uploads_dir(thread_id),
self.sandbox_outputs_dir(thread_id),
self.acp_workspace_dir(thread_id),
self.sandbox_work_dir(thread_id, user_id=user_id),
self.sandbox_uploads_dir(thread_id, user_id=user_id),
self.sandbox_outputs_dir(thread_id, user_id=user_id),
self.acp_workspace_dir(thread_id, user_id=user_id),
]:
d.mkdir(parents=True, exist_ok=True)
d.chmod(0o777)
def delete_thread_dir(self, thread_id: str) -> None:
def delete_thread_dir(self, thread_id: str, *, user_id: str | None = None) -> None:
"""Delete all persisted data for a thread.
The operation is idempotent: missing thread directories are ignored.
"""
thread_dir = self.thread_dir(thread_id)
thread_dir = self.thread_dir(thread_id, user_id=user_id)
if thread_dir.exists():
shutil.rmtree(thread_dir)
def resolve_virtual_path(self, thread_id: str, virtual_path: str) -> Path:
def resolve_virtual_path(self, thread_id: str, virtual_path: str, *, user_id: str | None = None) -> Path:
"""Resolve a sandbox virtual path to the actual host filesystem path.
Args:
@ -253,6 +282,7 @@ class Paths:
virtual_path: Virtual path as seen inside the sandbox, e.g.
``/mnt/user-data/outputs/report.pdf``.
Leading slashes are stripped before matching.
user_id: Optional user ID for user-scoped path resolution.
Returns:
The resolved absolute host filesystem path.
@ -270,7 +300,7 @@ class Paths:
raise ValueError(f"Path must start with /{prefix}")
relative = stripped[len(prefix) :].lstrip("/")
base = self.sandbox_user_data_dir(thread_id).resolve()
base = self.sandbox_user_data_dir(thread_id, user_id=user_id).resolve()
actual = (base / relative).resolve()
try:

View File

@ -83,8 +83,18 @@ class RunEventStore(abc.ABC):
self,
thread_id: str,
run_id: str,
*,
limit: int = 50,
before_seq: int | None = None,
after_seq: int | None = None,
) -> list[dict]:
"""Return displayable messages (category=message) for a specific run, ordered by seq ascending."""
"""Return displayable messages (category=message) for a specific run, ordered by seq ascending.
Supports bidirectional cursor pagination:
- after_seq: return the first ``limit`` records with seq > after_seq (ascending)
- before_seq: return the last ``limit`` records with seq < before_seq (ascending)
- neither: return the latest ``limit`` records (ascending)
"""
@abc.abstractmethod
async def count_messages(self, thread_id: str) -> int:

View File

@ -205,16 +205,35 @@ class DbRunEventStore(RunEventStore):
thread_id,
run_id,
*,
limit=50,
before_seq=None,
after_seq=None,
user_id: str | None | _AutoSentinel = AUTO,
):
resolved_user_id = resolve_user_id(user_id, method_name="DbRunEventStore.list_messages_by_run")
stmt = select(RunEventRow).where(RunEventRow.thread_id == thread_id, RunEventRow.run_id == run_id, RunEventRow.category == "message")
stmt = select(RunEventRow).where(
RunEventRow.thread_id == thread_id,
RunEventRow.run_id == run_id,
RunEventRow.category == "message",
)
if resolved_user_id is not None:
stmt = stmt.where(RunEventRow.user_id == resolved_user_id)
stmt = stmt.order_by(RunEventRow.seq.asc())
async with self._sf() as session:
result = await session.execute(stmt)
return [self._row_to_dict(r) for r in result.scalars()]
if before_seq is not None:
stmt = stmt.where(RunEventRow.seq < before_seq)
if after_seq is not None:
stmt = stmt.where(RunEventRow.seq > after_seq)
if after_seq is not None:
stmt = stmt.order_by(RunEventRow.seq.asc()).limit(limit)
async with self._sf() as session:
result = await session.execute(stmt)
return [self._row_to_dict(r) for r in result.scalars()]
else:
stmt = stmt.order_by(RunEventRow.seq.desc()).limit(limit)
async with self._sf() as session:
result = await session.execute(stmt)
rows = list(result.scalars())
return [self._row_to_dict(r) for r in reversed(rows)]
async def count_messages(
self,

View File

@ -152,9 +152,17 @@ class JsonlRunEventStore(RunEventStore):
events = [e for e in events if e.get("event_type") in event_types]
return events[:limit]
async def list_messages_by_run(self, thread_id, run_id):
async def list_messages_by_run(self, thread_id, run_id, *, limit=50, before_seq=None, after_seq=None):
events = self._read_run_events(thread_id, run_id)
return [e for e in events if e.get("category") == "message"]
filtered = [e for e in events if e.get("category") == "message"]
if before_seq is not None:
filtered = [e for e in filtered if e.get("seq", 0) < before_seq]
if after_seq is not None:
filtered = [e for e in filtered if e.get("seq", 0) > after_seq]
if after_seq is not None:
return filtered[:limit]
else:
return filtered[-limit:] if len(filtered) > limit else filtered
async def count_messages(self, thread_id):
all_events = self._read_thread_events(thread_id)

View File

@ -97,9 +97,17 @@ class MemoryRunEventStore(RunEventStore):
filtered = [e for e in filtered if e["event_type"] in event_types]
return filtered[:limit]
async def list_messages_by_run(self, thread_id, run_id):
async def list_messages_by_run(self, thread_id, run_id, *, limit=50, before_seq=None, after_seq=None):
all_events = self._events.get(thread_id, [])
return [e for e in all_events if e["run_id"] == run_id and e["category"] == "message"]
filtered = [e for e in all_events if e["run_id"] == run_id and e["category"] == "message"]
if before_seq is not None:
filtered = [e for e in filtered if e["seq"] < before_seq]
if after_seq is not None:
filtered = [e for e in filtered if e["seq"] > after_seq]
if after_seq is not None:
return filtered[:limit]
else:
return filtered[-limit:] if len(filtered) > limit else filtered
async def count_messages(self, thread_id):
all_events = self._events.get(thread_id, [])

View File

@ -87,6 +87,8 @@ async def run_agent(
journal = None
journal = None
# Track whether "events" was requested but skipped
if "events" in requested_modes:
logger.info(

View File

@ -90,6 +90,25 @@ def require_current_user() -> CurrentUser:
return user
# ---------------------------------------------------------------------------
# Effective user_id helpers (filesystem isolation)
# ---------------------------------------------------------------------------
DEFAULT_USER_ID: Final[str] = "default"
def get_effective_user_id() -> str:
"""Return the current user's id as a string, or DEFAULT_USER_ID if unset.
Unlike :func:`require_current_user` this never raises it is designed
for filesystem-path resolution where a valid user bucket is always needed.
"""
user = _current_user.get()
if user is None:
return DEFAULT_USER_ID
return str(user.id)
# ---------------------------------------------------------------------------
# Sentinel-based user_id resolution
# ---------------------------------------------------------------------------

View File

@ -200,8 +200,9 @@ def _get_acp_workspace_host_path(thread_id: str | None = None) -> str | None:
if thread_id is not None:
try:
from deerflow.config.paths import get_paths
from deerflow.runtime.user_context import get_effective_user_id
host_path = get_paths().acp_workspace_dir(thread_id)
host_path = get_paths().acp_workspace_dir(thread_id, user_id=get_effective_user_id())
if host_path.exists():
return str(host_path)
except Exception:

View File

@ -33,11 +33,12 @@ def _get_work_dir(thread_id: str | None) -> str:
An absolute physical filesystem path to use as the working directory.
"""
from deerflow.config.paths import get_paths
from deerflow.runtime.user_context import get_effective_user_id
paths = get_paths()
if thread_id:
try:
work_dir = paths.acp_workspace_dir(thread_id)
work_dir = paths.acp_workspace_dir(thread_id, user_id=get_effective_user_id())
except ValueError:
logger.warning("Invalid thread_id %r for ACP workspace, falling back to global", thread_id)
work_dir = paths.base_dir / "acp-workspace"

View File

@ -8,6 +8,7 @@ from langgraph.typing import ContextT
from deerflow.agents.thread_state import ThreadState
from deerflow.config.paths import VIRTUAL_PATH_PREFIX, get_paths
from deerflow.runtime.user_context import get_effective_user_id
OUTPUTS_VIRTUAL_PREFIX = f"{VIRTUAL_PATH_PREFIX}/outputs"
@ -47,7 +48,7 @@ def _normalize_presented_filepath(
virtual_prefix = VIRTUAL_PATH_PREFIX.lstrip("/")
if stripped == virtual_prefix or stripped.startswith(virtual_prefix + "/"):
actual_path = get_paths().resolve_virtual_path(thread_id, filepath)
actual_path = get_paths().resolve_virtual_path(thread_id, filepath, user_id=get_effective_user_id())
else:
actual_path = Path(filepath).expanduser().resolve()

View File

@ -10,6 +10,7 @@ from pathlib import Path
from urllib.parse import quote
from deerflow.config.paths import VIRTUAL_PATH_PREFIX, get_paths
from deerflow.runtime.user_context import get_effective_user_id
class PathTraversalError(ValueError):
@ -33,7 +34,7 @@ def validate_thread_id(thread_id: str) -> None:
def get_uploads_dir(thread_id: str) -> Path:
"""Return the uploads directory path for a thread (no side effects)."""
validate_thread_id(thread_id)
return get_paths().sandbox_uploads_dir(thread_id)
return get_paths().sandbox_uploads_dir(thread_id, user_id=get_effective_user_id())
def ensure_uploads_dir(thread_id: str) -> Path:

View File

@ -0,0 +1,160 @@
"""One-time migration: move legacy thread dirs and memory into per-user layout.
Usage:
PYTHONPATH=. python scripts/migrate_user_isolation.py [--dry-run]
The script is idempotent re-running it after a successful migration is a no-op.
"""
import argparse
import json
import logging
import shutil
from pathlib import Path
from deerflow.config.paths import Paths, get_paths
logger = logging.getLogger(__name__)
def migrate_thread_dirs(
paths: Paths,
thread_owner_map: dict[str, str],
*,
dry_run: bool = False,
) -> list[dict]:
"""Move legacy thread directories into per-user layout.
Args:
paths: Paths instance.
thread_owner_map: Mapping of thread_id -> user_id from threads_meta table.
dry_run: If True, only log what would happen.
Returns:
List of migration report entries.
"""
report: list[dict] = []
legacy_threads = paths.base_dir / "threads"
if not legacy_threads.exists():
logger.info("No legacy threads directory found — nothing to migrate.")
return report
for thread_dir in sorted(legacy_threads.iterdir()):
if not thread_dir.is_dir():
continue
thread_id = thread_dir.name
user_id = thread_owner_map.get(thread_id, "default")
dest = paths.base_dir / "users" / user_id / "threads" / thread_id
entry = {"thread_id": thread_id, "user_id": user_id, "action": ""}
if dest.exists():
conflicts_dir = paths.base_dir / "migration-conflicts" / thread_id
entry["action"] = f"conflict -> {conflicts_dir}"
if not dry_run:
conflicts_dir.parent.mkdir(parents=True, exist_ok=True)
shutil.move(str(thread_dir), str(conflicts_dir))
logger.warning("Conflict for thread %s: moved to %s", thread_id, conflicts_dir)
else:
entry["action"] = f"moved -> {dest}"
if not dry_run:
dest.parent.mkdir(parents=True, exist_ok=True)
shutil.move(str(thread_dir), str(dest))
logger.info("Migrated thread %s -> user %s", thread_id, user_id)
report.append(entry)
# Clean up empty legacy threads dir
if not dry_run and legacy_threads.exists() and not any(legacy_threads.iterdir()):
legacy_threads.rmdir()
return report
def migrate_memory(
paths: Paths,
user_id: str = "default",
*,
dry_run: bool = False,
) -> None:
"""Move legacy global memory.json into per-user layout.
Args:
paths: Paths instance.
user_id: Target user to receive the legacy memory.
dry_run: If True, only log.
"""
legacy_mem = paths.base_dir / "memory.json"
if not legacy_mem.exists():
logger.info("No legacy memory.json found — nothing to migrate.")
return
dest = paths.user_memory_file(user_id)
if dest.exists():
legacy_backup = paths.base_dir / "memory.legacy.json"
logger.warning("Destination %s exists; renaming legacy to %s", dest, legacy_backup)
if not dry_run:
legacy_mem.rename(legacy_backup)
return
logger.info("Migrating memory.json -> %s", dest)
if not dry_run:
dest.parent.mkdir(parents=True, exist_ok=True)
shutil.move(str(legacy_mem), str(dest))
def _build_owner_map_from_db(paths: Paths) -> dict[str, str]:
"""Query threads_meta table for thread_id -> user_id mapping.
Uses raw sqlite3 to avoid async dependencies.
"""
import sqlite3
db_path = paths.base_dir / "deer-flow.db"
if not db_path.exists():
logger.info("No database found at %s — using empty owner map.", db_path)
return {}
conn = sqlite3.connect(str(db_path))
try:
cursor = conn.execute("SELECT thread_id, user_id FROM threads_meta WHERE user_id IS NOT NULL")
return {row[0]: row[1] for row in cursor.fetchall()}
except sqlite3.OperationalError as e:
logger.warning("Failed to query threads_meta: %s", e)
return {}
finally:
conn.close()
def main() -> None:
parser = argparse.ArgumentParser(description="Migrate DeerFlow data to per-user layout")
parser.add_argument("--dry-run", action="store_true", help="Log actions without making changes")
args = parser.parse_args()
logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s")
paths = get_paths()
logger.info("Base directory: %s", paths.base_dir)
logger.info("Dry run: %s", args.dry_run)
owner_map = _build_owner_map_from_db(paths)
logger.info("Found %d thread ownership records in DB", len(owner_map))
report = migrate_thread_dirs(paths, owner_map, dry_run=args.dry_run)
migrate_memory(paths, user_id="default", dry_run=args.dry_run)
if report:
logger.info("Migration report:")
for entry in report:
logger.info(" thread=%s user=%s action=%s", entry["thread_id"], entry["user_id"], entry["action"])
else:
logger.info("No threads to migrate.")
unowned = [e for e in report if e["user_id"] == "default"]
if unowned:
logger.warning("%d thread(s) had no owner and were assigned to 'default':", len(unowned))
for e in unowned:
logger.warning(" %s", e["thread_id"])
if __name__ == "__main__":
main()

View File

@ -57,6 +57,7 @@ def test_get_thread_mounts_includes_acp_workspace(tmp_path, monkeypatch):
"""_get_thread_mounts must include /mnt/acp-workspace (read-only) for docker sandbox."""
aio_mod = importlib.import_module("deerflow.community.aio_sandbox.aio_sandbox_provider")
monkeypatch.setattr(aio_mod, "get_paths", lambda: Paths(base_dir=tmp_path))
monkeypatch.setattr(aio_mod, "get_effective_user_id", lambda: None)
mounts = aio_mod.AioSandboxProvider._get_thread_mounts("thread-3")
@ -95,6 +96,7 @@ def test_get_thread_mounts_preserves_windows_host_path_style(tmp_path, monkeypat
aio_mod = importlib.import_module("deerflow.community.aio_sandbox.aio_sandbox_provider")
monkeypatch.setenv("DEER_FLOW_HOST_BASE_DIR", r"C:\Users\demo\deer-flow\backend\.deer-flow")
monkeypatch.setattr(aio_mod, "get_paths", lambda: Paths(base_dir=tmp_path))
monkeypatch.setattr(aio_mod, "get_effective_user_id", lambda: None)
mounts = aio_mod.AioSandboxProvider._get_thread_mounts("thread-10")

View File

@ -231,7 +231,7 @@ class TestResolveAttachments:
mock_paths = MagicMock()
mock_paths.sandbox_outputs_dir.return_value = outputs_dir
def resolve_side_effect(tid, vpath):
def resolve_side_effect(tid, vpath, *, user_id=None):
if "data.csv" in vpath:
return good_file
return tmp_path / "missing.txt"

View File

@ -1241,7 +1241,10 @@ class TestMemoryManagement:
with patch("deerflow.agents.memory.updater.import_memory_data", return_value=imported) as mock_import:
result = client.import_memory(imported)
mock_import.assert_called_once_with(imported)
assert mock_import.call_count == 1
call_args = mock_import.call_args
assert call_args.args == (imported,)
assert "user_id" in call_args.kwargs
assert result == imported
def test_reload_memory(self, client):
@ -1487,9 +1490,12 @@ class TestUploads:
class TestArtifacts:
def test_get_artifact(self, client):
from deerflow.runtime.user_context import get_effective_user_id
with tempfile.TemporaryDirectory() as tmp:
paths = Paths(base_dir=tmp)
outputs = paths.sandbox_outputs_dir("t1")
user_id = get_effective_user_id()
outputs = paths.sandbox_outputs_dir("t1", user_id=user_id)
outputs.mkdir(parents=True)
(outputs / "result.txt").write_text("artifact content")
@ -1500,9 +1506,12 @@ class TestArtifacts:
assert "text" in mime
def test_get_artifact_not_found(self, client):
from deerflow.runtime.user_context import get_effective_user_id
with tempfile.TemporaryDirectory() as tmp:
paths = Paths(base_dir=tmp)
paths.sandbox_user_data_dir("t1").mkdir(parents=True)
user_id = get_effective_user_id()
paths.sandbox_outputs_dir("t1", user_id=user_id).mkdir(parents=True)
with patch("deerflow.client.get_paths", return_value=paths):
with pytest.raises(FileNotFoundError):
@ -1513,9 +1522,12 @@ class TestArtifacts:
client.get_artifact("t1", "bad/path/file.txt")
def test_get_artifact_path_traversal(self, client):
from deerflow.runtime.user_context import get_effective_user_id
with tempfile.TemporaryDirectory() as tmp:
paths = Paths(base_dir=tmp)
paths.sandbox_user_data_dir("t1").mkdir(parents=True)
user_id = get_effective_user_id()
paths.sandbox_outputs_dir("t1", user_id=user_id).mkdir(parents=True)
with patch("deerflow.client.get_paths", return_value=paths):
with pytest.raises(PathTraversalError):
@ -1699,13 +1711,16 @@ class TestScenarioFileLifecycle:
def test_upload_then_read_artifact(self, client):
"""Upload a file, simulate agent producing artifact, read it back."""
from deerflow.runtime.user_context import get_effective_user_id
with tempfile.TemporaryDirectory() as tmp:
tmp_path = Path(tmp)
uploads_dir = tmp_path / "uploads"
uploads_dir.mkdir()
paths = Paths(base_dir=tmp_path)
outputs_dir = paths.sandbox_outputs_dir("t-artifact")
user_id = get_effective_user_id()
outputs_dir = paths.sandbox_outputs_dir("t-artifact", user_id=user_id)
outputs_dir.mkdir(parents=True)
# Upload phase
@ -1955,11 +1970,14 @@ class TestScenarioThreadIsolation:
def test_artifacts_isolated_per_thread(self, client):
"""Artifacts in thread-A are not accessible from thread-B."""
from deerflow.runtime.user_context import get_effective_user_id
with tempfile.TemporaryDirectory() as tmp:
paths = Paths(base_dir=tmp)
outputs_a = paths.sandbox_outputs_dir("thread-a")
user_id = get_effective_user_id()
outputs_a = paths.sandbox_outputs_dir("thread-a", user_id=user_id)
outputs_a.mkdir(parents=True)
paths.sandbox_user_data_dir("thread-b").mkdir(parents=True)
paths.sandbox_outputs_dir("thread-b", user_id=user_id).mkdir(parents=True)
(outputs_a / "result.txt").write_text("thread-a artifact")
with patch("deerflow.client.get_paths", return_value=paths):
@ -2864,9 +2882,12 @@ class TestUploadDeleteSymlink:
class TestArtifactHardening:
def test_artifact_directory_rejected(self, client):
"""get_artifact rejects paths that resolve to a directory."""
from deerflow.runtime.user_context import get_effective_user_id
with tempfile.TemporaryDirectory() as tmp:
paths = Paths(base_dir=tmp)
subdir = paths.sandbox_outputs_dir("t1") / "subdir"
user_id = get_effective_user_id()
subdir = paths.sandbox_outputs_dir("t1", user_id=user_id) / "subdir"
subdir.mkdir(parents=True)
with patch("deerflow.client.get_paths", return_value=paths):
@ -2875,9 +2896,12 @@ class TestArtifactHardening:
def test_artifact_leading_slash_stripped(self, client):
"""Paths with leading slash are handled correctly."""
from deerflow.runtime.user_context import get_effective_user_id
with tempfile.TemporaryDirectory() as tmp:
paths = Paths(base_dir=tmp)
outputs = paths.sandbox_outputs_dir("t1")
user_id = get_effective_user_id()
outputs = paths.sandbox_outputs_dir("t1", user_id=user_id)
outputs.mkdir(parents=True)
(outputs / "file.txt").write_text("content")
@ -2991,9 +3015,12 @@ class TestBugArtifactPrefixMatchTooLoose:
def test_exact_prefix_without_subpath_accepted(self, client):
"""Bare 'mnt/user-data' is accepted (will later fail as directory, not at prefix)."""
from deerflow.runtime.user_context import get_effective_user_id
with tempfile.TemporaryDirectory() as tmp:
paths = Paths(base_dir=tmp)
paths.sandbox_user_data_dir("t1").mkdir(parents=True)
user_id = get_effective_user_id()
paths.sandbox_outputs_dir("t1", user_id=user_id).mkdir(parents=True)
with patch("deerflow.client.get_paths", return_value=paths):
# Accepted at prefix check, but fails because it's a directory.

View File

@ -262,8 +262,9 @@ class TestFileUploadIntegration:
# Physically exists
from deerflow.config.paths import get_paths
from deerflow.runtime.user_context import get_effective_user_id
assert (get_paths().sandbox_uploads_dir(tid) / "readme.txt").exists()
assert (get_paths().sandbox_uploads_dir(tid, user_id=get_effective_user_id()) / "readme.txt").exists()
def test_upload_duplicate_rename(self, e2e_env, tmp_path):
"""Uploading two files with the same name auto-renames the second."""
@ -472,12 +473,13 @@ class TestArtifactAccess:
def test_get_artifact_happy_path(self, e2e_env):
"""Write a file to outputs, then read it back via get_artifact()."""
from deerflow.config.paths import get_paths
from deerflow.runtime.user_context import get_effective_user_id
c = DeerFlowClient(checkpointer=None, thinking_enabled=False)
tid = str(uuid.uuid4())
# Create an output file in the thread's outputs directory
outputs_dir = get_paths().sandbox_outputs_dir(tid)
outputs_dir = get_paths().sandbox_outputs_dir(tid, user_id=get_effective_user_id())
outputs_dir.mkdir(parents=True, exist_ok=True)
(outputs_dir / "result.txt").write_text("hello artifact")
@ -488,11 +490,12 @@ class TestArtifactAccess:
def test_get_artifact_nested_path(self, e2e_env):
"""Artifacts in subdirectories are accessible."""
from deerflow.config.paths import get_paths
from deerflow.runtime.user_context import get_effective_user_id
c = DeerFlowClient(checkpointer=None, thinking_enabled=False)
tid = str(uuid.uuid4())
outputs_dir = get_paths().sandbox_outputs_dir(tid)
outputs_dir = get_paths().sandbox_outputs_dir(tid, user_id=get_effective_user_id())
sub = outputs_dir / "charts"
sub.mkdir(parents=True, exist_ok=True)
(sub / "data.json").write_text('{"x": 1}')

View File

@ -152,8 +152,10 @@ def test_get_work_dir_uses_base_dir_when_no_thread_id(monkeypatch, tmp_path):
def test_get_work_dir_uses_per_thread_path_when_thread_id_given(monkeypatch, tmp_path):
"""P1.1: _get_work_dir(thread_id) uses {base_dir}/threads/{thread_id}/acp-workspace/."""
from deerflow.config import paths as paths_module
from deerflow.runtime import user_context as uc_module
monkeypatch.setattr(paths_module, "get_paths", lambda: paths_module.Paths(base_dir=tmp_path))
monkeypatch.setattr(uc_module, "get_effective_user_id", lambda: None)
result = _get_work_dir("thread-abc-123")
expected = tmp_path / "threads" / "thread-abc-123" / "acp-workspace"
assert result == str(expected)
@ -310,8 +312,10 @@ async def test_invoke_acp_agent_uses_fixed_acp_workspace(monkeypatch, tmp_path):
async def test_invoke_acp_agent_uses_per_thread_workspace_when_thread_id_in_config(monkeypatch, tmp_path):
"""P1.1: When thread_id is in the RunnableConfig, ACP agent uses per-thread workspace."""
from deerflow.config import paths as paths_module
from deerflow.runtime import user_context as uc_module
monkeypatch.setattr(paths_module, "get_paths", lambda: paths_module.Paths(base_dir=tmp_path))
monkeypatch.setattr(uc_module, "get_effective_user_id", lambda: None)
monkeypatch.setattr(
"deerflow.config.extensions_config.ExtensionsConfig.from_file",

View File

@ -48,6 +48,7 @@ def test_process_queue_forwards_correction_flag_to_updater() -> None:
agent_name="lead_agent",
correction_detected=True,
reinforcement_detected=False,
user_id=None,
)
@ -88,4 +89,5 @@ def test_process_queue_forwards_reinforcement_flag_to_updater() -> None:
agent_name="lead_agent",
correction_detected=False,
reinforcement_detected=True,
user_id=None,
)

View File

@ -0,0 +1,38 @@
"""Tests for user_id propagation through memory queue."""
from unittest.mock import MagicMock, patch
from deerflow.agents.memory.queue import ConversationContext, MemoryUpdateQueue
def test_conversation_context_has_user_id():
ctx = ConversationContext(thread_id="t1", messages=[], user_id="alice")
assert ctx.user_id == "alice"
def test_conversation_context_user_id_default_none():
ctx = ConversationContext(thread_id="t1", messages=[])
assert ctx.user_id is None
def test_queue_add_stores_user_id():
q = MemoryUpdateQueue()
with patch.object(q, "_reset_timer"):
q.add(thread_id="t1", messages=["msg"], user_id="alice")
assert len(q._queue) == 1
assert q._queue[0].user_id == "alice"
q.clear()
def test_queue_process_passes_user_id_to_updater():
q = MemoryUpdateQueue()
with patch.object(q, "_reset_timer"):
q.add(thread_id="t1", messages=["msg"], user_id="alice")
mock_updater = MagicMock()
mock_updater.update_memory.return_value = True
with patch("deerflow.agents.memory.updater.MemoryUpdater", return_value=mock_updater):
q._process_queue()
mock_updater.update_memory.assert_called_once()
call_kwargs = mock_updater.update_memory.call_args.kwargs
assert call_kwargs["user_id"] == "alice"

View File

@ -258,12 +258,13 @@ def test_update_memory_fact_route_preserves_omitted_fields() -> None:
)
assert response.status_code == 200
update_fact.assert_called_once_with(
fact_id="fact_edit",
content="User prefers spaces",
category=None,
confidence=None,
)
assert update_fact.call_count == 1
call_kwargs = update_fact.call_args.kwargs
assert call_kwargs.get("fact_id") == "fact_edit"
assert call_kwargs.get("content") == "User prefers spaces"
assert call_kwargs.get("category") is None
assert call_kwargs.get("confidence") is None
assert "user_id" in call_kwargs
assert response.json()["facts"] == updated_memory["facts"]

View File

@ -0,0 +1,150 @@
"""Tests for per-user memory storage isolation."""
import pytest
from pathlib import Path
from unittest.mock import patch
from deerflow.agents.memory.storage import FileMemoryStorage, create_empty_memory
@pytest.fixture
def base_dir(tmp_path: Path) -> Path:
return tmp_path
@pytest.fixture
def storage() -> FileMemoryStorage:
return FileMemoryStorage()
class TestUserIsolatedStorage:
def test_save_and_load_per_user(self, storage: FileMemoryStorage, base_dir: Path):
from deerflow.config.paths import Paths
paths = Paths(base_dir)
with patch("deerflow.agents.memory.storage.get_paths", return_value=paths):
memory_a = create_empty_memory()
memory_a["user"]["workContext"]["summary"] = "User A context"
storage.save(memory_a, user_id="alice")
memory_b = create_empty_memory()
memory_b["user"]["workContext"]["summary"] = "User B context"
storage.save(memory_b, user_id="bob")
loaded_a = storage.load(user_id="alice")
loaded_b = storage.load(user_id="bob")
assert loaded_a["user"]["workContext"]["summary"] == "User A context"
assert loaded_b["user"]["workContext"]["summary"] == "User B context"
def test_user_memory_file_location(self, base_dir: Path):
from deerflow.config.paths import Paths
paths = Paths(base_dir)
with patch("deerflow.agents.memory.storage.get_paths", return_value=paths):
s = FileMemoryStorage()
memory = create_empty_memory()
s.save(memory, user_id="alice")
expected_path = base_dir / "users" / "alice" / "memory.json"
assert expected_path.exists()
def test_cache_isolated_per_user(self, base_dir: Path):
from deerflow.config.paths import Paths
paths = Paths(base_dir)
with patch("deerflow.agents.memory.storage.get_paths", return_value=paths):
s = FileMemoryStorage()
memory_a = create_empty_memory()
memory_a["user"]["workContext"]["summary"] = "A"
s.save(memory_a, user_id="alice")
memory_b = create_empty_memory()
memory_b["user"]["workContext"]["summary"] = "B"
s.save(memory_b, user_id="bob")
loaded_a = s.load(user_id="alice")
assert loaded_a["user"]["workContext"]["summary"] == "A"
def test_no_user_id_uses_legacy_path(self, base_dir: Path):
from deerflow.config.paths import Paths
from deerflow.config.memory_config import MemoryConfig
paths = Paths(base_dir)
with patch("deerflow.agents.memory.storage.get_paths", return_value=paths):
with patch("deerflow.agents.memory.storage.get_memory_config", return_value=MemoryConfig(storage_path="")):
s = FileMemoryStorage()
memory = create_empty_memory()
s.save(memory, user_id=None)
expected_path = base_dir / "memory.json"
assert expected_path.exists()
def test_user_and_legacy_do_not_interfere(self, base_dir: Path):
"""user_id=None (legacy) and user_id='alice' must use different files and caches."""
from deerflow.config.paths import Paths
from deerflow.config.memory_config import MemoryConfig
paths = Paths(base_dir)
with patch("deerflow.agents.memory.storage.get_paths", return_value=paths):
with patch("deerflow.agents.memory.storage.get_memory_config", return_value=MemoryConfig(storage_path="")):
s = FileMemoryStorage()
legacy_mem = create_empty_memory()
legacy_mem["user"]["workContext"]["summary"] = "legacy"
s.save(legacy_mem, user_id=None)
user_mem = create_empty_memory()
user_mem["user"]["workContext"]["summary"] = "alice"
s.save(user_mem, user_id="alice")
assert s.load(user_id=None)["user"]["workContext"]["summary"] == "legacy"
assert s.load(user_id="alice")["user"]["workContext"]["summary"] == "alice"
def test_user_agent_memory_file_location(self, base_dir: Path):
"""Per-user per-agent memory uses the user_agent_memory_file path."""
from deerflow.config.paths import Paths
paths = Paths(base_dir)
with patch("deerflow.agents.memory.storage.get_paths", return_value=paths):
s = FileMemoryStorage()
memory = create_empty_memory()
memory["user"]["workContext"]["summary"] = "agent scoped"
s.save(memory, "test-agent", user_id="alice")
expected_path = base_dir / "users" / "alice" / "agents" / "test-agent" / "memory.json"
assert expected_path.exists()
def test_cache_key_is_user_agent_tuple(self, base_dir: Path):
"""Cache keys must be (user_id, agent_name) tuples, not bare agent names."""
from deerflow.config.paths import Paths
paths = Paths(base_dir)
with patch("deerflow.agents.memory.storage.get_paths", return_value=paths):
s = FileMemoryStorage()
memory = create_empty_memory()
s.save(memory, user_id="alice")
# After save, cache should have tuple key
assert ("alice", None) in s._memory_cache
def test_reload_with_user_id(self, base_dir: Path):
"""reload() with user_id should force re-read from the user-scoped file."""
from deerflow.config.paths import Paths
paths = Paths(base_dir)
with patch("deerflow.agents.memory.storage.get_paths", return_value=paths):
s = FileMemoryStorage()
memory = create_empty_memory()
memory["user"]["workContext"]["summary"] = "initial"
s.save(memory, user_id="alice")
# Load once to prime cache
s.load(user_id="alice")
# Write updated content directly to file
user_file = base_dir / "users" / "alice" / "memory.json"
import json
updated = create_empty_memory()
updated["user"]["workContext"]["summary"] = "updated"
user_file.write_text(json.dumps(updated))
# reload should pick up the new content
reloaded = s.reload(user_id="alice")
assert reloaded["user"]["workContext"]["summary"] == "updated"

View File

@ -301,8 +301,8 @@ def test_import_memory_data_saves_and_returns_imported_memory() -> None:
with patch("deerflow.agents.memory.updater.get_memory_storage", return_value=mock_storage):
result = import_memory_data(imported_memory)
mock_storage.save.assert_called_once_with(imported_memory, None)
mock_storage.load.assert_called_once_with(None)
mock_storage.save.assert_called_once_with(imported_memory, None, user_id=None)
mock_storage.load.assert_called_once_with(None, user_id=None)
assert result == imported_memory

View File

@ -0,0 +1,29 @@
"""Tests for user_id propagation in memory updater."""
from unittest.mock import MagicMock, patch
from deerflow.agents.memory.updater import get_memory_data, clear_memory_data, _save_memory_to_file
def test_get_memory_data_passes_user_id():
mock_storage = MagicMock()
mock_storage.load.return_value = {"version": "1.0"}
with patch("deerflow.agents.memory.updater.get_memory_storage", return_value=mock_storage):
get_memory_data(user_id="alice")
mock_storage.load.assert_called_once_with(None, user_id="alice")
def test_save_memory_passes_user_id():
mock_storage = MagicMock()
mock_storage.save.return_value = True
with patch("deerflow.agents.memory.updater.get_memory_storage", return_value=mock_storage):
_save_memory_to_file({"version": "1.0"}, user_id="bob")
mock_storage.save.assert_called_once_with({"version": "1.0"}, None, user_id="bob")
def test_clear_memory_data_passes_user_id():
mock_storage = MagicMock()
mock_storage.save.return_value = True
with patch("deerflow.agents.memory.updater.get_memory_storage", return_value=mock_storage):
clear_memory_data(user_id="charlie")
# Verify save was called with user_id
assert mock_storage.save.call_args.kwargs["user_id"] == "charlie"

View File

@ -0,0 +1,116 @@
"""Tests for per-user data migration."""
import json
import pytest
from pathlib import Path
from deerflow.config.paths import Paths
@pytest.fixture
def base_dir(tmp_path: Path) -> Path:
return tmp_path
@pytest.fixture
def paths(base_dir: Path) -> Paths:
return Paths(base_dir)
class TestMigrateThreadDirs:
def test_moves_thread_to_user_dir(self, base_dir: Path, paths: Paths):
legacy = base_dir / "threads" / "t1" / "user-data" / "workspace"
legacy.mkdir(parents=True)
(legacy / "file.txt").write_text("hello")
from scripts.migrate_user_isolation import migrate_thread_dirs
migrate_thread_dirs(paths, thread_owner_map={"t1": "alice"})
expected = base_dir / "users" / "alice" / "threads" / "t1" / "user-data" / "workspace" / "file.txt"
assert expected.exists()
assert expected.read_text() == "hello"
assert not (base_dir / "threads" / "t1").exists()
def test_unowned_thread_goes_to_default(self, base_dir: Path, paths: Paths):
legacy = base_dir / "threads" / "t2" / "user-data" / "workspace"
legacy.mkdir(parents=True)
from scripts.migrate_user_isolation import migrate_thread_dirs
migrate_thread_dirs(paths, thread_owner_map={})
expected = base_dir / "users" / "default" / "threads" / "t2"
assert expected.exists()
def test_idempotent_skip_already_migrated(self, base_dir: Path, paths: Paths):
new_dir = base_dir / "users" / "alice" / "threads" / "t1" / "user-data" / "workspace"
new_dir.mkdir(parents=True)
from scripts.migrate_user_isolation import migrate_thread_dirs
migrate_thread_dirs(paths, thread_owner_map={"t1": "alice"})
assert new_dir.exists()
def test_conflict_preserved(self, base_dir: Path, paths: Paths):
legacy = base_dir / "threads" / "t1" / "user-data" / "workspace"
legacy.mkdir(parents=True)
(legacy / "old.txt").write_text("old")
dest = base_dir / "users" / "alice" / "threads" / "t1" / "user-data" / "workspace"
dest.mkdir(parents=True)
(dest / "new.txt").write_text("new")
from scripts.migrate_user_isolation import migrate_thread_dirs
migrate_thread_dirs(paths, thread_owner_map={"t1": "alice"})
assert (dest / "new.txt").read_text() == "new"
conflicts = base_dir / "migration-conflicts" / "t1"
assert conflicts.exists()
def test_cleans_up_empty_legacy_dir(self, base_dir: Path, paths: Paths):
legacy = base_dir / "threads" / "t1" / "user-data"
legacy.mkdir(parents=True)
from scripts.migrate_user_isolation import migrate_thread_dirs
migrate_thread_dirs(paths, thread_owner_map={})
assert not (base_dir / "threads").exists()
def test_dry_run_does_not_move(self, base_dir: Path, paths: Paths):
legacy = base_dir / "threads" / "t1" / "user-data"
legacy.mkdir(parents=True)
from scripts.migrate_user_isolation import migrate_thread_dirs
report = migrate_thread_dirs(paths, thread_owner_map={"t1": "alice"}, dry_run=True)
assert len(report) == 1
assert (base_dir / "threads" / "t1").exists() # not moved
assert not (base_dir / "users" / "alice" / "threads" / "t1").exists()
class TestMigrateMemory:
def test_moves_global_memory(self, base_dir: Path, paths: Paths):
legacy_mem = base_dir / "memory.json"
legacy_mem.write_text(json.dumps({"version": "1.0", "facts": []}))
from scripts.migrate_user_isolation import migrate_memory
migrate_memory(paths, user_id="default")
expected = base_dir / "users" / "default" / "memory.json"
assert expected.exists()
assert not legacy_mem.exists()
def test_skips_if_destination_exists(self, base_dir: Path, paths: Paths):
legacy_mem = base_dir / "memory.json"
legacy_mem.write_text(json.dumps({"version": "old"}))
dest = base_dir / "users" / "default" / "memory.json"
dest.parent.mkdir(parents=True)
dest.write_text(json.dumps({"version": "new"}))
from scripts.migrate_user_isolation import migrate_memory
migrate_memory(paths, user_id="default")
assert json.loads(dest.read_text())["version"] == "new"
assert (base_dir / "memory.legacy.json").exists()
def test_no_legacy_memory_is_noop(self, base_dir: Path, paths: Paths):
from scripts.migrate_user_isolation import migrate_memory
migrate_memory(paths, user_id="default") # should not raise

View File

@ -0,0 +1,167 @@
"""Tests for user-scoped path resolution in Paths."""
import pytest
from pathlib import Path
from deerflow.config.paths import Paths
@pytest.fixture
def paths(tmp_path: Path) -> Paths:
return Paths(tmp_path)
class TestValidateUserId:
def test_valid_user_id(self, paths: Paths):
d = paths.user_dir("u-abc-123")
assert d == paths.base_dir / "users" / "u-abc-123"
def test_rejects_path_traversal(self, paths: Paths):
with pytest.raises(ValueError, match="Invalid user_id"):
paths.user_dir("../escape")
def test_rejects_slash(self, paths: Paths):
with pytest.raises(ValueError, match="Invalid user_id"):
paths.user_dir("foo/bar")
def test_rejects_empty(self, paths: Paths):
with pytest.raises(ValueError, match="Invalid user_id"):
paths.user_dir("")
class TestUserDir:
def test_user_dir(self, paths: Paths):
assert paths.user_dir("alice") == paths.base_dir / "users" / "alice"
class TestUserMemoryFile:
def test_user_memory_file(self, paths: Paths):
assert paths.user_memory_file("bob") == paths.base_dir / "users" / "bob" / "memory.json"
class TestUserAgentMemoryFile:
def test_user_agent_memory_file(self, paths: Paths):
expected = paths.base_dir / "users" / "bob" / "agents" / "myagent" / "memory.json"
assert paths.user_agent_memory_file("bob", "myagent") == expected
def test_user_agent_memory_file_lowercases_name(self, paths: Paths):
expected = paths.base_dir / "users" / "bob" / "agents" / "myagent" / "memory.json"
assert paths.user_agent_memory_file("bob", "MyAgent") == expected
class TestUserThreadDir:
def test_user_thread_dir(self, paths: Paths):
expected = paths.base_dir / "users" / "u1" / "threads" / "t1"
assert paths.thread_dir("t1", user_id="u1") == expected
def test_thread_dir_no_user_id_falls_back_to_legacy(self, paths: Paths):
expected = paths.base_dir / "threads" / "t1"
assert paths.thread_dir("t1") == expected
class TestUserSandboxDirs:
def test_sandbox_work_dir(self, paths: Paths):
expected = paths.base_dir / "users" / "u1" / "threads" / "t1" / "user-data" / "workspace"
assert paths.sandbox_work_dir("t1", user_id="u1") == expected
def test_sandbox_uploads_dir(self, paths: Paths):
expected = paths.base_dir / "users" / "u1" / "threads" / "t1" / "user-data" / "uploads"
assert paths.sandbox_uploads_dir("t1", user_id="u1") == expected
def test_sandbox_outputs_dir(self, paths: Paths):
expected = paths.base_dir / "users" / "u1" / "threads" / "t1" / "user-data" / "outputs"
assert paths.sandbox_outputs_dir("t1", user_id="u1") == expected
def test_sandbox_user_data_dir(self, paths: Paths):
expected = paths.base_dir / "users" / "u1" / "threads" / "t1" / "user-data"
assert paths.sandbox_user_data_dir("t1", user_id="u1") == expected
def test_acp_workspace_dir(self, paths: Paths):
expected = paths.base_dir / "users" / "u1" / "threads" / "t1" / "acp-workspace"
assert paths.acp_workspace_dir("t1", user_id="u1") == expected
def test_legacy_sandbox_work_dir(self, paths: Paths):
expected = paths.base_dir / "threads" / "t1" / "user-data" / "workspace"
assert paths.sandbox_work_dir("t1") == expected
class TestHostPathsWithUserId:
def test_host_thread_dir_with_user_id(self, paths: Paths):
result = paths.host_thread_dir("t1", user_id="u1")
assert "users" in result
assert "u1" in result
assert "threads" in result
assert "t1" in result
def test_host_thread_dir_legacy(self, paths: Paths):
result = paths.host_thread_dir("t1")
assert "threads" in result
assert "t1" in result
assert "users" not in result
def test_host_sandbox_user_data_dir_with_user_id(self, paths: Paths):
result = paths.host_sandbox_user_data_dir("t1", user_id="u1")
assert "users" in result
assert "user-data" in result
def test_host_sandbox_work_dir_with_user_id(self, paths: Paths):
result = paths.host_sandbox_work_dir("t1", user_id="u1")
assert "workspace" in result
def test_host_sandbox_uploads_dir_with_user_id(self, paths: Paths):
result = paths.host_sandbox_uploads_dir("t1", user_id="u1")
assert "uploads" in result
def test_host_sandbox_outputs_dir_with_user_id(self, paths: Paths):
result = paths.host_sandbox_outputs_dir("t1", user_id="u1")
assert "outputs" in result
def test_host_acp_workspace_dir_with_user_id(self, paths: Paths):
result = paths.host_acp_workspace_dir("t1", user_id="u1")
assert "acp-workspace" in result
class TestEnsureAndDeleteWithUserId:
def test_ensure_thread_dirs_creates_user_scoped(self, paths: Paths):
paths.ensure_thread_dirs("t1", user_id="u1")
assert paths.sandbox_work_dir("t1", user_id="u1").is_dir()
assert paths.sandbox_uploads_dir("t1", user_id="u1").is_dir()
assert paths.sandbox_outputs_dir("t1", user_id="u1").is_dir()
assert paths.acp_workspace_dir("t1", user_id="u1").is_dir()
def test_delete_thread_dir_removes_user_scoped(self, paths: Paths):
paths.ensure_thread_dirs("t1", user_id="u1")
assert paths.thread_dir("t1", user_id="u1").exists()
paths.delete_thread_dir("t1", user_id="u1")
assert not paths.thread_dir("t1", user_id="u1").exists()
def test_delete_thread_dir_idempotent(self, paths: Paths):
paths.delete_thread_dir("nonexistent", user_id="u1") # should not raise
def test_ensure_thread_dirs_legacy_still_works(self, paths: Paths):
paths.ensure_thread_dirs("t1")
assert paths.sandbox_work_dir("t1").is_dir()
def test_user_scoped_and_legacy_are_independent(self, paths: Paths):
paths.ensure_thread_dirs("t1", user_id="u1")
paths.ensure_thread_dirs("t1")
# Both exist independently
assert paths.thread_dir("t1", user_id="u1").exists()
assert paths.thread_dir("t1").exists()
# Delete one doesn't affect the other
paths.delete_thread_dir("t1", user_id="u1")
assert not paths.thread_dir("t1", user_id="u1").exists()
assert paths.thread_dir("t1").exists()
class TestResolveVirtualPathWithUserId:
def test_resolve_virtual_path_with_user_id(self, paths: Paths):
paths.ensure_thread_dirs("t1", user_id="u1")
result = paths.resolve_virtual_path("t1", "/mnt/user-data/workspace/file.txt", user_id="u1")
expected_base = paths.sandbox_user_data_dir("t1", user_id="u1").resolve()
assert str(result).startswith(str(expected_base))
def test_resolve_virtual_path_legacy(self, paths: Paths):
paths.ensure_thread_dirs("t1")
result = paths.resolve_virtual_path("t1", "/mnt/user-data/workspace/file.txt")
expected_base = paths.sandbox_user_data_dir("t1").resolve()
assert str(result).startswith(str(expected_base))

View File

@ -38,7 +38,7 @@ def test_present_files_keeps_virtual_outputs_path(tmp_path, monkeypatch):
monkeypatch.setattr(
present_file_tool_module,
"get_paths",
lambda: SimpleNamespace(resolve_virtual_path=lambda thread_id, path: artifact_path),
lambda: SimpleNamespace(resolve_virtual_path=lambda thread_id, path, *, user_id=None: artifact_path),
)
result = present_file_tool_module.present_file_tool.func(

View File

@ -0,0 +1,107 @@
"""Tests for paginated list_messages_by_run across all RunEventStore backends."""
import pytest
from deerflow.runtime.events.store.memory import MemoryRunEventStore
@pytest.fixture
def base_store():
return MemoryRunEventStore()
@pytest.mark.anyio
async def test_list_messages_by_run_default_returns_all(base_store):
store = base_store
for i in range(7):
await store.put(
thread_id="t1", run_id="run-a",
event_type="human_message" if i % 2 == 0 else "ai_message",
category="message", content=f"msg-a-{i}",
)
for i in range(3):
await store.put(
thread_id="t1", run_id="run-b",
event_type="human_message", category="message", content=f"msg-b-{i}",
)
await store.put(thread_id="t1", run_id="run-a", event_type="tool_call", category="trace", content="trace")
msgs = await store.list_messages_by_run("t1", "run-a")
assert len(msgs) == 7
assert all(m["category"] == "message" for m in msgs)
assert all(m["run_id"] == "run-a" for m in msgs)
@pytest.mark.anyio
async def test_list_messages_by_run_with_limit(base_store):
store = base_store
for i in range(7):
await store.put(
thread_id="t1", run_id="run-a",
event_type="human_message" if i % 2 == 0 else "ai_message",
category="message", content=f"msg-a-{i}",
)
msgs = await store.list_messages_by_run("t1", "run-a", limit=3)
assert len(msgs) == 3
seqs = [m["seq"] for m in msgs]
assert seqs == sorted(seqs)
@pytest.mark.anyio
async def test_list_messages_by_run_after_seq(base_store):
store = base_store
for i in range(7):
await store.put(
thread_id="t1", run_id="run-a",
event_type="human_message" if i % 2 == 0 else "ai_message",
category="message", content=f"msg-a-{i}",
)
all_msgs = await store.list_messages_by_run("t1", "run-a")
cursor_seq = all_msgs[2]["seq"]
msgs = await store.list_messages_by_run("t1", "run-a", after_seq=cursor_seq, limit=50)
assert all(m["seq"] > cursor_seq for m in msgs)
assert len(msgs) == 4
@pytest.mark.anyio
async def test_list_messages_by_run_before_seq(base_store):
store = base_store
for i in range(7):
await store.put(
thread_id="t1", run_id="run-a",
event_type="human_message" if i % 2 == 0 else "ai_message",
category="message", content=f"msg-a-{i}",
)
all_msgs = await store.list_messages_by_run("t1", "run-a")
cursor_seq = all_msgs[4]["seq"]
msgs = await store.list_messages_by_run("t1", "run-a", before_seq=cursor_seq, limit=50)
assert all(m["seq"] < cursor_seq for m in msgs)
assert len(msgs) == 4
@pytest.mark.anyio
async def test_list_messages_by_run_does_not_include_other_run(base_store):
store = base_store
for i in range(7):
await store.put(
thread_id="t1", run_id="run-a",
event_type="human_message", category="message", content=f"msg-a-{i}",
)
for i in range(3):
await store.put(
thread_id="t1", run_id="run-b",
event_type="human_message", category="message", content=f"msg-b-{i}",
)
msgs = await store.list_messages_by_run("t1", "run-b")
assert len(msgs) == 3
assert all(m["run_id"] == "run-b" for m in msgs)
@pytest.mark.anyio
async def test_list_messages_by_run_empty_run(base_store):
store = base_store
msgs = await store.list_messages_by_run("t1", "nonexistent")
assert msgs == []

View File

@ -0,0 +1,243 @@
"""Tests for GET /api/runs/{run_id}/messages and GET /api/runs/{run_id}/feedback endpoints."""
from __future__ import annotations
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from _router_auth_helpers import make_authed_test_app
from fastapi.testclient import TestClient
from app.gateway.routers import runs
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _make_app(run_store=None, event_store=None, feedback_repo=None):
"""Build a test FastAPI app with stub auth and mocked state."""
app = make_authed_test_app()
app.include_router(runs.router)
if run_store is not None:
app.state.run_store = run_store
if event_store is not None:
app.state.run_event_store = event_store
if feedback_repo is not None:
app.state.feedback_repo = feedback_repo
return app
def _make_run_store(run_record: dict | None):
"""Return an AsyncMock run store whose get() returns run_record."""
store = MagicMock()
store.get = AsyncMock(return_value=run_record)
return store
def _make_event_store(rows: list[dict]):
"""Return an AsyncMock event store whose list_messages_by_run() returns rows."""
store = MagicMock()
store.list_messages_by_run = AsyncMock(return_value=rows)
return store
def _make_message(seq: int) -> dict:
return {"seq": seq, "event_type": "on_chat_model_stream", "category": "message", "content": f"msg-{seq}"}
# ---------------------------------------------------------------------------
# Tests
# ---------------------------------------------------------------------------
def test_run_messages_returns_envelope():
"""GET /api/runs/{run_id}/messages returns {data: [...], has_more: bool}."""
rows = [_make_message(i) for i in range(1, 4)]
run_record = {"run_id": "run-1", "thread_id": "thread-1"}
app = _make_app(
run_store=_make_run_store(run_record),
event_store=_make_event_store(rows),
)
with TestClient(app) as client:
response = client.get("/api/runs/run-1/messages")
assert response.status_code == 200
body = response.json()
assert "data" in body
assert "has_more" in body
assert body["has_more"] is False
assert len(body["data"]) == 3
def test_run_messages_404_when_run_not_found():
"""Returns 404 when the run store returns None."""
app = _make_app(
run_store=_make_run_store(None),
event_store=_make_event_store([]),
)
with TestClient(app) as client:
response = client.get("/api/runs/missing-run/messages")
assert response.status_code == 404
assert "missing-run" in response.json()["detail"]
def test_run_messages_has_more_true_when_extra_row_returned():
"""has_more=True when event store returns limit+1 rows."""
# Default limit is 50; provide 51 rows
rows = [_make_message(i) for i in range(1, 52)] # 51 rows
run_record = {"run_id": "run-2", "thread_id": "thread-2"}
app = _make_app(
run_store=_make_run_store(run_record),
event_store=_make_event_store(rows),
)
with TestClient(app) as client:
response = client.get("/api/runs/run-2/messages")
assert response.status_code == 200
body = response.json()
assert body["has_more"] is True
assert len(body["data"]) == 50 # trimmed to limit
def test_run_messages_passes_after_seq_to_event_store():
"""after_seq query param is forwarded to event_store.list_messages_by_run."""
rows = [_make_message(10)]
run_record = {"run_id": "run-3", "thread_id": "thread-3"}
event_store = _make_event_store(rows)
app = _make_app(
run_store=_make_run_store(run_record),
event_store=event_store,
)
with TestClient(app) as client:
response = client.get("/api/runs/run-3/messages?after_seq=5")
assert response.status_code == 200
event_store.list_messages_by_run.assert_awaited_once_with(
"thread-3", "run-3",
limit=51, # default limit(50) + 1
before_seq=None,
after_seq=5,
)
def test_run_messages_respects_custom_limit():
"""Custom limit is respected and capped at 200."""
rows = [_make_message(i) for i in range(1, 6)]
run_record = {"run_id": "run-4", "thread_id": "thread-4"}
event_store = _make_event_store(rows)
app = _make_app(
run_store=_make_run_store(run_record),
event_store=event_store,
)
with TestClient(app) as client:
response = client.get("/api/runs/run-4/messages?limit=10")
assert response.status_code == 200
event_store.list_messages_by_run.assert_awaited_once_with(
"thread-4", "run-4",
limit=11, # 10 + 1
before_seq=None,
after_seq=None,
)
def test_run_messages_passes_before_seq_to_event_store():
"""before_seq query param is forwarded to event_store.list_messages_by_run."""
rows = [_make_message(3)]
run_record = {"run_id": "run-5", "thread_id": "thread-5"}
event_store = _make_event_store(rows)
app = _make_app(
run_store=_make_run_store(run_record),
event_store=event_store,
)
with TestClient(app) as client:
response = client.get("/api/runs/run-5/messages?before_seq=10")
assert response.status_code == 200
event_store.list_messages_by_run.assert_awaited_once_with(
"thread-5", "run-5",
limit=51,
before_seq=10,
after_seq=None,
)
def test_run_messages_empty_data():
"""Returns empty data list when no messages exist."""
run_record = {"run_id": "run-6", "thread_id": "thread-6"}
app = _make_app(
run_store=_make_run_store(run_record),
event_store=_make_event_store([]),
)
with TestClient(app) as client:
response = client.get("/api/runs/run-6/messages")
assert response.status_code == 200
body = response.json()
assert body["data"] == []
assert body["has_more"] is False
def _make_feedback_repo(rows: list[dict]):
"""Return an AsyncMock feedback repo whose list_by_run() returns rows."""
repo = MagicMock()
repo.list_by_run = AsyncMock(return_value=rows)
return repo
def _make_feedback(run_id: str, idx: int) -> dict:
return {"id": f"fb-{idx}", "run_id": run_id, "thread_id": "thread-x", "value": "up"}
# ---------------------------------------------------------------------------
# TestRunFeedback
# ---------------------------------------------------------------------------
class TestRunFeedback:
def test_returns_list_of_feedback_dicts(self):
"""GET /api/runs/{run_id}/feedback returns a list of feedback dicts."""
run_record = {"run_id": "run-fb-1", "thread_id": "thread-fb-1"}
rows = [_make_feedback("run-fb-1", i) for i in range(3)]
app = _make_app(
run_store=_make_run_store(run_record),
feedback_repo=_make_feedback_repo(rows),
)
with TestClient(app) as client:
response = client.get("/api/runs/run-fb-1/feedback")
assert response.status_code == 200
body = response.json()
assert isinstance(body, list)
assert len(body) == 3
def test_404_when_run_not_found(self):
"""Returns 404 when run store returns None."""
app = _make_app(
run_store=_make_run_store(None),
feedback_repo=_make_feedback_repo([]),
)
with TestClient(app) as client:
response = client.get("/api/runs/missing-run/feedback")
assert response.status_code == 404
assert "missing-run" in response.json()["detail"]
def test_empty_list_when_no_feedback(self):
"""Returns empty list when no feedback exists for the run."""
run_record = {"run_id": "run-fb-2", "thread_id": "thread-fb-2"}
app = _make_app(
run_store=_make_run_store(run_record),
feedback_repo=_make_feedback_repo([]),
)
with TestClient(app) as client:
response = client.get("/api/runs/run-fb-2/feedback")
assert response.status_code == 200
assert response.json() == []
def test_503_when_feedback_repo_not_configured(self):
"""Returns 503 when feedback_repo is None (no DB configured)."""
run_record = {"run_id": "run-fb-3", "thread_id": "thread-fb-3"}
app = _make_app(
run_store=_make_run_store(run_record),
)
# Explicitly set feedback_repo to None to simulate missing DB
app.state.feedback_repo = None
with TestClient(app) as client:
response = client.get("/api/runs/run-fb-3/feedback")
assert response.status_code == 503

View File

@ -0,0 +1,128 @@
"""Tests for paginated GET /api/threads/{thread_id}/runs/{run_id}/messages endpoint."""
from __future__ import annotations
from unittest.mock import AsyncMock, MagicMock
import pytest
from _router_auth_helpers import make_authed_test_app
from fastapi.testclient import TestClient
from app.gateway.routers import thread_runs
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _make_app(event_store=None):
"""Build a test FastAPI app with stub auth and mocked state."""
app = make_authed_test_app()
app.include_router(thread_runs.router)
if event_store is not None:
app.state.run_event_store = event_store
return app
def _make_event_store(rows: list[dict]):
"""Return an AsyncMock event store whose list_messages_by_run() returns rows."""
store = MagicMock()
store.list_messages_by_run = AsyncMock(return_value=rows)
return store
def _make_message(seq: int) -> dict:
return {"seq": seq, "event_type": "ai_message", "category": "message", "content": f"msg-{seq}"}
# ---------------------------------------------------------------------------
# Tests
# ---------------------------------------------------------------------------
def test_returns_paginated_envelope():
"""GET /api/threads/{tid}/runs/{rid}/messages returns {data: [...], has_more: bool}."""
rows = [_make_message(i) for i in range(1, 4)]
app = _make_app(event_store=_make_event_store(rows))
with TestClient(app) as client:
response = client.get("/api/threads/thread-1/runs/run-1/messages")
assert response.status_code == 200
body = response.json()
assert "data" in body
assert "has_more" in body
assert body["has_more"] is False
assert len(body["data"]) == 3
def test_has_more_true_when_extra_row_returned():
"""has_more=True when event store returns limit+1 rows."""
# Default limit is 50; provide 51 rows
rows = [_make_message(i) for i in range(1, 52)] # 51 rows
app = _make_app(event_store=_make_event_store(rows))
with TestClient(app) as client:
response = client.get("/api/threads/thread-2/runs/run-2/messages")
assert response.status_code == 200
body = response.json()
assert body["has_more"] is True
assert len(body["data"]) == 50 # trimmed to limit
def test_after_seq_forwarded_to_event_store():
"""after_seq query param is forwarded to event_store.list_messages_by_run."""
rows = [_make_message(10)]
event_store = _make_event_store(rows)
app = _make_app(event_store=event_store)
with TestClient(app) as client:
response = client.get("/api/threads/thread-3/runs/run-3/messages?after_seq=5")
assert response.status_code == 200
event_store.list_messages_by_run.assert_awaited_once_with(
"thread-3", "run-3",
limit=51, # default limit(50) + 1
before_seq=None,
after_seq=5,
)
def test_before_seq_forwarded_to_event_store():
"""before_seq query param is forwarded to event_store.list_messages_by_run."""
rows = [_make_message(3)]
event_store = _make_event_store(rows)
app = _make_app(event_store=event_store)
with TestClient(app) as client:
response = client.get("/api/threads/thread-4/runs/run-4/messages?before_seq=10")
assert response.status_code == 200
event_store.list_messages_by_run.assert_awaited_once_with(
"thread-4", "run-4",
limit=51,
before_seq=10,
after_seq=None,
)
def test_custom_limit_forwarded_to_event_store():
"""Custom limit is forwarded as limit+1 to the event store."""
rows = [_make_message(i) for i in range(1, 6)]
event_store = _make_event_store(rows)
app = _make_app(event_store=event_store)
with TestClient(app) as client:
response = client.get("/api/threads/thread-5/runs/run-5/messages?limit=10")
assert response.status_code == 200
event_store.list_messages_by_run.assert_awaited_once_with(
"thread-5", "run-5",
limit=11, # 10 + 1
before_seq=None,
after_seq=None,
)
def test_empty_data_when_no_messages():
"""Returns empty data list with has_more=False when no messages exist."""
app = _make_app(event_store=_make_event_store([]))
with TestClient(app) as client:
response = client.get("/api/threads/thread-6/runs/run-6/messages")
assert response.status_code == 200
body = response.json()
assert body["data"] == []
assert body["has_more"] is False

View File

@ -1,439 +0,0 @@
"""Tests for event-store-backed message loading in thread state/history endpoints.
Covers the helper functions added to ``app/gateway/routers/threads.py``:
- ``_sanitize_legacy_command_repr`` extracts inner ToolMessage text from
legacy ``str(Command(...))`` strings captured before the ``journal.py``
fix for state-updating tools like ``present_files``.
- ``_get_event_store_messages`` loads the full message stream with full
pagination, copy-on-read id patching, legacy Command sanitization, and
a clean fallback to ``None`` when the event store is unavailable.
"""
from __future__ import annotations
import uuid
from types import SimpleNamespace
from typing import Any
import pytest
from app.gateway.routers.threads import (
_get_event_store_messages,
_sanitize_legacy_command_repr,
)
from deerflow.runtime.events.store.memory import MemoryRunEventStore
@pytest.fixture()
def event_store() -> MemoryRunEventStore:
return MemoryRunEventStore()
class _FakeFeedbackRepo:
"""Minimal ``FeedbackRepository`` stand-in that returns a configured map."""
def __init__(self, by_run: dict[str, dict] | None = None) -> None:
self._by_run = by_run or {}
async def list_by_thread_grouped(self, thread_id: str, *, user_id: str | None) -> dict[str, dict]:
return dict(self._by_run)
def _make_request(
event_store: MemoryRunEventStore,
feedback_repo: _FakeFeedbackRepo | None = None,
) -> Any:
"""Build a minimal FastAPI-like Request object.
``get_run_event_store(request)`` reads ``request.app.state.run_event_store``.
``get_feedback_repo(request)`` reads ``request.app.state.feedback_repo``.
``get_current_user`` is monkey-patched separately in tests that need it.
"""
state = SimpleNamespace(
run_event_store=event_store,
feedback_repo=feedback_repo or _FakeFeedbackRepo(),
)
app = SimpleNamespace(state=state)
return SimpleNamespace(app=app)
@pytest.fixture(autouse=True)
def _stub_current_user(monkeypatch):
"""Stub out ``get_current_user`` so tests don't need real auth context."""
import app.gateway.routers.threads as threads_mod
async def _fake(_request):
return None
monkeypatch.setattr(threads_mod, "get_current_user", _fake)
async def _seed_simple_run(store: MemoryRunEventStore, thread_id: str, run_id: str) -> None:
"""Seed one run: human + ai_tool_call + tool_result + final ai_message, plus a trace."""
await store.put(
thread_id=thread_id, run_id=run_id,
event_type="human_message", category="message",
content={
"type": "human", "id": None,
"content": [{"type": "text", "text": "hello"}],
"additional_kwargs": {}, "response_metadata": {}, "name": None,
},
)
await store.put(
thread_id=thread_id, run_id=run_id,
event_type="ai_tool_call", category="message",
content={
"type": "ai", "id": "lc_run--tc1",
"content": "",
"tool_calls": [{"name": "search", "args": {"q": "x"}, "id": "call_1", "type": "tool_call"}],
"invalid_tool_calls": [],
"additional_kwargs": {}, "response_metadata": {}, "name": None,
"usage_metadata": {"input_tokens": 10, "output_tokens": 5, "total_tokens": 15},
},
)
await store.put(
thread_id=thread_id, run_id=run_id,
event_type="tool_result", category="message",
content={
"type": "tool", "id": None,
"content": "results",
"tool_call_id": "call_1", "name": "search",
"artifact": None, "status": "success",
"additional_kwargs": {}, "response_metadata": {},
},
)
await store.put(
thread_id=thread_id, run_id=run_id,
event_type="ai_message", category="message",
content={
"type": "ai", "id": "lc_run--final1",
"content": "done",
"tool_calls": [], "invalid_tool_calls": [],
"additional_kwargs": {}, "response_metadata": {"finish_reason": "stop"}, "name": None,
"usage_metadata": {"input_tokens": 20, "output_tokens": 10, "total_tokens": 30},
},
)
# Non-message trace — must be filtered out.
await store.put(
thread_id=thread_id, run_id=run_id,
event_type="llm_request", category="trace",
content={"model": "test"},
)
class TestSanitizeLegacyCommandRepr:
def test_passthrough_non_string(self):
assert _sanitize_legacy_command_repr(None) is None
assert _sanitize_legacy_command_repr(42) == 42
assert _sanitize_legacy_command_repr([{"type": "text", "text": "x"}]) == [{"type": "text", "text": "x"}]
def test_passthrough_plain_string(self):
assert _sanitize_legacy_command_repr("Successfully presented files") == "Successfully presented files"
assert _sanitize_legacy_command_repr("") == ""
def test_extracts_inner_content_single_quotes(self):
legacy = (
"Command(update={'artifacts': ['/mnt/user-data/outputs/report.md'], "
"'messages': [ToolMessage(content='Successfully presented files', "
"tool_call_id='call_abc')]})"
)
assert _sanitize_legacy_command_repr(legacy) == "Successfully presented files"
def test_extracts_inner_content_double_quotes(self):
legacy = 'Command(update={"messages": [ToolMessage(content="ok", tool_call_id="x")]})'
assert _sanitize_legacy_command_repr(legacy) == "ok"
def test_unparseable_command_returns_original(self):
legacy = "Command(update={'something_else': 1})"
assert _sanitize_legacy_command_repr(legacy) == legacy
class TestGetEventStoreMessages:
@pytest.mark.anyio
async def test_returns_none_when_store_empty(self, event_store):
request = _make_request(event_store)
assert await _get_event_store_messages(request, "t_missing") is None
@pytest.mark.anyio
async def test_extracts_all_message_types_in_order(self, event_store):
await _seed_simple_run(event_store, "t1", "r1")
request = _make_request(event_store)
messages = await _get_event_store_messages(request, "t1")
assert messages is not None
types = [m["type"] for m in messages]
assert types == ["human", "ai", "tool", "ai"]
# Trace events must not appear
for m in messages:
assert m.get("type") in {"human", "ai", "tool"}
@pytest.mark.anyio
async def test_null_ids_get_deterministic_uuid5(self, event_store):
await _seed_simple_run(event_store, "t1", "r1")
request = _make_request(event_store)
messages = await _get_event_store_messages(request, "t1")
assert messages is not None
# AI messages keep their LLM ids
assert messages[1]["id"] == "lc_run--tc1"
assert messages[3]["id"] == "lc_run--final1"
# Human (seq=1) + tool (seq=3) get deterministic uuid5
expected_human_id = str(uuid.uuid5(uuid.NAMESPACE_URL, "t1:1"))
expected_tool_id = str(uuid.uuid5(uuid.NAMESPACE_URL, "t1:3"))
assert messages[0]["id"] == expected_human_id
assert messages[2]["id"] == expected_tool_id
# Re-running produces the same ids (stability across requests)
messages2 = await _get_event_store_messages(request, "t1")
assert [m["id"] for m in messages2] == [m["id"] for m in messages]
@pytest.mark.anyio
async def test_helper_does_not_mutate_store(self, event_store):
"""Helper must copy content dicts; the live store must stay unchanged."""
await _seed_simple_run(event_store, "t1", "r1")
request = _make_request(event_store)
_ = await _get_event_store_messages(request, "t1")
# Raw store records still have id=None for human/tool
raw = await event_store.list_messages("t1", limit=500)
human = next(e for e in raw if e["content"]["type"] == "human")
tool = next(e for e in raw if e["content"]["type"] == "tool")
assert human["content"]["id"] is None
assert tool["content"]["id"] is None
@pytest.mark.anyio
async def test_legacy_command_repr_sanitized(self, event_store):
"""A tool_result whose content is a legacy ``str(Command(...))`` is cleaned."""
legacy = (
"Command(update={'artifacts': ['/mnt/user-data/outputs/x.md'], "
"'messages': [ToolMessage(content='Successfully presented files', "
"tool_call_id='call_p')]})"
)
await event_store.put(
thread_id="t2", run_id="r1",
event_type="tool_result", category="message",
content={
"type": "tool", "id": None,
"content": legacy,
"tool_call_id": "call_p", "name": "present_files",
"artifact": None, "status": "success",
"additional_kwargs": {}, "response_metadata": {},
},
)
request = _make_request(event_store)
messages = await _get_event_store_messages(request, "t2")
assert messages is not None and len(messages) == 1
assert messages[0]["content"] == "Successfully presented files"
@pytest.mark.anyio
async def test_pagination_covers_more_than_one_page(self, event_store, monkeypatch):
"""Simulate a long thread that exceeds a single page to exercise the loop."""
thread_id = "t_long"
# Seed 12 human messages
for i in range(12):
await event_store.put(
thread_id=thread_id, run_id="r1",
event_type="human_message", category="message",
content={
"type": "human", "id": None,
"content": [{"type": "text", "text": f"msg {i}"}],
"additional_kwargs": {}, "response_metadata": {}, "name": None,
},
)
# Force small page size to exercise pagination
import app.gateway.routers.threads as threads_mod
original = threads_mod._get_event_store_messages
# Monkeypatch MemoryRunEventStore.list_messages to assert it's called with cursor pagination
calls: list[dict] = []
real_list = event_store.list_messages
async def spy_list_messages(tid, *, limit=50, before_seq=None, after_seq=None):
calls.append({"limit": limit, "after_seq": after_seq})
return await real_list(tid, limit=limit, before_seq=before_seq, after_seq=after_seq)
monkeypatch.setattr(event_store, "list_messages", spy_list_messages)
request = _make_request(event_store)
messages = await original(request, thread_id)
assert messages is not None
assert len(messages) == 12
assert [m["content"][0]["text"] for m in messages] == [f"msg {i}" for i in range(12)]
# At least one call was made with after_seq=None (the initial page)
assert any(c["after_seq"] is None for c in calls)
@pytest.mark.anyio
async def test_summarize_regression_recovers_pre_summarize_messages(self, event_store):
"""The exact bug: checkpoint would have only post-summarize messages;
event store must surface the original pre-summarize human query."""
# Run 1 (pre-summarize)
await event_store.put(
thread_id="t_sum", run_id="r1",
event_type="human_message", category="message",
content={
"type": "human", "id": None,
"content": [{"type": "text", "text": "original question"}],
"additional_kwargs": {}, "response_metadata": {}, "name": None,
},
)
await event_store.put(
thread_id="t_sum", run_id="r1",
event_type="ai_message", category="message",
content={
"type": "ai", "id": "lc_run--r1",
"content": "first answer",
"tool_calls": [], "invalid_tool_calls": [],
"additional_kwargs": {}, "response_metadata": {}, "name": None,
"usage_metadata": {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0},
},
)
# Run 2 (post-summarize — what the checkpoint still has)
await event_store.put(
thread_id="t_sum", run_id="r2",
event_type="human_message", category="message",
content={
"type": "human", "id": None,
"content": [{"type": "text", "text": "follow up"}],
"additional_kwargs": {}, "response_metadata": {}, "name": None,
},
)
await event_store.put(
thread_id="t_sum", run_id="r2",
event_type="ai_message", category="message",
content={
"type": "ai", "id": "lc_run--r2",
"content": "second answer",
"tool_calls": [], "invalid_tool_calls": [],
"additional_kwargs": {}, "response_metadata": {}, "name": None,
"usage_metadata": {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0},
},
)
request = _make_request(event_store)
messages = await _get_event_store_messages(request, "t_sum")
assert messages is not None
# 4 messages, not 2 (which is what the summarized checkpoint would yield)
assert len(messages) == 4
assert messages[0]["content"][0]["text"] == "original question"
assert messages[1]["id"] == "lc_run--r1"
assert messages[3]["id"] == "lc_run--r2"
@pytest.mark.anyio
async def test_run_id_attached_to_every_message(self, event_store):
await _seed_simple_run(event_store, "t1", "r1")
request = _make_request(event_store)
messages = await _get_event_store_messages(request, "t1")
assert messages is not None
assert all(m.get("run_id") == "r1" for m in messages)
@pytest.mark.anyio
async def test_feedback_attached_only_to_final_ai_message_per_run(self, event_store):
await _seed_simple_run(event_store, "t1", "r1")
feedback_repo = _FakeFeedbackRepo(
{"r1": {"feedback_id": "fb1", "rating": 1, "comment": "great"}}
)
request = _make_request(event_store, feedback_repo=feedback_repo)
messages = await _get_event_store_messages(request, "t1")
assert messages is not None
# human (0), ai_tool_call (1), tool (2), ai_message (3)
final_ai = messages[3]
assert final_ai["feedback"] == {
"feedback_id": "fb1",
"rating": 1,
"comment": "great",
}
# Non-final messages must NOT have a feedback key at all — the
# frontend keys button visibility off of this.
assert "feedback" not in messages[0]
assert "feedback" not in messages[1]
assert "feedback" not in messages[2]
@pytest.mark.anyio
async def test_feedback_none_when_no_row_for_run(self, event_store):
await _seed_simple_run(event_store, "t1", "r1")
request = _make_request(event_store, feedback_repo=_FakeFeedbackRepo({}))
messages = await _get_event_store_messages(request, "t1")
assert messages is not None
# Final ai_message gets an explicit ``None`` — distinguishes "eligible
# but unrated" from "not eligible" (field absent).
assert messages[3]["feedback"] is None
@pytest.mark.anyio
async def test_feedback_per_run_for_multi_run_thread(self, event_store):
"""A thread with two runs: each final ai_message should get its own feedback."""
# Run 1
await event_store.put(
thread_id="t_multi", run_id="r1",
event_type="human_message", category="message",
content={"type": "human", "id": None, "content": "q1",
"additional_kwargs": {}, "response_metadata": {}, "name": None},
)
await event_store.put(
thread_id="t_multi", run_id="r1",
event_type="ai_message", category="message",
content={"type": "ai", "id": "lc_run--a1", "content": "a1",
"tool_calls": [], "invalid_tool_calls": [],
"additional_kwargs": {}, "response_metadata": {}, "name": None,
"usage_metadata": None},
)
# Run 2
await event_store.put(
thread_id="t_multi", run_id="r2",
event_type="human_message", category="message",
content={"type": "human", "id": None, "content": "q2",
"additional_kwargs": {}, "response_metadata": {}, "name": None},
)
await event_store.put(
thread_id="t_multi", run_id="r2",
event_type="ai_message", category="message",
content={"type": "ai", "id": "lc_run--a2", "content": "a2",
"tool_calls": [], "invalid_tool_calls": [],
"additional_kwargs": {}, "response_metadata": {}, "name": None,
"usage_metadata": None},
)
feedback_repo = _FakeFeedbackRepo({
"r1": {"feedback_id": "fb_r1", "rating": 1, "comment": None},
"r2": {"feedback_id": "fb_r2", "rating": -1, "comment": "meh"},
})
request = _make_request(event_store, feedback_repo=feedback_repo)
messages = await _get_event_store_messages(request, "t_multi")
assert messages is not None
# human[r1], ai[r1], human[r2], ai[r2]
assert messages[1]["feedback"]["feedback_id"] == "fb_r1"
assert messages[1]["feedback"]["rating"] == 1
assert messages[3]["feedback"]["feedback_id"] == "fb_r2"
assert messages[3]["feedback"]["rating"] == -1
# Humans don't get feedback
assert "feedback" not in messages[0]
assert "feedback" not in messages[2]
@pytest.mark.anyio
async def test_feedback_repo_failure_does_not_break_helper(self, monkeypatch, event_store):
"""If feedback lookup throws, messages still come back without feedback."""
await _seed_simple_run(event_store, "t1", "r1")
class _BoomRepo:
async def list_by_thread_grouped(self, *a, **kw):
raise RuntimeError("db down")
request = _make_request(event_store, feedback_repo=_BoomRepo())
messages = await _get_event_store_messages(request, "t1")
assert messages is not None
assert len(messages) == 4
for m in messages:
assert "feedback" not in m
@pytest.mark.anyio
async def test_returns_none_when_dep_raises(self, monkeypatch, event_store):
"""When ``get_run_event_store`` is not configured, helper returns None."""
import app.gateway.routers.threads as threads_mod
def boom(_request):
raise RuntimeError("no store")
monkeypatch.setattr(threads_mod, "get_run_event_store", boom)
request = _make_request(event_store)
assert await threads_mod._get_event_store_messages(request, "t1") is None

View File

@ -50,10 +50,13 @@ def test_delete_thread_data_rejects_invalid_thread_id(tmp_path):
def test_delete_thread_route_cleans_thread_directory(tmp_path):
from deerflow.runtime.user_context import get_effective_user_id
paths = Paths(tmp_path)
thread_dir = paths.thread_dir("thread-route")
paths.sandbox_work_dir("thread-route").mkdir(parents=True, exist_ok=True)
(paths.sandbox_work_dir("thread-route") / "notes.txt").write_text("hello", encoding="utf-8")
user_id = get_effective_user_id()
thread_dir = paths.thread_dir("thread-route", user_id=user_id)
paths.sandbox_work_dir("thread-route", user_id=user_id).mkdir(parents=True, exist_ok=True)
(paths.sandbox_work_dir("thread-route", user_id=user_id) / "notes.txt").write_text("hello", encoding="utf-8")
app = make_authed_test_app()
app.include_router(threads.router)

View File

@ -34,7 +34,9 @@ def _runtime(thread_id: str | None = THREAD_ID) -> MagicMock:
def _uploads_dir(tmp_path: Path, thread_id: str = THREAD_ID) -> Path:
d = Paths(str(tmp_path)).sandbox_uploads_dir(thread_id)
from deerflow.runtime.user_context import get_effective_user_id
d = Paths(str(tmp_path)).sandbox_uploads_dir(thread_id, user_id=get_effective_user_id())
d.mkdir(parents=True, exist_ok=True)
return d

View File

@ -11,7 +11,9 @@ import pytest
from deerflow.runtime.user_context import (
CurrentUser,
DEFAULT_USER_ID,
get_current_user,
get_effective_user_id,
require_current_user,
reset_current_user,
set_current_user,
@ -67,3 +69,42 @@ def test_protocol_rejects_no_id():
"""Objects without .id do not satisfy CurrentUser Protocol."""
not_a_user = SimpleNamespace(email="no-id@example.com")
assert not isinstance(not_a_user, CurrentUser)
# ---------------------------------------------------------------------------
# get_effective_user_id / DEFAULT_USER_ID tests
# ---------------------------------------------------------------------------
def test_default_user_id_is_default():
assert DEFAULT_USER_ID == "default"
@pytest.mark.no_auto_user
def test_effective_user_id_returns_default_when_no_user():
"""No user in context -> fallback to DEFAULT_USER_ID."""
assert get_effective_user_id() == "default"
@pytest.mark.no_auto_user
def test_effective_user_id_returns_user_id_when_set():
user = SimpleNamespace(id="u-abc-123")
token = set_current_user(user)
try:
assert get_effective_user_id() == "u-abc-123"
finally:
reset_current_user(token)
@pytest.mark.no_auto_user
def test_effective_user_id_coerces_to_str():
"""User.id might be a UUID object; must come back as str."""
import uuid
uid = uuid.uuid4()
user = SimpleNamespace(id=uid)
token = set_current_user(user)
try:
assert get_effective_user_id() == str(uid)
finally:
reset_current_user(token)