mirror of https://github.com/bytedance/deer-flow.git synced 2026-06-09 17:12:01 +00:00

History

fix(replay-e2e): match by conversation, not the living system prompt (#3436 )

* fix(replay-e2e): match by conversation, not the living system prompt

The model-replay match key hashed the full input including the lead-agent
system prompt. That prompt is edited frequently (e.g. #3195 added a "File
Editing Workflow" section), so the committed fixture went stale the moment
the prompt changed on main — turning the Layer-2 render gate RED on every
unrelated PR (#3430, #3432, ...). This was a self-inflicted false positive.

Root-cause fix:
- replay_provider._canonical_messages now EXCLUDES the system message from
  the hash. The conversation (human/ai/tool) is the stable contract that
  identifies a recorded turn; the system prompt is an internal detail not
  part of the front-back contract under test. (Mirrors how open-design keys
  its mock picker on the user prompt, not the system internals.) Proven
  robust: injecting a prompt edit no longer causes a replay miss.
- Layer-1 golden was BLIND to replay misses: the gateway swallows a miss
  into an assistant error message, so the shape-only golden stayed green on
  a stale fixture. It now inspects replay_provider.replay_misses() and fails
  loud. (Layer-2 already fails on a miss.)
- Re-recorded write_read_file.ultra fixture + regenerated golden under the
  new conversation-only hash.
- Layer-2 render spec: assert the in-graph auto-title (deterministic); the
  follow-up suggestion is fired async and depends on a clean JSON model
  output, so assert it only when the fixture captured one — never gate on
  its absence (recording flakiness must not block CI).
- docs: REPLAY_E2E.md updated.

Verified: Layer-1 golden green (no miss), Layer-2 both specs green,
CI=true make test 4033 passed / 0 failed, frontend pnpm check clean.

* test(replay-e2e): restore suggestions coverage with a reliable capture

Addresses review feedback (the suggestion path was dropped from Layer-2):

- record spec now waits for the `/suggestions` response before checking
  capture stability, so the recorded fixture reliably includes the
  frontend-fired suggestions turn (previously the stability window could
  return before suggestions fired, yielding a fixture without it).
- Re-recorded write_read_file.ultra: 5 turns (write_file, auto-title,
  read_file, answer, suggestions). Golden unchanged — suggestions is a
  separate /suggestions call, not part of the /runs/stream SSE sequence.
- Layer-2 spec: restore the hard `EXPECTED_SUGGESTION` assertion. With the
  record spec now waiting for /suggestions, a fixture missing the suggestion
  turn means a broken recording and must fail loud, not pass silently.

Verified: Layer-1 golden green (no miss), Layer-2 both specs green
(auto-title + suggestion render), frontend pnpm check clean.

* ci: re-trigger (flaky Docker Hub image pull in sandbox e2e, unrelated)

backend-unit-tests failed only in test_sandbox_orphan_reconciliation_e2e.py
with 'docker pull busybox:latest ... context deadline exceeded' — a CI-runner
network flake reaching Docker Hub, not related to this docs/tests-only change.
Empty commit to re-run CI.

---------

Co-authored-by: DanielWalnut <45447813+hetaoBackend@users.noreply.github.com>

2026-06-08 17:32:41 +08:00

API.md

fix(security): harden MCP config endpoint (#3425 )

2026-06-08 12:21:02 +08:00

APPLE_CONTAINER.md

Fix command syntax for container image pull (#1349 )

2026-03-26 00:14:08 +08:00

ARCHITECTURE.md

docs: clarify LangGraph compatibility entrypoints (#2914 )

2026-05-12 23:15:11 +08:00

AUTH_DESIGN.md

docs: document auth design and user isolation (#2913 )

2026-05-12 23:07:11 +08:00

AUTH_TEST_DOCKER_GAP.md

docs: clean gateway runtime transition remnants (#3334 )

2026-06-02 10:03:28 +08:00

AUTH_TEST_PLAN.md

docs: clean standalone LangGraph server remnants (#3301 )

2026-05-29 11:36:45 +08:00

AUTH_UPGRADE.md

docs: clean gateway runtime transition remnants (#3334 )

2026-06-02 10:03:28 +08:00

AUTO_TITLE_GENERATION.md

docs: fix some broken links (#1864 )

2026-04-05 15:35:42 +08:00

BLOCKING_IO_DETECTION.md

fix(agents): offload UploadsMiddleware uploads scan off the event loop (#3311 )

2026-05-30 21:46:35 +08:00

CONFIGURATION.md

feat: upgrade MiniMax default model to M3 (#3357 )

2026-06-03 17:04:16 +08:00

FILE_UPLOAD.md

fix(uploads): enforce streaming upload limits in gateway (#2589 )

2026-05-01 20:19:30 +08:00

GUARDRAILS.md

fix: rename present_file to present_files in docs and prompts (#2393 )

2026-04-21 16:10:14 +08:00

MCP_SERVER.md

docs: discourage MCP filesystem workspace config (#3141 )

2026-05-22 09:19:23 +08:00

MEMORY_IMPROVEMENTS_SUMMARY.md

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

MEMORY_IMPROVEMENTS.md

fix(memory): inject stored facts into system prompt memory context (#1083 )

2026-03-13 14:37:40 +08:00

MEMORY_SETTINGS_REVIEW.md

feat: support manual add and edit for memory facts (#1538 )

2026-03-29 23:53:23 +08:00

memory-settings-sample.json

feat: support manual add and edit for memory facts (#1538 )

2026-03-29 23:53:23 +08:00

middleware-execution-flow.md

feat(loop-detection): defer warning injection (#2752 )

2026-05-21 14:36:07 +08:00

PATH_EXAMPLES.md

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

plan_mode_usage.md

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

README.md

chore: add sandbox memory profiling tools (#3249 )

2026-06-03 22:02:27 +08:00

REPLAY_E2E.md

fix(replay-e2e): match by conversation, not the living system prompt (#3436 )

2026-06-08 17:32:41 +08:00

rfc-create-deerflow-agent.md

feat: add create_deerflow_agent SDK entry point (Phase 1) (#1203 )

2026-03-29 15:31:18 +08:00

rfc-extract-shared-modules.md

refactor: extract shared skill installer and upload manager to harness (#1202 )

2026-03-25 16:28:33 +08:00

rfc-grep-glob-tools.md

feat(sandbox): add built-in grep and glob tools (#1784 )

2026-04-03 16:03:06 +08:00

SANDBOX_MEMORY_PROFILING.md

chore: add sandbox memory profiling tools (#3249 )

2026-06-03 22:02:27 +08:00

SETUP.md

fix(harness): resolve runtime paths from project root (#2642 )

2026-05-01 22:19:50 +08:00

STREAMING.md

fix(backend): stream DeerFlowClient AI text as token deltas (#1969 ) (#1974 )

2026-04-10 18:16:38 +08:00

summarization.md

fix(middleware): avoid rescuing non-skill tool outputs during summarization (#2458 )

2026-04-24 21:19:46 +08:00

task_tool_improvements.md

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

TITLE_GENERATION_IMPLEMENTATION.md

feat(persistence):Unified persistence layer with event store, feedback, and rebase cleanup (#2134 )

2026-04-26 11:09:55 +08:00

TODO.md

docs: clean standalone LangGraph server remnants (#3301 )

2026-05-29 11:36:45 +08:00

README.md

Documentation

This directory contains detailed documentation for the DeerFlow backend.

Quick Links

Document	Description
ARCHITECTURE.md	System architecture overview
API.md	Complete API reference
AUTH_DESIGN.md	User authentication, CSRF, and per-user isolation design
CONFIGURATION.md	Configuration options
SETUP.md	Quick setup guide

Feature Documentation

Document	Description
STREAMING.md	Token-level streaming design: Gateway vs DeerFlowClient paths, `stream_mode` semantics, per-id dedup
FILE_UPLOAD.md	File upload functionality
PATH_EXAMPLES.md	Path types and usage examples
SANDBOX_MEMORY_PROFILING.md	Sandbox memory baseline and runtime comparison guide
summarization.md	Context summarization feature
plan_mode_usage.md	Plan mode with TodoList
AUTO_TITLE_GENERATION.md	Automatic title generation

Development

Document	Description
TODO.md	Planned features and known issues

Getting Started

New to DeerFlow? Start with SETUP.md for quick installation
Configuring the system? See CONFIGURATION.md
Understanding the architecture? Read ARCHITECTURE.md
Building integrations? Check API.md for API reference

Document Organization

docs/
├── README.md                  # This file
├── ARCHITECTURE.md            # System architecture
├── API.md                     # API reference
├── AUTH_DESIGN.md             # User authentication and isolation design
├── CONFIGURATION.md           # Configuration guide
├── SETUP.md                   # Setup instructions
├── FILE_UPLOAD.md             # File upload feature
├── PATH_EXAMPLES.md           # Path usage examples
├── summarization.md           # Summarization feature
├── plan_mode_usage.md         # Plan mode feature
├── STREAMING.md               # Token-level streaming design
├── AUTO_TITLE_GENERATION.md   # Title generation
├── TITLE_GENERATION_IMPLEMENTATION.md  # Title implementation details
└── TODO.md                    # Roadmap and issues