mirror of https://github.com/bytedance/deer-flow.git synced 2026-06-09 17:12:01 +00:00

History

fix(replay-e2e): match by conversation, not the living system prompt (#3436 )

* fix(replay-e2e): match by conversation, not the living system prompt

The model-replay match key hashed the full input including the lead-agent
system prompt. That prompt is edited frequently (e.g. #3195 added a "File
Editing Workflow" section), so the committed fixture went stale the moment
the prompt changed on main — turning the Layer-2 render gate RED on every
unrelated PR (#3430, #3432, ...). This was a self-inflicted false positive.

Root-cause fix:
- replay_provider._canonical_messages now EXCLUDES the system message from
  the hash. The conversation (human/ai/tool) is the stable contract that
  identifies a recorded turn; the system prompt is an internal detail not
  part of the front-back contract under test. (Mirrors how open-design keys
  its mock picker on the user prompt, not the system internals.) Proven
  robust: injecting a prompt edit no longer causes a replay miss.
- Layer-1 golden was BLIND to replay misses: the gateway swallows a miss
  into an assistant error message, so the shape-only golden stayed green on
  a stale fixture. It now inspects replay_provider.replay_misses() and fails
  loud. (Layer-2 already fails on a miss.)
- Re-recorded write_read_file.ultra fixture + regenerated golden under the
  new conversation-only hash.
- Layer-2 render spec: assert the in-graph auto-title (deterministic); the
  follow-up suggestion is fired async and depends on a clean JSON model
  output, so assert it only when the fixture captured one — never gate on
  its absence (recording flakiness must not block CI).
- docs: REPLAY_E2E.md updated.

Verified: Layer-1 golden green (no miss), Layer-2 both specs green,
CI=true make test 4033 passed / 0 failed, frontend pnpm check clean.

* test(replay-e2e): restore suggestions coverage with a reliable capture

Addresses review feedback (the suggestion path was dropped from Layer-2):

- record spec now waits for the `/suggestions` response before checking
  capture stability, so the recorded fixture reliably includes the
  frontend-fired suggestions turn (previously the stability window could
  return before suggestions fired, yielding a fixture without it).
- Re-recorded write_read_file.ultra: 5 turns (write_file, auto-title,
  read_file, answer, suggestions). Golden unchanged — suggestions is a
  separate /suggestions call, not part of the /runs/stream SSE sequence.
- Layer-2 spec: restore the hard `EXPECTED_SUGGESTION` assertion. With the
  record spec now waiting for /suggestions, a fixture missing the suggestion
  turn means a broken recording and must fail loud, not pass silently.

Verified: Layer-1 golden green (no miss), Layer-2 both specs green
(auto-title + suggestion render), frontend pnpm check clean.

* ci: re-trigger (flaky Docker Hub image pull in sandbox e2e, unrelated)

backend-unit-tests failed only in test_sandbox_orphan_reconciliation_e2e.py
with 'docker pull busybox:latest ... context deadline exceeded' — a CI-runner
network flake reaching Docker Hub, not related to this docs/tests-only change.
Empty commit to re-run CI.

---------

Co-authored-by: DanielWalnut <45447813+hetaoBackend@users.noreply.github.com>

2026-06-08 17:32:41 +08:00

.vscode

chore: specify project name

2026-01-14 09:58:53 +08:00

public

feat(frontend): support static website demo mode (#3170 )

2026-05-23 00:10:56 +08:00

scripts

feat: add uploads

2026-01-24 19:38:08 +08:00

src

fix(frontend): truncate overflowing text in agent cards (#3391 )

2026-06-07 23:29:59 +08:00

tests

fix(replay-e2e): match by conversation, not the living system prompt (#3436 )

2026-06-08 17:32:41 +08:00

.env.example

docs: clean standalone LangGraph server remnants (#3301 )

2026-05-29 11:36:45 +08:00

.gitignore

chore: create frontend project from boilerplate

2026-01-14 09:50:26 +08:00

.npmrc

chore: add .npmrc back

2026-02-10 22:07:25 +08:00

.prettierignore

Stabilize write artifact previews (#3172 )

2026-05-23 16:56:14 +08:00

AGENTS.md

feat(frontend): add Playwright E2E tests with CI workflow (#2279 )

2026-04-18 08:21:08 +08:00

CLAUDE.md

docs: clean standalone LangGraph server remnants (#3301 )

2026-05-29 11:36:45 +08:00

components.json

feat: implement the first section of landing page

2026-01-23 00:15:21 +08:00

Dockerfile

chore(uv): speed up Docker builds with mirrors (#1600 )

2026-03-30 20:16:44 +08:00

eslint.config.js

fix: fix eslint errors and warnings

2026-01-31 21:46:31 +08:00

Makefile

feat(frontend): support static website demo mode (#3170 )

2026-05-23 00:10:56 +08:00

next.config.js

feat(frontend): support static website demo mode (#3170 )

2026-05-23 00:10:56 +08:00

package.json

chore(deps): bump next from 16.1.7 to 16.2.6 in /frontend (#2899 )

2026-05-12 10:45:40 +08:00

playwright.config.ts

fix: resolve make dev and test-e2e errors (#2570 )

2026-04-26 17:27:32 +08:00

playwright.real-backend.config.ts

test(e2e): deterministic record/replay front-back contract verification (#3365 )

2026-06-08 12:35:03 +08:00

playwright.record.config.ts

test(e2e): deterministic record/replay front-back contract verification (#3365 )

2026-06-08 12:35:03 +08:00

pnpm-lock.yaml

chore(deps): bump uuid from 10.0.0 to 14.0.0 in /frontend (#3281 )

2026-05-28 07:14:44 +08:00

pnpm-workspace.yaml

Add packages section to pnpm-workspace.yaml (#1382 )

2026-03-26 16:09:35 +08:00

postcss.config.js

chore: create frontend project from boilerplate

2026-01-14 09:50:26 +08:00

prettier.config.js

chore: create frontend project from boilerplate

2026-01-14 09:50:26 +08:00

README.md

docs: align runtime docs with gateway mode (#2868 )

2026-05-12 16:19:21 +08:00

tsconfig.json

feat: implement the first version of landing page

2026-01-23 13:24:03 +08:00

vitest.config.ts

feat(frontend): set up Vitest frontend testing infrastructure with CI workflow (#2147 )

2026-04-12 18:00:43 +08:00

README.md

DeerFlow Frontend

Like the original DeerFlow 1.0, we would love to give the community a minimalistic and easy-to-use web interface with a more modern and flexible architecture.

Tech Stack

Framework: Next.js 16 with App Router
UI: React 19, Tailwind CSS 4, Shadcn UI, MagicUI and React Bits
AI Integration: LangGraph SDK and Vercel AI Elements

Quick Start

Prerequisites

Node.js 22+
pnpm 10.26.2+

Installation

# Install dependencies
pnpm install

# Copy environment variables
cp .env.example .env
# Edit .env with your configuration

Development

# Start development server
pnpm dev

# The app will be available at http://localhost:3000

Build & Test

# Type check
pnpm typecheck

# Check formatting
pnpm format

# Apply formatting
pnpm format:write

# Lint
pnpm lint

# Run unit tests
pnpm test

# One-time setup: install Playwright Chromium browser
pnpm exec playwright install chromium

# Run E2E tests (builds and starts production server automatically)
pnpm test:e2e

# Build for production
pnpm build

# Start production server
pnpm start

Site Map

├── /                    # Landing page
├── /chats               # Chat list
├── /chats/new           # New chat page
└── /chats/[thread_id]   # A specific chat page

Configuration

Environment Variables

Key environment variables (see .env.example for full list):

# Backend API URL (optional, uses local Next.js/nginx proxy by default)
NEXT_PUBLIC_BACKEND_BASE_URL="http://localhost:8001"
# LangGraph-compatible API URL (optional, uses local Next.js/nginx proxy by default)
NEXT_PUBLIC_LANGGRAPH_BASE_URL="http://localhost:8001/api"

Project Structure

tests/
├── e2e/                    # E2E tests (Playwright, Chromium, mocked backend)
└── unit/                   # Unit tests (mirrors src/ layout)
src/
├── app/                    # Next.js App Router pages
│   ├── api/                # API routes
│   ├── workspace/          # Main workspace pages
│   └── mock/               # Mock/demo pages
├── components/             # React components
│   ├── ui/                 # Reusable UI components
│   ├── workspace/          # Workspace-specific components
│   ├── landing/            # Landing page components
│   └── ai-elements/        # AI-related UI elements
├── core/                   # Core business logic
│   ├── api/                # API client & data fetching
│   ├── artifacts/          # Artifact management
│   ├── config/              # App configuration
│   ├── i18n/               # Internationalization
│   ├── mcp/                # MCP integration
│   ├── messages/           # Message handling
│   ├── models/             # Data models & types
│   ├── settings/           # User settings
│   ├── skills/             # Skills system
│   ├── threads/            # Thread management
│   ├── todos/              # Todo system
│   └── utils/              # Utility functions
├── hooks/                  # Custom React hooks
├── lib/                    # Shared libraries & utilities
├── server/                 # Server-side code
│   └── better-auth/        # Authentication setup and session helpers
└── styles/                 # Global styles

Scripts

Command	Description
`pnpm dev`	Start development server with Turbopack
`pnpm build`	Build for production
`pnpm start`	Start production server
`pnpm test`	Run unit tests with Vitest
`pnpm test:e2e`	Run E2E tests with Playwright
`pnpm format`	Check formatting with Prettier
`pnpm format:write`	Apply formatting with Prettier
`pnpm lint`	Run ESLint
`pnpm lint:fix`	Fix ESLint issues
`pnpm typecheck`	Run TypeScript type checking
`pnpm check`	Run both lint and typecheck

Development Notes

Uses pnpm workspaces (see packageManager in package.json)
Turbopack enabled by default in development for faster builds
Environment validation can be skipped with SKIP_ENV_VALIDATION=1 (useful for Docker)
Backend API URLs are optional; nginx proxy is used by default in development

License

MIT License. See LICENSE for details.