deer-flow

mirror of https://github.com/bytedance/deer-flow.git synced 2026-06-09 17:12:01 +00:00

Author	SHA1	Message	Date
Xinmin Zeng	90e23bfd09	fix(ci): consolidate PR/issue labeling and fix reviewing-job crash + label thrash (#3455 ) * fix(ci): consolidate PR/issue labeling into one triage.yml; fix reviewing crash & label thrash - Replace pr-labeler + pr-triage + issue-triage with a single triage.yml; drop actions/labeler. Its sync-labels removed labels outside its config (clobbered size/risk/needs-validation and could clobber maintainer labels). Area is now computed in-script and reconciled only within owned namespaces (area:/size//risk:/needs-validation); first-time/reviewing are add-only. - reviewing: gate on author_association in {OWNER,MEMBER,COLLABORATOR} + user.type==='User' instead of getCollaboratorPermissionLevel, which 404'd on bot reviewers ('Copilot is not a user') and crashed the job. Excludes all review bots with no denylist and no API call. - Read live state (listFiles + listLabelsOnIssue) not the stale event payload, so rapid synchronize events converge instead of thrashing. Size churn excludes lockfiles/snapshots. * fix(ci): read labels live via paginate in reviewing & issue-triage jobs Address review feedback on #3455: - reviewing: listLabelsOnIssue now paginates (per_page:100) instead of the default 30, matching pr-labels, so a 'reviewing' label is never missed on PRs with many labels. - issue-triage: read live labels via the API instead of the event payload, consistent with the live-state reads documented in the header.	2026-06-09 11:14:19 +08:00
Xinmin Zeng	88759015e4	test(e2e): deterministic record/replay front-back contract verification (#3365 ) * test(e2e): record/replay front-back contract verification Guards the front-back contract with a deterministic, key-free record/replay harness (mirrors open-design's golden-trace approach): - ReplayChatModel (tests/replay_provider.py): replays recorded LLM turns by a normalized hash of the model input. Strips <system-reminder>/date/uuid/tmp-path so one fixture replays across days and from both the browser and direct-POST paths; a miss raises loudly (no silent divergence). - Recording is record-through-browser (scripts/record_gateway.py + build_fixture_from_jsonl.py + frontend/tests/e2e-record): a real run is driven through the real frontend so captured inputs match exactly what the browser sends; fixtures contain no API key. - Layer 1 — backend golden (tests/test_replay_golden.py): replay through the real gateway, assert the SSE event sequence == committed golden. - Layer 2 — full-stack render (frontend/tests/e2e-real-backend): real Next.js + real gateway (replay model) + Chromium; assert the replayed auto-title and follow-up suggestions render. DOM assertions are the gate; visual regression is a local dev gate (CI uploads the render as an artifact). - CI (.github/workflows/replay-e2e.yml): both layers, triggered on EITHER side of the contract (frontend/** or backend gateway/harness/fixtures). * test(e2e): multi-run render-order cross-stack scenario (#3352) Guards the dangerous front-back class where a backend ordering change silently breaks a frontend assumption while both sides' unit tests stay green. Reproduces issue #3352: backend list_by_thread returns runs newest-first (#2932) and the frontend prepended per-run pages, inverting chronological order once the checkpoint no longer held the older messages. - tests/seed_runs_router.py: test-only seeder, mounted on the replay gateway only when DEERFLOW_ENABLE_TEST_SEED=1 (never in the production app). Seeds a thread with >=2 runs + per-run message events and no checkpoint -- the #3352 precondition -- so the frontend per-run reload path is the sole source of truth and the prepend inversion is observable. - frontend/tests/e2e-real-backend/multi-run-order.spec.ts: drives the real frontend against the real gateway, asserts the first run renders above the second. Reverting the #3354 fix turns it red. - replay-e2e.yml: trigger on the new replay test-infra paths. - docs: REPLAY_E2E.md cross-stack scenario section. * test(e2e): address Copilot review on the replay harness - Fix stale recorder references (scripts/record_traces.py -> scripts/record_gateway.py + scripts/build_fixture_from_jsonl.py) in replay_provider.py, test_replay_golden.py, _replay_fixture.py. - MODE_CONTEXT['ultra']: thinking_enabled False -> True, mirroring the frontend's `context.mode !== 'flash'` (hooks.ts). It did not affect the hashed input (Layer 1 golden still green), but the table now matches the real frontend context it claims to mirror. - replay_provider.py docstring: stop claiming memory is recorded-enabled; the replay config disables memory/summarization for determinism (title stays, as an in-graph deterministic call). - record_gateway.py / run_replay_gateway.py: override DEER_FLOW_HOME instead of setdefault, so an outer value can't leak into the hermetic harness. - record_gateway.py: clear error when DEERFLOW_RECORD_OUT is unset (was a bare KeyError). - playwright.record.config.ts: forward OPENAI_/DEERFLOW_RECORD_OUT only when set, so the gateway raises a clear 'missing env' error instead of getting ''. test(e2e): address Copilot review round 2 - seed_runs_router.py: constrain SeedMessage.role to Literal['human','ai'] so a bad value is a clean 422 at the boundary instead of a 500 (KeyError on _EVENT_TYPE). - record-write-read-file.spec.ts: waitForCaptureStable now throws on timeout instead of returning the last count, so a truncated/partial recording can't pass silently. - real-backend-render.spec.ts: guard the suggestions JSON.parse; a bracket-prefixed non-JSON turn falls back to '' so the existing not.toBe('') assertion fails clearly instead of a generic parse throw.	2026-06-08 12:35:03 +08:00
Xinmin Zeng	aca7acc105	feat(ci): PR/issue auto-labeling + declarative label sync (#3360 ) - .github/labels.yml: declarative source of truth (29 namespaced labels) - scripts/sync_labels.py + label-sync.yml: idempotent label sync (self-bootstraps on merge) - labeler.yml + pr-labeler.yml: area:* labels by changed path (actions/labeler) - pr-triage.yml: size/, risk:, needs-validation, first-time-contributor, reviewing - issue-triage.yml: needs-triage on new issues (self-healing) All PR workflows use pull_request_target but never check out or run PR code (read changed-file metadata via the API only).	2026-06-03 16:40:24 +08:00
AochenShen99	e344be8d94	feat(tests): add Blockbuster runtime gate for event-loop blocking IO (#3229 ) * feat(tests): add Blockbuster runtime gate for event-loop blocking IO Adds a strict runtime gate that fails CI when sync blocking IO calls run on the asyncio event loop thread through DeerFlow business code. Components: - backend/tests/support/detectors/blocking_io_runtime.py — Blockbuster context scoped to `app.` and `deerflow.` so test infrastructure, pytest internals, and third-party libraries stay silent. - backend/tests/blocking_io/conftest.py — pytest_runtest_protocol hookwrapper that wraps every item (setup + call + teardown) with the strict context. Respects `@pytest.mark.allow_blocking_io` opt-out. - backend/tests/blocking_io/test_skills_load.py — regression anchor for the #1917 fix (asyncio.to_thread offload around LocalSkillStorage.load_skills). - backend/tests/blocking_io/test_sqlite_lifespan.py — regression anchor for the #1912 fix (asyncio.to_thread offload around ensure_sqlite_parent_dir). - backend/tests/blocking_io/test_gate_smoke.py — meta-test asserting the gate actually catches unoffloaded blocking IO and that the `@pytest.mark.allow_blocking_io` opt-out works. - backend/Makefile — `make test-blocking-io` target. - .github/workflows/backend-blocking-io-tests.yml — hard-fail PR gate on ubuntu-latest. Windows matrix deferred to follow-up. Dependencies: - blockbuster>=1.5.26,<1.6 added to dev group. Coverage boundary (called out in PR body): the gate only catches blocking IO on code paths the test suite actually exercises. Static AST inventory (separate, informational) is the complementary coverage tool. Three blind spot categories — untested paths, mocked-away paths, env-mismatched paths — are documented in the PR description. Findings surfaced while authoring this PR: - resolve_sqlite_conn_str in runtime/store/_sqlite_utils.py:19 does sync Path.resolve() -> os.path.abspath on the lifespan loop thread, ahead of the #1912 fix. Not addressed here; tracked as follow-up. Tests: 4 passed locally (`make test-blocking-io`). Lint/format: clean (`ruff check` and `ruff format --check`). * fix(tests): scope Blockbuster gate to blocking-io suite * fix(tests): harden Blockbuster runtime gate * test(blocking-io): add project rule extension point * test(blocking-io): address review cleanup	2026-05-26 23:03:49 +08:00
Willem Jiang	b10eb7bafc	feat(github): Added container push workflow (#2709 ) * feat(github):Added container push workflow * Apply suggestions from code review Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-05-04 11:14:34 +08:00
yangzheli	c6b0423558	feat(frontend): add Playwright E2E tests with CI workflow (#2279 ) * feat(frontend): add Playwright E2E tests with CI workflow Add end-to-end testing infrastructure using Playwright (Chromium only). 14 tests across 5 spec files cover landing page, chat workspace, thread history, sidebar navigation, and agent chat — all with mocked LangGraph/Backend APIs via network interception (zero backend dependency). New files: - playwright.config.ts — Chromium, 30s timeout, auto-start Next.js - tests/e2e/utils/mock-api.ts — shared API mocks & SSE stream helpers - tests/e2e/{landing,chat,thread-history,sidebar,agent-chat}.spec.ts - .github/workflows/e2e-tests.yml — push main + PR trigger, paths filter Updated: package.json, Makefile, .gitignore, CONTRIBUTING.md, frontend/CLAUDE.md, frontend/AGENTS.md, frontend/README.md Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: apply Copilot suggestions --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-04-18 08:21:08 +08:00
yangzheli	4efc8d404f	feat(frontend): set up Vitest frontend testing infrastructure with CI workflow (#2147 ) * feat: set up Vitest frontend testing infrastructure with CI workflow Migrate existing 4 frontend test files from Node.js native test runner (node:test + node:assert/strict) to Vitest, reorganize test directory structure under tests/unit/ mirroring src/ layout, and add a dedicated CI workflow for frontend unit tests. - Add vitest as devDependency, remove tsx - Create vitest.config.ts with @/ path alias - Migrate tests to Vitest API (test/expect/vi) - Rename .mjs test files to .ts - Move tests from src/ to tests/unit/ (mirrors src/ layout) - Add frontend/Makefile `test` target - Add .github/workflows/frontend-unit-tests.yml (parallel to backend) - Update CONTRIBUTING.md, README.md, AGENTS.md, CLAUDE.md Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: fix the lint error * style: fix the lint error --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-04-12 18:00:43 +08:00
greatmengqi	084dc7e748	ci: enforce code formatting checks for backend and frontend (#1536 )	2026-03-29 15:34:38 +08:00
luo jiyin	ca20b48601	chore(ci): align workflow action versions (#1484 )	2026-03-27 23:25:55 +08:00
Willem Jiang	d0049ad904	chron(ci):setup the lint check in frontend (#1276 ) * chron(ci):setup the lint check in frontend * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix(ci): correct lint-check.yml indentation, add Python 3.12 setup, upgrade checkout to v4 (#1277) * Initial plan * Fix lint-check.yml: fix steps indentation, add Python 3.12 setup, upgrade checkout to v4 Co-authored-by: WillemJiang <219644+WillemJiang@users.noreply.github.com> Agent-Logs-Url: https://github.com/bytedance/deer-flow/sessions/7b4d4fad-f024-453a-9f93-5fc2dd83b471 --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: WillemJiang <219644+WillemJiang@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>	2026-03-24 10:48:18 +08:00
Willem Jiang	72f01a1638	Update workflow to trigger on push to main Add push trigger for unit tests on main branch	2026-03-22 17:57:06 +08:00
Salman Chishti	902ff3b9f3	Upgrade GitHub Actions to latest versions (#913 ) Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com> Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-02-26 22:49:32 +08:00
Salman Chishti	32a22069e9	Upgrade GitHub Actions for Node 24 compatibility (#912 ) Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com>	2026-02-26 20:23:57 +08:00
DanielWalnut	faa422072c	feat(subagents): make subagent timeout configurable via config.yaml (#897 ) * feat(subagents): make subagent timeout configurable via config.yaml - Add SubagentsAppConfig supporting global and per-agent timeout_seconds - Load subagents config section in AppConfig.from_file() - Registry now applies config.yaml overrides without mutating builtin defaults - Polling safety-net in task_tool is now dynamic (execution timeout + 60s buffer) - Document subagents section in config.example.yaml - Add make test command and enforce TDD policy in CLAUDE.md - Add 38 unit tests covering config validation, timeout resolution, registry override behavior, and polling timeout formula Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(subagents): add logging for subagent timeout config and execution - Log loaded timeout config (global default + per-agent overrides) on startup - Log debug message in registry when config.yaml overrides a builtin timeout - Include timeout in executor's async execution start log - Log effective timeout and polling limit when a task is dispatched - Fix UnboundLocalError: move max_poll_count assignment before logger.info Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * ci(backend): add lint step and run all unit tests via Makefile Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix lint --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-25 08:39:29 +08:00
Willem Jiang	03705acf3a	fix(sandbox):deer-flow-provisioner container fails to start in local execution mode (#889 )	2026-02-24 08:31:52 +08:00
Willem Jiang	a66d8c94fa	Prepare to merge deer-flow-2	2026-02-14 16:28:12 +08:00
Willem Jiang	1d71f8910e	fix: react key warnings from duplicate message IDs + establish jest testing framework (#655 ) * fix: resolve issue #588 - react key warnings from duplicate message IDs + establish jest testing framework * Update the makefile and workflow with the js test * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-10-25 20:46:43 +08:00
Willem Jiang	d9f829b608	Add frontend tests step to frontend lint workflow	2025-10-16 19:19:07 +08:00
Willem Jiang	4c17d88029	feat: creating mogodb and postgres mock instance in checkpoint test (#561 ) * fix: using mongomock for the checkpoint test * Add postgres mock setting to the unit test * Added utils file of postgres_mock_utils * fixed the runtime loading error of deerflow server	2025-09-09 22:49:11 +08:00
Willem Jiang	72f9c59195	feat: add lint check of front-end (#534 ) * feat: add lint check of front-end * add pnpm installation * add pnpm installation	2025-08-22 21:08:53 +08:00
Willem Jiang	b08e9ad3ac	fix: GitHub workflow action version warning (#520 ) * fix: using commit hash as the action version * fix: using commit hash as the action version --------- Co-authored-by: Willem Jiang <143703838+willem-bd@users.noreply.github.com>	2025-08-20 14:39:02 +08:00
Willem Jiang	c6d152a074	fix: using commit hash as the action version (#519 ) Co-authored-by: Willem Jiang <143703838+willem-bd@users.noreply.github.com>	2025-08-20 13:52:00 +08:00
CHANGXUBO	1bfec3ad05	feat: Enhance chat streaming and tool call processing (#498 ) * feat: Enhance chat streaming and tool call processing - Added support for MongoDB checkpointer in the chat streaming workflow. - Introduced functions to process tool call chunks and sanitize arguments. - Improved event message creation with additional metadata. - Enhanced error handling for JSON serialization in event messages. - Updated the frontend to convert escaped characters in tool call arguments. - Refactored the workflow input preparation and initial message processing. - Added new dependencies for MongoDB integration and tool argument sanitization. * fix: Update MongoDB checkpointer configuration to use LANGGRAPH_CHECKPOINT_DB_URL * feat: Add support for Postgres checkpointing and update README with database recommendations * feat: Implement checkpoint saver functionality and update MongoDB connection handling * refactor: Improve code formatting and readability in app.py and json_utils.py * refactor: Clean up commented code and improve formatting in server.py * refactor: Remove unused imports and improve code organization in app.py * refactor: Improve code organization and remove unnecessary comments in app.py * chore: use langgraph-checkpoint-postgres==2.0.21 to avoid the JSON convert issue in the latest version, implement chat stream persistant with Postgres * feat: add MongoDB and PostgreSQL support for LangGraph checkpointing, enhance environment variable handling * fix: update comments for clarity on Windows event loop policy * chore: remove empty code changes in MongoDB and PostgreSQL checkpoint tests * chore: clean up unused imports and code in checkpoint-related files * chore: remove empty code changes in test_checkpoint.py * chore: remove empty code changes in test_checkpoint.py * chore: remove empty code changes in test_checkpoint.py * test: update status code assertions in MCP endpoint tests to allow for 403 responses * test: update MCP endpoint tests to assert specific status codes and enable MCP server configuration * chore: remove unnecessary environment variables from unittest workflow * fix: invert condition for MCP server configuration check to raise 403 when disabled * chore: remove pymongo from test dependencies in uv.lock * chore: optimize the _get_agent_name method * test: enhance ChatStreamManager tests for PostgreSQL and MongoDB initialization * test: add persistence tests for ChatStreamManager with PostgreSQL and MongoDB * test: add unit tests for ChatStreamManager initialization with PostgreSQL and MongoDB * test: enhance persistence tests for ChatStreamManager with PostgreSQL and MongoDB to verify message aggregation * test: add unit tests for ChatStreamManager with PostgreSQL and MongoDB * test: add unit tests for ChatStreamManager initialization with PostgreSQL and MongoDB * test: add unit tests for ChatStreamManager initialization with PostgreSQL and MongoDB --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2025-08-16 21:03:12 +08:00
Willem Jiang	d8016809b2	fix: the typo of setup-uv action (#393 ) * fix: spine the github hash on the third party actions * fix: the typo of action * fix: try to fix the build by specify the action version	2025-07-07 08:43:11 +08:00
Willem Jiang	6c254c0783	fix: spine the github hash on the third party actions (#392 )	2025-07-07 08:18:17 +08:00
Johannes Maron	5977b4a03e	Publish containers to GitHub (#375 ) This workflow creates two offical container images: * `ghcr.io/codingjoe/deer-flow:main` * `ghcr.io/codingjoe/deer-flow-web:main`	2025-06-29 20:55:51 +08:00
Willem Jiang	db3e74629f	fix: added permissions setting in the workflow (#273 ) * fix: added permissions setting in the workflow * fix: reformat the code of src/tools/retriever.py	2025-06-03 11:48:51 +08:00
XingLiu0923	9cff113862	feat(ut): add ut coverage check (#170 )	2025-05-15 08:56:13 -07:00
DanielWalnut	a4da95f412	chore: add github workflow (#15 )	2025-05-09 13:41:05 +08:00
Li Xin	0ac6efa7f0	chore: remove	2025-05-07 15:24:03 +08:00
He Tao	03798ded08	feat: lite deep researcher implementation	2025-04-09 20:32:16 +08:00

31 Commits