mirror of https://github.com/bytedance/deer-flow.git synced 2026-06-09 17:12:01 +00:00

History

feat: MiniMax provider for image/video/podcast skills + new music-generation skill (#3437 )

* docs(spec): MiniMax integration for generation skills + new music skill

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(plan): MiniMax generation providers implementation plan

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(skills): add importlib loader + FakeResp for skill tests

* test(skills): register loaded module in sys.modules; raise requests.HTTPError in FakeResp

* feat(image-generation): add MiniMax provider with env auto-detect

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(image-generation): guard unknown provider, derive ref MIME, strengthen tests

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(video-generation): add MiniMax provider with async poll/download

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(video-generation): surface base_resp errors while polling; add timeout test

* feat(podcast-generation): add MiniMax t2a_v2 provider with env auto-detect

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor(podcast-generation): restore TTS credential guard; add volcengine + voice tests

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(music-generation): new MiniMax music skill via skill-creator

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* refactor(music-generation): treat empty lyrics as absent; test no-audio-data path

* refactor(skills): add request timeouts to MiniMax network calls

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Potential fix for pull request finding 'Explicit returns mixed with implicit (fall through) returns'

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

* fix(models): strip inconsistent user-message names for MiniMax chat

DeerFlow middlewares tag user messages with provenance names (user-input, summary, loop_warning); langchain serializes them into the OpenAI-compatible payload and MiniMax rejects mismatched user-message names with "user name must be consistent (2013)". PatchedChatMiniMax now drops the per-message name from user-role messages. Point the config.example MiniMax models at PatchedChatMiniMax so they also get reasoning_content mapping.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(image-generation): MiniMax sends JSON prompt field, guard 1500-char limit

MiniMax image-01 takes one text string capped at 1500 chars, but the skill was sending the whole structured JSON. The MiniMax provider now extracts the JSON `prompt` field (relying on prompt_optimizer to expand it) and fails fast with a clear error before calling the API when that field exceeds 1500 chars. Authoring stays provider-agnostic; Gemini still receives the full JSON.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(podcast-generation): per-provider TTS concurrency and retry/backoff

Each TTS provider owns its concurrency internally — MiniMax runs single-threaded to reduce rate-limit failures, Volcengine keeps 4 workers — with automatic retry and backoff on transient HTTP and base_resp errors. No caller-facing concurrency knob.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(skills): address Copilot review comments on generation skills

- video: add raise_for_status + timeout to the Gemini download/POST/poll calls so non-2xx responses surface as clear HTTP errors instead of JSON/KeyError or hangs
- video: check the task Fail status before the generic base_resp check so the failure keeps its task_id context
- video/image: create the output file parent directory before writing (matching music-generation) so nested output paths do not raise FileNotFoundError
- music: require a non-empty prompt and fail fast with ValueError instead of sending an empty prompt to the API

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(scripts): reclaim dev ports across worktrees in make stop/dev

All deer-flow worktrees (main checkout + linked worktrees) hardcode the same dev ports (8001/3000/2026), so a service started from any worktree must be reclaimable from another. stop_all now resolves the set of worktree roots (DEERFLOW_ROOTS) and treats a process as deer-flow-owned when its open files live under any of them. It also force-kills survivors on 2026 alongside 8001/3000, fixing `make dev` aborting on the nginx port preflight when a prior nginx lingered on 2026.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(view-image): hide the injected image-context message from the UI

ViewImageMiddleware injects a HumanMessage (text + base64 images) so the vision model can see viewed images, but it was the only internal injector that set neither hide_from_ui nor a hidden name, so it leaked into the chat UI (and IM channels) as a user bubble reading "Here are the images you've viewed:". Mark it with additional_kwargs={"hide_from_ui": True}, matching todo/dynamic_context injections, which the frontend isHiddenFromUIMessage and the channel sender already honor. The model still receives the full content.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(minimax): mark M2.7 models as text-only (no vision)

MiniMax M2.7 / M2.7-highspeed do not support vision; only M3 does. The
provider config asserted vision support for M2.7 in four places.

- config.example.yaml: 4 M2.7 entries -> supports_vision: false
- backend/docs/CONFIGURATION.md: M2.7 + highspeed -> supports_vision: false
- wizard: add LLMProvider.model_vision_overrides + extra_config_for() so
  selecting an M2.7 model writes supports_vision: false while M3 (default)
  keeps vision; wire it through setup_wizard.py
- tests: M2.7-highspeed fixture -> supports_vision=False; add
  test_minimax_vision_is_per_model

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

2026-06-08 22:04:38 +08:00

API.md

fix(security): harden MCP config endpoint (#3425 )

2026-06-08 12:21:02 +08:00

APPLE_CONTAINER.md

Fix command syntax for container image pull (#1349 )

2026-03-26 00:14:08 +08:00

ARCHITECTURE.md

docs: clarify LangGraph compatibility entrypoints (#2914 )

2026-05-12 23:15:11 +08:00

AUTH_DESIGN.md

docs: document auth design and user isolation (#2913 )

2026-05-12 23:07:11 +08:00

AUTH_TEST_DOCKER_GAP.md

docs: clean gateway runtime transition remnants (#3334 )

2026-06-02 10:03:28 +08:00

AUTH_TEST_PLAN.md

docs: clean standalone LangGraph server remnants (#3301 )

2026-05-29 11:36:45 +08:00

AUTH_UPGRADE.md

docs: clean gateway runtime transition remnants (#3334 )

2026-06-02 10:03:28 +08:00

AUTO_TITLE_GENERATION.md

docs: fix some broken links (#1864 )

2026-04-05 15:35:42 +08:00

BLOCKING_IO_DETECTION.md

fix(agents): offload UploadsMiddleware uploads scan off the event loop (#3311 )

2026-05-30 21:46:35 +08:00

CONFIGURATION.md

feat: MiniMax provider for image/video/podcast skills + new music-generation skill (#3437 )

2026-06-08 22:04:38 +08:00

FILE_UPLOAD.md

fix(uploads): enforce streaming upload limits in gateway (#2589 )

2026-05-01 20:19:30 +08:00

GUARDRAILS.md

fix: rename present_file to present_files in docs and prompts (#2393 )

2026-04-21 16:10:14 +08:00

MCP_SERVER.md

docs: discourage MCP filesystem workspace config (#3141 )

2026-05-22 09:19:23 +08:00

MEMORY_IMPROVEMENTS_SUMMARY.md

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

MEMORY_IMPROVEMENTS.md

fix(memory): inject stored facts into system prompt memory context (#1083 )

2026-03-13 14:37:40 +08:00

MEMORY_SETTINGS_REVIEW.md

feat: support manual add and edit for memory facts (#1538 )

2026-03-29 23:53:23 +08:00

memory-settings-sample.json

feat: support manual add and edit for memory facts (#1538 )

2026-03-29 23:53:23 +08:00

middleware-execution-flow.md

feat(loop-detection): defer warning injection (#2752 )

2026-05-21 14:36:07 +08:00

PATH_EXAMPLES.md

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

plan_mode_usage.md

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

README.md

chore: add sandbox memory profiling tools (#3249 )

2026-06-03 22:02:27 +08:00

REPLAY_E2E.md

fix(replay-e2e): match by conversation, not the living system prompt (#3436 )

2026-06-08 17:32:41 +08:00

rfc-create-deerflow-agent.md

feat: add create_deerflow_agent SDK entry point (Phase 1) (#1203 )

2026-03-29 15:31:18 +08:00

rfc-extract-shared-modules.md

refactor: extract shared skill installer and upload manager to harness (#1202 )

2026-03-25 16:28:33 +08:00

rfc-grep-glob-tools.md

feat(sandbox): add built-in grep and glob tools (#1784 )

2026-04-03 16:03:06 +08:00

SANDBOX_MEMORY_PROFILING.md

chore: add sandbox memory profiling tools (#3249 )

2026-06-03 22:02:27 +08:00

SETUP.md

fix(harness): resolve runtime paths from project root (#2642 )

2026-05-01 22:19:50 +08:00

STREAMING.md

fix(backend): stream DeerFlowClient AI text as token deltas (#1969 ) (#1974 )

2026-04-10 18:16:38 +08:00

summarization.md

fix(middleware): avoid rescuing non-skill tool outputs during summarization (#2458 )

2026-04-24 21:19:46 +08:00

task_tool_improvements.md

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

TITLE_GENERATION_IMPLEMENTATION.md

feat(persistence):Unified persistence layer with event store, feedback, and rebase cleanup (#2134 )

2026-04-26 11:09:55 +08:00

TODO.md

docs: clean standalone LangGraph server remnants (#3301 )

2026-05-29 11:36:45 +08:00

README.md

Documentation

This directory contains detailed documentation for the DeerFlow backend.

Quick Links

Document	Description
ARCHITECTURE.md	System architecture overview
API.md	Complete API reference
AUTH_DESIGN.md	User authentication, CSRF, and per-user isolation design
CONFIGURATION.md	Configuration options
SETUP.md	Quick setup guide

Feature Documentation

Document	Description
STREAMING.md	Token-level streaming design: Gateway vs DeerFlowClient paths, `stream_mode` semantics, per-id dedup
FILE_UPLOAD.md	File upload functionality
PATH_EXAMPLES.md	Path types and usage examples
SANDBOX_MEMORY_PROFILING.md	Sandbox memory baseline and runtime comparison guide
summarization.md	Context summarization feature
plan_mode_usage.md	Plan mode with TodoList
AUTO_TITLE_GENERATION.md	Automatic title generation

Development

Document	Description
TODO.md	Planned features and known issues

Getting Started

New to DeerFlow? Start with SETUP.md for quick installation
Configuring the system? See CONFIGURATION.md
Understanding the architecture? Read ARCHITECTURE.md
Building integrations? Check API.md for API reference

Document Organization

docs/
├── README.md                  # This file
├── ARCHITECTURE.md            # System architecture
├── API.md                     # API reference
├── AUTH_DESIGN.md             # User authentication and isolation design
├── CONFIGURATION.md           # Configuration guide
├── SETUP.md                   # Setup instructions
├── FILE_UPLOAD.md             # File upload feature
├── PATH_EXAMPLES.md           # Path usage examples
├── summarization.md           # Summarization feature
├── plan_mode_usage.md         # Plan mode feature
├── STREAMING.md               # Token-level streaming design
├── AUTO_TITLE_GENERATION.md   # Title generation
├── TITLE_GENERATION_IMPLEMENTATION.md  # Title implementation details
└── TODO.md                    # Roadmap and issues