deer-flow

mirror of https://github.com/bytedance/deer-flow.git synced 2026-04-25 11:18:22 +00:00

Author	SHA1	Message	Date
KKK	654354c624	test(skills): add evaluation + trigger analysis for systematic-literature-review (#2061 ) * test(skills): add trigger eval set for systematic-literature-review skill 20 eval queries (10 should-trigger, 10 should-not-trigger) for use with skill-creator's run_eval.py. Includes real-world SLR queries contributed by @VANDRANKI (issue #1862 author) and edge cases for routing disambiguation with academic-paper-review. * test(skills): add grader expectations for SLR skill evaluation 5 eval cases with 39 expectations covering: - Standard SLR flow (APA/BibTeX/IEEE format selection) - Keyword extraction and search behavior - Subagent dispatch for metadata extraction - Report structure (themes, convergences, gaps, per-paper annotations) - Negative case: single-paper routing to academic-paper-review - Edge case: implicit SLR without explicit keywords * refactor(skills): shorten SLR description for better trigger rate Reduce description from 833 to 344 chars. Key changes: - Lead with "systematic literature review" as primary trigger phrase - Strengthen single-paper exclusion: "Not for single-paper tasks" - Remove verbose example patterns that didn't improve routing Tested with run_eval.py (10 runs/query): - False positive "best paper on RL": 67% → 20% (improved) - True positive explicit SLR query: ~30% (unchanged) Low recall is a routing-layer limitation, not a description issue — see PR description for full analysis. * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-04-10 18:02:45 +08:00
KKK	16aa51c9b3	feat(skills): add systematic-literature-review skill for multi-paper SLR workflows (#2032 ) * feat(skills): add systematic-literature-review skill for multi-paper SLR workflows Adds a new skill that produces a structured systematic literature review (SLR) across multiple academic papers on a topic. Addresses #1862 with a pure skill approach: no new tools, no architectural changes, no new dependencies. Skill layout: - SKILL.md — 4+1 phase workflow (plan, search, extract, synthesize, present) - scripts/arxiv_search.py — arXiv API client, stdlib only, with a requests->urllib fallback shim modeled after github-deep-research's github_api.py - templates/{apa,ieee,bibtex}.md — citation format templates selected dynamically in Phase 4, mirroring podcast-generation's templates/ pattern Design notes: - Multi-paper synthesis uses the existing `task` tool to dispatch extraction subagents in parallel. SKILL.md's Phase 3 includes a fixed decision table for batch splitting to respect the runtime's MAX_CONCURRENT_SUBAGENTS = 3 cap, and explicitly tells the agent to strip the "Task Succeeded. Result: " prefix before parsing subagent JSON output. - arXiv only, by design. Semantic Scholar and PubMed adapters would push the scope toward a standalone MCP server (see #933) and are intentionally out of scope for this skill. - Coexists with the existing `academic-paper-review` skill: this skill does breadth-first synthesis across many papers, academic-paper-review does single-paper peer review. The two are routed via distinct triggers and can compose (SLR on many + deep review on 1-2 important ones). - Hard upper bound of 50 papers, tied to the Phase 3 concurrency strategy. Larger surveys degrade in synthesis quality and are better split by sub-topic. BibTeX template explicitly uses @misc for arXiv preprints (not @article), which is the most common mistake when generating BibTeX for arXiv papers. arxiv_search.py was smoke-tested end-to-end against the live arXiv API with two query shapes (relevance sort, submittedDate sort with category filter); all returned JSON fields parse correctly (id normalization, Atom namespace handling, URL encoding for multi-word queries). * fix(skills): prevent LLM from saving intermediate search results to file Adds an explicit "do not save" instruction at the end of Phase 2. Observed during Test 1 with DeepSeek: the model saved search results to a markdown file before proceeding to Phase 3, wasting 2-3 tool call rounds and increasing the risk of hitting the graph recursion limit. The search JSON should stay in context for Phase 3, not be persisted. * fix(skills): use relevance+start-date instead of submittedDate sorting Test 2 revealed that arXiv's submittedDate sorting returns the most recently submitted papers in the category regardless of query relevance. Searching "diffusion models" with sortBy=submittedDate in cs.CV returned papers on spatial memory, Navier-Stokes, and photon-counting CT — none about diffusion models. The LLM then retried with 4 different queries, wasting tool calls and approaching the recursion limit. Fix: always sort by relevance; when the user wants "recent" papers, combine relevance sorting with --start-date to constrain the time window. Also add an explicit "run the search exactly once" instruction to prevent the retry loop. * fix(skills): wrap multi-word arXiv queries in double quotes for phrase matching Without quotes, `all:diffusion model` is parsed by arXiv's Lucene as `all:diffusion OR model`, pulling in unrelated papers from physics (thermal diffusion) and other fields. Wrapping in double quotes forces phrase matching: `all:"diffusion model"`. Also fixes date filtering: the previous bug caused 2011 papers to appear in results despite --start-date 2024-04-09, because the unquoted query words were OR'd with the date constraint. Verified: "diffusion models" --category cs.CV --start-date 2024-04-09 now returns only relevant diffusion model papers published after April 2024. * fix(skills): add query phrasing guide and enforce subagent delegation Two fixes from Test 2 observations with DeepSeek: 1. Query phrasing: add a table showing good vs bad query examples. The script wraps multi-word queries in double quotes for phrase matching, so long queries like "diffusion models in computer vision" return 0 results. Guide the LLM to use 2-3 core keywords + --category instead. 2. Subagent enforcement: DeepSeek was extracting metadata inline via python -c scripts instead of using the task tool. Strengthen Phase 3 to explicitly name the task tool, say "do not extract metadata yourself", and explain why (token budget, isolation). This is more direct than the previous natural-language-only approach while still providing the reasoning behind the constraint. * fix(skills): strengthen search keyword guidance and subagent enforcement Address two issues found during end-to-end testing with DeepSeek: 1. Search retry: LLM passed full topic descriptions as queries (e.g. "diffusion models in computer vision"), which returned 0 results due to exact phrase matching and triggered retries. Added explicit instruction to extract 2-3 core keywords before searching. 2. Subagent bypass: LLM used python -c to extract metadata instead of dispatching via task tool. Added explicit prohibition list (python -c, bash scripts, inline extraction) with ❌ markers for clarity. * fix(skills): address Copilot review feedback on SLR skill - Fix legacy arXiv ID parsing: preserve archive prefix for pre-2007 papers (e.g. hep-th/9901001 instead of just 9901001) - Fix phase count: "four phases" -> "five phases" - Add subagent_enabled prerequisite note to SKILL.md Notes section - Remove PR-specific references ("PR 1") from ieee.md and bibtex.md templates, replace with workflow-scoped wording - Fix script header: "stdlib only" -> "no additional dependencies required", fix relative path to github_api.py reference - Remove reference to non-existent docs/enhancement/ path in header * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-04-10 08:54:28 +08:00
Adem Akdoğan	8bb14fa1a7	feat(skills): add academic-paper-review, code-documentation, and newsletter-generation skills (#1861 ) Add three new public skills to enhance DeerFlow's content creation capabilities: - academic-paper-review: Structured peer-review-quality analysis of research papers following top-venue review standards (NeurIPS, ICML, ACL). Covers methodology assessment, contribution evaluation, literature positioning, and constructive feedback with a 3-phase workflow. - code-documentation: Professional documentation generation for software projects, including README generation, API reference docs, architecture documentation with Mermaid diagrams, and inline code documentation supporting Python, TypeScript, Go, Rust, and Java conventions. - newsletter-generation: Curated newsletter creation with research workflow, supporting daily digest, weekly roundup, deep-dive, and industry briefing formats. Includes audience-specific tone adaptation and multi-source content curation. All skills: - Follow the existing SKILL.md frontmatter convention (name + description) - Pass the official _validate_skill_frontmatter() validation - Use hyphen-case naming consistent with existing skills - Contain only allowed frontmatter properties - Include comprehensive examples, quality checklists, and output templates	2026-04-05 10:19:35 +08:00
SCPZ24	6b13f5c9fb	feat: Support gitHub PAT configuration for higher github API accessing rate. (#1374 ) * feat: Add github PAT configs, allowing larger github API rates. * Update comment to English for better clarity * fix: Remove unused config lines in config.example.yaml and unreferenced declarations in app_config. Fix lint issues and update documentation. * fix: Remove unused imports, and passed the ruff check. --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-03-27 09:54:14 +08:00
orbisai0security	14a3fa5290	fix: use subprocess instead of os.system in analyze.py (#1289 ) The data analysis skill executes shell commands using os Resolves V-001 Co-authored-by: orbisai0security <orbisai0security@users.noreply.github.com>	2026-03-24 20:42:03 +08:00
Jason	79acc3939a	fix: add error handling for podcast generation failures (#1257 ) * fix: add error handling for podcast generation failures When TTS processing fails, the system was generating 0-second audio files without any error indication. This fix adds: 1. Track failed TTS lines and log warning with indices 2. Raise ValueError when all TTS generation fails with helpful message 3. Check for empty audio output in mix_audio and raise error 4. Log success/failure ratio for debugging Fixes #30 * fix: address Copilot review feedback - Use `not audio` to catch both None and empty bytes - Log failed lines with 1-based indices for user-friendly output - Handle empty script case with clear error message - Validate env vars before ThreadPoolExecutor for fast-fail on config errors --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-03-24 00:20:12 +08:00
lailoo	9809af1f26	feat: add citation/reference support to deep research reports (#1143 ) * feat: add citation/reference support to deep research reports (#1141) - Enhance lead agent system prompt with mandatory citation requirements after web_search/web_fetch tool usage - Add citation examples and best practices to GitHub Deep Research skill - Add citation hints to report template (Executive Summary, Key Analysis) - Style regular markdown links in frontend for visual distinction (color, underline, hover effect) - Fix TitleMiddleware being registered when title generation is disabled * fix: address PR review comments - Revert TitleMiddleware conditional registration (agent.py) to avoid sync/async incompatibility with DeerFlowClient - Fix markdown link rendering: merge classNames instead of overwriting, only set target=_blank for external http(s) URLs - Remove unrelated package.json/pnpm-lock.yaml changes * fix: use plain markdown links in Sources section for cleaner rendering Inline citations in report body use [citation:Title](URL) for pill/badge style. Sources section uses plain [Title](URL) for simple underlined link style. * fix(frontend): render plain links as underlined text in artifact markdown Only links with citation: prefix render as Badge pills. Regular links in Sources section now render as underlined text links. --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-03-17 09:51:08 +08:00
-Astraia-	191b60a326	fix: issue 1138 windows encoding (#1139 ) * fix(windows): use utf-8 for text file operations * fix(windows): normalize sandbox path masking * fix(windows): preserve utf-8 handling after backend split	2026-03-16 16:53:12 +08:00
DanielWalnut	8871fca5cb	feat: add claude-to-deerflow skill for DeerFlow API integration (#1024 ) * feat: add claude-to-deerflow skill for DeerFlow API integration Add a new skill that enables Claude Code to interact with the DeerFlow AI agent platform via its HTTP API, including chat streaming and status checking capabilities. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: fix telegram channel --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-08 22:06:24 +08:00
DanielWalnut	75b7302000	feat: add IM channels for Feishu, Slack, and Telegram (#1010 ) * feat: add IM channels system for Feishu, Slack, and Telegram integration Bridge external messaging platforms to DeerFlow via LangGraph Server with async message bus, thread management, and per-channel configuration. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review comments on IM channels system Fix topic_id handling in store remove/list_entries and manager commands, correct Telegram reply threading, remove unused imports/variables, update docstrings and docs to match implementation, and prevent config mutation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * update skill creator * fix im reply text * fix comments --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-08 15:21:18 +08:00
JeffJiang	7de94394d4	feat(agent):Supports custom agent and chat experience with refactoring (#957 ) * feat: add agent management functionality with creation, editing, and deletion * feat: enhance agent creation and chat experience - Added AgentWelcome component to display agent description on new thread creation. - Improved agent name validation with availability check during agent creation. - Updated NewAgentPage to handle agent creation flow more effectively, including enhanced error handling and user feedback. - Refactored chat components to streamline message handling and improve user experience. - Introduced new bootstrap skill for personalized onboarding conversations, including detailed conversation phases and a structured SOUL.md template. - Updated localization files to reflect new features and error messages. - General code cleanup and optimizations across various components and hooks. * Refactor workspace layout and agent management components - Updated WorkspaceLayout to use useLayoutEffect for sidebar state initialization. - Removed unused AgentFormDialog and related edit functionality from AgentCard. - Introduced ArtifactTrigger component to manage artifact visibility. - Enhanced ChatBox to handle artifact selection and display. - Improved message list rendering logic to avoid loading states. - Updated localization files to remove deprecated keys and add new translations. - Refined hooks for local settings and thread management to improve performance and clarity. - Added temporal awareness guidelines to deep research skill documentation. * feat: refactor chat components and introduce thread management hooks * feat: improve artifact file detail preview logic and clean up console logs * feat: refactor lead agent creation logic and improve logging details * feat: validate agent name format and enhance error handling in agent setup * feat: simplify thread search query by removing unnecessary metadata * feat: update query key in useDeleteThread and useRenameThread for consistency * feat: add isMock parameter to thread and artifact handling for improved testing * fix: reorder import of setup_agent for consistency in builtins module * feat: append mock parameter to thread links in CaseStudySection for testing purposes * fix: update load_agent_soul calls to use cfg.name for improved clarity * fix: update date format in apply_prompt_template for consistency * feat: integrate isMock parameter into artifact content loading for enhanced testing * docs: add license section to SKILL.md for clarity and attribution * feat(agent): enhance model resolution and agent configuration handling * chore: remove unused import of _resolve_model_name from agents * feat(agent): remove unused field * fix(agent): set default value for requested_model_name in _resolve_model_name function * feat(agent): update get_available_tools call to handle optional agent_config and improve middleware function signature --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com>	2026-03-03 21:32:01 +08:00
JeffJiang	33595f0bac	fix(skill): enhance data authenticity protocols and clarify reporting guidelines (#905 )	2026-02-25 22:25:23 +08:00
JeffJiang	4d5fdcb8db	Consolidates market and data analysis skills; adds chart viz (#36 ) Unifies market analysis, data analysis, and consulting reporting into a comprehensive consulting-analysis skill, enabling a two-phase workflow from analysis framework design to professional report generation. Introduces a DuckDB-based data analysis utility for Excel/CSV files and a chart-visualization skill with a flexible JS interface and extensive chart type documentation. Removes the legacy market analysis skill to streamline report generation and improve extensibility for consulting and data-driven workflows.	2026-02-12 11:08:09 +08:00
hetao	e87fd74e17	docs(ppt-generation): enforce sequential slide image generation Explicitly prohibit parallel image generation to ensure each slide can use the previous slide as a reference image for visual consistency. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-11 15:36:51 +08:00
LofiSu	46048c76ce	chore: 移除所有 Citations 相关逻辑，为后续重构做准备 - Backend: 删除 lead_agent / general_purpose 中的 citations_format 与引用相关 reminder；artifacts 下载不再对 markdown 做 citation 清洗，统一走 FileResponse，保留 Response 用于二进制 inline - Frontend: 删除 core/citations 模块、inline-citation、safe-citation-content；新增 MarkdownContent 仅做 Markdown 渲染；消息/artifact 预览与复制均使用原始 content - i18n: 移除 citations 命名空间（loadingCitations、loadingCitationsWithCount） - 技能与 demo: 措辞改为 references，demo 数据去掉 <citations> 块 - 文档: 更新 CLAUDE/AGENTS/README 描述，新增按文件 diff 的代码变更总结 Co-authored-by: Cursor <cursoragent@cursor.com>	2026-02-09 16:24:01 +08:00
hetaoBackend	6eb4cdd3ec	feat: disallow present_files tool in subagents and add market-analysis skill Add "present_files" to disallowed_tools for bash and general-purpose subagents to prevent them from presenting files directly. Also add the new market-analysis skill for generating consulting-grade reports. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-08 23:38:55 +08:00
Henry Li	60be7ee20d	docs: update description for surprise-me skill to enhance clarity	2026-02-07 10:51:43 +08:00
Henry Li	c758a28a3e	styles: format	2026-02-07 10:50:08 +08:00
Henry Li	22dea3fd43	feat: add surprise-me	2026-02-06 14:04:15 +08:00
hetao	db0461142e	feat: enhance memory system with tiktoken and improved prompt guidelines Add accurate token counting using tiktoken library and significantly enhance memory update prompts with detailed section guidelines, multilingual support, and improved fact extraction. Update deep-research skill to be more proactive for research queries. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-04 20:44:26 +08:00
Henry Li	efd56fdf51	feat: use list of links	2026-02-02 13:25:21 +08:00
hetaoBackend	f082ef3d87	feat: add find-skills skill for discovering agent skills Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 23:54:08 +08:00
Henry Li	f206a574c5	feat: update github-deep-research skill	2026-02-01 10:55:21 +08:00
Henry Li	46feff6c16	feat: add github-deep-research skill	2026-02-01 10:54:19 +08:00
hetaoBackend	43ee8a2968	fix: fix aio sandbox shutdown bug	2026-01-30 22:02:07 +08:00
hetao	2c7a56dd33	feat: optimize vision tools and image handling - Add model-aware vision tool loading based on supports_vision flag - Move view_image_tool from config to builtin tools for dynamic inclusion - Add timeout to image search to prevent hanging requests - Optimize image search results format using thumbnails - Add image validation for reference images in generation - Improve error handling with detailed messages Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-01-29 14:57:26 +08:00
hetaoBackend	1926c58cf2	feat: add image search builtin tool	2026-01-29 08:23:50 +08:00
hetaoBackend	248ffe61bc	feat: modernize PPT styles and add deep-research skill Update presentation generation with contemporary design styles (glassmorphism, dark-premium, neo-brutalist, etc.) and add a new deep-research skill to guide thorough web research before content generation tasks. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-29 01:54:57 +08:00
Henry Li	90782f29a2	chore: remove	2026-01-27 13:37:09 +08:00
hetao	9215c9cce7	feat: add ppt-generation skill Creates presentations by generating AI images for each slide and composing them into PPTX files. Features include: - Multiple presentation styles (business, academic, minimal, keynote, creative) - Visual consistency through reference image chaining (each slide uses the previous slide as reference) - Speaker notes from presentation plan Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-26 19:35:29 +08:00
hetao	22004406a7	perf: parallelize TTS generation in podcast skill Use ThreadPoolExecutor to generate audio for multiple script lines concurrently, significantly speeding up podcast generation. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-26 14:50:56 +08:00
hetao	dddd745b5b	refactor: simplify podcast-generation to use direct JSON script input - Remove LLM script generation from Python script, model now generates JSON script directly (similar to image-generation skill) - Add --transcript-file option to generate markdown transcript - Add optional "title" field in JSON for transcript heading - Remove dependency on OPENAI_API_KEY for podcast generation - Update SKILL.md with new workflow and JSON format documentation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-26 14:01:48 +08:00
hetaoBackend	9f5658fa0e	feat: add podcast generation skill - Add podcast-generation skill for creating tech explainer podcasts - Include generate.py script with TTS synthesis capabilities - Add tech-explainer template for structured podcast content - Increase sandbox command timeout from 30s to 600s to support longer-running skill scripts	2026-01-26 13:16:35 +08:00
Henry Li	ae0e7de3b7	feat: add image and video generation skills	2026-01-25 21:57:44 +08:00
hetao	6e147a772e	feat: add environment variable injection for Docker sandbox - Add environment field to sandbox config for injecting env vars into container - Support $VAR syntax to resolve values from host environment variables - Refactor frontend API modules to use centralized getBackendBaseURL() - Improve Doraemon skill with explicit input/output path arguments - Add .env.example file Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-24 22:36:05 +08:00
Henry Li	c468381064	feat: add Doraemon Skill	2026-01-24 21:54:01 +08:00
hetao	08101aa432	refactor: refine skills	2026-01-21 21:22:56 +08:00
hetao	5a45b9c131	feat: add SSE and HTTP transport support for MCP servers - Add type, url, and headers fields to MCP server config - Update MCP client to handle stdio, sse, and http transports - Add todos field to ThreadState - Add Deerflow branding requirement to frontend-design skill - Update extensions_config.example.json with SSE/HTTP examples Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-21 16:14:00 +08:00
hetaoBackend	5888a5ba16	fix: fix skill md path	2026-01-20 21:10:05 +08:00
hetaoBackend	50810c8212	feat: add skills api	2026-01-20 13:57:36 +08:00
DanielWalnut	9f755ecc30	feat: add skills system for specialized agent workflows (#6 ) Implement a skills framework that enables specialized workflows for specific tasks (e.g., PDF processing, web page generation). Skills are discovered from the skills/ directory and automatically mounted in sandboxes with path mapping support. - Add SkillsConfig for configuring skills path and container mount point - Implement dynamic skill loading from SKILL.md files with YAML frontmatter - Add path mapping in LocalSandbox to translate container paths to local paths - Mount skills directory in AIO Docker sandbox containers - Update lead agent prompt to dynamically inject available skills - Add setup documentation and expand config.example.yaml Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-16 14:44:51 +08:00

41 Commits