deer-flow

mirror of https://github.com/bytedance/deer-flow.git synced 2026-06-09 17:12:01 +00:00

History

fix(uploads): handle split-bold headings and ** ** artefacts in extract_outline (#1838 )

* feat(uploads): guide agent to use grep/glob/read_file for uploaded documents

Add workflow guidance to the <uploaded_files> context block so the agent
knows to use grep and glob (added in #1784) alongside read_file when
working with uploaded documents, rather than falling back to web search.

This is the final piece of the three-PR PDF agentic search pipeline:
- PR1 (#1727): pymupdf4llm converter produces structured Markdown with headings
- PR2 (#1738): document outline injected into agent context with line numbers
- PR3 (this):  agent guided to use outline + grep + read_file workflow

* feat(uploads): add file-first priority and fallback guidance to uploaded_files context

* fix(uploads): handle split-bold headings and ** ** artefacts in extract_outline

- Add _clean_bold_title() to merge adjacent bold spans (** **) produced
  by pymupdf4llm when bold text crosses span boundaries
- Add _SPLIT_BOLD_HEADING_RE (Style 3) to recognise **<num>** **<title>**
  headings common in academic papers; excludes pure-number table headers
  and rows with more than 4 bold blocks
- When outline is empty, read first 5 non-empty lines of the .md as a
  content preview and surface a grep hint in the agent context
- Update _format_file_entry to render the preview + grep hint instead of
  silently omitting the outline section
- Add 3 new extract_outline tests and 2 new middleware tests (65 total)

* fix(uploads): address Copilot review comments on extract_outline regex

- Replace ASCII [A-Za-z] guard with negative lookahead to support non-ASCII
  titles (e.g. **1** **概述**); pure-numeric/punctuation blocks still excluded
- Replace .+ with [^*]+ and cap repetition at {0,2} (four blocks total) to
  keep _SPLIT_BOLD_HEADING_RE linear and avoid ReDoS on malformed input
- Remove now-redundant len(blocks) <= 4 code-level check (enforced by regex)
- Log debug message with exc_info when preview extraction fails

2026-04-04 14:25:08 +08:00

conftest.py

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

test_acp_config.py

feat(acp): add env field to ACPAgentConfig for subprocess env injection (#1447 )

2026-03-27 20:03:30 +08:00

test_aio_sandbox_local_backend.py

fix: use safe docker bind mount syntax for sandbox mounts (#1655 )

2026-04-01 11:42:12 +08:00

test_aio_sandbox_provider.py

fix Windows Docker sandbox path mounting (#1634 )

2026-03-31 22:19:27 +08:00

test_aio_sandbox.py

fix: prevent concurrent subagent file write conflicts in sandbox tools (#1714 )

2026-04-02 15:39:41 +08:00

test_app_config_reload.py

fix(config): reload AppConfig when config path or mtime changes (#1239 )

2026-03-22 20:34:01 +08:00

test_artifacts_router.py

fix(gateway): enforce safe download for active artifact MIME types to mitigate stored XSS (#1389 )

2026-03-26 17:44:25 +08:00

test_channel_file_attachments.py

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

test_channels.py

feat: support wecom channel (#1390 )

2026-04-04 11:28:35 +08:00

test_checkpointer_none_fix.py

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

test_checkpointer.py

fix: normalize structured LLM content in serialization and memory updater (#1215 )

2026-03-22 17:29:29 +08:00

test_claude_provider_oauth_billing.py

fix(oauth): Harden Claude OAuth cache-control handling (#1583 )

2026-03-30 07:41:18 +08:00

test_cli_auth_providers.py

feat(harness): integration ACP agent tool (#1344 )

2026-03-26 14:20:18 +08:00

test_client_e2e.py

[Security] Address critical host-shell escape in LocalSandboxProvider (#1547 )

2026-03-29 21:03:58 +08:00

test_client_live.py

[Security] Address critical host-shell escape in LocalSandboxProvider (#1547 )

2026-03-29 21:03:58 +08:00

test_client.py

feat(client): add available_skills parameter to DeerFlowClient (#1779 )

2026-04-03 11:22:58 +08:00

test_config_version.py

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

test_create_deerflow_agent_live.py

feat: add create_deerflow_agent SDK entry point (Phase 1) (#1203 )

2026-03-29 15:31:18 +08:00

test_create_deerflow_agent.py

style: format unformatted files and add .omc/ to prettierignore (#1539 )

2026-03-29 16:45:31 +08:00

test_credential_loader.py

feat: add Claude Code OAuth and Codex CLI as LLM providers (#1166 )

2026-03-22 22:39:50 +08:00

test_custom_agent.py

feat/per agent skill filter (#1650 )

2026-04-02 15:02:09 +08:00

test_dangling_tool_call_middleware.py

test: add unit tests for DanglingToolCallMiddleware (#1305 )

2026-03-26 00:20:08 +08:00

test_docker_sandbox_mode_detection.py

fix Windows Docker sandbox path mounting (#1634 )

2026-03-31 22:19:27 +08:00

test_feishu_parser.py

fix: avoid treating Feishu file paths as commands (#1654 )

2026-04-01 23:23:00 +08:00

test_file_conversion.py

fix(uploads): handle split-bold headings and ** ** artefacts in extract_outline (#1838 )

2026-04-04 14:25:08 +08:00

test_gateway_services.py

fix(gateway): prevent 400 error when client sends context with configurable (#1660 )

2026-04-01 23:21:32 +08:00

test_guardrail_middleware.py

feat(guardrails): add pre-tool-call authorization middleware with pluggable providers (#1240 )

2026-03-23 18:07:33 +08:00

test_harness_boundary.py

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

test_infoquest_client.py

feat(harness): integration ACP agent tool (#1344 )

2026-03-26 14:20:18 +08:00

test_invoke_acp_agent_tool.py

fix ACP mcpServers payload (#1735 )

2026-04-03 15:28:56 +08:00

test_jina_client.py

refactor: replace sync requests with async httpx in Jina AI client (#1603 )

2026-04-01 17:02:39 +08:00

test_lead_agent_model_resolution.py

ci: enforce code formatting checks for backend and frontend (#1536 )

2026-03-29 15:34:38 +08:00

test_lead_agent_prompt.py

fix: surface configured sandbox mounts to agents (#1638 )

2026-03-31 22:22:30 +08:00

test_lead_agent_skills.py

feat/per agent skill filter (#1650 )

2026-04-02 15:02:09 +08:00

test_llm_error_handling_middleware.py

Fix/1681 llm call retry handling (#1683 )

2026-04-02 10:12:17 +08:00

test_local_bash_tool_loading.py

[Security] Address critical host-shell escape in LocalSandboxProvider (#1547 )

2026-03-29 21:03:58 +08:00

test_local_sandbox_encoding.py

fix: add Windows shell fallback for local sandbox (#1505 )

2026-03-29 21:31:29 +08:00

test_local_sandbox_provider_mounts.py

feat(sandbox): add read-only support for local sandbox path mappings (#1808 )

2026-04-03 19:46:22 +08:00

test_loop_detection_middleware.py

fix(middleware): handle list-type AIMessage.content in LoopDetectionMiddleware (#1823 )

2026-04-04 10:38:22 +08:00

test_mcp_client_config.py

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

test_mcp_oauth.py

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

test_mcp_sync_wrapper.py

feat(harness): integration ACP agent tool (#1344 )

2026-03-26 14:20:18 +08:00

test_memory_prompt_injection.py

fix: inject longTermBackground into memory prompt (#1734 )

2026-04-03 11:21:58 +08:00

test_memory_queue.py

feat(memory): structured reflection + correction detection in MemoryMiddleware (#1620 ) (#1668 )

2026-04-01 16:45:29 +08:00

test_memory_router.py

feat(memory): structured reflection + correction detection in MemoryMiddleware (#1620 ) (#1668 )

2026-04-01 16:45:29 +08:00

test_memory_storage.py

ci: enforce code formatting checks for backend and frontend (#1536 )

2026-03-29 15:34:38 +08:00

test_memory_updater.py

feat(memory): structured reflection + correction detection in MemoryMiddleware (#1620 ) (#1668 )

2026-04-01 16:45:29 +08:00

test_memory_upload_filtering.py

feat(memory): structured reflection + correction detection in MemoryMiddleware (#1620 ) (#1668 )

2026-04-01 16:45:29 +08:00

test_model_config.py

feat(codex): support explicit OpenAI Responses API config (#1235 )

2026-03-22 20:39:26 +08:00

test_model_factory.py

feat(tracing): add optional Langfuse support (#1717 )

2026-04-02 13:06:10 +08:00

test_patched_minimax.py

fix: improve MiniMax code plan integration (#1169 )

2026-03-20 17:18:59 +08:00

test_patched_openai.py

fix(LLM): fixing Gemini thinking + tool calls via OpenAI gateway (#1180 ) (#1205 )

2026-03-26 15:07:05 +08:00

test_present_file_tool_core_logic.py

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

test_provisioner_kubeconfig.py

feat(subagents): make subagent timeout configurable via config.yaml (#897 )

2026-02-25 08:39:29 +08:00

test_readability.py

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

test_reflection_resolvers.py

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

test_run_manager.py

fix: surface configured sandbox mounts to agents (#1638 )

2026-03-31 22:22:30 +08:00

test_sandbox_audit_middleware.py

feat(sandbox): add SandboxAuditMiddleware for bash command security auditing (#1532 )

2026-03-30 07:48:31 +08:00

test_sandbox_search_tools.py

feat(sandbox): add built-in grep and glob tools (#1784 )

2026-04-03 16:03:06 +08:00

test_sandbox_tools_security.py

feat(sandbox): add read-only support for local sandbox path mappings (#1808 )

2026-04-03 19:46:22 +08:00

test_serialization.py

feat(gateway): implement LangGraph Platform API in Gateway, replace langgraph-cli (#1403 )

2026-03-30 16:02:23 +08:00

test_serialize_message_content.py

feat(harness): integration ACP agent tool (#1344 )

2026-03-26 14:20:18 +08:00

test_skills_archive_root.py

refactor: extract shared skill installer and upload manager to harness (#1202 )

2026-03-25 16:28:33 +08:00

test_skills_installer.py

Fix Windows backend test compatibility (#1384 )

2026-03-26 17:39:16 +08:00

test_skills_loader.py

feat(harness): integration ACP agent tool (#1344 )

2026-03-26 14:20:18 +08:00

test_skills_parser.py

fix(skills): support parsing multiline YAML strings in SKILL.md frontmatter (#1703 )

2026-04-01 23:08:30 +08:00

test_skills_validation.py

test: add unit tests for skill frontmatter validation (#1309 )

2026-03-27 20:20:31 +08:00

test_sse_format.py

feat(gateway): implement LangGraph Platform API in Gateway, replace langgraph-cli (#1403 )

2026-03-30 16:02:23 +08:00

test_stream_bridge.py

fix: guarantee END sentinel delivery when stream bridge queue is full (#1695 )

2026-04-03 20:12:30 +08:00

test_subagent_executor.py

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

test_subagent_limit_middleware.py

test: add unit tests for SubagentLimitMiddleware (#1306 )

2026-03-25 10:20:16 +08:00

test_subagent_prompt_security.py

[Security] Address critical host-shell escape in LocalSandboxProvider (#1547 )

2026-03-29 21:03:58 +08:00

test_subagent_timeout_config.py

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

test_suggestions_router.py

fix: use SystemMessage+HumanMessage for follow-up question generation (#1751 )

2026-04-03 20:09:01 +08:00

test_task_tool_core_logic.py

[Security] Address critical host-shell escape in LocalSandboxProvider (#1547 )

2026-03-29 21:03:58 +08:00

test_thread_data_middleware.py

Fix Windows backend test compatibility (#1384 )

2026-03-26 17:39:16 +08:00

test_threads_router.py

fix(threads): clean up local thread data after thread deletion (#1262 )

2026-03-24 00:36:08 +08:00

test_title_generation.py

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

test_title_middleware_core_logic.py

fix: add sync after_model to TitleMiddleware (#1190 )

2026-03-19 15:46:31 +08:00

test_todo_middleware.py

test: add unit tests for TodoMiddleware (#1307 )

2026-03-26 00:20:50 +08:00

test_token_usage.py

feat(harness): integration ACP agent tool (#1344 )

2026-03-26 14:20:18 +08:00

test_tool_error_handling_middleware.py

refactor: split backend into harness (deerflow.*) and app (app.*) (#1131 )

2026-03-14 22:55:52 +08:00

test_tool_output_truncation.py

feat(sandbox): truncate oversized bash and read_file tool outputs (#1677 )

2026-04-02 09:22:41 +08:00

test_tool_search.py

fix: promote deferred tools after tool_search returns schema (#1570 )

2026-03-30 11:23:15 +08:00

test_tracing_config.py

feat(tracing): add optional Langfuse support (#1717 )

2026-04-02 13:06:10 +08:00

test_tracing_factory.py

feat(tracing): add optional Langfuse support (#1717 )

2026-04-02 13:06:10 +08:00

test_uploads_manager.py

Fix Windows backend test compatibility (#1384 )

2026-03-26 17:39:16 +08:00

test_uploads_middleware_core_logic.py

fix(uploads): handle split-bold headings and ** ** artefacts in extract_outline (#1838 )

2026-04-04 14:25:08 +08:00

test_uploads_router.py

fix(sandbox): Relax upload permissions for aio sandbox sync (#1409 )

2026-03-27 17:37:44 +08:00