mirror of
https://github.com/bytedance/deer-flow.git
synced 2026-05-16 05:33:46 +00:00
* Make loop detection configurable Expose LoopDetectionMiddleware thresholds through config.yaml while preserving existing defaults and allowing the middleware to be disabled. Refs bytedance/deer-flow#2517 * feat(loop-detection): add per-tool tool_freq_overrides to Phase 1 Adds ToolFreqOverride model and tool_freq_overrides field to LoopDetectionConfig, wires it through LoopDetectionMiddleware, and documents the option in config.example.yaml. Resolves the gap flagged in the #2586 review: without per-tool overrides, users hit by #2510/#2511 (RNA-seq workflows exceeding the bash hard limit) had no way to raise thresholds for one tool without loosening the global limit for every tool. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * docs(loop-detection): document tool_freq_overrides in LoopDetectionMiddleware docstring Add the missing Args entry for tool_freq_overrides, explaining the (warn, hard_limit) tuple structure and how per-tool thresholds supersede the global tool_freq_warn / tool_freq_hard_limit for named tools. Also run ruff format on the three files flagged by the lint check. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(loop-detection): validate LoopDetectionMiddleware __init__ params eagerly Raise clear ValueError at construction time instead of crashing at unpack-time inside _track_and_check when bad values are passed: - tool_freq_overrides: must be 2-tuples of positive ints with hard_limit >= warn - scalar thresholds: warn_threshold, hard_limit, tool_freq_warn, tool_freq_hard_limit must be >= 1 and hard limits must >= their warn pairs - window_size, max_tracked_threads must be >= 1 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(test): isolate credential loader directory-path test from real ~/.claude The test didn't monkeypatch HOME, so on any machine with real Claude Code credentials at ~/.claude/.credentials.json the function fell through to those credentials and the assertion failed. Adding HOME redirect ensures the default credential path doesn't exist during the test. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * style(test): add blank lines after import pytest in TestInitValidation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(loop-detection): collapse dual validation to LoopDetectionConfig Modifications - LoopDetectionMiddleware.__init__: stripped of all ValueError raises; becomes a plain field-assignment constructor. - LoopDetectionMiddleware.from_config: classmethod that builds the middleware from a Pydantic-validated LoopDetectionConfig and handles the ToolFreqOverride -> tuple[int, int] conversion. - agents/factory.py: SDK construction routed through LoopDetectionMiddleware.from_config(LoopDetectionConfig()) so the defaults path is Pydantic-validated too. - agents/lead_agent/agent.py: uses from_config instead of unpacking config fields by hand. - tests/test_loop_detection_middleware.py: deleted TestInitValidation (16 methods exercising the removed __init__ checks); added TestFromConfig (4 tests: scalar field mapping, override tuple conversion, empty overrides, behavioral smoke test). Result: one validation layer (Pydantic), zero duplication, no __new__ hacks. Both production construction sites flow through LoopDetectionConfig. Test results make test -> 2977 passed, 18 skipped, 0 failed (137s) make format -> All checks passed; 411 files left unchanged * feat(agents): make loop_detection configurable in create_deerflow_agent Adds a `loop_detection: bool | AgentMiddleware = True` field to RuntimeFeatures, mirroring the existing pattern used by `sandbox`, `memory`, and `vision`. SDK users can now disable LoopDetectionMiddleware or replace it with a custom instance built from their own LoopDetectionConfig — e.g. `LoopDetectionMiddleware.from_config(my_cfg)` — instead of being stuck with the hardcoded defaults previously installed by the SDK factory. The lead-agent path (which already reads AppConfig.loop_detection) is unchanged, and the default `True` preserves prior always-on behavior for all existing callers. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: knight0940 <631532668@qq.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: Amorend <142649913+knight0940@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
74 lines
2.6 KiB
Python
74 lines
2.6 KiB
Python
"""Configuration for loop detection middleware."""
|
|
|
|
from pydantic import BaseModel, Field, model_validator
|
|
|
|
|
|
class ToolFreqOverride(BaseModel):
|
|
"""Per-tool frequency threshold override.
|
|
|
|
Can be higher or lower than the global defaults. Commonly used to raise
|
|
thresholds for high-frequency tools like bash in batch workflows (e.g.
|
|
RNA-seq pipelines) without weakening protection on every other tool.
|
|
"""
|
|
|
|
warn: int = Field(ge=1)
|
|
hard_limit: int = Field(ge=1)
|
|
|
|
@model_validator(mode="after")
|
|
def _validate(self) -> "ToolFreqOverride":
|
|
if self.hard_limit < self.warn:
|
|
raise ValueError("hard_limit must be >= warn")
|
|
return self
|
|
|
|
|
|
class LoopDetectionConfig(BaseModel):
|
|
"""Configuration for repetitive tool-call loop detection."""
|
|
|
|
enabled: bool = Field(
|
|
default=True,
|
|
description="Whether to enable repetitive tool-call loop detection",
|
|
)
|
|
warn_threshold: int = Field(
|
|
default=3,
|
|
ge=1,
|
|
description="Number of identical tool-call sets before injecting a warning",
|
|
)
|
|
hard_limit: int = Field(
|
|
default=5,
|
|
ge=1,
|
|
description="Number of identical tool-call sets before forcing a stop",
|
|
)
|
|
window_size: int = Field(
|
|
default=20,
|
|
ge=1,
|
|
description="Number of recent tool-call sets to track per thread",
|
|
)
|
|
max_tracked_threads: int = Field(
|
|
default=100,
|
|
ge=1,
|
|
description="Maximum number of thread histories to keep in memory",
|
|
)
|
|
tool_freq_warn: int = Field(
|
|
default=30,
|
|
ge=1,
|
|
description="Number of calls to the same tool type before injecting a frequency warning",
|
|
)
|
|
tool_freq_hard_limit: int = Field(
|
|
default=50,
|
|
ge=1,
|
|
description="Number of calls to the same tool type before forcing a stop",
|
|
)
|
|
tool_freq_overrides: dict[str, ToolFreqOverride] = Field(
|
|
default_factory=dict,
|
|
description=("Per-tool overrides for tool_freq_warn / tool_freq_hard_limit, keyed by tool name. Values can be higher or lower than the global defaults. Commonly used to raise thresholds for high-frequency tools like bash."),
|
|
)
|
|
|
|
@model_validator(mode="after")
|
|
def validate_thresholds(self) -> "LoopDetectionConfig":
|
|
"""Ensure hard stop cannot happen before the warning threshold."""
|
|
if self.hard_limit < self.warn_threshold:
|
|
raise ValueError("hard_limit must be greater than or equal to warn_threshold")
|
|
if self.tool_freq_hard_limit < self.tool_freq_warn:
|
|
raise ValueError("tool_freq_hard_limit must be greater than or equal to tool_freq_warn")
|
|
return self
|