feat(guardrails): add pre-tool-call authorization middleware with pluggable providers (#1240)

Add GuardrailMiddleware that evaluates every tool call before execution. Three provider options: built-in AllowlistProvider (zero deps), OAP passport providers (open standard), or custom providers loaded by class path. - GuardrailProvider protocol with GuardrailRequest/Decision dataclasses - GuardrailMiddleware (AgentMiddleware, position 5 in chain) - AllowlistProvider for simple deny/allow by tool name - GuardrailsConfig (Pydantic singleton, loaded from config.yaml) - 25 tests covering allow/deny, fail-closed/open, async, GraphBubbleUp - Comprehensive docs at backend/docs/GUARDRAILS.md Closes #1213 Co-authored-by: Willem Jiang <willem.jiang@gmail.com>
2026-04-25 11:18:22 +00:00 · 2026-03-23 06:07:33 -04:00 · 2026-03-23 06:07:33 -04:00 · a29134d7c9
commit a29134d7c9
parent fe75cb35ca
11 changed files with 1041 additions and 7 deletions
--- a/backend/CLAUDE.md
+++ b/backend/CLAUDE.md
@ -156,13 +156,14 @@ Middlewares execute in strict order in `packages/harness/deerflow/agents/lead_ag
 2. **UploadsMiddleware** - Tracks and injects newly uploaded files into conversation
 3. **SandboxMiddleware** - Acquires sandbox, stores `sandbox_id` in state
 4. **DanglingToolCallMiddleware** - Injects placeholder ToolMessages for AIMessage tool_calls that lack responses (e.g., due to user interruption)
-5. **SummarizationMiddleware** - Context reduction when approaching token limits (optional, if enabled)
+5. **GuardrailMiddleware** - Pre-tool-call authorization via pluggable `GuardrailProvider` protocol (optional, if `guardrails.enabled` in config). Evaluates each tool call and returns error ToolMessage on deny. Three provider options: built-in `AllowlistProvider` (zero deps), OAP policy providers (e.g. `aport-agent-guardrails`), or custom providers. See [docs/GUARDRAILS.md](docs/GUARDRAILS.md) for setup, usage, and how to implement a provider.
-6. **TodoListMiddleware** - Task tracking with `write_todos` tool (optional, if plan_mode)
+6. **SummarizationMiddleware** - Context reduction when approaching token limits (optional, if enabled)
-7. **TitleMiddleware** - Auto-generates thread title after first complete exchange and normalizes structured message content before prompting the title model
+7. **TodoListMiddleware** - Task tracking with `write_todos` tool (optional, if plan_mode)
-8. **MemoryMiddleware** - Queues conversations for async memory update (filters to user + final AI responses)
+8. **TitleMiddleware** - Auto-generates thread title after first complete exchange and normalizes structured message content before prompting the title model
-9. **ViewImageMiddleware** - Injects base64 image data before LLM call (conditional on vision support)
+9. **MemoryMiddleware** - Queues conversations for async memory update (filters to user + final AI responses)
-10. **SubagentLimitMiddleware** - Truncates excess `task` tool calls from model response to enforce `MAX_CONCURRENT_SUBAGENTS` limit (optional, if subagent_enabled)
+10. **ViewImageMiddleware** - Injects base64 image data before LLM call (conditional on vision support)
-11. **ClarificationMiddleware** - Intercepts `ask_clarification` tool calls, interrupts via `Command(goto=END)` (must be last)
+11. **SubagentLimitMiddleware** - Truncates excess `task` tool calls from model response to enforce `MAX_CONCURRENT_SUBAGENTS` limit (optional, if subagent_enabled)
 12. **ClarificationMiddleware** - Intercepts `ask_clarification` tool calls, interrupts via `Command(goto=END)` (must be last)
 ### Configuration System
--- a/backend/docs/GUARDRAILS.md
+++ b/backend/docs/GUARDRAILS.md
@ -0,0 +1,385 @@
 # Guardrails: Pre-Tool-Call Authorization
 > **Context:** [Issue #1213](https://github.com/bytedance/deer-flow/issues/1213) — DeerFlow has Docker sandboxing and human approval via `ask_clarification`, but no deterministic, policy-driven authorization layer for tool calls. An agent running autonomous multi-step tasks can execute any loaded tool with any arguments. Guardrails add a middleware that evaluates every tool call against a policy **before** execution.
 ## Why Guardrails
 ```
 Without guardrails:                      With guardrails:
  Agent                                    Agent
    │                                        │
    ▼                                        ▼
  ┌──────────┐                             ┌──────────┐
  │ bash     │──▶ executes immediately     │ bash     │──▶ GuardrailMiddleware
  │ rm -rf / │                             │ rm -rf / │        │
  └──────────┘                             └──────────┘        ▼
                                                         ┌──────────────┐
                                                         │  Provider    │
                                                         │  evaluates   │
                                                         │  against     │
                                                         │  policy      │
                                                         └──────┬───────┘
                                                                │
                                                          ┌─────┴─────┐
                                                          │           │
                                                        ALLOW       DENY
                                                          │           │
                                                          ▼           ▼
                                                      Tool runs   Agent sees:
                                                      normally    "Guardrail denied:
                                                                   rm -rf blocked"
 ```
 - **Sandboxing** provides process isolation but not semantic authorization. A sandboxed `bash` can still `curl` data out.
 - **Human approval** (`ask_clarification`) requires a human in the loop for every action. Not viable for autonomous workflows.
 - **Guardrails** provide deterministic, policy-driven authorization that works without human intervention.
 ## Architecture
 ```
 ┌─────────────────────────────────────────────────────────────────────┐
 │                        Middleware Chain                               │
 │                                                                      │
 │  1. ThreadDataMiddleware     ─── per-thread dirs                     │
 │  2. UploadsMiddleware        ─── file upload tracking                │
 │  3. SandboxMiddleware        ─── sandbox acquisition                 │
 │  4. DanglingToolCallMiddleware ── fix incomplete tool calls           │
 │  5. GuardrailMiddleware ◄──── EVALUATES EVERY TOOL CALL             │
 │  6. ToolErrorHandlingMiddleware ── convert exceptions to messages     │
 │  7-12. (Summarization, Title, Memory, Vision, Subagent, Clarify)    │
 │                                                                      │
 └─────────────────────────────────────────────────────────────────────┘
                         │
                         ▼
           ┌──────────────────────────┐
           │    GuardrailProvider     │  ◄── pluggable: any class
           │    (configured in YAML)  │      with evaluate/aevaluate
           └────────────┬─────────────┘
                        │
              ┌─────────┼──────────────┐
              │         │              │
              ▼         ▼              ▼
         Built-in   OAP Passport    Custom
         Allowlist  Provider        Provider
         (zero dep) (open standard) (your code)
                        │
                  Any implementation
                  (e.g. APort, or
                   your own evaluator)
 ```
 The `GuardrailMiddleware` implements `wrap_tool_call` / `awrap_tool_call` (the same `AgentMiddleware` pattern used by `ToolErrorHandlingMiddleware`). It:
 1. Builds a `GuardrailRequest` with tool name, arguments, and passport reference
 2. Calls `provider.evaluate(request)` on whatever provider is configured
 3. If **deny**: returns `ToolMessage(status="error")` with the reason -- agent sees the denial and adapts
 4. If **allow**: passes through to the actual tool handler
 5. If **provider error** and `fail_closed=true` (default): blocks the call
 6. `GraphBubbleUp` exceptions (LangGraph control signals) are always propagated, never caught
 ## Three Provider Options
 ### Option 1: Built-in AllowlistProvider (Zero Dependencies)
 The simplest option. Ships with DeerFlow. Block or allow tools by name. No external packages, no passport, no network.
 **config.yaml:**
 ```yaml
 guardrails:
  enabled: true
  provider:
    use: deerflow.guardrails.builtin:AllowlistProvider
    config:
      denied_tools: ["bash", "write_file"]
 ```
 This blocks `bash` and `write_file` for all requests. All other tools pass through.
 You can also use an allowlist (only these tools are permitted):
 ```yaml
 guardrails:
  enabled: true
  provider:
    use: deerflow.guardrails.builtin:AllowlistProvider
    config:
      allowed_tools: ["web_search", "read_file", "ls"]
 ```
 **Try it:**
 1. Add the config above to your `config.yaml`
 2. Start DeerFlow: `make dev`
 3. Ask the agent: "Use bash to run echo hello"
 4. The agent sees: `Guardrail denied: tool 'bash' was blocked (oap.tool_not_allowed)`
 ### Option 2: OAP Passport Provider (Policy-Based)
 For policy enforcement based on the [Open Agent Passport (OAP)](https://github.com/aporthq/aport-spec) open standard. An OAP passport is a JSON document that declares an agent's identity, capabilities, and operational limits. Any provider that reads an OAP passport and returns OAP-compliant decisions works with DeerFlow.
 ```
 ┌─────────────────────────────────────────────────────────────┐
 │                    OAP Passport (JSON)                        │
 │                   (open standard, any provider)              │
 │  {                                                           │
 │    "spec_version": "oap/1.0",                                │
 │    "status": "active",                                       │
 │    "capabilities": [                                         │
 │      {"id": "system.command.execute"},                       │
 │      {"id": "data.file.read"},                               │
 │      {"id": "data.file.write"},                              │
 │      {"id": "web.fetch"},                                    │
 │      {"id": "mcp.tool.execute"}                              │
 │    ],                                                        │
 │    "limits": {                                               │
 │      "system.command.execute": {                             │
 │        "allowed_commands": ["git", "npm", "node", "ls"],     │
 │        "blocked_patterns": ["rm -rf", "sudo", "chmod 777"]   │
 │      }                                                       │
 │    }                                                         │
 │  }                                                           │
 └──────────────────────────┬──────────────────────────────────┘
                           │
               Any OAP-compliant provider
          ┌────────────────┼────────────────┐
          │                │                │
     Your own         APort (ref.      Other future
     evaluator        implementation)  implementations
 ```
 **Creating a passport manually:**
 An OAP passport is just a JSON file. You can create one by hand following the [OAP specification](https://github.com/aporthq/aport-spec/blob/main/oap/oap-spec.md) and validate it against the [JSON schema](https://github.com/aporthq/aport-spec/blob/main/oap/passport-schema.json). See the [examples](https://github.com/aporthq/aport-spec/tree/main/oap/examples) directory for templates.
 **Using APort as a reference implementation:**
 [APort Agent Guardrails](https://github.com/aporthq/aport-agent-guardrails) is one open-source (Apache 2.0) implementation of an OAP provider. It handles passport creation, local evaluation, and optional hosted API evaluation.
 ```bash
 pip install aport-agent-guardrails
 aport setup --framework deerflow
 ```
 This creates:
 - `~/.aport/deerflow/config.yaml` -- evaluator config (local or API mode)
 - `~/.aport/deerflow/aport/passport.json` -- OAP passport with capabilities and limits
 **config.yaml (using APort as the provider):**
 ```yaml
 guardrails:
  enabled: true
  provider:
    use: aport_guardrails.providers.generic:OAPGuardrailProvider
 ```
 **config.yaml (using your own OAP provider):**
 ```yaml
 guardrails:
  enabled: true
  provider:
    use: my_oap_provider:MyOAPProvider
    config:
      passport_path: ./my-passport.json
 ```
 Any provider that accepts `framework` as a kwarg and implements `evaluate`/`aevaluate` works. The OAP standard defines the passport format and decision codes; DeerFlow doesn't care which provider reads them.
 **What the passport controls:**
 | Passport field | What it does | Example |
 |---|---|---|
 | `capabilities[].id` | Which tool categories the agent can use | `system.command.execute`, `data.file.write` |
 | `limits.*.allowed_commands` | Which commands are allowed | `["git", "npm", "node"]` or `["*"]` for all |
 | `limits.*.blocked_patterns` | Patterns always denied | `["rm -rf", "sudo", "chmod 777"]` |
 | `status` | Kill switch | `active`, `suspended`, `revoked` |
 **Evaluation modes (provider-dependent):**
 OAP providers may support different evaluation modes. For example, the APort reference implementation supports:
 | Mode | How it works | Network | Latency |
 |---|---|---|---|
 | **Local** | Evaluates passport locally (bash script). | None | ~300ms |
 | **API** | Sends passport + context to a hosted evaluator. Signed decisions. | Yes | ~65ms |
 A custom OAP provider can implement any evaluation strategy -- the DeerFlow middleware doesn't care how the provider reaches its decision.
 **Try it:**
 1. Install and set up as above
 2. Start DeerFlow and ask: "Create a file called test.txt with content hello"
 3. Then ask: "Now delete it using bash rm -rf"
 4. Guardrail blocks it: `oap.blocked_pattern: Command contains blocked pattern: rm -rf`
 ### Option 3: Custom Provider (Bring Your Own)
 Any Python class with `evaluate(request)` and `aevaluate(request)` methods works. No base class or inheritance needed -- it's a structural protocol.
 ```python
 # my_guardrail.py
 class MyGuardrailProvider:
    name = "my-company"
    def evaluate(self, request):
        from deerflow.guardrails.provider import GuardrailDecision, GuardrailReason
        # Example: block any bash command containing "delete"
        if request.tool_name == "bash" and "delete" in str(request.tool_input):
            return GuardrailDecision(
                allow=False,
                reasons=[GuardrailReason(code="custom.blocked", message="delete not allowed")],
                policy_id="custom.v1",
            )
        return GuardrailDecision(allow=True, reasons=[GuardrailReason(code="oap.allowed")])
    async def aevaluate(self, request):
        return self.evaluate(request)
 ```
 **config.yaml:**
 ```yaml
 guardrails:
  enabled: true
  provider:
    use: my_guardrail:MyGuardrailProvider
 ```
 Make sure `my_guardrail.py` is on the Python path (e.g. in the backend directory or installed as a package).
 **Try it:**
 1. Create `my_guardrail.py` in the backend directory
 2. Add the config
 3. Start DeerFlow and ask: "Use bash to delete test.txt"
 4. Your provider blocks it
 ## Implementing a Provider
 ### Required Interface
 ```
 ┌──────────────────────────────────────────────────┐
 │              GuardrailProvider Protocol            │
 │                                                   │
 │  name: str                                        │
 │                                                   │
 │  evaluate(request: GuardrailRequest)              │
 │      -> GuardrailDecision                         │
 │                                                   │
 │  aevaluate(request: GuardrailRequest)   (async)   │
 │      -> GuardrailDecision                         │
 └──────────────────────────────────────────────────┘
 ┌──────────────────────────┐    ┌──────────────────────────┐
 │     GuardrailRequest      │    │    GuardrailDecision      │
 │                           │    │                           │
 │  tool_name: str           │    │  allow: bool              │
 │  tool_input: dict         │    │  reasons: [GuardrailReason]│
 │  agent_id: str | None     │    │  policy_id: str | None    │
 │  thread_id: str | None    │    │  metadata: dict           │
 │  is_subagent: bool        │    │                           │
 │  timestamp: str           │    │  GuardrailReason:         │
 │                           │    │    code: str              │
 └──────────────────────────┘    │    message: str           │
                                └──────────────────────────┘
 ```
 ### DeerFlow Tool Names
 These are the tool names your provider will see in `request.tool_name`:
 | Tool | What it does |
 |---|---|
 | `bash` | Shell command execution |
 | `write_file` | Create/overwrite a file |
 | `str_replace` | Edit a file (find and replace) |
 | `read_file` | Read file content |
 | `ls` | List directory |
 | `web_search` | Web search query |
 | `web_fetch` | Fetch URL content |
 | `image_search` | Image search |
 | `present_file` | Present file to user |
 | `view_image` | Display image |
 | `ask_clarification` | Ask user a question |
 | `task` | Delegate to subagent |
 | `mcp__*` | MCP tools (dynamic) |
 ### OAP Reason Codes
 Standard codes used by the [OAP specification](https://github.com/aporthq/aport-spec):
 | Code | Meaning |
 |---|---|
 | `oap.allowed` | Tool call authorized |
 | `oap.tool_not_allowed` | Tool not in allowlist |
 | `oap.command_not_allowed` | Command not in allowed_commands |
 | `oap.blocked_pattern` | Command matches a blocked pattern |
 | `oap.limit_exceeded` | Operation exceeds a limit |
 | `oap.passport_suspended` | Passport status is suspended/revoked |
 | `oap.evaluator_error` | Provider crashed (fail-closed) |
 ### Provider Loading
 DeerFlow loads providers via `resolve_variable()` -- the same mechanism used for models, tools, and sandbox providers. The `use:` field is a Python class path: `package.module:ClassName`.
 The provider is instantiated with `**config` kwargs if `config:` is set, plus `framework="deerflow"` is always injected. Accept `**kwargs` to stay forward-compatible:
 ```python
 class YourProvider:
    def __init__(self, framework: str = "generic", **kwargs):
        # framework="deerflow" tells you which config dir to use
        ...
 ```
 ## Configuration Reference
 ```yaml
 guardrails:
  # Enable/disable guardrail middleware (default: false)
  enabled: true
  # Block tool calls if provider raises an exception (default: true)
  fail_closed: true
  # Passport reference -- passed as request.agent_id to the provider.
  # File path, hosted agent ID, or null (provider resolves from its config).
  passport: null
  # Provider: loaded by class path via resolve_variable
  provider:
    use: deerflow.guardrails.builtin:AllowlistProvider
    config:  # optional kwargs passed to provider.__init__
      denied_tools: ["bash"]
 ```
 ## Testing
 ```bash
 cd backend
 uv run python -m pytest tests/test_guardrail_middleware.py -v
 ```
 25 tests covering:
 - AllowlistProvider: allow, deny, both allowlist+denylist, async
 - GuardrailMiddleware: allow passthrough, deny with OAP codes, fail-closed, fail-open, passport forwarding, empty reasons fallback, empty tool name, protocol isinstance check
 - Async paths: awrap_tool_call for allow, deny, fail-closed, fail-open
 - GraphBubbleUp: LangGraph control signals propagate through (not caught)
 - Config: defaults, from_dict, singleton load/reset
 ## Files
 ```
 packages/harness/deerflow/guardrails/
    __init__.py              # Public exports
    provider.py              # GuardrailProvider protocol, GuardrailRequest, GuardrailDecision
    middleware.py             # GuardrailMiddleware (AgentMiddleware subclass)
    builtin.py               # AllowlistProvider (zero deps)
 packages/harness/deerflow/config/
    guardrails_config.py     # GuardrailsConfig Pydantic model + singleton
 packages/harness/deerflow/agents/middlewares/
    tool_error_handling_middleware.py  # Registers GuardrailMiddleware in chain
 config.example.yaml          # Three provider options documented
 tests/test_guardrail_middleware.py  # 25 tests
 docs/GUARDRAILS.md           # This file
 ```
--- a/backend/packages/harness/deerflow/agents/middlewares/tool_error_handling_middleware.py
+++ b/backend/packages/harness/deerflow/agents/middlewares/tool_error_handling_middleware.py
@ -90,6 +90,31 @@ def _build_runtime_middlewares(
        middlewares.append(DanglingToolCallMiddleware())
    # Guardrail middleware (if configured)
    from deerflow.config.guardrails_config import get_guardrails_config
    guardrails_config = get_guardrails_config()
    if guardrails_config.enabled and guardrails_config.provider:
        import inspect
        from deerflow.guardrails.middleware import GuardrailMiddleware
        from deerflow.reflection import resolve_variable
        provider_cls = resolve_variable(guardrails_config.provider.use)
        provider_kwargs = dict(guardrails_config.provider.config) if guardrails_config.provider.config else {}
        # Pass framework hint if the provider accepts it (e.g. for config discovery).
        # Built-in providers like AllowlistProvider don't need it, so only inject
        # when the constructor accepts 'framework' or '**kwargs'.
        if "framework" not in provider_kwargs:
            try:
                sig = inspect.signature(provider_cls.__init__)
                if "framework" in sig.parameters or any(p.kind == inspect.Parameter.VAR_KEYWORD for p in sig.parameters.values()):
                    provider_kwargs["framework"] = "deerflow"
            except (ValueError, TypeError):
                pass
        provider = provider_cls(**provider_kwargs)
        middlewares.append(GuardrailMiddleware(provider, fail_closed=guardrails_config.fail_closed, passport=guardrails_config.passport))
    middlewares.append(ToolErrorHandlingMiddleware())
    return middlewares
--- a/backend/packages/harness/deerflow/config/app_config.py
+++ b/backend/packages/harness/deerflow/config/app_config.py
@ -9,6 +9,7 @@ from pydantic import BaseModel, ConfigDict, Field
 from deerflow.config.checkpointer_config import CheckpointerConfig, load_checkpointer_config_from_dict
 from deerflow.config.extensions_config import ExtensionsConfig
 from deerflow.config.guardrails_config import load_guardrails_config_from_dict
 from deerflow.config.memory_config import load_memory_config_from_dict
 from deerflow.config.model_config import ModelConfig
 from deerflow.config.sandbox_config import SandboxConfig
@ -107,6 +108,10 @@ class AppConfig(BaseModel):
        if "tool_search" in config_data:
            load_tool_search_config_from_dict(config_data["tool_search"])
        # Load guardrails config if present
        if "guardrails" in config_data:
            load_guardrails_config_from_dict(config_data["guardrails"])
        # Load checkpointer config if present
        if "checkpointer" in config_data:
            load_checkpointer_config_from_dict(config_data["checkpointer"])
--- a/backend/packages/harness/deerflow/config/guardrails_config.py
+++ b/backend/packages/harness/deerflow/config/guardrails_config.py
@ -0,0 +1,48 @@
 """Configuration for pre-tool-call authorization."""
 from pydantic import BaseModel, Field
 class GuardrailProviderConfig(BaseModel):
    """Configuration for a guardrail provider."""
    use: str = Field(description="Class path (e.g. 'deerflow.guardrails.builtin:AllowlistProvider')")
    config: dict = Field(default_factory=dict, description="Provider-specific settings passed as kwargs")
 class GuardrailsConfig(BaseModel):
    """Configuration for pre-tool-call authorization.
    When enabled, every tool call passes through the configured provider
    before execution. The provider receives tool name, arguments, and the
    agent's passport reference, and returns an allow/deny decision.
    """
    enabled: bool = Field(default=False, description="Enable guardrail middleware")
    fail_closed: bool = Field(default=True, description="Block tool calls if provider errors")
    passport: str | None = Field(default=None, description="OAP passport path or hosted agent ID")
    provider: GuardrailProviderConfig | None = Field(default=None, description="Guardrail provider configuration")
 _guardrails_config: GuardrailsConfig | None = None
 def get_guardrails_config() -> GuardrailsConfig:
    """Get the guardrails config, returning defaults if not loaded."""
    global _guardrails_config
    if _guardrails_config is None:
        _guardrails_config = GuardrailsConfig()
    return _guardrails_config
 def load_guardrails_config_from_dict(data: dict) -> GuardrailsConfig:
    """Load guardrails config from a dict (called during AppConfig loading)."""
    global _guardrails_config
    _guardrails_config = GuardrailsConfig.model_validate(data)
    return _guardrails_config
 def reset_guardrails_config() -> None:
    """Reset the cached config instance. Used in tests to prevent singleton leaks."""
    global _guardrails_config
    _guardrails_config = None
--- a/backend/packages/harness/deerflow/guardrails/init.py
+++ b/backend/packages/harness/deerflow/guardrails/init.py
@ -0,0 +1,14 @@
 """Pre-tool-call authorization middleware."""
 from deerflow.guardrails.builtin import AllowlistProvider
 from deerflow.guardrails.middleware import GuardrailMiddleware
 from deerflow.guardrails.provider import GuardrailDecision, GuardrailProvider, GuardrailReason, GuardrailRequest
 __all__ = [
    "AllowlistProvider",
    "GuardrailDecision",
    "GuardrailMiddleware",
    "GuardrailProvider",
    "GuardrailReason",
    "GuardrailRequest",
 ]
--- a/backend/packages/harness/deerflow/guardrails/builtin.py
+++ b/backend/packages/harness/deerflow/guardrails/builtin.py
@ -0,0 +1,23 @@
 """Built-in guardrail providers that ship with DeerFlow."""
 from deerflow.guardrails.provider import GuardrailDecision, GuardrailReason, GuardrailRequest
 class AllowlistProvider:
    """Simple allowlist/denylist provider. No external dependencies."""
    name = "allowlist"
    def __init__(self, *, allowed_tools: list[str] | None = None, denied_tools: list[str] | None = None):
        self._allowed = set(allowed_tools) if allowed_tools else None
        self._denied = set(denied_tools) if denied_tools else set()
    def evaluate(self, request: GuardrailRequest) -> GuardrailDecision:
        if self._allowed is not None and request.tool_name not in self._allowed:
            return GuardrailDecision(allow=False, reasons=[GuardrailReason(code="oap.tool_not_allowed", message=f"tool '{request.tool_name}' not in allowlist")])
        if request.tool_name in self._denied:
            return GuardrailDecision(allow=False, reasons=[GuardrailReason(code="oap.tool_not_allowed", message=f"tool '{request.tool_name}' is denied")])
        return GuardrailDecision(allow=True, reasons=[GuardrailReason(code="oap.allowed")])
    async def aevaluate(self, request: GuardrailRequest) -> GuardrailDecision:
        return self.evaluate(request)
--- a/backend/packages/harness/deerflow/guardrails/middleware.py
+++ b/backend/packages/harness/deerflow/guardrails/middleware.py
@ -0,0 +1,98 @@
 """GuardrailMiddleware - evaluates tool calls against a GuardrailProvider before execution."""
 import logging
 from collections.abc import Awaitable, Callable
 from datetime import UTC, datetime
 from typing import override
 from langchain.agents import AgentState
 from langchain.agents.middleware import AgentMiddleware
 from langchain_core.messages import ToolMessage
 from langgraph.errors import GraphBubbleUp
 from langgraph.prebuilt.tool_node import ToolCallRequest
 from langgraph.types import Command
 from deerflow.guardrails.provider import GuardrailDecision, GuardrailProvider, GuardrailReason, GuardrailRequest
 logger = logging.getLogger(__name__)
 class GuardrailMiddleware(AgentMiddleware[AgentState]):
    """Evaluate tool calls against a GuardrailProvider before execution.
    Denied calls return an error ToolMessage so the agent can adapt.
    If the provider raises, behavior depends on fail_closed:
      - True (default): block the call
      - False: allow it through with a warning
    """
    def __init__(self, provider: GuardrailProvider, *, fail_closed: bool = True, passport: str | None = None):
        self.provider = provider
        self.fail_closed = fail_closed
        self.passport = passport
    def _build_request(self, request: ToolCallRequest) -> GuardrailRequest:
        return GuardrailRequest(
            tool_name=str(request.tool_call.get("name", "")),
            tool_input=request.tool_call.get("args", {}),
            agent_id=self.passport,
            timestamp=datetime.now(UTC).isoformat(),
        )
    def _build_denied_message(self, request: ToolCallRequest, decision: GuardrailDecision) -> ToolMessage:
        tool_name = str(request.tool_call.get("name", "unknown_tool"))
        tool_call_id = str(request.tool_call.get("id", "missing_id"))
        reason_text = decision.reasons[0].message if decision.reasons else "blocked by guardrail policy"
        reason_code = decision.reasons[0].code if decision.reasons else "oap.denied"
        return ToolMessage(
            content=f"Guardrail denied: tool '{tool_name}' was blocked ({reason_code}). Reason: {reason_text}. Choose an alternative approach.",
            tool_call_id=tool_call_id,
            name=tool_name,
            status="error",
        )
    @override
    def wrap_tool_call(
        self,
        request: ToolCallRequest,
        handler: Callable[[ToolCallRequest], ToolMessage | Command],
    ) -> ToolMessage | Command:
        gr = self._build_request(request)
        try:
            decision = self.provider.evaluate(gr)
        except GraphBubbleUp:
            # Preserve LangGraph control-flow signals (interrupt/pause/resume).
            raise
        except Exception:
            logger.exception("Guardrail provider error (sync)")
            if self.fail_closed:
                decision = GuardrailDecision(allow=False, reasons=[GuardrailReason(code="oap.evaluator_error", message="guardrail provider error (fail-closed)")])
            else:
                return handler(request)
        if not decision.allow:
            logger.warning("Guardrail denied: tool=%s policy=%s code=%s", gr.tool_name, decision.policy_id, decision.reasons[0].code if decision.reasons else "unknown")
            return self._build_denied_message(request, decision)
        return handler(request)
    @override
    async def awrap_tool_call(
        self,
        request: ToolCallRequest,
        handler: Callable[[ToolCallRequest], Awaitable[ToolMessage | Command]],
    ) -> ToolMessage | Command:
        gr = self._build_request(request)
        try:
            decision = await self.provider.aevaluate(gr)
        except GraphBubbleUp:
            # Preserve LangGraph control-flow signals (interrupt/pause/resume).
            raise
        except Exception:
            logger.exception("Guardrail provider error (async)")
            if self.fail_closed:
                decision = GuardrailDecision(allow=False, reasons=[GuardrailReason(code="oap.evaluator_error", message="guardrail provider error (fail-closed)")])
            else:
                return await handler(request)
        if not decision.allow:
            logger.warning("Guardrail denied: tool=%s policy=%s code=%s", gr.tool_name, decision.policy_id, decision.reasons[0].code if decision.reasons else "unknown")
            return self._build_denied_message(request, decision)
        return await handler(request)
--- a/backend/packages/harness/deerflow/guardrails/provider.py
+++ b/backend/packages/harness/deerflow/guardrails/provider.py
@ -0,0 +1,56 @@
 """GuardrailProvider protocol and data structures for pre-tool-call authorization."""
 from __future__ import annotations
 from dataclasses import dataclass, field
 from typing import Any, Protocol, runtime_checkable
@dataclass
 class GuardrailRequest:
    """Context passed to the provider for each tool call."""
    tool_name: str
    tool_input: dict[str, Any]
    agent_id: str | None = None
    thread_id: str | None = None
    is_subagent: bool = False
    timestamp: str = ""
@dataclass
 class GuardrailReason:
    """Structured reason for an allow/deny decision (OAP reason object)."""
    code: str
    message: str = ""
@dataclass
 class GuardrailDecision:
    """Provider's allow/deny verdict (aligned with OAP Decision object)."""
    allow: bool
    reasons: list[GuardrailReason] = field(default_factory=list)
    policy_id: str | None = None
    metadata: dict[str, Any] = field(default_factory=dict)
@runtime_checkable
 class GuardrailProvider(Protocol):
    """Contract for pluggable tool-call authorization.
    Any class with these methods works - no base class required.
    Providers are loaded by class path via resolve_variable(),
    the same mechanism DeerFlow uses for models, tools, and sandbox.
    """
    name: str
    def evaluate(self, request: GuardrailRequest) -> GuardrailDecision:
        """Evaluate whether a tool call should proceed."""
        ...
    async def aevaluate(self, request: GuardrailRequest) -> GuardrailDecision:
        """Async variant."""
        ...
--- a/backend/tests/test_guardrail_middleware.py
+++ b/backend/tests/test_guardrail_middleware.py
@ -0,0 +1,344 @@
 """Tests for the guardrail middleware and built-in providers."""
 from __future__ import annotations
 import asyncio
 from unittest.mock import MagicMock
 import pytest
 from langgraph.errors import GraphBubbleUp
 from deerflow.guardrails.builtin import AllowlistProvider
 from deerflow.guardrails.middleware import GuardrailMiddleware
 from deerflow.guardrails.provider import GuardrailDecision, GuardrailReason, GuardrailRequest
 # --- Helpers ---
 def _make_tool_call_request(name: str = "bash", args: dict | None = None, call_id: str = "call_1"):
    """Create a mock ToolCallRequest."""
    req = MagicMock()
    req.tool_call = {"name": name, "args": args or {}, "id": call_id}
    return req
 class _AllowAllProvider:
    name = "allow-all"
    def evaluate(self, request: GuardrailRequest) -> GuardrailDecision:
        return GuardrailDecision(allow=True, reasons=[GuardrailReason(code="oap.allowed")])
    async def aevaluate(self, request: GuardrailRequest) -> GuardrailDecision:
        return self.evaluate(request)
 class _DenyAllProvider:
    name = "deny-all"
    def evaluate(self, request: GuardrailRequest) -> GuardrailDecision:
        return GuardrailDecision(
            allow=False,
            reasons=[GuardrailReason(code="oap.denied", message="all tools blocked")],
            policy_id="test.deny.v1",
        )
    async def aevaluate(self, request: GuardrailRequest) -> GuardrailDecision:
        return self.evaluate(request)
 class _ExplodingProvider:
    name = "exploding"
    def evaluate(self, request: GuardrailRequest) -> GuardrailDecision:
        raise RuntimeError("provider crashed")
    async def aevaluate(self, request: GuardrailRequest) -> GuardrailDecision:
        raise RuntimeError("provider crashed")
 # --- AllowlistProvider tests ---
 class TestAllowlistProvider:
    def test_no_restrictions_allows_all(self):
        provider = AllowlistProvider()
        req = GuardrailRequest(tool_name="bash", tool_input={})
        decision = provider.evaluate(req)
        assert decision.allow is True
    def test_denied_tools(self):
        provider = AllowlistProvider(denied_tools=["bash", "write_file"])
        req = GuardrailRequest(tool_name="bash", tool_input={})
        decision = provider.evaluate(req)
        assert decision.allow is False
        assert decision.reasons[0].code == "oap.tool_not_allowed"
    def test_denied_tools_allows_unlisted(self):
        provider = AllowlistProvider(denied_tools=["bash"])
        req = GuardrailRequest(tool_name="web_search", tool_input={})
        decision = provider.evaluate(req)
        assert decision.allow is True
    def test_allowed_tools_blocks_unlisted(self):
        provider = AllowlistProvider(allowed_tools=["web_search", "read_file"])
        req = GuardrailRequest(tool_name="bash", tool_input={})
        decision = provider.evaluate(req)
        assert decision.allow is False
    def test_allowed_tools_allows_listed(self):
        provider = AllowlistProvider(allowed_tools=["web_search"])
        req = GuardrailRequest(tool_name="web_search", tool_input={})
        decision = provider.evaluate(req)
        assert decision.allow is True
    def test_both_allowed_and_denied(self):
        provider = AllowlistProvider(allowed_tools=["bash", "web_search"], denied_tools=["bash"])
        # bash is in both: allowlist passes, denylist blocks
        req = GuardrailRequest(tool_name="bash", tool_input={})
        decision = provider.evaluate(req)
        assert decision.allow is False
    def test_async_delegates_to_sync(self):
        provider = AllowlistProvider(denied_tools=["bash"])
        req = GuardrailRequest(tool_name="bash", tool_input={})
        decision = asyncio.run(provider.aevaluate(req))
        assert decision.allow is False
 # --- GuardrailMiddleware tests ---
 class TestGuardrailMiddleware:
    def test_allowed_tool_passes_through(self):
        mw = GuardrailMiddleware(_AllowAllProvider())
        req = _make_tool_call_request("web_search")
        expected = MagicMock()
        handler = MagicMock(return_value=expected)
        result = mw.wrap_tool_call(req, handler)
        handler.assert_called_once_with(req)
        assert result is expected
    def test_denied_tool_returns_error_message(self):
        mw = GuardrailMiddleware(_DenyAllProvider())
        req = _make_tool_call_request("bash")
        handler = MagicMock()
        result = mw.wrap_tool_call(req, handler)
        handler.assert_not_called()
        assert result.status == "error"
        assert "oap.denied" in result.content
        assert result.name == "bash"
    def test_fail_closed_on_provider_error(self):
        mw = GuardrailMiddleware(_ExplodingProvider(), fail_closed=True)
        req = _make_tool_call_request("bash")
        handler = MagicMock()
        result = mw.wrap_tool_call(req, handler)
        handler.assert_not_called()
        assert result.status == "error"
        assert "oap.evaluator_error" in result.content
    def test_fail_open_on_provider_error(self):
        mw = GuardrailMiddleware(_ExplodingProvider(), fail_closed=False)
        req = _make_tool_call_request("bash")
        expected = MagicMock()
        handler = MagicMock(return_value=expected)
        result = mw.wrap_tool_call(req, handler)
        handler.assert_called_once_with(req)
        assert result is expected
    def test_passport_passed_as_agent_id(self):
        captured = {}
        class CapturingProvider:
            name = "capture"
            def evaluate(self, request):
                captured["agent_id"] = request.agent_id
                return GuardrailDecision(allow=True)
            async def aevaluate(self, request):
                return self.evaluate(request)
        mw = GuardrailMiddleware(CapturingProvider(), passport="./guardrails/passport.json")
        req = _make_tool_call_request("bash")
        mw.wrap_tool_call(req, MagicMock())
        assert captured["agent_id"] == "./guardrails/passport.json"
    def test_decision_contains_oap_reason_codes(self):
        mw = GuardrailMiddleware(_DenyAllProvider())
        req = _make_tool_call_request("bash")
        result = mw.wrap_tool_call(req, MagicMock())
        assert "oap.denied" in result.content
        assert "all tools blocked" in result.content
    def test_deny_with_empty_reasons_uses_fallback(self):
        """Provider returns deny with empty reasons list -- middleware uses fallback text."""
        class EmptyReasonProvider:
            name = "empty-reason"
            def evaluate(self, request):
                return GuardrailDecision(allow=False, reasons=[])
            async def aevaluate(self, request):
                return self.evaluate(request)
        mw = GuardrailMiddleware(EmptyReasonProvider())
        req = _make_tool_call_request("bash")
        result = mw.wrap_tool_call(req, MagicMock())
        assert result.status == "error"
        assert "blocked by guardrail policy" in result.content
    def test_empty_tool_name(self):
        """Tool call with empty name is handled gracefully."""
        mw = GuardrailMiddleware(_AllowAllProvider())
        req = _make_tool_call_request("")
        expected = MagicMock()
        handler = MagicMock(return_value=expected)
        result = mw.wrap_tool_call(req, handler)
        assert result is expected
    def test_protocol_isinstance_check(self):
        """AllowlistProvider satisfies GuardrailProvider protocol at runtime."""
        from deerflow.guardrails.provider import GuardrailProvider
        assert isinstance(AllowlistProvider(), GuardrailProvider)
    def test_async_allowed(self):
        mw = GuardrailMiddleware(_AllowAllProvider())
        req = _make_tool_call_request("web_search")
        expected = MagicMock()
        async def handler(r):
            return expected
        async def run():
            return await mw.awrap_tool_call(req, handler)
        result = asyncio.run(run())
        assert result is expected
    def test_async_denied(self):
        mw = GuardrailMiddleware(_DenyAllProvider())
        req = _make_tool_call_request("bash")
        async def handler(r):
            return MagicMock()
        async def run():
            return await mw.awrap_tool_call(req, handler)
        result = asyncio.run(run())
        assert result.status == "error"
    def test_async_fail_closed(self):
        mw = GuardrailMiddleware(_ExplodingProvider(), fail_closed=True)
        req = _make_tool_call_request("bash")
        async def handler(r):
            return MagicMock()
        async def run():
            return await mw.awrap_tool_call(req, handler)
        result = asyncio.run(run())
        assert result.status == "error"
    def test_async_fail_open(self):
        mw = GuardrailMiddleware(_ExplodingProvider(), fail_closed=False)
        req = _make_tool_call_request("bash")
        expected = MagicMock()
        async def handler(r):
            return expected
        async def run():
            return await mw.awrap_tool_call(req, handler)
        result = asyncio.run(run())
        assert result is expected
    def test_graph_bubble_up_not_swallowed(self):
        """GraphBubbleUp (LangGraph interrupt/pause) must propagate, not be caught."""
        class BubbleProvider:
            name = "bubble"
            def evaluate(self, request):
                raise GraphBubbleUp()
            async def aevaluate(self, request):
                raise GraphBubbleUp()
        mw = GuardrailMiddleware(BubbleProvider(), fail_closed=True)
        req = _make_tool_call_request("bash")
        with pytest.raises(GraphBubbleUp):
            mw.wrap_tool_call(req, MagicMock())
    def test_async_graph_bubble_up_not_swallowed(self):
        """Async: GraphBubbleUp must propagate."""
        class BubbleProvider:
            name = "bubble"
            def evaluate(self, request):
                raise GraphBubbleUp()
            async def aevaluate(self, request):
                raise GraphBubbleUp()
        mw = GuardrailMiddleware(BubbleProvider(), fail_closed=True)
        req = _make_tool_call_request("bash")
        async def handler(r):
            return MagicMock()
        async def run():
            return await mw.awrap_tool_call(req, handler)
        with pytest.raises(GraphBubbleUp):
            asyncio.run(run())
 # --- Config tests ---
 class TestGuardrailsConfig:
    def test_config_defaults(self):
        from deerflow.config.guardrails_config import GuardrailsConfig
        config = GuardrailsConfig()
        assert config.enabled is False
        assert config.fail_closed is True
        assert config.passport is None
        assert config.provider is None
    def test_config_from_dict(self):
        from deerflow.config.guardrails_config import GuardrailsConfig
        config = GuardrailsConfig.model_validate(
            {
                "enabled": True,
                "fail_closed": False,
                "passport": "./guardrails/passport.json",
                "provider": {
                    "use": "deerflow.guardrails.builtin:AllowlistProvider",
                    "config": {"denied_tools": ["bash"]},
                },
            }
        )
        assert config.enabled is True
        assert config.fail_closed is False
        assert config.passport == "./guardrails/passport.json"
        assert config.provider.use == "deerflow.guardrails.builtin:AllowlistProvider"
        assert config.provider.config == {"denied_tools": ["bash"]}
    def test_singleton_load_and_get(self):
        from deerflow.config.guardrails_config import get_guardrails_config, load_guardrails_config_from_dict, reset_guardrails_config
        try:
            load_guardrails_config_from_dict({"enabled": True, "provider": {"use": "test:Foo"}})
            config = get_guardrails_config()
            assert config.enabled is True
        finally:
            reset_guardrails_config()
--- a/config.example.yaml
+++ b/config.example.yaml
@ -505,3 +505,38 @@ checkpointer:
 #           context:
 #             thinking_enabled: true
 #             subagent_enabled: true
 # ============================================================================
 # Guardrails Configuration
 # ============================================================================
 # Optional pre-execution authorization for tool calls.
 # When enabled, every tool call passes through the configured provider
 # before execution. Three options: built-in allowlist, OAP policy provider,
 # or custom provider. See backend/docs/GUARDRAILS.md for full documentation.
 #
 # Providers are loaded by class path via resolve_variable (same as models/tools).
 # --- Option 1: Built-in AllowlistProvider (zero external deps) ---
 # guardrails:
 #   enabled: true
 #   provider:
 #     use: deerflow.guardrails.builtin:AllowlistProvider
 #     config:
 #       denied_tools: ["bash", "write_file"]
 # --- Option 2: OAP passport provider (open standard, any implementation) ---
 #   The Open Agent Passport (OAP) spec defines passport format and decision codes.
 #   Any OAP-compliant provider works. Example using APort (reference implementation):
 #     pip install aport-agent-guardrails && aport setup --framework deerflow
 # guardrails:
 #   enabled: true
 #   provider:
 #     use: aport_guardrails.providers.generic:OAPGuardrailProvider
 # --- Option 3: Custom provider (any class with evaluate/aevaluate methods) ---
 # guardrails:
 #   enabled: true
 #   provider:
 #     use: my_package:MyGuardrailProvider
 #     config:
 #       key: value