Major refactoring of deerflow/runtime/: - runs/callbacks/ - new callback system (builder, events, title, tokens) - runs/internal/ - execution internals (executor, supervisor, stream_logic, registry) - runs/internal/execution/ - execution artifacts and events handling - runs/facade.py - high-level run facade - runs/observer.py - run observation protocol - runs/types.py - type definitions - runs/store/ - simplified store interfaces (create, delete, query, event) Refactor stream_bridge/: - Replace old providers with contract.py and exceptions.py - Remove async_provider.py, base.py, memory.py Add documentation: - README.md and README_zh.md for runtime module Remove deprecated: - manager.py moved to internal/ - worker.py, schemas.py - user_context.py Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
deerflow.runtime Design Overview
This document describes the current implementation of backend/packages/harness/deerflow/runtime, including its overall design, boundary model, the collaboration between runs and stream_bridge, how it interacts with external infrastructure and the app layer, and how actor_context is dynamically injected to provide user isolation.
1. Overall Role
deerflow.runtime is the runtime kernel layer of DeerFlow.
It sits below agents / tools / middlewares and above app / gateway / infra. Its purpose is to define runtime semantics and boundary contracts, without directly owning web endpoints, ORM models, or concrete infrastructure implementations.
Its public surface is re-exported from __init__.py and currently exposes four main capability areas:
runs- Run domain types, execution facade, lifecycle observers, and store protocols
stream_bridge- Stream event bridge contract and public stream types
actor_context- Request/task-scoped actor context and user-isolation bridge
serialization- Runtime serialization helpers for LangChain / LangGraph data and outward-facing events
Structurally, the current package looks like:
runtime
├─ runs
│ ├─ facade / types / observer / store
│ ├─ internal/*
│ └─ callbacks/*
├─ stream_bridge
│ ├─ contract
│ └─ exceptions
├─ actor_context
└─ serialization / converters
2. Overall Design and Constraint Model
2.1 Design Goal
The core goal of runtime is to decouple runtime control-plane semantics from infrastructure implementations.
It only cares about:
- What a run is and how run state changes over time
- What lifecycle events and stream events are produced during execution
- Which capabilities must be injected from the outside, such as checkpointer, event store, stream bridge, and durable stores
- Who the current actor is, and how lower layers can use that for isolation
It deliberately does not care about:
- Whether events are stored in memory, Redis, or another transport
- How run / thread / feedback data is persisted
- HTTP / SSE / FastAPI details
- How the auth plugin resolves the request user
2.2 Boundary Rules
The current package has a fairly clear boundary model:
runsowns execution orchestration, not ORM or SQL writesstream_bridgedefines stream semantics, not app-level bridge constructionactor_contextdefines runtime context, not auth-plugin behavior- Durable data enters only through boundary protocols:
RunCreateStoreRunQueryStoreRunDeleteStoreRunEventStore
- Lifecycle side effects enter only through
RunObserver - User isolation is not implemented ad hoc in each module; it is propagated through actor context
In one sentence:
runtime defines semantics and contracts; app.infra provides implementations.
3. runs Subsystem Design
3.1 Purpose
runtime/runs is the run orchestration domain. It is responsible for:
- Defining run domain objects and status transitions
- Organizing create / stream / wait / join / cancel / delete behavior
- Maintaining the in-process runtime control plane
- Emitting stream events and lifecycle events during execution
- Collecting trace, token, title, and message data through callbacks
3.2 Core Objects
See runs/types.py.
The most important types are:
RunSpec- Built by the app-side input layer
- The real execution input
RunRecord- The runtime record managed by
RunRegistry
- The runtime record managed by
RunStatuspending,starting,running,success,error,interrupted,timeout
RunScope- Distinguishes stateful vs stateless execution and temporary thread behavior
3.3 Current Constraints
The current implementation explicitly limits some parts of the problem space:
multitask_strategycurrently supports onlyrejectandinterrupton the main pathenqueue,after_seconds, and batch execution are not on the current primary pathRunRegistryis an in-process state source, not a durable source of truth- External queries may use durable stores, but the live control plane still centers on the in-memory registry
3.4 Facade and Internal Components
RunsFacade in runs/facade.py provides the unified API:
create_backgroundcreate_and_streamcreate_and_waitjoin_streamjoin_waitcancelget_runlist_runsdelete_run
Internally it composes:
RunRegistryExecutionPlannerRunSupervisorRunStreamServiceRunWaitServiceRunCreateStore/RunQueryStore/RunDeleteStoreRunObserver
So RunsFacade is the public entry point, while execution and state transitions are distributed across smaller components.
4. stream_bridge Design and Implementation
4.1 Why stream_bridge Is a Separate Abstraction
StreamBridge is defined in stream_bridge/contract.py.
It exists because run execution needs an event channel that is:
- Subscribable
- Replayable
- Terminal-state aware
- Resume-capable
That behavior must not be hard-coupled to HTTP SSE, in-memory queues, or Redis-specific details.
So:
- harness defines stream semantics
- the app layer owns backend selection and implementation
4.2 Contract Contents
The abstract StreamBridge currently exposes:
publish(run_id, event, data)publish_end(run_id)publish_terminal(run_id, kind, data)subscribe(run_id, last_event_id, heartbeat_interval)cleanup(run_id, delay=0)cancel(run_id)mark_awaiting_input(run_id)start()close()
Public types include:
StreamEventStreamStatusResumeResultHEARTBEAT_SENTINELEND_SENTINELCANCELLED_SENTINEL
4.3 Semantic Boundary
The contract explicitly distinguishes:
end/cancel/error- Real business-level terminal events for a run
close()- Bridge-level shutdown
- Not equivalent to run cancellation
4.4 Current Implementation Style
The concrete implementation currently used is the app-layer MemoryStreamBridge.
Its design is effectively “one in-memory event log per run”:
_RunStreamstores the event list, offset mapping, status, subscriber count, and awaiting-input statepublish()generates increasing event IDs and appends to the per-run logsubscribe()supports replay, heartbeat, resume, and terminal exitcleanup_loop()handles:- old streams
- active streams with no publish activity
- orphan terminal streams
- TTL expiration
mark_awaiting_input()extends timeout behavior for HITL flows
The Redis implementation is still only a placeholder in RedisStreamBridge.
4.5 Call Chain
The stream bridge participates in the execution chain like this:
RunsFacade
-> RunStreamService
-> StreamBridge
-> app route converts events to SSE
More concretely:
_RunExecution._start()publishesmetadata_RunExecution._stream()converts agentastream()output into bridge events_RunExecution._finish_success()/_finish_failed()/_finish_aborted()publish terminal eventsRunWaitServicewaits by subscribing forvalues,error, or terminal events- The app route layer converts those events into outward-facing SSE
4.6 Future Extensions
Likely future directions include:
- A real Redis bridge for cross-process / multi-instance streaming
- Stronger Last-Event-ID gap recovery behavior
- Richer HITL state handling
- Cross-node run coordination and more explicit dead-letter strategies
5. External Communication and Store Read/Write Boundaries
5.1 Two Main Outward Boundaries
runtime does not send HTTP requests directly and does not write ORM models directly, but it communicates outward through two main boundaries:
StreamBridge- For outward-facing stream events
store/observer- For durable data and lifecycle side effects
5.2 Store Boundary Protocols
Under runs/store, the harness layer defines:
RunCreateStoreRunQueryStoreRunDeleteStoreRunEventStore
These are not harness-internal persistence implementations. They are app-facing contracts declared by the runtime.
5.3 How the app Layer Supplies Store Implementations
The app layer currently provides:
The shared pattern is:
- harness depends only on protocols
- the app layer owns session lifecycle, commit behavior, access control, and backend choice
- durable data eventually lands in
store.repositories.*or JSONL files
5.4 How Run Lifecycle Data Leaves the Runtime
The single-run executor _RunExecution does not write to the database directly.
It exports data through three paths:
- bridge events
- Streamed outward to subscribers
- callback ->
RunEventStore- Execution trace / message / tool / custom events are persisted in batches
- lifecycle event ->
RunObserver- Run started, completed, failed, cancelled, and thread-status updates are emitted for app observers
5.5 RunEventStore Backends
The app-side factory app/infra/run_events/factory.py currently selects:
run_events.backend == "db"AppRunEventStore
run_events.backend == "jsonl"JsonlRunEventStore
So the runtime does not care whether events end up in a database or in files. It only requires the event-store protocol.
6. Run Lifecycle Data, Callbacks, Write-Back, and Query Flow
6.1 Main Single-Run Flow
The main _RunExecution.run() flow is:
_start()_prepare()_stream()_finish_after_stream()finally_emit_final_thread_status()callbacks.flush()bridge.cleanup(run_id)
6.2 What the Start Phase Records
_start():
- sets run status to
running - emits
RUN_STARTED - extracts the first human message and emits
HUMAN_MESSAGE - captures the pre-run checkpoint ID
- publishes a
metadatastream event
6.3 What the Callbacks Collect
Callbacks live under runs/callbacks.
The main ones are:
RunEventCallback- Records
run_start,run_end,llm_request,llm_response,tool_start,tool_end,tool_result,custom_event, and more - Flushes batches into
RunEventStore
- Records
RunTokenCallback- Aggregates token usage, LLM call counts, lead/subagent/middleware token split, message counts, first human message, and last AI message
RunTitleCallback- Extracts thread title from title middleware output or custom events
6.4 How completion_data Is Produced
RunTokenCallback.completion_data() yields RunCompletionData, including:
total_input_tokenstotal_output_tokenstotal_tokensllm_call_countlead_agent_tokenssubagent_tokensmiddleware_tokensmessage_countlast_ai_messagefirst_human_message
The executor includes this data in lifecycle payloads on success, failure, and cancellation.
6.5 How the app Layer Writes Lifecycle Results Back
The executor emits RunLifecycleEvent objects through RunEventEmitter.
The app-layer StorageRunObserver then persists durable state:
RUN_STARTED- Marks the run as
running
- Marks the run as
RUN_COMPLETED- Writes completion data
- Syncs thread title if present
RUN_FAILED- Writes error and completion data
RUN_CANCELLED- Writes
interruptedstate and completion data
- Writes
THREAD_STATUS_UPDATED- Syncs thread status
6.6 Query Paths
RunsFacade.get_run() and list_runs() have two paths:
- If a
RunQueryStoreis injected, durable state is used first - Otherwise, the facade falls back to
RunRegistry
So:
- the in-memory registry is the control plane
- the durable store is the preferred query surface
7. How actor_context Is Dynamically Injected for User Isolation
7.1 Design Goal
actor_context is defined in actor_context.py.
Its purpose is to let the runtime and lower-level infrastructure modules depend on a stable notion of “who the current actor is” without importing the auth plugin, FastAPI request objects, or a specific user model.
7.2 Current Implementation
The current implementation is a request/task-scoped context built on top of ContextVar:
ActorContext- Currently carries only
user_id
- Currently carries only
_current_actor- A
ContextVar[ActorContext | None]
- A
bind_actor_context(actor)- Binds the current actor
reset_actor_context(token)- Restores the previous context
get_actor_context()- Returns the current actor
get_effective_user_id()- Returns the current user ID or
DEFAULT_USER_ID
- Returns the current user ID or
resolve_user_id(value=AUTO | explicit | None)- Resolves repository/storage-facing user IDs consistently
7.3 How the app Layer Injects It Dynamically
Dynamic injection currently happens at the app/auth boundary.
For HTTP request flows:
app.plugins.auth.security.middleware- Builds
ActorContext(user_id=...)from the authenticated request user - Binds and resets runtime actor context around request handling
- Builds
app.plugins.auth.security.actor_context- Provides
bind_request_actor_context(request)andbind_user_actor_context(user_id) - Allows routes and non-HTTP entry points to bind runtime actor context explicitly
- Provides
For non-HTTP / external channel flows:
Those entry points also wrap execution with bind_user_actor_context(user_id) before they enter runtime-facing code. This matters because:
- the runtime does not need to distinguish HTTP from Feishu or other channels
- any entry point that can resolve a user ID can inject the same isolation semantics
- the same runtime/store/path/memory code can stay protocol-agnostic
So the runtime itself does not know what a request is, and it does not know the auth plugin’s user model. It only knows whether an ActorContext is currently bound in the ContextVar.
7.4 Propagation Semantics After Injection
In practice, “dynamic injection” here does not mean manually threading user_id through every function signature. The app boundary binds the actor into a ContextVar, and runtime-facing code reads it only where isolation is actually needed.
The current semantics are:
- an entry boundary calls
bind_actor_context(...) - the async call chain created inside that context sees the same actor view
- the boundary restores the previous value with
reset_actor_context(token)when the request/task exits
That gives two practical outcomes:
- most runtime interfaces do not need to carry
user_idas an explicit parameter through every layer - boundaries that do need durable isolation or path isolation can still read explicitly via
resolve_user_id()orget_effective_user_id()
7.5 How User Isolation Actually Works
User isolation is implemented through “dynamic injection + boundary-specific reads”.
The main paths are:
- path / uploads / sandbox / memory
- Use
get_effective_user_id()to derive per-user directories and resource scopes
- Use
- app storage adapters
- Use
resolve_user_id(AUTO)inRunStoreAdapter,ThreadMetaStorage, and related boundaries
- Use
- run event store
AppRunEventStorereadsget_actor_context()and decides whether the current actor may see a thread
So user isolation is not centralized in a single middleware and then forgotten. Instead:
- the app boundary dynamically binds the actor into runtime context
- runtime and lower layers read that context when they need isolation input
- each boundary applies the user ID according to its own responsibility
7.6 Why This Approach Works Well
The current design has several practical strengths:
- The runtime does not depend on a specific auth implementation
- HTTP and non-HTTP entry points can reuse the same isolation mechanism
- The same user ID propagates naturally into paths, memory, store access, and event visibility
- Where stronger enforcement is needed,
AUTO+resolve_user_id()can require a bound actor context
7.7 Future Extensions
ActorContext already contains explicit future-extension hints. The current pattern can be extended without changing the architecture:
tenant_id- For multi-tenant isolation
subject_id- For a more stable identity key
scopes- For finer-grained authorization
auth_source- To track the source channel or auth mechanism
The recommended extension model is to preserve the current shape:
- The app/auth boundary binds a richer
ActorContext - The runtime depends only on abstract context fields, never on request/user objects
- Lower layers read only the fields they actually need
- Store / path / sandbox / stream / memory boundaries can gradually become tenant-aware or scope-aware
More concretely, stronger isolation can be added incrementally at the boundaries:
- store boundaries
- add
tenant_idfiltering inRunStoreAdapter,ThreadMetaStorage, and feedback/event stores
- add
- path and sandbox boundaries
- shard directories by
tenant_id/user_idinstead ofuser_idalone
- shard directories by
- event-visibility boundaries
- layer
scopesorsubject_idchecks into run-event and thread queries
- layer
- external-channel boundaries
- populate
auth_sourceso API, channel, and internal-job traffic can be distinguished
- populate
That keeps the runtime dependent on the abstract “current actor context” concept, not on FastAPI request objects or a specific auth implementation.
8. Interaction with the app Layer
8.1 How the app Layer Wires the Runtime
The app composition root for runs is app/gateway/services/runs/facade_factory.py.
It assembles:
RunRegistryExecutionPlannerRunSupervisorRunStreamServiceRunWaitServiceRunsRuntimebridgecheckpointerstoreevent_storeagent_factory_resolver
StorageRunObserverAppRunCreateStoreAppRunQueryStoreAppRunDeleteStore
8.2 How app.state Provides Infrastructure
init_persistence()creates:persistencecheckpointerrun_storethread_meta_storagerun_event_store
init_runtime()creates:stream_bridge
Those objects are then attached to app.state for dependency injection and facade construction.
8.3 The app Boundary for stream_bridge
Concrete stream bridge construction now belongs entirely to the app layer:
- harness exports only the
StreamBridgecontract app.infra.stream_bridge.build_stream_bridgeconstructs the actual implementation
That is a very explicit boundary:
- harness defines runtime semantics and interfaces
- app selects and constructs infrastructure
9. Summary
The most accurate one-line summary of deerflow.runtime today is:
It is a runtime kernel built around run orchestration, a stream bridge as the streaming boundary, actor context as the dynamic isolation bridge, and store / observer protocols as the durable and side-effect boundaries.
More concretely:
runsowns orchestration and lifecycle progressionstream_bridgeowns stream semanticsactor_contextowns runtime-scoped user context and isolation bridgingserialization/convertersown outward event and message formatting- the app layer owns real persistence, stream infrastructure, and auth-driven context injection
The main strengths of this structure are:
- Runtime semantics are decoupled from infrastructure implementations
- Request identity is decoupled from runtime logic
- HTTP, CLI, and channel-worker entry points can reuse the same runtime boundaries
- The system can grow toward multi-tenancy, cross-process stream bridges, and richer durable backends without changing the core model
The current limitations are also clear:
RunRegistryis still an in-process control plane- The Redis bridge is not implemented yet
- Some multitask strategies and batch capabilities are still outside the main path
ActorContextcurrently carries onlyuser_id, not richer fields such as tenant, scopes, or auth source
So the best way to understand the current code is not as a final platform, but as a runtime kernel with clear semantics and extension boundaries.