mirror of
https://github.com/bytedance/deer-flow.git
synced 2026-06-09 09:02:02 +00:00
* fix(dev): create backend/sandbox before uvicorn reload-exclude (#3459) #3426 switched the dev gateway's --reload-exclude patterns to absolute paths. uvicorn only excludes an absolute path directly when it already exists as a directory; otherwise it globs the pattern, and Python 3.12's pathlib raises NotImplementedError("Non-relative patterns are unsupported") for an absolute glob pattern. serve.sh mkdir'd the .deer-flow excludes but not backend/sandbox, so `make dev` crashed on startup on a fresh checkout under Python 3.12 (#3454). docker/dev-entrypoint.sh had the same latent gap. Create backend/sandbox in both launchers so every absolute exclude stays on uvicorn's is_dir() short-circuit. Add a regression test that pins the uvicorn mechanism (crash on missing dir, safe once created) and enforces that every absolute --reload-exclude is mkdir'd before launch. Closes #3459 * test(dev): harden reload-exclude invariant parser against false pass/negatives The launcher invariant test parsed shell with a "mkdir -p" line filter and a substring membership check. Two latent gaps (sub-threshold for this fix, but this code guards a user-facing startup path, so close them): - A `\`-continued multi-line `mkdir` would drop arguments on continuation lines, silently weakening coverage. - Substring membership could false-pass when an exclude is a path-prefix of a different created dir (e.g. `/app/backend/sandbox` "found" inside `/app/backend/sandbox-other`). Fold line-continuations, drop comments, and shlex-tokenize each `mkdir` argument list into an exact set (quotes stripped, `$VAR` literal); assert exact set membership. Same shlex handling for `--reload-exclude` values. Verified the parser still flags the pre-fix missing `backend/sandbox` (RED preserved) and no longer false-passes on a path-prefix. * fix(dev): gitignore backend/sandbox runtime dir + pin mkdir-before-launch Address two review findings on the #3459 fix: - backend/sandbox was described as "gitignored runtime state" but no ignore rule actually matched it. Add an anchored `/sandbox/` to backend/.gitignore (anchored so it does NOT shadow the source package backend/packages/harness/deerflow/sandbox/) so sandbox artifacts created at runtime can't pollute the working tree or be committed by accident. New test asserts content under backend/sandbox is ignored, making the claim verifiable. - The launcher invariant test only proved the sandbox mkdir exists somewhere, not that it runs before uvicorn starts. Add an order test (sandbox mkdir line must precede the `uv run uvicorn` launch) so a future edit can't move the mkdir below the launch and silently reintroduce the crash. * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * test(dev): fix reload-exclude parser to handle serve.sh's quoted flag bundle The previous autofix tokenized each whole line with shlex, but serve.sh packs every flag into a single double-quoted `GATEWAY_EXTRA_FLAGS="..."` assignment. shlex collapses that into one token, so no `--reload-exclude` flag is found and `test_launcher_precreates_every_absolute_reload_exclude[scripts/serve.sh]` failed CI with "expected at least one absolute reload-exclude". Parse `--reload-exclude` with a regex that matches a balanced single/double quoted group or a bare token, so the assignment's surrounding `"` is never swallowed into the value. This recovers all three serve.sh excludes (the prior regex also silently dropped the last `$BACKEND_RUNTIME_HOME` because the adjacent closing quote broke shlex) while still covering dev-entrypoint.sh and the space-separated `--reload-exclude <value>` form. --------- Co-authored-by: Willem Jiang <willem.jiang@gmail.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
100 lines
4.3 KiB
Bash
Executable File
100 lines
4.3 KiB
Bash
Executable File
#!/usr/bin/env sh
|
|
#
|
|
# DeerFlow gateway dev entrypoint — runs inside the docker-compose-dev gateway
|
|
# container. Extracted from docker/docker-compose-dev.yaml's inline `command:`
|
|
# (PR #2767, addressing review on Issue #2754).
|
|
#
|
|
# Responsibilities:
|
|
# 1. Resolve `--extra X` flags from UV_EXTRAS (comma- or whitespace-separated,
|
|
# mirroring scripts/detect_uv_extras.py for parity with local `make dev`).
|
|
# 2. Validate each extra against [A-Za-z][A-Za-z0-9_-]* so a stray shell
|
|
# metacharacter in `.env` cannot reach `uv sync`.
|
|
# 3. `uv sync --all-packages` so workspace member extras (deerflow-harness's
|
|
# postgres extra in particular) are installed — see PR #2584.
|
|
# 4. Self-heal: if the first sync fails, recreate .venv and retry once.
|
|
# 5. Hand off to uvicorn with reload, replacing this shell so uvicorn becomes
|
|
# PID 1 inside the container.
|
|
#
|
|
# Anchored at /bin/sh (not bash) since alpine-based base images may not ship
|
|
# bash. Uses POSIX-only constructs throughout.
|
|
|
|
set -e
|
|
|
|
# `--print-extras` is a dry-run hook: parse + validate UV_EXTRAS, print the
|
|
# resulting `--extra X` flags to stdout, and exit. Used by the unit test in
|
|
# backend/tests/test_dev_entrypoint.py and useful for ad-hoc debugging.
|
|
PRINT_EXTRAS_ONLY=0
|
|
if [ "${1:-}" = "--print-extras" ]; then
|
|
PRINT_EXTRAS_ONLY=1
|
|
fi
|
|
|
|
# Mirror the legacy command's behavior: redirect both stdout and stderr to the
|
|
# host-mounted log file (../logs/gateway.log → /app/logs/gateway.log). Skip
|
|
# the redirect under --print-extras so the test runner can capture stdout.
|
|
if [ "$PRINT_EXTRAS_ONLY" = "0" ]; then
|
|
exec >/app/logs/gateway.log 2>&1
|
|
fi
|
|
|
|
# ── Resolve extras ──────────────────────────────────────────────────────────
|
|
|
|
EXTRAS_FLAGS=""
|
|
if [ -n "${UV_EXTRAS:-}" ]; then
|
|
# Normalize comma → space, then split on whitespace via the unquoted `for`.
|
|
for raw in $(printf '%s' "$UV_EXTRAS" | tr ',' ' '); do
|
|
[ -z "$raw" ] && continue
|
|
# Reject anything that does not look like an identifier.
|
|
# Two patterns: leading non-letter, or any non-[A-Za-z0-9_-] character.
|
|
case "$raw" in
|
|
[!A-Za-z]* | *[!A-Za-z0-9_-]*)
|
|
echo "[startup] UV_EXTRAS entry '$raw' is invalid (must match [A-Za-z][A-Za-z0-9_-]*) — aborting" >&2
|
|
exit 1
|
|
;;
|
|
esac
|
|
EXTRAS_FLAGS="$EXTRAS_FLAGS --extra $raw"
|
|
done
|
|
fi
|
|
|
|
if [ "$PRINT_EXTRAS_ONLY" = "1" ]; then
|
|
# Trim leading space for tidier output, then exit.
|
|
printf '%s\n' "${EXTRAS_FLAGS# }"
|
|
exit 0
|
|
fi
|
|
|
|
if [ -n "$EXTRAS_FLAGS" ]; then
|
|
echo "[startup] uv extras:$EXTRAS_FLAGS"
|
|
fi
|
|
|
|
# Keep runtime-owned files out of uvicorn's reload watcher. Each excluded path
|
|
# must exist before uvicorn starts so watchfiles treats it as an excluded
|
|
# directory, not as a plain glob pattern — on Python 3.12, globbing an absolute
|
|
# pattern raises NotImplementedError and crashes startup (#3459 / #3454). That
|
|
# means `sandbox` must be created here too, not just `.deer-flow`.
|
|
: "${DEER_FLOW_HOME:=/app/backend/.deer-flow}"
|
|
export DEER_FLOW_HOME
|
|
mkdir -p "$DEER_FLOW_HOME" /app/backend/.deer-flow /app/backend/sandbox
|
|
|
|
# ── Sync dependencies (with self-heal) ──────────────────────────────────────
|
|
|
|
cd /app/backend
|
|
|
|
# `--all-packages` propagates extras into workspace members (PR #2584).
|
|
# `$EXTRAS_FLAGS` intentionally unquoted so each `--extra X` becomes its own arg.
|
|
# shellcheck disable=SC2086 # word-splitting is intentional here
|
|
if ! uv sync --all-packages $EXTRAS_FLAGS; then
|
|
echo "[startup] uv sync failed; recreating .venv and retrying once"
|
|
uv venv --allow-existing .venv
|
|
# shellcheck disable=SC2086
|
|
uv sync --all-packages $EXTRAS_FLAGS
|
|
fi
|
|
|
|
# ── Hand off to uvicorn ─────────────────────────────────────────────────────
|
|
|
|
PYTHONPATH=. exec uv run uvicorn app.gateway.app:app \
|
|
--host 0.0.0.0 --port 8001 \
|
|
--reload \
|
|
--reload-include='*.yaml' \
|
|
--reload-include='.env' \
|
|
--reload-exclude=/app/backend/sandbox \
|
|
--reload-exclude="$DEER_FLOW_HOME" \
|
|
--reload-exclude=/app/backend/.deer-flow
|