penpot/frontend/src/app/main/data/uploads.cljs
Andrey Antukh 6fa440cf92 🎉 Add chunked upload API for large media and binary files
Introduce a purpose-agnostic three-step session-based upload API that
allows uploading large binary blobs (media files and .penpot imports)
without hitting multipart size limits.

Backend:
- Migration 0147: new `upload_session` table (profile_id, total_chunks,
  created_at) with indexes on profile_id and created_at.
- Three new RPC commands in media.clj:
    * `create-upload-session`  – allocates a session row; enforces
      `upload-sessions-per-profile` and `upload-chunks-per-session`
      quota limits (configurable in config.clj, defaults 5 / 20).
    * `upload-chunk`           – stores each slice as a storage object;
      validates chunk index bounds and profile ownership.
    * `assemble-file-media-object` – reassembles chunks via the shared
      `assemble-chunks!` helper and creates the final media object.
- `assemble-chunks!` is a public helper in media.clj shared by both
  `assemble-file-media-object` and `import-binfile`.
- `import-binfile` (binfile.clj): accepts an optional `upload-id` param;
  when provided, materialises the temp file from chunks instead of
  expecting an inline multipart body, removing the 200 MiB body limit
  on .penpot imports.  Schema updated with an `:and` validator requiring
  either `:file` or `:upload-id`.
- quotes.clj: new `upload-sessions-per-profile` quota check.
- Background GC task (`tasks/upload_session_gc.clj`): deletes stalled
  (never-completed) sessions older than 1 hour; scheduled daily at
  midnight via the cron system in main.clj.
- backend/AGENTS.md: document the background-task wiring pattern.

Frontend:
- New `app.main.data.uploads` namespace: generic `upload-blob-chunked`
  helper drives steps 1–2 (create session + upload all chunks with a
  concurrency cap of 2) and emits `{:session-id uuid}` for callers.
- `config.cljs`: expose `upload-chunk-size` (default 25 MiB, overridable
  via `penpotUploadChunkSize` global).
- `workspace/media.cljs`: blobs ≥ chunk-size go through the chunked path
  (`upload-blob-chunked` → `assemble-file-media-object`); smaller blobs
  use the existing direct `upload-file-media-object` path.
  `handle-media-error` simplified; `on-error` callback removed.
- `worker/import.cljs`: new `import-blob-via-upload` helper replaces the
  inline multipart approach for both binfile-v1 and binfile-v3 imports.
- `repo.cljs`: `:upload-chunk` derived as a `::multipart-upload`;
  `form-data?` removed from `import-binfile` (JSON params only).

Tests:
- Backend (rpc_media_test.clj): happy path, idempotency, permission
  isolation, invalid media type, missing chunks, session-not-found,
  chunk-index out-of-range, and quota-limit scenarios.
- Frontend (uploads_test.cljs): session creation and chunk-count
  correctness for `upload-blob-chunked`.
- Frontend (workspace_media_test.cljs): direct-upload path for small
  blobs, chunked path for large blobs, and chunk-count correctness for
  `process-blobs`.
- `helpers/http.cljs`: shared fetch-mock helpers (`install-fetch-mock!`,
  `make-json-response`, `make-transit-response`, `url->cmd`).

Signed-off-by: Andrey Antukh <niwi@niwi.nz>
2026-04-16 19:43:57 +02:00

71 lines
2.9 KiB
Clojure
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

;; This Source Code Form is subject to the terms of the Mozilla Public
;; License, v. 2.0. If a copy of the MPL was not distributed with this
;; file, You can obtain one at http://mozilla.org/MPL/2.0/.
;;
;; Copyright (c) KALEIDOS INC
(ns app.main.data.uploads
"Generic chunked-upload helpers.
Provides a purpose-agnostic three-step session API that can be used
by any feature that needs to upload large binary blobs:
1. create-upload-session obtain a session-id
2. upload-chunk upload each slice (max-parallel-chunk-uploads in-flight)
3. caller-specific step e.g. assemble-file-media-object or import-binfile
`upload-blob-chunked` drives steps 1 and 2 and emits the completed
`{:session-id …}` map so that the caller can proceed with its own
step 3."
(:require
[app.common.data.macros :as dm]
[app.common.uuid :as uuid]
[app.config :as cf]
[app.main.repo :as rp]
[beicon.v2.core :as rx]))
;; Size of each upload chunk in bytes. Reads the penpotUploadChunkSize global
;; variable at startup; defaults to 25 MiB (overridden in production).
(def ^:private chunk-size cf/upload-chunk-size)
(def ^:private max-parallel-chunk-uploads
"Maximum number of chunk upload requests that may be in-flight at the
same time within a single chunked upload session."
2)
(defn upload-blob-chunked
"Uploads `blob` via the three-step chunked session API.
Steps performed:
1. Creates an upload session (`create-upload-session`).
2. Slices `blob` and uploads every chunk (`upload-chunk`),
with at most `max-parallel-chunk-uploads` concurrent requests.
Returns an observable that emits exactly one map:
`{:session-id <uuid>}`
The caller is responsible for the final step (assemble / import)."
[blob]
(let [total-size (.-size blob)
total-chunks (js/Math.ceil (/ total-size chunk-size))]
(->> (rp/cmd! :create-upload-session
{:total-chunks total-chunks})
(rx/mapcat
(fn [{raw-session-id :session-id}]
(let [session-id (cond-> raw-session-id
(string? raw-session-id) uuid/uuid)
chunk-uploads
(->> (range total-chunks)
(map (fn [idx]
(let [start (* idx chunk-size)
end (min (+ start chunk-size) total-size)
chunk (.slice blob start end)]
(rp/cmd! :upload-chunk
{:session-id session-id
:index idx
:content (list chunk (dm/str "chunk-" idx))})))))]
(->> (rx/from chunk-uploads)
(rx/merge-all max-parallel-chunk-uploads)
(rx/last)
(rx/map (fn [_] {:session-id session-id})))))))))