Introduce a purpose-agnostic three-step session-based upload API that
allows uploading large binary blobs (media files and .penpot imports)
without hitting multipart size limits.
Backend:
- Migration 0147: new `upload_session` table (profile_id, total_chunks,
created_at) with indexes on profile_id and created_at.
- Three new RPC commands in media.clj:
* `create-upload-session` – allocates a session row; enforces
`upload-sessions-per-profile` and `upload-chunks-per-session`
quota limits (configurable in config.clj, defaults 5 / 20).
* `upload-chunk` – stores each slice as a storage object;
validates chunk index bounds and profile ownership.
* `assemble-file-media-object` – reassembles chunks via the shared
`assemble-chunks!` helper and creates the final media object.
- `assemble-chunks!` is a public helper in media.clj shared by both
`assemble-file-media-object` and `import-binfile`.
- `import-binfile` (binfile.clj): accepts an optional `upload-id` param;
when provided, materialises the temp file from chunks instead of
expecting an inline multipart body, removing the 200 MiB body limit
on .penpot imports. Schema updated with an `:and` validator requiring
either `:file` or `:upload-id`.
- quotes.clj: new `upload-sessions-per-profile` quota check.
- Background GC task (`tasks/upload_session_gc.clj`): deletes stalled
(never-completed) sessions older than 1 hour; scheduled daily at
midnight via the cron system in main.clj.
- backend/AGENTS.md: document the background-task wiring pattern.
Frontend:
- New `app.main.data.uploads` namespace: generic `upload-blob-chunked`
helper drives steps 1–2 (create session + upload all chunks with a
concurrency cap of 2) and emits `{:session-id uuid}` for callers.
- `config.cljs`: expose `upload-chunk-size` (default 25 MiB, overridable
via `penpotUploadChunkSize` global).
- `workspace/media.cljs`: blobs ≥ chunk-size go through the chunked path
(`upload-blob-chunked` → `assemble-file-media-object`); smaller blobs
use the existing direct `upload-file-media-object` path.
`handle-media-error` simplified; `on-error` callback removed.
- `worker/import.cljs`: new `import-blob-via-upload` helper replaces the
inline multipart approach for both binfile-v1 and binfile-v3 imports.
- `repo.cljs`: `:upload-chunk` derived as a `::multipart-upload`;
`form-data?` removed from `import-binfile` (JSON params only).
Tests:
- Backend (rpc_media_test.clj): happy path, idempotency, permission
isolation, invalid media type, missing chunks, session-not-found,
chunk-index out-of-range, and quota-limit scenarios.
- Frontend (uploads_test.cljs): session creation and chunk-count
correctness for `upload-blob-chunked`.
- Frontend (workspace_media_test.cljs): direct-upload path for small
blobs, chunked path for large blobs, and chunk-count correctness for
`process-blobs`.
- `helpers/http.cljs`: shared fetch-mock helpers (`install-fetch-mock!`,
`make-json-response`, `make-transit-response`, `url->cmd`).
Signed-off-by: Andrey Antukh <niwi@niwi.nz>
Extend the telemetry payload with a sorted list of unique email domains
extracted from all registered profile email addresses. The new
:email-domains field is populated via a single SQL query using
split_part and DISTINCT, and is included in the stats sent when
telemetry is enabled.
Also update the tasks-telemetry-test to assert the new field is present
and contains the expected domain values.
This simplifes the mental model on how it works and simplifies testing
of the related code.
This also normalizes storage object deletion in the same way as the
rest of objects in penpot (now future deletion date on storage object
also means storage object to be deleted).
* ⚡ Improve efficiency on objects-gc tasks
Replacing db/cursor with db/plan
* 🐛 Use correct order of subprocs on objects-gc
Mainly affects the file deletion
* 📎 Increase chunk-size to 100 on objects-gc
This upgrade also includes complete elimination of use spec
from the backend codebase, completing the long running migration
to fully use malli for validation and decoding.
The main issue was the long running gc operation that
affects storage objects with deduplication. The long running
transacion ends locking some storage object rows which collaterally
made operations like import-binfile become blocked indefinitelly
because of the same rows (because of deduplication).
The solution used in this commit is split operations on small
chunks so we no longer use long running transactions that holds
too many locks. With this approach we will make a window to work
concurrently all operarate the distinct operations that requires
locks on the same rows.