Introduce a purpose-agnostic three-step session-based upload API that
allows uploading large binary blobs (media files and .penpot imports)
without hitting multipart size limits.
Backend:
- Migration 0147: new `upload_session` table (profile_id, total_chunks,
created_at) with indexes on profile_id and created_at.
- Three new RPC commands in media.clj:
* `create-upload-session` – allocates a session row; enforces
`upload-sessions-per-profile` and `upload-chunks-per-session`
quota limits (configurable in config.clj, defaults 5 / 20).
* `upload-chunk` – stores each slice as a storage object;
validates chunk index bounds and profile ownership.
* `assemble-file-media-object` – reassembles chunks via the shared
`assemble-chunks!` helper and creates the final media object.
- `assemble-chunks!` is a public helper in media.clj shared by both
`assemble-file-media-object` and `import-binfile`.
- `import-binfile` (binfile.clj): accepts an optional `upload-id` param;
when provided, materialises the temp file from chunks instead of
expecting an inline multipart body, removing the 200 MiB body limit
on .penpot imports. Schema updated with an `:and` validator requiring
either `:file` or `:upload-id`.
- quotes.clj: new `upload-sessions-per-profile` quota check.
- Background GC task (`tasks/upload_session_gc.clj`): deletes stalled
(never-completed) sessions older than 1 hour; scheduled daily at
midnight via the cron system in main.clj.
- backend/AGENTS.md: document the background-task wiring pattern.
Frontend:
- New `app.main.data.uploads` namespace: generic `upload-blob-chunked`
helper drives steps 1–2 (create session + upload all chunks with a
concurrency cap of 2) and emits `{:session-id uuid}` for callers.
- `config.cljs`: expose `upload-chunk-size` (default 25 MiB, overridable
via `penpotUploadChunkSize` global).
- `workspace/media.cljs`: blobs ≥ chunk-size go through the chunked path
(`upload-blob-chunked` → `assemble-file-media-object`); smaller blobs
use the existing direct `upload-file-media-object` path.
`handle-media-error` simplified; `on-error` callback removed.
- `worker/import.cljs`: new `import-blob-via-upload` helper replaces the
inline multipart approach for both binfile-v1 and binfile-v3 imports.
- `repo.cljs`: `:upload-chunk` derived as a `::multipart-upload`;
`form-data?` removed from `import-binfile` (JSON params only).
Tests:
- Backend (rpc_media_test.clj): happy path, idempotency, permission
isolation, invalid media type, missing chunks, session-not-found,
chunk-index out-of-range, and quota-limit scenarios.
- Frontend (uploads_test.cljs): session creation and chunk-count
correctness for `upload-blob-chunked`.
- Frontend (workspace_media_test.cljs): direct-upload path for small
blobs, chunked path for large blobs, and chunk-count correctness for
`process-blobs`.
- `helpers/http.cljs`: shared fetch-mock helpers (`install-fetch-mock!`,
`make-json-response`, `make-transit-response`, `url->cmd`).
Signed-off-by: Andrey Antukh <niwi@niwi.nz>
Replace general usage of virtual threads with platform threads
and use virtual threads for lightweight procs such that websocket
connections. This decision is made mainly because virtual threads
does not appear on thread dumps in an easy way so debugging issues
becomes very difficult.
The threads requirement of penpot for serving http requests
is not very big so having so this decision does not really affects
the resource usage.
this allows almost all api operations to success usin application/json
encoding with the exception of the update-file, which we need to
approach a bit differently;
the reason update-file is different, is because the operations vector
is right now defined without the context of shape type, so we are just
unable to properly parse the value to correct type using the schema
decoding mechanism
The climit previously of this commit is heavily used inside a
transactions, so in heavy contention operation such that file thumbnail
creation can cause a db pool exhaust.
This commit fixes this issue setting up a better resource limiting
mechanism that works outside the transactions so, contention will
no longer hold an open connection/transaction.
It also adds general improvement to the traceability to the climit
mechanism: it now properly logs the profile-id that is currently
cause some contention on specific resources.
It also add a general/root climit that is applied to all requests
so if someone start making abussive requests, we can clearly detect
it.
The main objective is prevent deletion of objects that can leave
unreachable orphan objects which we are unable to correctly track.
Additionally, this commit includes:
1. Properly implement safe cascade deletion of all participating
tables on soft deletion in the objects-gc task;
2. Make the file thumbnail related tables also participate in the
touch/refcount mechanism applyign to the same safety checks;
3. Add helper for db query lazy iteration using PostgreSQL support
for server side cursors;
4. Fix efficiency issues on gc related task using server side
cursors instead of custom chunked iteration for processing data.
The problem resided when a large chunk of rows that has identical
value on the deleted_at column and the chunk size is small (the
default); when the custom chunked iteration only reads a first N
items and skip the rest of the set to the next run.
This has caused many objects to remain pending to be eliminated,
taking up space for longer than expected. The server side cursor
based iteration does not has this problem and iterates correctly
over all objects.
5. Fix refcount issues on font variant deletion RPC methods
Mainly the followin changes:
- Pass majority of code to the old and plain synchronous style
and start using virtual threads for the RPC (and partially some
HTTP server middlewares).
- Make some improvements on how CLIMIT is handled, simplifying code
- Improve considerably performance reducing the reflection and
unnecesary funcion calls on the whole stack-trace of an RPC call.
- Improve efficiency reducing considerably the total threads number.
- makes the profile access more efficient (replace in-app joins to a
simple select query on profile table
- add partial support for access-tokens (still missing some RPC methods)
- move router definitions to specific modules and simplify the main http
module definitions to simple includes
- simplifiy authentication code related to access-tokens and sessions
- normalize db parameters with proper namespaced props
- more work on convert all modules initialization to use proper specs
with fully-qualified keyword config props