Introduce a purpose-agnostic three-step session-based upload API that
allows uploading large binary blobs (media files and .penpot imports)
without hitting multipart size limits.
Backend:
- Migration 0147: new `upload_session` table (profile_id, total_chunks,
created_at) with indexes on profile_id and created_at.
- Three new RPC commands in media.clj:
* `create-upload-session` – allocates a session row; enforces
`upload-sessions-per-profile` and `upload-chunks-per-session`
quota limits (configurable in config.clj, defaults 5 / 20).
* `upload-chunk` – stores each slice as a storage object;
validates chunk index bounds and profile ownership.
* `assemble-file-media-object` – reassembles chunks via the shared
`assemble-chunks!` helper and creates the final media object.
- `assemble-chunks!` is a public helper in media.clj shared by both
`assemble-file-media-object` and `import-binfile`.
- `import-binfile` (binfile.clj): accepts an optional `upload-id` param;
when provided, materialises the temp file from chunks instead of
expecting an inline multipart body, removing the 200 MiB body limit
on .penpot imports. Schema updated with an `:and` validator requiring
either `:file` or `:upload-id`.
- quotes.clj: new `upload-sessions-per-profile` quota check.
- Background GC task (`tasks/upload_session_gc.clj`): deletes stalled
(never-completed) sessions older than 1 hour; scheduled daily at
midnight via the cron system in main.clj.
- backend/AGENTS.md: document the background-task wiring pattern.
Frontend:
- New `app.main.data.uploads` namespace: generic `upload-blob-chunked`
helper drives steps 1–2 (create session + upload all chunks with a
concurrency cap of 2) and emits `{:session-id uuid}` for callers.
- `config.cljs`: expose `upload-chunk-size` (default 25 MiB, overridable
via `penpotUploadChunkSize` global).
- `workspace/media.cljs`: blobs ≥ chunk-size go through the chunked path
(`upload-blob-chunked` → `assemble-file-media-object`); smaller blobs
use the existing direct `upload-file-media-object` path.
`handle-media-error` simplified; `on-error` callback removed.
- `worker/import.cljs`: new `import-blob-via-upload` helper replaces the
inline multipart approach for both binfile-v1 and binfile-v3 imports.
- `repo.cljs`: `:upload-chunk` derived as a `::multipart-upload`;
`form-data?` removed from `import-binfile` (JSON params only).
Tests:
- Backend (rpc_media_test.clj): happy path, idempotency, permission
isolation, invalid media type, missing chunks, session-not-found,
chunk-index out-of-range, and quota-limit scenarios.
- Frontend (uploads_test.cljs): session creation and chunk-count
correctness for `upload-blob-chunked`.
- Frontend (workspace_media_test.cljs): direct-upload path for small
blobs, chunked path for large blobs, and chunk-count correctness for
`process-blobs`.
- `helpers/http.cljs`: shared fetch-mock helpers (`install-fetch-mock!`,
`make-json-response`, `make-transit-response`, `url->cmd`).
Signed-off-by: Andrey Antukh <niwi@niwi.nz>
8.4 KiB
Penpot Backend – Agent Instructions
Clojure backend (RPC) service running on the JVM.
Uses Integrant for dependency injection, PostgreSQL for storage, and Redis for messaging/caching.
General Guidelines
To ensure consistency across the Penpot JVM stack, all contributions must adhere to these criteria:
1. Testing & Validation
-
Coverage: If code is added or modified in
src/, corresponding tests intest/backend_tests/must be added or updated. -
Execution:
- Isolated: Run
clojure -M:dev:test --focus backend-tests.my-ns-testfor the specific test namespace. - Regression: Run
clojure -M:dev:testto ensure the suite passes without regressions in related functional areas.
- Isolated: Run
2. Code Quality & Formatting
- Linting: All code must pass
clj-kondochecks (runpnpm run lint:clj) - Formatting: All the code must pass the formatting check (run
pnpm run check-fmt). Usepnpm run fmtto fix formatting issues. Avoid "dirty" diffs caused by unrelated whitespace changes. - Type Hinting: Use explicit JVM type hints (e.g.,
^String,^long) in performance-critical paths to avoid reflection overhead.
Code Conventions
Namespace Overview
The source is located under src directory and this is a general overview of
namespaces structure:
app.rpc.commands.*– RPC command implementations (auth,files,teams, etc.)app.http.*– HTTP routes and middlewareapp.db.*– Database layerapp.tasks.*– Background job tasksapp.main– Integrant system setup and entrypointapp.loggers– Internal loggers (auditlog, mattermost, etc.) (not to be confused withapp.common.logging)
RPC
The RPC methods are implemented using a multimethod-like structure via the
app.util.services namespace. The main RPC methods are collected under
app.rpc.commands namespace and exposed under /api/rpc/command/<cmd-name>.
The RPC method accepts POST and GET requests indistinctly and uses the Accept
header to negotiate the response encoding (which can be Transit — the default —
or plain JSON). It also accepts Transit (default) or JSON as input, which should
be indicated using the Content-Type header.
The main convention is: use get- prefix on RPC name when we want READ
operation.
Example of RPC method definition:
(sv/defmethod ::my-command
{::rpc/auth true ;; requires auth
::doc/added "1.18"
::sm/params [:map ...] ;; malli input schema
::sm/result [:map ...]} ;; malli output schema
[{:keys [::db/pool] :as cfg} {:keys [::rpc/profile-id] :as params}]
;; return a plain map or throw
{:id (uuid/next)})
Look under src/app/rpc/commands/*.clj to see more examples.
Tests
Test namespaces match .*-test$ under test/. Config is in tests.edn.
Integrant System
The src/app/main.clj declares the system map. Each key is a component; values
are config maps with ::ig/ref for dependencies. Components implement
ig/init-key / ig/halt-key!.
Connecting to the Database
Two PostgreSQL databases are used in this environment:
| Database | Purpose | Connection string |
|---|---|---|
penpot |
Development / app | postgresql://penpot:penpot@postgres/penpot |
penpot_test |
Test suite | postgresql://penpot:penpot@postgres/penpot_test |
Interactive psql session:
# development DB
psql "postgresql://penpot:penpot@postgres/penpot"
# test DB
psql "postgresql://penpot:penpot@postgres/penpot_test"
One-shot query (non-interactive):
psql "postgresql://penpot:penpot@postgres/penpot" -c "SELECT id, name FROM team LIMIT 5;"
Useful psql meta-commands:
\dt -- list all tables
\d <table> -- describe a table (columns, types, constraints)
\di -- list indexes
\q -- quit
Migrations table: Applied migrations are tracked in the
migrationstable with columnsmodule,step, andcreated_at. When renaming a migration logical name, update this table in both databases to match the new name; otherwise the runner will attempt to re-apply the migration on next startup.
# Example: fix a renamed migration entry in the test DB
psql "postgresql://penpot:penpot@postgres/penpot_test" \
-c "UPDATE migrations SET step = 'new-name' WHERE step = 'old-name';"
Database Access (Clojure)
app.db wraps next.jdbc. Queries use a SQL builder that auto-converts kebab-case ↔ snake_case.
;; Query helpers
(db/get cfg-or-pool :table {:id id}) ; fetch one row (throws if missing)
(db/get* cfg-or-pool :table {:id id}) ; fetch one row (returns nil)
(db/query cfg-or-pool :table {:team-id team-id}) ; fetch multiple rows
(db/insert! cfg-or-pool :table {:name "x" :team-id id}) ; insert
(db/update! cfg-or-pool :table {:name "y"} {:id id}) ; update
(db/delete! cfg-or-pool :table {:id id}) ; delete
;; Run multiple statements/queries on single connection
(db/run! cfg (fn [{:keys [::db/conn]}]
(db/insert! conn :table row1)
(db/insert! conn :table row2))
;; Transactions
(db/tx-run! cfg (fn [{:keys [::db/conn]}]
(db/insert! conn :table row)))
Almost all methods in the app.db namespace accept pool, conn, or
cfg as params.
Migrations live in src/app/migrations/ as numbered SQL files. They run automatically on startup.
Error Handling
The exception helpers are defined on Common module, and are available under
app.common.exceptions namespace.
Example of raising an exception:
(ex/raise :type :not-found
:code :object-not-found
:hint "File does not exist"
:file-id id)
Common types: :not-found, :validation, :authorization, :conflict, :internal.
Performance Macros (app.common.data.macros)
Always prefer these macros over their clojure.core equivalents — they provide
optimized implementations:
(dm/select-keys m [:a :b]) ;; faster than core/select-keys
(dm/get-in obj [:a :b :c]) ;; faster than core/get-in
(dm/str "a" "b" "c") ;; string concatenation
Configuration
src/app/config.clj reads PENPOT_* environment variables, validated with
Malli. Access anywhere via (cf/get :smtp-host). Feature flags: (cf/flags :enable-smtp).
Background Tasks
Background tasks live in src/app/tasks/. Each task is an Integrant component
that exposes a ::handler key and follows this three-method pattern:
(defmethod ig/assert-key ::handler ;; validate config at startup
[_ params]
(assert (db/pool? (::db/pool params)) "expected a valid database pool"))
(defmethod ig/expand-key ::handler ;; inject defaults before init
[k v]
{k (assoc v ::my-option default-value)})
(defmethod ig/init-key ::handler ;; return the task fn
[_ cfg]
(fn [_task] ;; receives the task row from the worker
(db/tx-run! cfg (fn [{:keys [::db/conn]}]
;; … do work …
))))
Wiring a new task requires two changes in src/app/main.clj:
- Handler config – add an entry in
system-configwith the dependencies:
:app.tasks.my-task/handler
{::db/pool (ig/ref ::db/pool)}
- Registry + cron – register the handler name and schedule it:
;; in ::wrk/registry ::wrk/tasks map:
:my-task (ig/ref :app.tasks.my-task/handler)
;; in worker-config ::wrk/cron ::wrk/entries vector:
{:cron #penpot/cron "0 0 0 * * ?" ;; daily at midnight
:task :my-task}
Useful cron patterns (Quartz format — six fields: s m h dom mon dow):
| Expression | Meaning |
|---|---|
"0 0 0 * * ?" |
Daily at midnight |
"0 0 */6 * * ?" |
Every 6 hours |
"0 */5 * * * ?" |
Every 5 minutes |
Time helpers (app.common.time):
(ct/now) ;; current instant
(ct/duration {:hours 1}) ;; java.time.Duration
(ct/minus (ct/now) some-duration) ;; subtract duration from instant
db/interval converts a Duration (or millis / string) to a PostgreSQL
interval object suitable for use in SQL queries:
(db/interval (ct/duration {:hours 1})) ;; → PGInterval "3600.0 seconds"