* ✨ Add dedicated concurrency limit for restore-file-snapshot
This adds a dedicated climit configuration for the restore-file-snapshot
RPC method with :permits 1 per profile (plus queue of 2 and 60s timeout)
and a global limit of 3. Previously the method only used the generic
root/by-profile and root/global limits, allowing up to 7 concurrent
restore operations per profile which caused database row lock contention
on FOR UPDATE and connection pool exhaustion.
* ✨ Skip locking on restore! to avoid blocking other operations
Changes the row lock acquisition in restore! from a blocking FOR UPDATE
to FOR UPDATE SKIP LOCKED. If the file row is already locked by another
concurrent operation (e.g., another restore or an update-file), the query
returns no rows and the caller fails fast with a clear conflict error
instead of blocking indefinitely holding a database connection.
* ✨ Add queue and timeout limits to root/by-profile concurrency limit
Previously root/by-profile had no queue limit (unbounded Integer/MAX_VALUE)
and no timeout, allowing requests to pile up indefinitely behind a profile
whose permits were exhausted by long-running operations. This could lead
to memory pressure and cascading failures. Now limited to 30 queued
requests with a 30-second timeout so excess requests fail fast.
* ✨ Move backup snapshot creation outside restore transaction
The backup snapshot (fsnap/create!) is now created in its own short-lived
connection before the actual restore transaction begins. This ensures the
backup is persisted independently of the restore outcome and reduces the
restore transaction window.
The restore itself runs inside a db/tx-run! block with an optimistic
locking check: it reads the file with FOR UPDATE and compares its revn
against the value captured at backup time. If the file was edited
concurrently, the restore aborts with a conflict error to prevent data
loss.
Co-dependent with the SKIP LOCKED change in restore! — the FOR UPDATE
acquired here is in the same transaction as restore!, so the SKIP LOCKED
inside restore! correctly sees the row as unlocked (same transaction).
* ♻️ Remove unused private function get-minimal-file
The local get-minimal-file function in file_snapshots.clj is no longer
used since restore! switched to direct exec-one! with FOR UPDATE SKIP
LOCKED. The sql:get-minimal-file SQL constant is still used directly.
* ✨ Add minor improvements on db connection management
* ♻️ Refactor create-file-snapshot to use explicit transaction management
Remove automatic transaction wrapping (`::db/transaction true`) and
pass `cfg` through the call chain instead of destructured `conn`.
Wrap `fsnap/create!` in an explicit `db/tx-run!` for clearer
transaction boundaries.
Signed-off-by: Andrey Antukh <niwi@niwi.nz>
* ✨ Add dedicated concurrency limit for create-file-snapshot
This adds a dedicated climit configuration for the create-file-snapshot
RPC method with :permits 1 per profile (plus queue of 2 and 60s timeout)
and a global limit of 3. Previously the method only used the generic
root/by-profile and root/global limits, allowing up to 10 concurrent
snapshot creation operations per profile which could cause database
contention and connection pool exhaustion.
Signed-off-by: Andrey Antukh <niwi@niwi.nz>
---------
Signed-off-by: Andrey Antukh <niwi@niwi.nz>
This upgrade also includes complete elimination of use spec
from the backend codebase, completing the long running migration
to fully use malli for validation and decoding.
The main objective is prevent deletion of objects that can leave
unreachable orphan objects which we are unable to correctly track.
Additionally, this commit includes:
1. Properly implement safe cascade deletion of all participating
tables on soft deletion in the objects-gc task;
2. Make the file thumbnail related tables also participate in the
touch/refcount mechanism applyign to the same safety checks;
3. Add helper for db query lazy iteration using PostgreSQL support
for server side cursors;
4. Fix efficiency issues on gc related task using server side
cursors instead of custom chunked iteration for processing data.
The problem resided when a large chunk of rows that has identical
value on the deleted_at column and the chunk size is small (the
default); when the custom chunked iteration only reads a first N
items and skip the rest of the set to the next run.
This has caused many objects to remain pending to be eliminated,
taking up space for longer than expected. The server side cursor
based iteration does not has this problem and iterates correctly
over all objects.
5. Fix refcount issues on font variant deletion RPC methods
Mainly move all pointer-map related helpers from app.rpc.commands.files
to the the app.features.fdata namespace and normalizes codestile around
feature handling on all affected code.
This commit also comes with several features related bugifxes on the
components-v2 migration code:
- properly migrate legacy feature names on apply components-v2 migration
- start using new fdata feature related functions
- prevent generation of a ephimeral pointer on each graphic migration
operation; on large files this caused a very noticiable overhead of
creating a big number of completly unused pointer maps
- do persistence after validation and not before
- makes the profile access more efficient (replace in-app joins to a
simple select query on profile table
- add partial support for access-tokens (still missing some RPC methods)
- move router definitions to specific modules and simplify the main http
module definitions to simple includes
- simplifiy authentication code related to access-tokens and sessions
- normalize db parameters with proper namespaced props
- more work on convert all modules initialization to use proper specs
with fully-qualified keyword config props