When the :telemetry flag is ON and :audit-log is OFF, frontend and
backend events are stored anonymously in the audit_log table and
shipped in compressed batches by the existing telemetry task.
Stored rows strip props and ip-addr but preserve the profile-id, since
Penpot profile UUIDs are already anonymous random identifiers with no
PII attached. Timestamps are truncated to day precision to avoid leaking
exact event timing. Only a safe subset of context fields is preserved:
- Backend events: initiator, version, client-version, client-user-agent
- Frontend events: browser, os, locale, screen metrics and event-origin
Backend (app.loggers.audit):
- Store backend telemetry events with source='telemetry', the safe
context subset described above, and timestamps truncated to day
precision via ct/truncate.
Frontend RPC (app.rpc.commands.audit):
- Add filter-safe-context to retain only the allowed frontend context
fields.
- Add xf:map-telemetry-event-row transducer that anonymises frontend
events before inserting them.
- push-audit-events now accepts events when telemetry is active.
Telemetry task (app.tasks.telemetry):
- gc-telemetry-events: enforces a 100,000-row safety cap by dropping
the oldest rows first.
- collect-and-send-audit-events: loop that fetches up to 10,000 rows
per iteration, encodes and sends each page, deletes it on success,
and stops immediately on failure leaving remaining rows for retry.
- send-event-batch: POSTs a fressian+zstd batch (base64-encoded via
blob/encode-str) to the telemetry endpoint, including instance-id
and profile-id per event.
- delete-sent-events: deletes successfully shipped rows by id.
Blob utilities (app.util.blob):
- Add blob/encode-str and blob/decode-str: convenience wrappers that
combine blob encoding with base64 for JSON-safe string transport.
Database:
- Add index on audit_log (source, created_at ASC) to support efficient
queries for telemetry batch collection.
Tests (backend-tests.tasks-telemetry-test):
- 21 tests, 94 assertions covering all code paths: disabled/enabled
telemetry, no-events no-op, happy-path batch send and delete, failure
retention, payload anonymity, context stripping, timestamp day
precision, batch encoding round-trip, multi-page iteration, GC cap
enforcement.
Signed-off-by: Andrey Antukh <niwi@niwi.nz>
The current binfile export process uses a streaming technique. The
major problem with the streaming approach is the case when an error
happens on the middle of generation, because we have no way to
notify the user about the error (because the response is already
is sent and contents are streaming directly to the user
client/browser).
This commit replaces the streaming with temporal files and SSE
encoded response for emit the export progress events; once the
exportation is finished, a temporal uri to the exported artifact
is emited to the user via "end" event and the frontend code
will automatically trigger the download.
Using the SSE approach removes possible transport timeouts on export
large files by sending progress data over the open connection.
This commit also removes obsolete code related to old binfile
formats.
Replace general usage of virtual threads with platform threads
and use virtual threads for lightweight procs such that websocket
connections. This decision is made mainly because virtual threads
does not appear on thread dumps in an easy way so debugging issues
becomes very difficult.
The threads requirement of penpot for serving http requests
is not very big so having so this decision does not really affects
the resource usage.
This upgrade also includes complete elimination of use spec
from the backend codebase, completing the long running migration
to fully use malli for validation and decoding.
This commit also comes with:
- a fix for incorrect conflict handling on team access request creation
- a fix for incorrect handling of file-data when it is offloaded
- replace some inneficient queries with effcient ones
- remove redundant validation on creation of request-access
this allows almost all api operations to success usin application/json
encoding with the exception of the update-file, which we need to
approach a bit differently;
the reason update-file is different, is because the operations vector
is right now defined without the context of shape type, so we are just
unable to properly parse the value to correct type using the schema
decoding mechanism
Also optimizes some functions for faster shape and rect props
access (there is still a lot of work ahead optimizing the rest of
the functions)
Also normalizes shape creation and validation for ensuring
correct setup of all the mandatory properties.
Mainly the followin changes:
- Pass majority of code to the old and plain synchronous style
and start using virtual threads for the RPC (and partially some
HTTP server middlewares).
- Make some improvements on how CLIMIT is handled, simplifying code
- Improve considerably performance reducing the reflection and
unnecesary funcion calls on the whole stack-trace of an RPC call.
- Improve efficiency reducing considerably the total threads number.
- makes the profile access more efficient (replace in-app joins to a
simple select query on profile table
- add partial support for access-tokens (still missing some RPC methods)
- move router definitions to specific modules and simplify the main http
module definitions to simple includes
- simplifiy authentication code related to access-tokens and sessions
- normalize db parameters with proper namespaced props
- more work on convert all modules initialization to use proper specs
with fully-qualified keyword config props