104 Commits

Author SHA1 Message Date
Andrey Antukh
7d4be33d4f 🎉 Add telemetry anonymous event collection (#9483)
* 🎉 Add telemetry anonymous event collection

Rewrite the audit logging subsystem to support three operating modes and
add anonymous telemetry event collection:

Modes:
- A (audit-log only): events persisted with full context
- B (audit-log + telemetry): same as A, plus events are collected for
  telemetry shipping
- C (telemetry-only): events stored anonymously with PII stripped,
  telemetry flag active, audit-log flag inactive

Audit system refactoring (app.loggers.audit):
- Replace qualified map keys (::audit/name etc.) with plain keywords
- Rename submit! -> submit, insert! -> insert, prepare-event ->
  prepare-rpc-event
- Add submit* as a lower-level public API
- Add process-event dispatch function that handles all three modes and
  webhooks in a single tx-run!
- Add :id to event schema (auto-generated if omitted)
- Add filter-telemetry-props: anonymises event props per event type.
  Keeps UUID/boolean/number values; for login/identify events preserves
  lang, auth-backend, email-domain; for navigate events preserves route,
  file-id, team-id, page-id; instance-start trigger passes through.
- Add filter-telemetry-context: retains only safe context keys.
  Backend: version, initiator, client-version, client-user-agent.
  Frontend: browser, os, locale, screen metrics, event-origin.
- Timestamps truncated to day precision via ct/truncate for telemetry
  storage
- PII stripped: props emptied, ip-addr zeroed, session-linking and
  access-token fields removed from context

Config (app.config):
- Derive :enable-telemetry flag from telemetry-enabled config option

Email utilities (app.email):
- Add email/clean and email/get-domain helper functions for domain
  extraction from email addresses

Setup (app.setup):
- Emit instance-start trigger event at system startup
- Simplify handle-instance-id (remove read-only check)

RPC layer (app.rpc):
- wrap-audit now activates when :telemetry flag is set
- Add :request-id to RPC params context for event correlation

RPC commands (management, teams_invitations, verify_token, OIDC auth,
webhooks): migrate all audit call sites to use the new plain-key API

SREPL (app.srepl.main):
- Migrate all audit/insert! calls to audit/insert with plain keys

Telemetry task (app.tasks.telemetry):
- Restructure legacy report into make-legacy-request; distinguish
  payload type as :telemetry-legacy-report
- Add collect-and-send-audit-events: loop fetching up to 10,000 rows
  per iteration, encodes and sends each page, deletes on success,
  stops immediately on failure for retry
- Add send-event-batch: POSTs fressian+zstd batch (base64 via
  blob/encode-str) to the telemetry endpoint with instance-id per event
- Add gc-telemetry-events: enforces 100,000-row safety cap by dropping
  oldest rows first
- Add delete-sent-events: deletes successfully shipped rows by id

Blob utilities (app.util.blob):
- Add encode-str/decode-str: combine fressian+zstd encoding with URL-
  safe base64 for JSON-safe string transport

Database:
- Add migration 0145: index on audit_log (source, created_at ASC) for
  efficient telemetry batch collection queries

Frontend:
- Always initialize event system regardless of :audit-log flag
- Defer auth events (signin identify) to after profile is set
- Refactor event subsystem for telemetry support

Tests (21 test vars, 94 assertions in tasks-telemetry-test):
- Cover all code paths: disabled/enabled telemetry, no-events no-op,
  happy-path batch send and delete, failure retention, payload anonymity,
  context stripping, timestamp day precision, batch encoding round-trip,
  multi-page iteration, GC cap enforcement, partial failure handling
- blob encode-str/decode-str round-trip tests (14 test vars)
- RPC audit integration tests (5 test vars)

Signed-off-by: Andrey Antukh <niwi@niwi.nz>

* 📎 Add pr feedback changes

---------

Signed-off-by: Andrey Antukh <niwi@niwi.nz>
2026-05-11 12:42:01 +02:00
Andrey Antukh
e5f9c1e863 🎉 Add chunked upload API for large media and binary files
Introduce a purpose-agnostic three-step session-based upload API that
allows uploading large binary blobs (media files and .penpot imports)
without hitting multipart size limits.

Backend:
- Migration 0147: new `upload_session` table (profile_id, total_chunks,
  created_at) with indexes on profile_id and created_at.
- Three new RPC commands in media.clj:
    * `create-upload-session`  – allocates a session row; enforces
      `upload-sessions-per-profile` and `upload-chunks-per-session`
      quota limits (configurable in config.clj, defaults 5 / 20).
    * `upload-chunk`           – stores each slice as a storage object;
      validates chunk index bounds and profile ownership.
    * `assemble-file-media-object` – reassembles chunks via the shared
      `assemble-chunks!` helper and creates the final media object.
- `assemble-chunks!` is a public helper in media.clj shared by both
  `assemble-file-media-object` and `import-binfile`.
- `import-binfile` (binfile.clj): accepts an optional `upload-id` param;
  when provided, materialises the temp file from chunks instead of
  expecting an inline multipart body, removing the 200 MiB body limit
  on .penpot imports.  Schema updated with an `:and` validator requiring
  either `:file` or `:upload-id`.
- quotes.clj: new `upload-sessions-per-profile` quota check.
- Background GC task (`tasks/upload_session_gc.clj`): deletes stalled
  (never-completed) sessions older than 1 hour; scheduled daily at
  midnight via the cron system in main.clj.
- backend/AGENTS.md: document the background-task wiring pattern.

Frontend:
- New `app.main.data.uploads` namespace: generic `upload-blob-chunked`
  helper drives steps 1–2 (create session + upload all chunks with a
  concurrency cap of 2) and emits `{:session-id uuid}` for callers.
- `config.cljs`: expose `upload-chunk-size` (default 25 MiB, overridable
  via `penpotUploadChunkSize` global).
- `workspace/media.cljs`: blobs ≥ chunk-size go through the chunked path
  (`upload-blob-chunked` → `assemble-file-media-object`); smaller blobs
  use the existing direct `upload-file-media-object` path.
  `handle-media-error` simplified; `on-error` callback removed.
- `worker/import.cljs`: new `import-blob-via-upload` helper replaces the
  inline multipart approach for both binfile-v1 and binfile-v3 imports.
- `repo.cljs`: `:upload-chunk` derived as a `::multipart-upload`;
  `form-data?` removed from `import-binfile` (JSON params only).

Tests:
- Backend (rpc_media_test.clj): happy path, idempotency, permission
  isolation, invalid media type, missing chunks, session-not-found,
  chunk-index out-of-range, and quota-limit scenarios.
- Frontend (uploads_test.cljs): session creation and chunk-count
  correctness for `upload-blob-chunked`.
- Frontend (workspace_media_test.cljs): direct-upload path for small
  blobs, chunked path for large blobs, and chunk-count correctness for
  `process-blobs`.
- `helpers/http.cljs`: shared fetch-mock helpers (`install-fetch-mock!`,
  `make-json-response`, `make-transit-response`, `url->cmd`).

Signed-off-by: Andrey Antukh <niwi@niwi.nz>
2026-04-21 18:51:10 +00:00
Andrey Antukh
3a39676969 Backport MCP from staging (part 1) 2026-04-20 19:37:02 +02:00
Andrey Antukh
ca4d00df69 🐛 Fix latest error report related migration 2026-02-04 15:36:07 +01:00
Andrey Antukh
d80ba1856a
Add several improvements to frontend error reporting
*  Add major improvement on error handling

*  Add the ability to store frontend reports

* 📎 Add PR feedback changes
2026-02-04 12:45:38 +01:00
Andrey Antukh
363b4e3778
♻️ Make the SSO code more modular (#7575)
* 📎 Disable by default social auth on devenv

* 🎉 Add the ability to import profile picture from SSO provider

* 📎 Add srepl helper for insert custom sso config

* 🎉 Add custom SSO auth flow
2025-11-12 12:49:10 +01:00
Andrey Antukh
5717708b56 ♻️ Refactor file storage
Make it more scallable and make it easily extensible
2025-10-13 12:24:05 +02:00
Andrey Antukh
85c1750706
🐛 Fix backend last migration naming (#7333) 2025-09-17 10:47:14 +02:00
Pablo Alba
06441063f2 Add "advanced" events to variants 2025-09-08 15:33:14 +02:00
Laurie Crean
0b47a366ab Implement version locking functionality for file snapshots
Signed-off-by: Laurie Crean <lmcrean@gmail.com>
2025-08-01 11:41:30 +02:00
Andrey Antukh
019bc2f183 Add migrations handling on file snapshots 2025-07-24 11:40:54 +02:00
Andrey Antukh
893f19fa5e Remove automatic cascade on file_change table fk constraint 2025-02-21 14:24:07 +01:00
Andrey Antukh
f871f88f30
♻️ Refactor file data migrations subsystem (#5692)
* ♻️ Refactor file data migrations subsystem

* 📎 Add backend scripts/run helper script
2025-01-31 13:37:41 +01:00
alonso.torres
b1dda02b47 Add mentions to notifications 2025-01-09 11:55:53 +01:00
Pablo Alba
cbc92e9f1e Add created-by to invitations, and an event related 2024-11-11 17:00:54 +01:00
Andrey Antukh
32126d1874 ♻️ Refactor file changes gc tasks
Make it more friendly with the current snapshoting mechanism
2024-10-30 13:39:38 +01:00
alonso.torres
ecb7f0a2f6 File history versions management 2024-10-29 14:23:35 +01:00
Eva Marco
043c4105db Add viewer only mode on webhook 2024-10-15 13:38:46 +02:00
Andrey Antukh
50df2279a7 🐛 Make the media cleaning on file-gc task aware of snapshots
It now takes in account the snapshots, and prevents
deletion of media files used in snapshots.
2024-09-03 14:50:17 +02:00
Andrey Antukh
ceaafdbb1c Add offload mechanism for file snapshots 2024-08-26 13:52:42 +02:00
Andrey Antukh
215148ca81 Add better fillfactor setting for storage_object and task tables 2024-08-23 15:09:58 +02:00
Pablo Alba
6169f5c2e8 🎉 New oops page with login and request access 2024-08-14 15:32:04 +02:00
Andrey Antukh
3219c150d4 Add better internal fillfactor setting for file table
Increasing the change for HOT updates on db for this heavy-update
table
2024-08-09 14:28:18 +02:00
Andrey Antukh
ba167f256b Add performance enhancements on telemetry related queries 2024-08-09 14:28:18 +02:00
Andrey Antukh
0e92bcc0de 🎉 Add file-data offload mechanism 2024-08-09 14:28:18 +02:00
Andrey Antukh
86a732600b Add naming consistency changes for file_data_fragment table 2024-08-07 16:34:39 +02:00
Andrey Antukh
763fc3532e Simplify local audit table
Remove unnecessary partitioning
2024-03-25 17:58:39 +01:00
Andrey Antukh
b718a282e0 ♻️ Add minor refactor to file migrations
Relevant changes:

- Add the ability to create migration in both directions, defaulting
  to identity if not provided
- Move the version attribute to file table column for to make it more
  accessible (previously it was on data blob)
- Reduce db update operations on file-update rpc method
2024-02-19 09:20:47 +01:00
Andrey Antukh
a71e7f7906 Remove partitioning from task table
Which causes strange random delays when some row is moved from one
partition to other. Also, there are evidences that partitioning is
not aporting real value here.
2024-02-06 17:23:18 +01:00
Andrey Antukh
addb392ecc Add safety mechanism for direct object deletion
The main objective is prevent deletion of objects that can leave
unreachable orphan objects which we are unable to correctly track.

Additionally, this commit includes:

1. Properly implement safe cascade deletion of all participating
   tables on soft deletion in the objects-gc task;

2. Make the file thumbnail related tables also participate in the
   touch/refcount mechanism applyign to the same safety checks;

3. Add helper for db query lazy iteration using PostgreSQL support
   for server side cursors;

4. Fix efficiency issues on gc related task using server side
   cursors instead of custom chunked iteration for processing data.

   The problem resided when a large chunk of rows that has identical
   value on the deleted_at column and the chunk size is small (the
   default); when the custom chunked iteration only reads a first N
   items and skip the rest of the set to the next run.

   This has caused many objects to remain pending to be eliminated,
   taking up space for longer than expected. The server side cursor
   based iteration does not has this problem and iterates correctly
   over all objects.

5. Fix refcount issues on font variant deletion RPC methods
2024-01-03 10:56:57 +01:00
Andrey Antukh
6d49e1cac5 🐛 Add missing index on file_tagged_object_thumbnail media_id field 2023-11-24 10:41:27 +01:00
Aitor
7951350762 🐛 Fix db table tagged thumbnails 2023-11-07 16:01:00 +01:00
Andrey Antukh
6f93b41920 🎉 Add features assignation for teams 2023-11-07 12:48:31 +01:00
Aitor
c28c55bf0b 🎉 Add tag property to thumbnails 2023-11-02 11:06:30 +01:00
Andrey Antukh
80bf7cc1e5 Merge remote-tracking branch 'origin/staging' into develop 2023-08-07 12:59:17 +02:00
Andrey Antukh
1190cf837b Add an internal approach to prevent xlog gc to remove file changes 2023-08-03 16:40:42 +02:00
Andrey Antukh
1bfc28f63d Add missing index on server_error_report table 2023-08-02 13:43:53 +02:00
Andrey Antukh
e8ffcbae69 🎉 Add support for multipart upload of thumbnails
and improve the thumbnails storage to offloading it
to the storage subsystem
2023-05-05 17:00:35 +02:00
Alejandro Alonso
890583a13a Add mvp access-token support 2023-05-04 22:14:55 +02:00
Andrey Antukh
bb055a3c84 ♻️ Refactor logging subsystem and error reporting 2023-02-02 13:38:04 +01:00
Andrey Antukh
fa17ce5d40 📎 Avoid email index change on profile indexes migration 2023-01-23 09:56:21 +01:00
Andrey Antukh
db689d151e ♻️ Refactor profile and session handling
- makes the profile access more efficient (replace in-app joins to a
  simple select query on profile table
- add partial support for access-tokens (still missing some RPC methods)
- move router definitions to specific modules and simplify the main http
  module definitions to simple includes
- simplifiy authentication code related to access-tokens and sessions
- normalize db parameters with proper namespaced props
- more work on convert all modules initialization to use proper specs
  with fully-qualified keyword config props
2023-01-18 10:51:58 +01:00
Andrey Antukh
bafe3ec087 Revert some changes related to admin that are no longer necessary 2023-01-13 10:19:39 +01:00
Andrey Antukh
73a3e0c0ae 🎉 Add usage quotes 2022-12-31 11:22:36 +01:00
Andrey Antukh
b929564fa7 ♻️ Add admin facilities on the code base
- Fix bugs related to orphan teams on profile deletion
- Separate session based profile-id param from api user provided
2022-12-22 16:42:45 +01:00
Andrey Antukh
d7459db292 🎉 Add task deduplication by label 2022-12-13 23:13:11 +01:00
Andrey Antukh
5b9f0ed0b1 🎉 Add webhook processing worker 2022-12-13 16:17:31 +01:00
Andrey Antukh
39b9daa3a7 🎉 Add webhooks rpc API 2022-12-05 15:20:29 +01:00
Andrey Antukh
76333cec26 🎉 Integrate storage/pointer-map file feature 2022-11-08 13:02:14 +01:00
Andrey Antukh
951b3eb4fe Integrate objects-map and introduce file feature flags 2022-10-18 15:49:18 +02:00