394 Commits

Author SHA1 Message Date
Pablo Alba
6e61e3304b
Add and endpoint for nitrate to check the SSO configuration for an organization (#10432) 2026-06-26 11:38:18 +02:00
Juanfran
d328cb4a9e
Enable org owners to view organization teams (#10388) 2026-06-25 13:07:20 +02:00
Pablo Alba
2a5b6a69ad
Send warning for email about nitrate orgs with sso (#10413) 2026-06-25 09:53:07 +02:00
Andrey Antukh
9e52bb40d0
Add process-level resource limits to font processing tools (#10274)
*  Add font processing resource limits via prlimit

Font processing tools (fontforge, sfnt2woff, woff2sfnt, woff2_decompress)
were invoked via clojure.java.shell/sh with no timeouts or resource limits.
This adds process-level resource limits using prlimit(1) and the shell/exec!
infrastructure from the ImageMagick hardening work.

shell/exec! changes:
- Add :prlimit parameter that prepends prlimit(1) to the command
- :prlimit takes {:mem <MiB> :cpu <seconds>} for address space and CPU time
  limits, enforced by the kernel's RLIMIT subsystem
- prlimit-cmd builds the prlimit command prefix (private helper)

Font processing changes:
- Replace all clojure.java.shell/sh calls with shell/exec! via exec-font!
- exec-font! applies font-prlimit (512 MiB, 30s CPU, 60s wall-clock)
- All 5 conversion functions (ttf->otf, otf->ttf, ttf-or-otf->woff,
  woff->sfnt, woff2->sfnt) use try/finally for explicit temp file cleanup
- Remove clojure.java.shell require from media.clj

Tests:
- Add exec-prlimit-normal, exec-prlimit-cpu, exec-prlimit-memory tests

Closes #10234

Co-authored-by: mimo-v2.5-pro <mimo-v2.5-pro@penpot.app>

*  Make font processing resource limits configurable

Replace hardcoded font-prlimit map and wall-clock timeout with
config-driven values under the PENPOT_FONT_PROCESS_* namespace.
The prlimit implementation detail is not exposed in config keys.

Co-authored-by: deepseek-v4-flash <deepseek-v4-flash@penpot.app>

---------

Co-authored-by: mimo-v2.5-pro <mimo-v2.5-pro@penpot.app>
Co-authored-by: deepseek-v4-flash <deepseek-v4-flash@penpot.app>
2026-06-19 11:30:48 +02:00
Andrey Antukh
b4532486e3
Add configurable resource usage limits for imagemagick (#10240)
* 🐳 Add ImageMagick policy.xml resource limits to backend Docker image

Add a restrictive policy.xml to the backend Docker image that caps
ImageMagick resource usage: 256MiB memory, 512MiB map, 128MP area,
30s time limit, 16KP max dimensions. Blocks PS/EPS/PDF/XPS coders
to prevent Ghostscript attack surface.

Co-authored-by: mimo-v2.5-pro <mimo-v2.5-pro@penpot.app>

*  Add timeout support to shell/exec!

Add optional :timeout parameter (in seconds) that uses
Process.waitFor(long, TimeUnit). On timeout, the process is
destroyed forcibly and an :internal/:process-timeout exception
is raised. Stdout/stderr readers handle IOException from closed
streams when the process is killed.

Co-authored-by: mimo-v2.5-pro <mimo-v2.5-pro@penpot.app>

* ♻️ Rename ::wrk/netty-executor to ::wrk/executor with cached pool

Replace DefaultEventExecutorGroup (fixed Netty thread pool) with a
cached thread pool (px/cached-executor) for general async task
offloading. The cached pool creates threads on demand and reuses
idle ones, which is more appropriate for blocking I/O workloads
(shell commands, message bus, rate limiting, etc.).

Changes:
- Rename ::wrk/netty-executor to ::wrk/executor in worker/executor.clj
- Switch implementation from DefaultEventExecutorGroup to px/cached-executor
- Update all ig/ref wiring in main.clj (msgbus, tmp cleaner, climit, rlimit, rpc)
- Remove ::wrk/netty-executor from redis.clj (let lettuce create its own
  eventExecutorGroup instead of sharing a Netty executor)
- Assert executor is present in shell/exec! to prevent silent nil usage
- Remove executor-threads config (no longer needed for cached pool)

The ::wrk/netty-io-executor (NioEventLoopGroup) remains unchanged as it
handles actual non-blocking network I/O for Redis and S3.

Co-authored-by: mimo-v2.5-pro <mimo-v2.5-pro@penpot.app>

* 🔥 Remove im4java dependency and replace with direct ImageMagick CLI calls

- Replace im4java Java library with direct 'magick' CLI calls via shell/exec!
- Add PENPOT_IMAGEMAGICK_* config env vars for resource limits (thread, memory, map, area, disk, time, width, height)
- Use configurable ImageMagick environment with sensible defaults matching policy.xml
- Remove -Dim4java.useV7=true JVM flag from startup scripts
- Remove org.im4java/im4java from deps.edn
- All ImageMagick commands now use shell/exec! with 60s timeout and resource limits

Co-authored-by: mimo-v2.5-pro <mimo-v2.5-pro@penpot.app>

* 💄 Rename imagemagick env functions and optimize config reads

- Rename imagemagick-defaults -> imagemagick-default-env
- Rename imagemagick-env -> get-imagemagick-env
- Optimize to avoid double cf/get calls per config key

Co-authored-by: mimo-v2.5-pro <mimo-v2.5-pro@penpot.app>

*  Add tests for shell/exec! timeout and media processing

- Add shell_test.clj: tests for exec! timeout, env vars, stdin, stderr
- Add media_test.clj: tests for info, generic-thumbnail, profile-thumbnail
- Fix generic-process to prefer explicit format over input mtype
- Fix shell/exec! to use cached executor when system has no executor
- Fix reduce-kv accumulator in set-env (must return penv)

Co-authored-by: mimo-v2.5-pro <mimo-v2.5-pro@penpot.app>

* ♻️ Refactor media/process to take system as first argument

- Change (defmulti process :cmd) -> (defmulti process (fn [_system params] (:cmd params)))
- Change (run params) -> (run system params)
- All process methods now receive [system params]
- Update all callers: rpc/commands/media, profile, auth, fonts
- Revert shell/exec! to require system with executor (no fallback)
- Fix lint warnings and formatting

Co-authored-by: mimo-v2.5-pro <mimo-v2.5-pro@penpot.app>

* 🔥 Remove unused app.svgo namespace

Co-authored-by: mimo-v2.5-pro <mimo-v2.5-pro@penpot.app>

* 🔥 Remove Node.js from backend Docker image

- Delete unused svgo-cli.js script
- Remove Node.js installation from Dockerfile.backend
- Remove svgo-cli.js copy from backend build script

Co-authored-by: mimo-v2.5-pro <mimo-v2.5-pro@penpot.app>

* 🔥 Remove unused process-error multimethod

- Remove process-error multimethod and its default handler
- Simplify media/run to directly call process
- Fix alignment in main.clj

Co-authored-by: mimo-v2.5-pro <mimo-v2.5-pro@penpot.app>

* 📚 Add ImageMagick resource limits configuration to technical guide

Co-authored-by: mimo-v2.5-pro <mimo-v2.5-pro@penpot.app>

---------

Co-authored-by: mimo-v2.5-pro <mimo-v2.5-pro@penpot.app>
2026-06-18 17:52:01 +02:00
Andrey Antukh
b391a4c8d3
♻️ Add mcp integration state management refactor (#10226)
* ♻️ Add mcp integration state management refactor

* 🐛 Fix access tokens do not appear

* ♻️ Refactor some names

* ♻️ Refactor token deletion

---------

Co-authored-by: Luis de Dios <luis.dedios@kaleidos.net>
2026-06-16 18:35:30 +02:00
Andrey Antukh
d44c6250ea Merge remote-tracking branch 'origin/staging' into develop 2026-06-15 13:04:19 +02:00
Andrey Antukh
79227c4de8
Batch multiple thumbnail deletions into a single RPC call (#9943)
*  Batch multiple thumbnail deletions into a single RPC call

Replace the old per-object immediate thumbnail deletion with a
debounced batched approach. The frontend queues object-ids in state
and waits 200ms before sending a single RPC request with up to 200
object-ids. The backend deletes all matching thumbnails in one SQL
statement with a single RETURNING clause, then touches the affected
media objects.

This reduces RPC overhead when rapidly clearing thumbnails (e.g.
navigating pages) and makes deletions more efficient.

Signed-off-by: Andrey Antukh <niwi@niwi.nz>

* 📎 Fix missing issues

---------

Signed-off-by: Andrey Antukh <niwi@niwi.nz>
2026-06-15 12:01:32 +02:00
María Valderrama
68d4238277 🐛 Fix license not loading in theme change 2026-06-12 12:13:50 +02:00
Alonso Torres
f5c258b868
🐛 Activate inactive profile when re-registering with disable-email-verification (#9999) 2026-06-11 12:52:07 +02:00
Andrey Antukh
12e7ea038f Merge remote-tracking branch 'origin/staging' into develop 2026-06-11 10:58:19 +02:00
Andrey Antukh
ba9b03268a
⬆️ Update backend and common dependencies (#10108)
* 🐛 Fix incorrect valkey uri on backend tests

* ⬆️ Update backend dependencies

* ⬆️ Update pnpm dependencies

* 📎 Fix minor linter issues
2026-06-11 10:11:23 +02:00
Andrey Antukh
9f5a0f767e Merge remote-tracking branch 'origin/staging' into develop 2026-06-10 11:26:48 +02:00
Andrey Antukh
0ac092d177
🐛 Make set-file-shared idempotent to fix race condition with optimistic updates (#10093) 2026-06-10 10:58:50 +02:00
Pablo Alba
2a48747cf6 Review nitrate add team members permission 2026-06-08 10:56:18 +02:00
yong2bba
bb89ca526b
Avoid deduplicating temporary export files (#9959)
* 🐛 Avoid deduplicating temporary export files

* 📎 Update changelog

Signed-off-by: Andrey Antukh <niwi@niwi.nz>

---------

Signed-off-by: Andrey Antukh <niwi@niwi.nz>
Co-authored-by: yongjin <yongjin@yongjinui-Macmini.local>
Co-authored-by: Andrey Antukh <niwi@niwi.nz>
2026-06-03 14:08:10 +02:00
María Valderrama
7bf519a127 Inherit subscriptions perks to Nitrate 2026-06-02 09:33:02 +02:00
Andrey Antukh
4a8fb5af53 Merge remote-tracking branch 'origin/staging' into develop 2026-06-01 13:15:57 +02:00
Andrey Antukh
c5de4c27b0 Merge remote-tracking branch 'origin/main' into staging 2026-06-01 12:57:39 +02:00
Yamila Moreno
ddba2ffa75
📎 Update Kaleidos Copyright (#9929) 2026-05-29 11:24:58 +02:00
Pablo Alba
c51a137ca9 Add nitrate permission team members design review changes 2026-05-29 11:23:53 +02:00
Chan
ac3950e36c
🐛 Fix CORS middleware reflecting arbitrary origins (#9675)
*  Align profile and dashboard files with penpot develop

* 🐛 Fix CORS origin allowlist for issue #9659

---------

Signed-off-by: Chan <101856681+enjoyandlove@users.noreply.github.com>
Co-authored-by: Andrey Antukh <niwi@niwi.nz>
2026-05-29 11:23:16 +02:00
Andrey Antukh
1a1c7355e2 Merge remote-tracking branch 'refs/remotes/origin/develop' into develop 2026-05-27 13:37:17 +02:00
Andrey Antukh
3858993a57 Merge remote-tracking branch 'origin/staging' into develop 2026-05-27 13:37:02 +02:00
María Valderrama
15d6df48f5 🐛 Fix default team showing up in count 2026-05-27 13:36:35 +02:00
Andrey Antukh
f6c76711f4 Merge remote-tracking branch 'origin/main' into staging 2026-05-27 11:34:59 +02:00
Pablo Alba
a637dda554 Check nitrate permission only org members for move teams 2026-05-26 13:25:20 +02:00
María Valderrama
87384aaccd 🐛 Fix nitrate delete and leave org flow 2026-05-25 14:39:03 +02:00
Pablo Alba
dac98c0625 Add nitrate add team members permission 2026-05-23 17:18:27 +02:00
Andrey Antukh
405a73e8ba
Add climit impl and config for file snapshot methods (#9722)
*  Add dedicated concurrency limit for restore-file-snapshot

This adds a dedicated climit configuration for the restore-file-snapshot
RPC method with :permits 1 per profile (plus queue of 2 and 60s timeout)
and a global limit of 3. Previously the method only used the generic
root/by-profile and root/global limits, allowing up to 7 concurrent
restore operations per profile which caused database row lock contention
on FOR UPDATE and connection pool exhaustion.

*  Skip locking on restore! to avoid blocking other operations

Changes the row lock acquisition in restore! from a blocking FOR UPDATE
to FOR UPDATE SKIP LOCKED. If the file row is already locked by another
concurrent operation (e.g., another restore or an update-file), the query
returns no rows and the caller fails fast with a clear conflict error
instead of blocking indefinitely holding a database connection.

*  Add queue and timeout limits to root/by-profile concurrency limit

Previously root/by-profile had no queue limit (unbounded Integer/MAX_VALUE)
and no timeout, allowing requests to pile up indefinitely behind a profile
whose permits were exhausted by long-running operations. This could lead
to memory pressure and cascading failures. Now limited to 30 queued
requests with a 30-second timeout so excess requests fail fast.

*  Move backup snapshot creation outside restore transaction

The backup snapshot (fsnap/create!) is now created in its own short-lived
connection before the actual restore transaction begins. This ensures the
backup is persisted independently of the restore outcome and reduces the
restore transaction window.

The restore itself runs inside a db/tx-run! block with an optimistic
locking check: it reads the file with FOR UPDATE and compares its revn
against the value captured at backup time. If the file was edited
concurrently, the restore aborts with a conflict error to prevent data
loss.

Co-dependent with the SKIP LOCKED change in restore! — the FOR UPDATE
acquired here is in the same transaction as restore!, so the SKIP LOCKED
inside restore! correctly sees the row as unlocked (same transaction).

* ♻️ Remove unused private function get-minimal-file

The local get-minimal-file function in file_snapshots.clj is no longer
used since restore! switched to direct exec-one! with FOR UPDATE SKIP
LOCKED. The sql:get-minimal-file SQL constant is still used directly.

*  Add minor improvements on db connection management

* ♻️ Refactor create-file-snapshot to use explicit transaction management

Remove automatic transaction wrapping (`::db/transaction true`) and
pass `cfg` through the call chain instead of destructured `conn`.
Wrap `fsnap/create!` in an explicit `db/tx-run!` for clearer
transaction boundaries.

Signed-off-by: Andrey Antukh <niwi@niwi.nz>

*  Add dedicated concurrency limit for create-file-snapshot

This adds a dedicated climit configuration for the create-file-snapshot
RPC method with :permits 1 per profile (plus queue of 2 and 60s timeout)
and a global limit of 3. Previously the method only used the generic
root/by-profile and root/global limits, allowing up to 10 concurrent
snapshot creation operations per profile which could cause database
contention and connection pool exhaustion.

Signed-off-by: Andrey Antukh <niwi@niwi.nz>

---------

Signed-off-by: Andrey Antukh <niwi@niwi.nz>
2026-05-19 14:30:44 +02:00
Andrey Antukh
595ec599c6 Merge remote-tracking branch 'origin/staging' into develop 2026-05-18 20:00:47 +02:00
Andrey Antukh
1b6b367951 Add diagnostic keys to SSRF validation exceptions
Add :uri and :scheme/:host keys to exceptions raised by
`validate-uri` for better error diagnostics. Also fix a bug
where (str url) was used instead of (str uri) in the
host-missing exception path.

Update the existing blocked-target test to verify the new :uri
key, and add three new tests covering scheme rejection, missing
host, and DNS failure error paths. All 27 tests pass with 60
assertions and 0 failures.

Signed-off-by: Andrey Antukh <niwi@niwi.nz>
2026-05-18 15:57:55 +00:00
Juanfran
8e86416b0b Cascade owned organization deletion on account removal 2026-05-18 16:05:08 +02:00
Andrey Antukh
6f41a2b729 Merge remote-tracking branch 'origin/staging' into develop 2026-05-18 15:24:02 +02:00
Andrey Antukh
208182cab1 Merge remote-tracking branch 'origin/main' into staging 2026-05-18 15:23:46 +02:00
Andrey Antukh
9de25c5404
🐛 Fix incorrect content-type on doc endpoint response (#9681)
The /api/main/doc endpoint was returning HTML content with a
text/plain content-type header instead of text/html. This caused
browsers to render the response as plain text.

Added content-type: text/html; charset=utf-8 header to the
response in the doc handler and added a regression test to
verify the fix.

Closes #9680

Signed-off-by: Andrey Antukh <niwi@niwi.nz>
2026-05-18 12:54:16 +02:00
Andrey Antukh
ff23f786b4 🐛 Fix broken authentication on /assets handlers
- Add ::setup/props and ::db/pool to :app.http.assets/routes config
  so session renewal works correctly for asset requests.
- Add actoken/authz middleware to the assets middleware chain so
  access tokens are properly recognized.
- Add authenticated? helper that checks both ::session/profile-id
  and ::actoken/profile-id, fixing 401 errors when accessing
  protected assets with a valid access token.
- Add comprehensive test suite for assets auth scenarios.

Closes #9677

Signed-off-by: Andrey Antukh <niwi@niwi.nz>
2026-05-15 12:05:02 +02:00
Andrey Antukh
d620c86053 Merge remote-tracking branch 'origin/staging' into develop 2026-05-15 11:58:06 +02:00
Andrey Antukh
9021544c05 Merge remote-tracking branch 'origin/main' into staging 2026-05-14 15:24:29 +02:00
Andrey Antukh
67d9567971
🐛 Prevent CSS injection vulnerability in font family names
Add a shared `schema:font-family` whitelist validator in
app.common.types.font that only allows letters, digits, spaces,
hyphens, underscores, and dots in font family names. Apply the schema
to create-font-variant and update-font RPC endpoints on the
backend, and add client-side validation in the dashboard fonts UI.
Include unit tests for the schema and integration tests for the RPC
handlers.

Signed-off-by: Andrey Antukh <niwi@niwi.nz>
2026-05-14 13:46:02 +02:00
Andrey Antukh
52588412c7 Merge remote-tracking branch 'origin/staging' into develop 2026-05-14 11:12:01 +02:00
Andrey Antukh
d78074307f Merge remote-tracking branch 'origin/main' into staging 2026-05-14 11:07:42 +02:00
Pablo Alba
fffafdab93
🐛 Fix library updates reappear after file is reloaded (#9563)
* 🐛 Fix library updates reappear after file is reloaded

Summary
Migrate synced_at timestamps to a standalone file_library_sync table to ensure sync state is tracked for both direct and transitive libraries.

Problem
Transitive libraries (libraries imported by other libraries) are not stored as direct rows in file_library_rel. Because the system previously coupled synced_at directly to the file_library_rel schema, transitive libraries lacked a persistent location for their sync timestamps. This caused sync states to be lost or incorrectly reported for nested dependencies.

Changes
Schema Migration: Created file_library_sync and migrated existing synced_at values from file_library_rel.

Decoupling: Removed tight Foreign Key coupling to allow sync rows to exist independently of specific relationship records.

Persistent Writes: Added upsert-file-library-sync! helper. Updated all import, duplication, and RPC write paths (v1/v2/v3 importers, link-file-library) to ensure every write persists a sync row.

Unified Reads: Updated both direct and recursive/transitive library queries to fetch synced_at from the new table.

Testing: Added regression tests to verify that sync rows are correctly created/updated even when a transitive relation is absent in file_library_rel.

Impact
This fix ensures that the system accurately records and retrieves sync states for the entire library dependency tree, resolving the bug where nested libraries appeared out of sync.

*  MR review
2026-05-13 11:29:05 +02:00
Andrey Antukh
947f6d392d
🎉 Add chunked upload support for font variants (#9551)
*  Add additional logging and validation for image upload

* 🎉 Add chunked upload support for font variants

Extend the font variant upload flow across frontend, backend, and common
to support the standardized chunked upload protocol.

**Backend:**
- Add \`:font-max-file-size\` config default (30 MiB) and schema entry
- Add \`validate-font-size!\` in \`media.clj\` (mirrors
  \`validate-media-size!\`, raises \`:font-max-file-size-reached\`)
- Extend \`schema:create-font-variant\` to accept either \`:data\`
  (legacy bytes or chunk-vector) or \`:uploads\` (new chunked session
  map), with a validator requiring exactly one
- Add \`prepare-font-data-from-uploads\`: assembles each chunked
  session via \`cmedia/assemble-chunks\`, validates type+size
- Add \`prepare-font-data-from-legacy\`: normalises legacy byte/chunk
  entries, writing to a tempfile (joining via SequenceInputStream),
  validates type+size
- Add structured logging ("init"/"end") with \`:size\`, \`:mtypes\`,
  and \`:elapsed\` in \`create-font-variant\`

**Frontend:**
- \`upload-blob-chunked\` accepts a per-caller \`:chunk-size\` option
- Add \`font-upload-chunk-size\` (10 MiB) and \`upload-font-variant\`
  fn that uploads each mtype as a separate chunked session
- \`on-upload*\` in dashboard fonts now calls \`upload-font-variant\`
  instead of issuing \`create-font-variant\` RPC directly
- \`process-upload\` stores raw ArrayBuffer instead of chunking
  client-side

**Common:**
- Replace \`"font/opentype"\` with \`"font/woff2"\` in \`font-types\`

**Tests:**
- 25 tests / 224 assertions covering all three upload paths (direct
  bytes, legacy chunk-vector, new chunked sessions), size validation,
  and media type validation

Signed-off-by: Andrey Antukh <niwi@niwi.nz>

* 📎 Add a script for check the commit format locally

---------

Signed-off-by: Andrey Antukh <niwi@niwi.nz>
2026-05-12 18:30:19 +02:00
jony376
f7fbd3007e
🐛 Prevent viewers from overwriting file thumbnails (#9285)
* 🐛 Prevent viewers from overwriting file thumbnails

* 🐛 Fix message

---------

Co-authored-by: jony376 <jony376@gmail.com>
Co-authored-by: Andrey Antukh <niwi@niwi.nz>
2026-05-11 14:38:40 +02:00
Andrey Antukh
7d4be33d4f 🎉 Add telemetry anonymous event collection (#9483)
* 🎉 Add telemetry anonymous event collection

Rewrite the audit logging subsystem to support three operating modes and
add anonymous telemetry event collection:

Modes:
- A (audit-log only): events persisted with full context
- B (audit-log + telemetry): same as A, plus events are collected for
  telemetry shipping
- C (telemetry-only): events stored anonymously with PII stripped,
  telemetry flag active, audit-log flag inactive

Audit system refactoring (app.loggers.audit):
- Replace qualified map keys (::audit/name etc.) with plain keywords
- Rename submit! -> submit, insert! -> insert, prepare-event ->
  prepare-rpc-event
- Add submit* as a lower-level public API
- Add process-event dispatch function that handles all three modes and
  webhooks in a single tx-run!
- Add :id to event schema (auto-generated if omitted)
- Add filter-telemetry-props: anonymises event props per event type.
  Keeps UUID/boolean/number values; for login/identify events preserves
  lang, auth-backend, email-domain; for navigate events preserves route,
  file-id, team-id, page-id; instance-start trigger passes through.
- Add filter-telemetry-context: retains only safe context keys.
  Backend: version, initiator, client-version, client-user-agent.
  Frontend: browser, os, locale, screen metrics, event-origin.
- Timestamps truncated to day precision via ct/truncate for telemetry
  storage
- PII stripped: props emptied, ip-addr zeroed, session-linking and
  access-token fields removed from context

Config (app.config):
- Derive :enable-telemetry flag from telemetry-enabled config option

Email utilities (app.email):
- Add email/clean and email/get-domain helper functions for domain
  extraction from email addresses

Setup (app.setup):
- Emit instance-start trigger event at system startup
- Simplify handle-instance-id (remove read-only check)

RPC layer (app.rpc):
- wrap-audit now activates when :telemetry flag is set
- Add :request-id to RPC params context for event correlation

RPC commands (management, teams_invitations, verify_token, OIDC auth,
webhooks): migrate all audit call sites to use the new plain-key API

SREPL (app.srepl.main):
- Migrate all audit/insert! calls to audit/insert with plain keys

Telemetry task (app.tasks.telemetry):
- Restructure legacy report into make-legacy-request; distinguish
  payload type as :telemetry-legacy-report
- Add collect-and-send-audit-events: loop fetching up to 10,000 rows
  per iteration, encodes and sends each page, deletes on success,
  stops immediately on failure for retry
- Add send-event-batch: POSTs fressian+zstd batch (base64 via
  blob/encode-str) to the telemetry endpoint with instance-id per event
- Add gc-telemetry-events: enforces 100,000-row safety cap by dropping
  oldest rows first
- Add delete-sent-events: deletes successfully shipped rows by id

Blob utilities (app.util.blob):
- Add encode-str/decode-str: combine fressian+zstd encoding with URL-
  safe base64 for JSON-safe string transport

Database:
- Add migration 0145: index on audit_log (source, created_at ASC) for
  efficient telemetry batch collection queries

Frontend:
- Always initialize event system regardless of :audit-log flag
- Defer auth events (signin identify) to after profile is set
- Refactor event subsystem for telemetry support

Tests (21 test vars, 94 assertions in tasks-telemetry-test):
- Cover all code paths: disabled/enabled telemetry, no-events no-op,
  happy-path batch send and delete, failure retention, payload anonymity,
  context stripping, timestamp day precision, batch encoding round-trip,
  multi-page iteration, GC cap enforcement, partial failure handling
- blob encode-str/decode-str round-trip tests (14 test vars)
- RPC audit integration tests (5 test vars)

Signed-off-by: Andrey Antukh <niwi@niwi.nz>

* 📎 Add pr feedback changes

---------

Signed-off-by: Andrey Antukh <niwi@niwi.nz>
2026-05-11 12:42:01 +02:00
Andrey Antukh
cd4a4da0f2
🎉 Add telemetry anonymous event collection (#9483)
* 🎉 Add telemetry anonymous event collection

Rewrite the audit logging subsystem to support three operating modes and
add anonymous telemetry event collection:

Modes:
- A (audit-log only): events persisted with full context
- B (audit-log + telemetry): same as A, plus events are collected for
  telemetry shipping
- C (telemetry-only): events stored anonymously with PII stripped,
  telemetry flag active, audit-log flag inactive

Audit system refactoring (app.loggers.audit):
- Replace qualified map keys (::audit/name etc.) with plain keywords
- Rename submit! -> submit, insert! -> insert, prepare-event ->
  prepare-rpc-event
- Add submit* as a lower-level public API
- Add process-event dispatch function that handles all three modes and
  webhooks in a single tx-run!
- Add :id to event schema (auto-generated if omitted)
- Add filter-telemetry-props: anonymises event props per event type.
  Keeps UUID/boolean/number values; for login/identify events preserves
  lang, auth-backend, email-domain; for navigate events preserves route,
  file-id, team-id, page-id; instance-start trigger passes through.
- Add filter-telemetry-context: retains only safe context keys.
  Backend: version, initiator, client-version, client-user-agent.
  Frontend: browser, os, locale, screen metrics, event-origin.
- Timestamps truncated to day precision via ct/truncate for telemetry
  storage
- PII stripped: props emptied, ip-addr zeroed, session-linking and
  access-token fields removed from context

Config (app.config):
- Derive :enable-telemetry flag from telemetry-enabled config option

Email utilities (app.email):
- Add email/clean and email/get-domain helper functions for domain
  extraction from email addresses

Setup (app.setup):
- Emit instance-start trigger event at system startup
- Simplify handle-instance-id (remove read-only check)

RPC layer (app.rpc):
- wrap-audit now activates when :telemetry flag is set
- Add :request-id to RPC params context for event correlation

RPC commands (management, teams_invitations, verify_token, OIDC auth,
webhooks): migrate all audit call sites to use the new plain-key API

SREPL (app.srepl.main):
- Migrate all audit/insert! calls to audit/insert with plain keys

Telemetry task (app.tasks.telemetry):
- Restructure legacy report into make-legacy-request; distinguish
  payload type as :telemetry-legacy-report
- Add collect-and-send-audit-events: loop fetching up to 10,000 rows
  per iteration, encodes and sends each page, deletes on success,
  stops immediately on failure for retry
- Add send-event-batch: POSTs fressian+zstd batch (base64 via
  blob/encode-str) to the telemetry endpoint with instance-id per event
- Add gc-telemetry-events: enforces 100,000-row safety cap by dropping
  oldest rows first
- Add delete-sent-events: deletes successfully shipped rows by id

Blob utilities (app.util.blob):
- Add encode-str/decode-str: combine fressian+zstd encoding with URL-
  safe base64 for JSON-safe string transport

Database:
- Add migration 0145: index on audit_log (source, created_at ASC) for
  efficient telemetry batch collection queries

Frontend:
- Always initialize event system regardless of :audit-log flag
- Defer auth events (signin identify) to after profile is set
- Refactor event subsystem for telemetry support

Tests (21 test vars, 94 assertions in tasks-telemetry-test):
- Cover all code paths: disabled/enabled telemetry, no-events no-op,
  happy-path batch send and delete, failure retention, payload anonymity,
  context stripping, timestamp day precision, batch encoding round-trip,
  multi-page iteration, GC cap enforcement, partial failure handling
- blob encode-str/decode-str round-trip tests (14 test vars)
- RPC audit integration tests (5 test vars)

Signed-off-by: Andrey Antukh <niwi@niwi.nz>

* 📎 Add pr feedback changes

---------

Signed-off-by: Andrey Antukh <niwi@niwi.nz>
2026-05-11 12:19:59 +02:00
Andrey Antukh
10a23a6869 Merge remote-tracking branch 'origin/main' into staging 2026-05-10 09:16:41 +02:00
Andrey Antukh
279231240d
🐛 Harden outbound HTTP requests against SSRF and restrict assets handlers (#9390)
* ⬆️ Update root deps

* 🐛 Harden outbound HTTP requests against SSRF and restrict unauthenticated asset access

- Add app.util.ssrf URL/host validator that resolves hostnames and blocks
  loopback, link-local, site-local, cloud metadata, and operator-supplied CIDRs
- Add app.media.sanitize image EOF truncator that strips trailing data after
  PNG IEND, JPEG EOI, GIF trailer, and WebP RIFF markers
- Disable HTTP client auto-redirect; add req-with-redirects! helper that
  revalidates every redirect hop against the SSRF blocklist
- Wire SSRF validation and EOF sanitization into media/download-image
- Validate webhook URLs and OIDC profile picture URLs against SSRF
- Restrict /assets/by-id to require authentication for non-public buckets
  (profile) while keeping public access for file-media-object,
  file-object-thumbnail, team-font-variant, and file-data-fragment
- Add config knobs: ssrf-protection-enabled, ssrf-allowed-hosts,
  ssrf-extra-blocked-cidrs

Signed-off-by: Andrey Antukh <niwi@niwi.nz>

---------

Signed-off-by: Andrey Antukh <niwi@niwi.nz>
2026-05-08 09:18:22 +02:00
Andrey Antukh
14a0660352 Merge remote-tracking branch 'origin/main' into staging 2026-05-06 15:06:59 +02:00