Rewrite the audit logging subsystem to support three operating modes and add anonymous telemetry event collection: Modes: - A (audit-log only): events persisted with full context - B (audit-log + telemetry): same as A, plus events are collected for telemetry shipping - C (telemetry-only): events stored anonymously with PII stripped, telemetry flag active, audit-log flag inactive Audit system refactoring (app.loggers.audit): - Replace qualified map keys (::audit/name etc.) with plain keywords - Rename submit! -> submit, insert! -> insert, prepare-event -> prepare-rpc-event - Add submit* as a lower-level public API - Add process-event dispatch function that handles all three modes and webhooks in a single tx-run! - Add :id to event schema (auto-generated if omitted) - Add filter-telemetry-props: anonymises event props per event type. Keeps UUID/boolean/number values; for login/identify events preserves lang, auth-backend, email-domain; for navigate events preserves route, file-id, team-id, page-id; instance-start trigger passes through. - Add filter-telemetry-context: retains only safe context keys. Backend: version, initiator, client-version, client-user-agent. Frontend: browser, os, locale, screen metrics, event-origin. - Timestamps truncated to day precision via ct/truncate for telemetry storage - PII stripped: props emptied, ip-addr zeroed, session-linking and access-token fields removed from context Config (app.config): - Derive :enable-telemetry flag from telemetry-enabled config option Email utilities (app.email): - Add email/clean and email/get-domain helper functions for domain extraction from email addresses Setup (app.setup): - Emit instance-start trigger event at system startup - Simplify handle-instance-id (remove read-only check) RPC layer (app.rpc): - wrap-audit now activates when :telemetry flag is set - Add :request-id to RPC params context for event correlation RPC commands (management, teams_invitations, verify_token, OIDC auth, webhooks): migrate all audit call sites to use the new plain-key API SREPL (app.srepl.main): - Migrate all audit/insert! calls to audit/insert with plain keys Telemetry task (app.tasks.telemetry): - Restructure legacy report into make-legacy-request; distinguish payload type as :telemetry-legacy-report - Add collect-and-send-audit-events: loop fetching up to 10,000 rows per iteration, encodes and sends each page, deletes on success, stops immediately on failure for retry - Add send-event-batch: POSTs fressian+zstd batch (base64 via blob/encode-str) to the telemetry endpoint with instance-id per event - Add gc-telemetry-events: enforces 100,000-row safety cap by dropping oldest rows first - Add delete-sent-events: deletes successfully shipped rows by id Blob utilities (app.util.blob): - Add encode-str/decode-str: combine fressian+zstd encoding with URL- safe base64 for JSON-safe string transport Database: - Add migration 0145: index on audit_log (source, created_at ASC) for efficient telemetry batch collection queries Frontend: - Always initialize event system regardless of :audit-log flag - Defer auth events (signin identify) to after profile is set - Refactor event subsystem for telemetry support Tests (21 test vars, 94 assertions in tasks-telemetry-test): - Cover all code paths: disabled/enabled telemetry, no-events no-op, happy-path batch send and delete, failure retention, payload anonymity, context stripping, timestamp day precision, batch encoding round-trip, multi-page iteration, GC cap enforcement, partial failure handling - blob encode-str/decode-str round-trip tests (14 test vars) - RPC audit integration tests (5 test vars) Signed-off-by: Andrey Antukh <niwi@niwi.nz>
8.5 KiB
Penpot Backend – Agent Instructions
Clojure backend (RPC) service running on the JVM.
Uses Integrant for dependency injection, PostgreSQL for storage, and Redis for messaging/caching.
General Guidelines
To ensure consistency across the Penpot JVM stack, all contributions must adhere to these criteria.
IMPORTANT: all CLI commands should be executed under backend/ subdirectory for make them work correctly.
1. Testing & Validation
-
Coverage: If code is added or modified in
src/, corresponding tests intest/backend_tests/must be added or updated. -
Execution:
- Isolated: Run
clojure -M:dev:test --focus backend-tests.my-ns-testfor the specific test namespace. - Regression: Run
clojure -M:dev:testto ensure the suite passes without regressions in related functional areas.
- Isolated: Run
2. Code Quality & Formatting
- Linting: All code must pass linter checks (run
pnpm run lint:cljorpnpm run linton the repository root) - Formatting: All the code must pass the formatting check (run
pnpm run check-fmt). Usepnpm run fmtto fix formatting issues. Avoid "dirty" diffs caused by unrelated whitespace changes. - Type Hinting: Use explicit JVM type hints (e.g.,
^String,^long) in performance-critical paths to avoid reflection overhead.
Code Conventions
Namespace Overview
The source is located under src directory and this is a general overview of
namespaces structure:
app.rpc.commands.*– RPC command implementations (auth,files,teams, etc.)app.http.*– HTTP routes and middlewareapp.db.*– Database layerapp.tasks.*– Background job tasksapp.main– Integrant system setup and entrypointapp.loggers– Internal loggers (auditlog, mattermost, etc.) (not to be confused withapp.common.logging)
RPC
The RPC methods are implemented using a multimethod-like structure via the
app.util.services namespace. The main RPC methods are collected under
app.rpc.commands namespace and exposed under /api/rpc/command/<cmd-name>.
The RPC method accepts POST and GET requests indistinctly and uses the Accept
header to negotiate the response encoding (which can be Transit — the default —
or plain JSON). It also accepts Transit (default) or JSON as input, which should
be indicated using the Content-Type header.
The main convention is: use get- prefix on RPC name when we want READ
operation.
Example of RPC method definition:
(sv/defmethod ::my-command
{::rpc/auth true ;; requires auth
::doc/added "1.18"
::sm/params [:map ...] ;; malli input schema
::sm/result [:map ...]} ;; malli output schema
[{:keys [::db/pool] :as cfg} {:keys [::rpc/profile-id] :as params}]
;; return a plain map or throw
{:id (uuid/next)})
Look under src/app/rpc/commands/*.clj to see more examples.
Tests
Test namespaces match .*-test$ under test/. Config is in tests.edn.
Integrant System
The src/app/main.clj declares the system map. Each key is a component; values
are config maps with ::ig/ref for dependencies. Components implement
ig/init-key / ig/halt-key!.
Connecting to the Database
Two PostgreSQL databases are used in this environment:
| Database | Purpose | Connection string |
|---|---|---|
penpot |
Development / app | postgresql://penpot:penpot@postgres/penpot |
penpot_test |
Test suite | postgresql://penpot:penpot@postgres/penpot_test |
Interactive psql session:
# development DB
psql "postgresql://penpot:penpot@postgres/penpot"
# test DB
psql "postgresql://penpot:penpot@postgres/penpot_test"
One-shot query (non-interactive):
psql "postgresql://penpot:penpot@postgres/penpot" -c "SELECT id, name FROM team LIMIT 5;"
Useful psql meta-commands:
\dt -- list all tables
\d <table> -- describe a table (columns, types, constraints)
\di -- list indexes
\q -- quit
Migrations table: Applied migrations are tracked in the
migrationstable with columnsmodule,step, andcreated_at. When renaming a migration logical name, update this table in both databases to match the new name; otherwise the runner will attempt to re-apply the migration on next startup.
# Example: fix a renamed migration entry in the test DB
psql "postgresql://penpot:penpot@postgres/penpot_test" \
-c "UPDATE migrations SET step = 'new-name' WHERE step = 'old-name';"
Database Access (Clojure)
app.db wraps next.jdbc. Queries use a SQL builder that auto-converts kebab-case ↔ snake_case.
;; Query helpers
(db/get cfg-or-pool :table {:id id}) ; fetch one row (throws if missing)
(db/get* cfg-or-pool :table {:id id}) ; fetch one row (returns nil)
(db/query cfg-or-pool :table {:team-id team-id}) ; fetch multiple rows
(db/insert! cfg-or-pool :table {:name "x" :team-id id}) ; insert
(db/update! cfg-or-pool :table {:name "y"} {:id id}) ; update
(db/delete! cfg-or-pool :table {:id id}) ; delete
;; Run multiple statements/queries on single connection
(db/run! cfg (fn [{:keys [::db/conn]}]
(db/insert! conn :table row1)
(db/insert! conn :table row2))
;; Transactions
(db/tx-run! cfg (fn [{:keys [::db/conn]}]
(db/insert! conn :table row)))
Almost all methods in the app.db namespace accept pool, conn, or
cfg as params.
Migrations live in src/app/migrations/ as numbered SQL files. They run automatically on startup.
Error Handling
The exception helpers are defined on Common module, and are available under
app.common.exceptions namespace.
Example of raising an exception:
(ex/raise :type :not-found
:code :object-not-found
:hint "File does not exist"
:file-id id)
Common types: :not-found, :validation, :authorization, :conflict, :internal.
Performance Macros (app.common.data.macros)
Always prefer these macros over their clojure.core equivalents — they provide
optimized implementations:
(dm/select-keys m [:a :b]) ;; faster than core/select-keys
(dm/get-in obj [:a :b :c]) ;; faster than core/get-in
(dm/str "a" "b" "c") ;; string concatenation
Configuration
src/app/config.clj reads PENPOT_* environment variables, validated with
Malli. Access anywhere via (cf/get :smtp-host). Feature flags: (cf/flags :enable-smtp).
Background Tasks
Background tasks live in src/app/tasks/. Each task is an Integrant component
that exposes a ::handler key and follows this three-method pattern:
(defmethod ig/assert-key ::handler ;; validate config at startup
[_ params]
(assert (db/pool? (::db/pool params)) "expected a valid database pool"))
(defmethod ig/expand-key ::handler ;; inject defaults before init
[k v]
{k (assoc v ::my-option default-value)})
(defmethod ig/init-key ::handler ;; return the task fn
[_ cfg]
(fn [_task] ;; receives the task row from the worker
(db/tx-run! cfg (fn [{:keys [::db/conn]}]
;; … do work …
))))
Wiring a new task requires two changes in src/app/main.clj:
- Handler config – add an entry in
system-configwith the dependencies:
:app.tasks.my-task/handler
{::db/pool (ig/ref ::db/pool)}
- Registry + cron – register the handler name and schedule it:
;; in ::wrk/registry ::wrk/tasks map:
:my-task (ig/ref :app.tasks.my-task/handler)
;; in worker-config ::wrk/cron ::wrk/entries vector:
{:cron #penpot/cron "0 0 0 * * ?" ;; daily at midnight
:task :my-task}
Useful cron patterns (Quartz format — six fields: s m h dom mon dow):
| Expression | Meaning |
|---|---|
"0 0 0 * * ?" |
Daily at midnight |
"0 0 */6 * * ?" |
Every 6 hours |
"0 */5 * * * ?" |
Every 5 minutes |
Time helpers (app.common.time):
(ct/now) ;; current instant
(ct/duration {:hours 1}) ;; java.time.Duration
(ct/minus (ct/now) some-duration) ;; subtract duration from instant
db/interval converts a Duration (or millis / string) to a PostgreSQL
interval object suitable for use in SQL queries:
(db/interval (ct/duration {:hours 1})) ;; → PGInterval "3600.0 seconds"