System monitoring
The System Monitoring page in the Liberator UI surfaces the most operationally relevant signals from the Liberator stack in a single place: cluster health, queue depth, dataset access patterns, long-running queries, and license usage. It’s intended for super-admins, on-call engineers, and capacity planners. End users do not see this page.The System Monitoring page is backed by the same Prometheus instance you can connect Grafana to. See the Grafana integration guide if you want the same data alongside metrics from systems outside CloudQuant.
Opening system monitoring
- Sign in to the Liberator UI as a super-admin.
- Click System Monitoring in the top navigation.
Tabs
System Monitoring is organized into five tabs. The first four are backed by Prometheus and share a global time-range selector (6h / 24h) in the upper-right. The fifth is backed by the entitlements database and uses its own dedicated 1d / 1w / 1m / 1y selector.
Cluster
Real-time cluster health from the Liberator gateway, application pods, and host nodes. What you’ll see:- Gateway request rate and latency percentiles (
p50/p95/p99) - Per-pod CPU and memory utilization for Liberator components
- Per-node CPU, memory, and filesystem utilization
- Data-cache worker pool status
Queue
The Liberator waiting room queue — how many requests are queued, how long they’ve been waiting, and which users own them. What you’ll see:- Active connections (in flight) and queued connections (waiting)
- Per-user breakdown of queue occupancy
- Maximum in-flight query duration (a useful early-warning signal)
CQAIOps service account often appears as a high-volume user; that reflects automated platform monitoring and is expected.
Use this tab to answer: “Is anyone being blocked, and by whom?”
Datasets
Dataset access patterns over the selected time range. What you’ll see:- Top datasets by query count
- Top datasets by bytes returned
- Distribution of access by client (Python, REST, Excel, etc.)
Long queries
The slowest individual queries in the selected window. What you’ll see:- A ranked list of queries with execution time, user, dataset, and
from/towindow - Click-through to see the full query text and result-set size
Usage
License utilization from the entitlements database. Distinct from the other tabs in two ways:- Different selector. This tab exposes
1d / 1w / 1m / 1ywindows instead of the Prometheus6h / 24h, because license utilization is measured against per-contract caps that operate on much longer windows. - Different backing store. Numbers come from the entitlements database, not Prometheus, so they survive Prometheus retention rollovers and reflect contract truth.
- Active vs. licensed seat count, by license tier
- Per-dataset utilization vs. contract caps
- Trend lines that make it easy to spot accounts approaching their limits
- Most queried datasets — use this to prioritize cache pre-generation
The
1d / 1w selector labels on this tab may still reflect monthly aggregation in the backing entitlements store in some releases. Treat long-window utilization as directional until label semantics match the aggregation period in your environment.Grafana integration
The action in the upper-right of every tab opens the Grafana Integration dialog. Super-admins can use it to:- Issue Bearer tokens for external Prometheus consumers (the full token is shown exactly once at creation, so copy it immediately).
- List existing tokens with their issue time and issuing user.
- Revoke tokens that are no longer needed or may have leaked.
How the data flows
/metrics-api/* for the in-product UI and /metrics-api-bearer/* for external consumers) expose the same read-only subset of the Prometheus HTTP API. The OIDC-fronted route is what the in-product tabs use; the Bearer-fronted route is what Grafana and federated Prometheus servers use.
