> ## Documentation Index
> Fetch the complete documentation index at: https://knowledge.cloudquant.com/llms.txt
> Use this file to discover all available pages before exploring further.

# System Monitoring Overview

> Built-in observability for the CloudQuant Data Liberator stack, with a tour of each tab and when to look at it

# System monitoring

The **System Monitoring** page in the Liberator UI surfaces the most operationally relevant signals from the Liberator stack in a single place: cluster health, queue depth, dataset access patterns, long-running queries, and license usage.

It's intended for super-admins, on-call engineers, and capacity planners. End users do not see this page.

<Note>
  The System Monitoring page is backed by the same Prometheus instance you can connect Grafana to. See the [Grafana](/integrations/grafana) integration guide if you want the same data alongside metrics from systems outside CloudQuant.
</Note>

## Opening system monitoring

1. Sign in to the Liberator UI as a super-admin.
2. Click **System Monitoring** in the top navigation.

If the menu item doesn't appear, your account doesn't have super-admin privileges. Contact your CloudQuant administrator.

## Tabs

System Monitoring is organized into five tabs. The first four are backed by Prometheus and share a global **time-range selector** (`6h` / `24h`) in the upper-right. The fifth is backed by the entitlements database and uses its own dedicated `1d / 1w / 1m / 1y` selector.

### Cluster

Real-time cluster health from the Liberator gateway, application pods, and host nodes.

What you'll see:

* Gateway request rate and latency percentiles (`p50` / `p95` / `p99`)
* Per-pod CPU and memory utilization for Liberator components
* Per-node CPU, memory, and filesystem utilization
* Data-cache worker pool status

<Warning>
  Watch **volume / disk usage** on mounted filesystems. Sustained usage at or above roughly **85%** on a volume warrants immediate attention to avoid query failures from insufficient write space.
</Warning>

Use this tab to answer: *"Is the cluster behaving normally right now, and if not, where is the problem?"*

### Queue

The Liberator **waiting room** queue — how many requests are queued, how long they've been waiting, and which users own them.

What you'll see:

* Active connections (in flight) and queued connections (waiting)
* Per-user breakdown of queue occupancy
* Maximum in-flight query duration (a useful early-warning signal)

The `CQAIOps` service account often appears as a high-volume user; that reflects automated platform monitoring and is expected.

Use this tab to answer: *"Is anyone being blocked, and by whom?"*

### Datasets

Dataset access patterns over the selected time range.

What you'll see:

* Top datasets by query count
* Top datasets by bytes returned
* Distribution of access by client (Python, REST, Excel, etc.)

Use this tab for capacity planning and detecting anomalous access patterns (e.g. a previously dormant dataset suddenly receiving heavy traffic).

### Long queries

The slowest individual queries in the selected window.

What you'll see:

* A ranked list of queries with execution time, user, dataset, and `from` / `to` window
* Click-through to see the full query text and result-set size

Use this tab to find candidates for query rewriting, dataset re-partitioning, or user education.

### Usage

License utilization from the entitlements database. Distinct from the other tabs in two ways:

1. **Different selector.** This tab exposes `1d / 1w / 1m / 1y` windows instead of the Prometheus `6h / 24h`, because license utilization is measured against per-contract caps that operate on much longer windows.
2. **Different backing store.** Numbers come from the entitlements database, not Prometheus, so they survive Prometheus retention rollovers and reflect contract truth.

What you'll see:

* Active vs. licensed seat count, by license tier
* Per-dataset utilization vs. contract caps
* Trend lines that make it easy to spot accounts approaching their limits
* **Most queried datasets** — use this to prioritize [cache pre-generation](/administration/cache-pre-generation)

<Note>
  The `1d` / `1w` selector labels on this tab may still reflect monthly aggregation in the backing entitlements store in some releases. Treat long-window utilization as directional until label semantics match the aggregation period in your environment.
</Note>

## Grafana integration

The action in the upper-right of every tab opens the **Grafana Integration** dialog. Super-admins can use it to:

* **Issue Bearer tokens** for external Prometheus consumers (the full token is shown exactly once at creation, so copy it immediately).
* **List existing tokens** with their issue time and issuing user.
* **Revoke tokens** that are no longer needed or may have leaked.

See the full setup walkthrough in the [Grafana integration guide](/integrations/grafana).

## How the data flows

```
┌────────────────────────────────────────────────────────────────────┐
│  Liberator UI                                                      │
│  ┌────────────────────────┐    ┌──────────────────────────────┐    │
│  │  Cluster / Queue /     │    │  Usage tab                   │    │
│  │  Datasets / Long Q     │    │                              │    │
│  └──┬─────────────────────┘    └──┬───────────────────────────┘    │
└─────┼────────────────────────────┼────────────────────────────────┘
      │ /metrics-api/* (OIDC cookie)│ /admin-api/entitlements/*
      ▼                             ▼
┌──────────────┐              ┌──────────────────┐
│  Prometheus  │              │  Entitlements DB │
│  (read-only) │              │  (PostgreSQL)    │
└──────────────┘              └──────────────────┘
      ▲
      │ /metrics-api-bearer/* (Bearer token)
      │
┌──────────────┐
│  External    │
│  Grafana,    │
│  Federation  │
└──────────────┘
```

Both Prometheus-fronted endpoints (`/metrics-api/*` for the in-product UI and `/metrics-api-bearer/*` for external consumers) expose the same read-only subset of the Prometheus HTTP API. The OIDC-fronted route is what the in-product tabs use; the Bearer-fronted route is what Grafana and federated Prometheus servers use.
