Cache pre-generation
By default, Liberator builds its cache the first time a query runs (cache warming). For large datasets or slow source databases, that first query can take a long time. Pre-generated cache lets Super Admins schedule cache creation ahead of time so data is ready when users query it. Cache files are written in Parquet format (replacing the older Arrow cache format) and can be stored in Amazon S3 or Google Cloud Storage, so cached data does not need to live on the Liberator server. Parquet is typically much more compact than the previous Arrow caches, which reduces storage cost at scale.Super Admin role is required to configure cache storage connections and pre-generation settings.
Step 1 — Create a cache storage connection
Before enabling pre-generated cache on a dataset, create a cache storage destination.Configure storage
Select your storage provider (S3 is supported today; confirm GCS availability with your CloudQuant account team if needed). Enter the bucket name and credentials.
Step 2 — Enable pre-generated cache on a dataset
Format and destination
Select Parquet as the cache format (recommended). Under Destination, select the cache storage connection from Step 1.
Step 3 — Configure the cache window
| Option | Use when |
|---|---|
| Rolling window | Users mostly query recent data; keeps the last N days cached and advances automatically. |
| Fixed date range | You need a known historical slice that does not change. |
| Full dataset | You want complete coverage and the dataset is bounded in size. |
Step 4 — Run or schedule generation
Scheduled: Once a schedule is configured on the dataset, cache generation runs automatically. No further action is required. Manual: Open the dataset and click Trigger Cache Pre-Generation. A status indicator shows whether the job is in progress or complete.What users experience
After pre-generated cache is populated, queries against that dataset are much faster because Liberator reads the warm cache instead of the source system. This matters most for large SQL-backed datasets where the first cold query previously took minutes.Tips
- Use rolling window for datasets queried over recent windows (for example, the last 30 days of market data).
- Use fixed date range for historical snapshots that do not change.
- Use full dataset only when storage budget and dataset size are understood.
- The Most queried datasets view on System Monitoring helps prioritize which datasets to pre-generate.

