Skip to main content

Cache pre-generation

By default, Liberator builds its cache the first time a query runs (cache warming). For large datasets or slow source databases, that first query can take a long time. Pre-generated cache lets Super Admins schedule cache creation ahead of time so data is ready when users query it. Cache files are written in Parquet format (replacing the older Arrow cache format) and can be stored in Amazon S3 or Google Cloud Storage, so cached data does not need to live on the Liberator server. Parquet is typically much more compact than the previous Arrow caches, which reduces storage cost at scale.
Super Admin role is required to configure cache storage connections and pre-generation settings.

Step 1 — Create a cache storage connection

Before enabling pre-generated cache on a dataset, create a cache storage destination.
1

Add a connection

Go to Connections → Add Connection.
2

Select cache storage

Select Cache Storage as the connection type.
3

Configure storage

Select your storage provider (S3 is supported today; confirm GCS availability with your CloudQuant account team if needed). Enter the bucket name and credentials.
4

Test and save

Click Test Connection, then Save.

Step 2 — Enable pre-generated cache on a dataset

1

Open the dataset

Open the dataset and click Edit.
2

Advanced options

Scroll to Advanced Options and toggle on Pre-Generated Cache.
3

Format and destination

Select Parquet as the cache format (recommended). Under Destination, select the cache storage connection from Step 1.
4

Test destination

Click Test next to the destination to confirm connectivity, then save.

Step 3 — Configure the cache window

OptionUse when
Rolling windowUsers mostly query recent data; keeps the last N days cached and advances automatically.
Fixed date rangeYou need a known historical slice that does not change.
Full datasetYou want complete coverage and the dataset is bounded in size.
Enter the number of days (rolling) or the start/end dates (fixed), then save.
Long or open-ended retention periods trigger a storage impact warning in the UI. Review projected volume before saving — large windows in Parquet can accumulate significant object storage over time.

Step 4 — Run or schedule generation

Scheduled: Once a schedule is configured on the dataset, cache generation runs automatically. No further action is required. Manual: Open the dataset and click Trigger Cache Pre-Generation. A status indicator shows whether the job is in progress or complete.

What users experience

After pre-generated cache is populated, queries against that dataset are much faster because Liberator reads the warm cache instead of the source system. This matters most for large SQL-backed datasets where the first cold query previously took minutes.

Tips

  • Use rolling window for datasets queried over recent windows (for example, the last 30 days of market data).
  • Use fixed date range for historical snapshots that do not change.
  • Use full dataset only when storage budget and dataset size are understood.
  • The Most queried datasets view on System Monitoring helps prioritize which datasets to pre-generate.