Skip to main content

SuperQuery

SuperQuery enables you to resample pandas DataFrames from CloudQuant Data Liberator into a common time axis by querying multiple datasets simultaneously.

Initial Setup

import liberator

symbols = ['TSLA']

as_of = '2024-06-07'

back_to = '2024-06-04'

%time df1 = liberator.get_dataframe(
    liberator.query(
        symbols=symbols,
        name='minute_bars',
        as_of=as_of,
        back_to=back_to
    )
)

liberator.get_dataframe(
    liberator.query(
        name='daily_bars',
        symbols=symbols,
        as_of=as_of,
        back_to=back_to
    )
)
This loads two independent datasets: minute bars and daily bars.

Querying Multiple Datasets Together

SuperQuery lets you specify how multiple datasets are reindexed and merged together using the superq_resample_rule parameter.

Daily (1D) Resampling

For a three-day query using daily frequency, the result contains three rows with daily bars and the final minute bar of each day:
df = liberator.get_dataframe(
    liberator.query(
        symbols=symbols,
        name=['daily_bars', 'minute_bars'],
        as_of=as_of,
        back_to=back_to,
        superq_periods_per_batch_override=5,
        superq_resample_rule='1D'
    )
)

df

Hourly (60T) Resampling

For a three-day query using 60-minute frequency, the result contains 24 rows per day. Daily bar data is timestamped at 8pm:
%time df4 = liberator.get_dataframe(
    liberator.query(
        symbols=symbols,
        name=['daily_bars', 'minute_bars'],
        as_of=as_of,
        back_to=back_to,
        superq_periods_per_batch_override=5,
        superq_resample_rule='60T'
    )
)

df

The superq_resample_rule Parameter

The superq_resample_rule parameter uses pandas “Offset Aliases” to define the resampling frequency. Common values:
RuleFrequency
1T1 minute
5T5 minutes
15T15 minutes
60T60 minutes
1D1 day
For a complete list of offset aliases, see the pandas time series offset aliases documentation.