Selecting Specific Columns

Selecting specific columns

CloudQuant Data Liberator automatically returns a minimum default set of columns for each dataset, which varies but typically includes _seq, muts, timestamp, and symbol. Since some datasets contain hundreds of columns, you may want to limit results to specific fields, especially on slower connections.

Filtering columns with the fields parameter

To reduce returned columns, pass the fields parameter with your desired column list:

df = liberator.get_dataframe(liberator.query(
    name = 'daily_bars',
    as_of = '2024-07-24',
    back_to = '2024-07-22',
    symbols = ['AAPL', 'GOOGL'],
    fields = ['Open', 'Close']
))

CloudQuant Data Liberator places your selected columns at the front of the DataFrame, followed by the default columns.

Discovering available columns

To identify column names in a dataset, use one of these approaches:

Using the schema function

liberator.datasets(schema=True)['nameOfDataset']

Inspecting a sample query result

df.columns
list(df.columns)
print(list(df.columns))

Example output for daily_bars:

['_seq', '_dsname', 'timestamp', 'msg_len', 'msg', 'muts', 'symbol',
'length', 'open', 'high', 'low', 'close', 'volume', 'vwap', 'bvwap',
'spread', 'bidvol', 'askvol', 'count', 'avgdelta', 'Date', 'Time',
'Hour', 'DateTime', 'DateHour']

​Selecting specific columns

​Filtering columns with the fields parameter

​Discovering available columns

​Using the schema function

​Inspecting a sample query result

Filtering columns with the fields parameter

Discovering available columns

Using the schema function

Inspecting a sample query result