Documentation Index
Fetch the complete documentation index at: https://knowledge.cloudquant.com/llms.txt
Use this file to discover all available pages before exploring further.
Local File (CSV/TSV)
Local file datasources read CSV, TSV, or other delimited flat files from a directory on the CloudQuant Data Liberator server or a mounted filesystem. This is the simplest file-based connection type and serves as the foundation for understanding all other file-based sources.Connection Configuration
Required Fields
| Field | Type | Description |
|---|---|---|
connection_type | string | Must be "file" |
behavior | string | Must be "file" |
location | string | Absolute path to the directory containing data files |
The
location field should point to a directory, not an individual file. CloudQuant Data Liberator will scan the directory for files matching the file_pattern in data_args.Example Connection
Dataset Configuration (data_args)
All file-based datasources share the samedata_args fields. These control how CloudQuant Data Liberator finds, parses, and interprets your files.
Required Fields
| Field | Type | Description |
|---|---|---|
file_pattern | string | Glob pattern to match files, e.g., "*.csv", "prefix_*.tsv" |
data_dt_column | string or list | Column(s) containing the datetime value |
data_dt_format | string or list | strptime format string, or special values: "muts", "uts", "nuts", "datetime", "date" |
data_key_column | string or list | Column(s) used as the symbol/key for query filtering |
Optional Fields
| Field | Type | Default | Description |
|---|---|---|---|
sep_override | string | "," | Delimiter character: "," (comma), "\t" (tab), "|" (pipe), ";" (semicolon) |
encoding | string | "utf-8" | File encoding (e.g., "utf-8", "latin-1", "ascii") |
data_dt_timezone | string | "UTC" | Timezone of source data, e.g., "UTC", "America/New_York" |
fname_dt_regex | string | Regex to extract a date from the filename | |
fname_dt_format | string | strptime format for the date extracted by fname_dt_regex | |
fname_dt_timezone | string | Timezone of the filename-derived date | |
fname_dt_nudge | int | 0 | Microsecond offset applied to filename-derived dates |
fname_dt_approx_seconds | int | Approximate number of seconds of data per file (used for query optimization) | |
arrow_sort | list | ["symbol", "muts"] | Sort order for the resulting Arrow table |
arrow_timestamp | bool | true | Whether to generate the human-readable timestamp column |
Complete Example
Below is a full configuration showing both the connection and a dataset for daily trade CSV files.Connection
Dataset
Tab-Separated Files (TSV)
For TSV files, setsep_override to "\t":
Composite Key Example
When the symbol is constructed from multiple columns:NYSE_AAPL, NASDAQ_MSFT, etc.
Multiple Datetime Columns
When the date and time are in separate columns:"%Y-%m-%d %H:%M:%S.%f".
