Datasource Configuration
CloudQuant Data Liberator supports a wide range of datasource types for ingesting time series data. Each datasource requires a connection (how to reach the data) and a dataset (what data to extract and how to interpret it).Supported Datasource Types
File-Based Sources
| Type | Description |
|---|---|
| Local File (CSV/TSV) | Flat files on local/mounted storage |
| S3 | Amazon S3 or S3-compatible object storage |
| Azure Blob Storage | Microsoft Azure Blob containers |
| SFTP | SSH File Transfer Protocol servers |
| FTPS | FTP over TLS/SSL |
| CIFS/SMB | Windows/Samba network file shares |
Database Sources
| Type | Description |
|---|---|
| PostgreSQL | High-performance native driver |
| MySQL | Via ODBC driver (MySQL-compatible) |
| SQL Server | Via ODBC driver (ODBC Driver 18) |
| Oracle | Via Oracle database driver (thin mode) |
| Snowflake | High-performance native driver |
Additional Supported File Types
CloudQuant Data Liberator also supports Arrow IPC (binary columnar), Parquet, Excel (.xlsx), XML, and HDF5 files. These are configured the same way as other file-based sources through the CloudQuant Data Liberator UI.Architecture: Connection + Dataset
Every datasource in CloudQuant Data Liberator is composed of two parts:Connection
Defines how to reach the data — credentials, endpoints, paths, and transport protocol.Dataset
Defines what to extract — which table/files, timestamp columns, key columns, schema, and data frequency.Common Configuration Concepts
Timestamp Configuration
All datasources require timestamp configuration to map source data into CloudQuant Data Liberator’s microsecond timestamp (muts) format:
| Field | Description |
|---|---|
data_dt_column | Column(s) containing the datetime |
data_dt_format | Format string or parsing specification |
data_dt_timezone | Timezone of the source data (e.g., "UTC", "America/New_York") |
data_dt_nudge | Microsecond offset applied to timestamps |
Supported DateTime Formats
| Format | Description |
|---|---|
"%Y-%m-%d %H:%M:%S" | Standard strptime format |
"datetime" | Native database datetime column |
"date" | Native date column (date32/date64) |
"muts" | Unix epoch microseconds |
"uts" | Unix epoch seconds |
"nuts" | Unix epoch nanoseconds |
true | Auto-detect native datetime (database sources) |
Key Column Configuration
Thedata_key_column field defines the symbol/key used for filtering queries:
Schema Definition
Each column in a dataset schema requires:string, int64, uint64, double, float, bool, date32, date64, time64
Column groups:
key— Symbol/key columnstime— Timestamp columnsvalue— Data columnsmeta— System columns (_seq,muts, etc.)
Auto-Generated Columns
CloudQuant Data Liberator automatically generates these columns if not present in source data:| Column | Type | Description |
|---|---|---|
_seq | uint64 | Sequential row number within partition |
muts | int64 | Microseconds since Unix epoch |
timestamp | string | Human-readable timestamp (America/New_York) |
symbol | string | Key column (copied from data_key_column) |
File Name Date Extraction
For file-based sources, dates can be extracted from filenames:| Field | Description | Example |
|---|---|---|
fname_dt_regex | Regex to match date portion of filename | data_(\d{4}-\d{2}-\d{2})\.csv |
fname_dt_format | strptime format for the matched portion | %Y-%m-%d |
fname_dt_timezone | Timezone of the filename date | UTC |
fname_dt_nudge | Microsecond offset | 0 |
fname_dt_approx_seconds | Approximate seconds per file | 86400 |

