Query Liberator for Stream of Time Range Data as Arrow::RecordBatch in C++
Result query(const std::map<std::string, Arg> &args);
The query function takes a std::map of arguments and streams data as a generator of Arrow Record Batches. The generator function pointer returned from the query should be used in a for loop to generate Record Batches until the end of the stream.
The std::map consists of key std::string paired with value Arg. Arg represents the following std::variant.
using Arg = std::variant< std::monostate,
bool,
int64_t,
double,
std::string,
std::vector<std::string>,
std::ostream* >;
The return value “Result” represents the following std::variant. Item result from query returns a std::function pointer to a custom generator function.
using Result = std::variant<std::shared_ptr<rapidjson::Document>, Func>;
using Func = std::function<std::variant<std::monostate,
std::shared_ptr<rapidjson::Document>,
std::shared_ptr<arrow::RecordBatch>>()>;
The example below demonstrates how to receive data from the generator to the end of the data stream.
Argument |
Description |
Type |
Example |
symbols |
The security trading symbol(s) you wish to query |
std::string, or std::vector<std:: string> |
{“symbols”,std::vector<std::string>{"AAPL","GOOGL"}} |
name |
The name of the dataset |
std::string |
{“name”,"daily_bars"s} |
as _of |
This value can be any past date so that you can see the data as it was known on the “as of” date. as_of defaults to now. |
std::string Format YYYY-MM-DD HH:MM:SS (HH:MM:SS optional) |
{"as_of","2019-09-15" } |
back_to |
The date where the return dataset should begin. This is reading all the data “back to” the specified date. |
std::string Format YYYY-MM-DD HH:MM:SS (HH:MM:SS optional) |
{"back_to","2019-07-15"} |
fields |
An optional filter of field names. (There are some mandatory fields) |
std::string |
{“fields”, “volume”s} |
stats |
Set to 'total' to get count per symbol as Json result |
std::string |
{“stats”, “total”s} |
crux_key |
if querying for a Crux data set |
std::string |
{“crux_key”, “<Your Key>”} |
compress |
The data compression method on the wire. CloudQuant uses compression. |
Boolean. Always compressed_transfer |
{“compress", false} |
json_xfer |
Json transfer. This is usually False |
Boolean. Always False |
{“json_xfer”:False} |
debug_stream |
Send log info to this ostream ptr |
std::ostream* |
{"debug_stream",&std::cerr} |
warning_stream |
Send warning info to this ostream ptr |
std::ostream* |
{“warning_stream",&std::cerr} |
record_limit |
Limit number of records |
int |
{“record_limit”, 5} |
user |
The user identifier (as assigned by CloudQuant) |
std::string |
{“user”,”myUserID”} |
token |
The user’s assigned token |
std::string |
{“token”,”mypersonal-private-token”} |
Example Query stream of data as Arrow::RecordBatch
Liberator liberator;
Liberator::Result ptr;
ptr = liberator.query({
{"symbols",std::vector<std::string>{"AAPL","GOOGL"}},
{"name","daily_bars"s}
});
auto generator = *std::get_if<Liberator::Func>(&ptr);
for(auto res=generator(); res.index(); res=generator())
{
auto batch = *std::get_if<2>(&res);
(void)arrow::PrettyPrint(*batch, arrow::PrettyPrintOptions(0, 1),
&std::cout);
}