C++ Queries - a stream or time range of data

Query Liberator for Stream of Time Range Data as Arrow::RecordBatch in C++

Result query(const std::map<std::string, Arg> &args);

 

The query function takes a std::map of arguments and streams data as a generator of Arrow Record Batches. The generator function pointer returned from the query should be used in a for loop to generate Record Batches until the end of the stream.

The std::map consists of key std::string paired with value Arg. Arg represents the following std::variant.

     

using Arg = std::variant< std::monostate,
                       bool,
                       int64_t,
                       double,
                       std::string,
                       std::vector<std::string>,
                       std::ostream* >;

 

The return valueResult” represents the following std::variant. Item result from query returns a std::function pointer to a custom generator function.

 

using Result = std::variant<std::shared_ptr<rapidjson::Document>, Func>;
using Func = std::function<std::variant<std::monostate,
                std::shared_ptr<rapidjson::Document>,
std::shared_ptr<arrow::RecordBatch>>()>;

 

The example below demonstrates how to receive data from the generator to the end of the data stream.

 

Argument

Description

Type

Example

symbols

The security trading symbol(s) you wish to query

std::string, or std::vector<std:: string>

{“symbols”,std::vector<std::string>{"AAPL","GOOGL"}}

name

The name of the dataset

std::string

{“name”,"daily_bars"s}

as _of

This value can be any past date so that you can see the data as it was known on the “as of” date. as_of defaults to now.

std::string

Format YYYY-MM-DD HH:MM:SS

(HH:MM:SS optional)

{"as_of","2019-09-15" }

back_to

The date where the return dataset should begin. This is reading all the data “back to” the specified date.

std::string

Format YYYY-MM-DD HH:MM:SS

(HH:MM:SS optional)

{"back_to","2019-07-15"}

fields

An optional filter of field names. (There are some mandatory fields)

std::string

{“fields”, “volume”s}

stats

Set to 'total' to get count per symbol as Json result

std::string

{“stats”, “total”s}

crux_key

if querying for a Crux data set

std::string

{“crux_key”, “<Your Key>”}

compress

The data compression method on the wire. CloudQuant uses compression.

Boolean. Always compressed_transfer

{“compress", false}

json_xfer

Json transfer. This is usually False

Boolean. Always False

{“json_xfer”:False}

debug_stream

Send log info to this ostream ptr

std::ostream*

{"debug_stream",&std::cerr}

warning_stream

Send warning info to this ostream ptr

std::ostream*

{“warning_stream",&std::cerr}

record_limit

Limit number of records

int

{“record_limit”, 5}

user

The user identifier (as assigned by CloudQuant)

std::string

{“user”,”myUserID”}

token

The user’s assigned token

std::string

{“token”,”mypersonal-private-token”}

 

Example Query stream of data as Arrow::RecordBatch

Liberator liberator;
Liberator::Result ptr;
 
ptr = liberator.query({                                
{"symbols",std::vector<std::string>{"AAPL","GOOGL"}},
                  {"name","daily_bars"s}
                  });

  

auto generator = *std::get_if<Liberator::Func>(&ptr);

for(auto res=generator(); res.index(); res=generator())
{
    auto batch = *std::get_if<2>(&res);
    (void)arrow::PrettyPrint(*batch, arrow::PrettyPrintOptions(0, 1),
      &std::cout);
}