base module ¶

Base class for working with records.

vectorbt works with two different representations of data: matrices and records.

A matrix, in this context, is just an array of one-dimensional arrays, each corresponding to a separate feature. The matrix itself holds only one kind of information (one attribute). For example, one can create a matrix for entry signals, with columns being different strategy configurations. But what if the matrix is huge and sparse? What if there is more information we would like to represent by each element? Creating multiple matrices would be a waste of memory.

Records make possible representing complex, sparse information in a dense format. They are just an array of one-dimensional arrays of fixed schema. You can imagine records being a DataFrame, where each row represents a record and each column represents a specific attribute.

               a     b
         0   1.0   5.0
attr1 =  1   2.0   NaN
         2   NaN   7.0
         3   4.0   8.0
               a     b
         0   9.0  13.0
attr2 =  1  10.0   NaN
         2   NaN  15.0
         3  12.0  16.0
            |
            v
      id  col  idx  attr1  attr2
0      0    0    0      1      9
1      1    0    1      2     10
2      2    0    3      4     12
3      3    1    0      5     13
4      4    1    1      7     15
5      5    1    3      8     16

Another advantage of records is that they are not constrained by size. Multiple records can map to a single element in a matrix. For example, one can define multiple orders at the same time step, which is impossible to represent in a matrix form without using complex data types.

Consider the following example:

>>> import numpy as np
>>> import pandas as pd
>>> from numba import njit
>>> from collections import namedtuple
>>> import vectorbt as vbt

>>> example_dt = np.dtype([
...     ('id', np.int64),
...     ('col', np.int64),
...     ('idx', np.int64),
...     ('some_field', np.float64)
... ])
>>> records_arr = np.array([
...     (0, 0, 0, 10.),
...     (1, 0, 1, 11.),
...     (2, 0, 2, 12.),
...     (3, 1, 0, 13.),
...     (4, 1, 1, 14.),
...     (5, 1, 2, 15.),
...     (6, 2, 0, 16.),
...     (7, 2, 1, 17.),
...     (8, 2, 2, 18.)
... ], dtype=example_dt)
>>> wrapper = vbt.ArrayWrapper(index=['x', 'y', 'z'],
...     columns=['a', 'b', 'c'], ndim=2, freq='1 day')
>>> records = vbt.Records(wrapper, records_arr)

Printing¶

There are two ways to print records:

Raw dataframe that preserves field names and data types:

>>> records.records
   id  col  idx  some_field
0   0    0    0        10.0
1   1    0    1        11.0
2   2    0    2        12.0
3   3    1    0        13.0
4   4    1    1        14.0
5   5    1    2        15.0
6   6    2    0        16.0
7   7    2    1        17.0
8   8    2    2        18.0

Readable dataframe that takes into consideration Records.field_config:

>>> records.records_readable
   Id Column Timestamp  some_field
0   0      a         x        10.0
1   1      a         y        11.0
2   2      a         z        12.0
3   3      b         x        13.0
4   4      b         y        14.0
5   5      b         z        15.0
6   6      c         x        16.0
7   7      c         y        17.0
8   8      c         z        18.0

Mapping¶

Records are just structured arrays with a bunch of methods and properties for processing them. Their main feature is to map the records array and to reduce it by column (similar to the MapReduce paradigm). The main advantage is that it all happens without conversion to the matrix form and wasting memory resources.

Records can be mapped to MappedArray in several ways:

Use Records.map_field() to map a record field:

>>> records.map_field('some_field')
<vectorbt.records.mapped_array.MappedArray at 0x7ff49bd31a58>

>>> records.map_field('some_field').values
array([10., 11., 12., 13., 14., 15., 16., 17., 18.])

Use Records.map() to map records using a custom function.

>>> @njit
... def power_map_nb(record, pow):
...     return record.some_field ** pow

>>> records.map(power_map_nb, 2)
<vectorbt.records.mapped_array.MappedArray at 0x7ff49c990cf8>

>>> records.map(power_map_nb, 2).values
array([100., 121., 144., 169., 196., 225., 256., 289., 324.])

Use Records.map_array() to convert an array to MappedArray.

>>> records.map_array(records_arr['some_field'] ** 2)
<vectorbt.records.mapped_array.MappedArray object at 0x7fe9bccf2978>

>>> records.map_array(records_arr['some_field'] ** 2).values
array([100., 121., 144., 169., 196., 225., 256., 289., 324.])

Use Records.apply() to apply a function on each column/group:

>>> @njit
... def cumsum_apply_nb(records):
...     return np.cumsum(records.some_field)

>>> records.apply(cumsum_apply_nb)
<vectorbt.records.mapped_array.MappedArray at 0x7ff49c990cf8>

>>> records.apply(cumsum_apply_nb).values
array([10., 21., 33., 13., 27., 42., 16., 33., 51.])

>>> group_by = np.array(['first', 'first', 'second'])
>>> records.apply(cumsum_apply_nb, group_by=group_by, apply_per_group=True).values
array([10., 21., 33., 46., 60., 75., 16., 33., 51.])

Notice how cumsum resets at each column in the first example and at each group in the second example.

Filtering¶

Use Records.apply_mask() to filter elements per column/group:

>>> mask = [True, False, True, False, True, False, True, False, True]
>>> filtered_records = records.apply_mask(mask)
>>> filtered_records.count()
a    2
b    1
c    2
dtype: int64

>>> filtered_records.values['id']
array([0, 2, 4, 6, 8])

Grouping¶

One of the key features of Records is that you can perform reducing operations on a group of columns as if they were a single column. Groups can be specified by group_by, which can be anything from positions or names of column levels, to a NumPy array with actual groups.

There are multiple ways of define grouping:

When creating Records, pass group_by to ArrayWrapper:

>>> group_by = np.array(['first', 'first', 'second'])
>>> grouped_wrapper = wrapper.replace(group_by=group_by)
>>> grouped_records = vbt.Records(grouped_wrapper, records_arr)

>>> grouped_records.map_field('some_field').mean()
first     12.5
second    17.0
dtype: float64

Regroup an existing Records:

>>> records.regroup(group_by).map_field('some_field').mean()
first     12.5
second    17.0
dtype: float64

Pass group_by directly to the mapping method:

>>> records.map_field('some_field', group_by=group_by).mean()
first     12.5
second    17.0
dtype: float64

Pass group_by directly to the reducing method:

>>> records.map_field('some_field').mean(group_by=group_by)
a    11.0
b    14.0
c    17.0
dtype: float64

Note

Grouping applies only to reducing operations, there is no change to the arrays.

Indexing¶

Like any other class subclassing Wrapping, we can do pandas indexing on a Records instance, which forwards indexing operation to each object with columns:

>>> records['a'].records
   id  col  idx  some_field
0   0    0    0        10.0
1   1    0    1        11.0
2   2    0    2        12.0

>>> grouped_records['first'].records
   id  col  idx  some_field
0   0    0    0        10.0
1   1    0    1        11.0
2   2    0    2        12.0
3   3    1    0        13.0
4   4    1    1        14.0
5   5    1    2        15.0

Note

Changing index (time axis) is not supported. The object should be treated as a Series rather than a DataFrame; for example, use some_field.iloc[0] instead of some_field.iloc[:, 0].

Indexing behavior depends solely upon ArrayWrapper. For example, if group_select is enabled indexing will be performed on groups, otherwise on single columns.

Caching¶

Records supports caching. If a method or a property requires heavy computation, it's wrapped with cached_method() and cached_property respectively. Caching can be disabled globally via caching in settings.

Note

Because of caching, class is meant to be immutable and all properties are read-only. To change any attribute, use the copy method and pass the attribute as keyword argument.

Saving and loading¶

Like any other class subclassing Pickleable, we can save a Records instance to the disk with Pickleable.save() and load it with Pickleable.load().

Stats¶

Hint

See StatsBuilderMixin.stats() and Records.metrics.

>>> records.stats(column='a')
Start                          x
End                            z
Period           3 days 00:00:00
Total Records                  3
Name: a, dtype: object

StatsBuilderMixin.stats() also supports (re-)grouping:

>>> grouped_records.stats(column='first')
Start                          x
End                            z
Period           3 days 00:00:00
Total Records                  6
Name: first, dtype: object

Plots¶

Hint

See PlotsBuilderMixin.plots() and Records.subplots.

This class is too generic to have any subplots, but feel free to add custom subplots to your subclass.

Extending¶

Records class can be extended by subclassing.

In case some of our fields have the same meaning but different naming (such as the base field idx) or other properties, we can override field_config using override_field_config(). It will look for configs of all base classes and merge our config on top of them. This preserves any base class property that is not explicitly listed in our config.

>>> from vectorbt.records.decorators import override_field_config

>>> my_dt = np.dtype([
...     ('my_id', np.int64),
...     ('my_col', np.int64),
...     ('my_idx', np.int64)
... ])

>>> my_fields_config = dict(
...     dtype=my_dt,
...     settings=dict(
...         id=dict(name='my_id'),
...         col=dict(name='my_col'),
...         idx=dict(name='my_idx')
...     )
... )
>>> @override_field_config(my_fields_config)
... class MyRecords(vbt.Records):
...     pass

>>> records_arr = np.array([
...     (0, 0, 0),
...     (1, 0, 1),
...     (2, 1, 0),
...     (3, 1, 1)
... ], dtype=my_dt)
>>> wrapper = vbt.ArrayWrapper(index=['x', 'y'],
...     columns=['a', 'b'], ndim=2, freq='1 day')
>>> my_records = MyRecords(wrapper, records_arr)

>>> my_records.id_arr
array([0, 1, 2, 3])

>>> my_records.col_arr
array([0, 0, 1, 1])

>>> my_records.idx_arr
array([0, 1, 0, 1])

Alternatively, we can override the _field_config class attribute.

>>> @override_field_config
... class MyRecords(vbt.Records):
...     _field_config = dict(
...         dtype=my_dt,
...         settings=dict(
...             id=dict(name='my_id'),
...             idx=dict(name='my_idx'),
...             col=dict(name='my_col')
...         )
...     )

Note

Don't forget to decorate the class with @override_field_config to inherit configs from base classes.

You can stop inheritance by not decorating or passing merge_configs=False to the decorator.

MetaFields class ¶

MetaFields(
    *args,
    **kwargs
)

Meta class that exposes a read-only class property MetaFields.field_config.

Superclasses

builtins.type

Subclasses

MetaRecords

field_config property ¶

Field config.

MetaRecords class ¶

MetaRecords(
    *args,
    **kwargs
)

Meta class that exposes a read-only class property StatsBuilderMixin.metrics.

Superclasses

Inherited members

Records class ¶

Records(
    wrapper,
    records_arr,
    col_mapper=None,
    **kwargs
)

Wraps the actual records array (such as trades) and exposes methods for mapping it to some array of values (such as PnL of each trade).

Args

wrapper : ArrayWrapper

Array wrapper.

See ArrayWrapper.

records_arr : array_like

A structured NumPy array of records.

Must have the fields id (record index) and col (column index).

col_mapper : ColumnMapper

Column mapper if already known.

Note

It depends on records_arr, so make sure to invalidate col_mapper upon creating a Records instance with a modified records_arr.

Records.replace() does it automatically.

**kwargs

Custom keyword arguments passed to the config.

Useful if any subclass wants to extend the config.

Superclasses

Inherited members

Subclasses

apply method ¶

Records.apply(
    apply_func_nb,
    *args,
    group_by=None,
    apply_per_group=False,
    dtype=None,
    **kwargs
)

Apply function on records per column/group. Returns mapped array.

Applies per group if apply_per_group is True.

See apply_on_records_nb().

**kwargs are passed to Records.map_array().

apply_mask method ¶

Records.apply_mask(
    mask,
    group_by=None,
    **kwargs
)

Return a new class instance, filtered by mask.

build_field_config_doc class method ¶

Records.build_field_config_doc(
    source_cls=None
)

Build field config documentation.

col_arr property ¶

Get column array.

col_mapper property ¶

Column mapper.

See ColumnMapper.

count method ¶

Records.count(
    group_by=None,
    wrap_kwargs=None
)

Return count by column.

field_config class variable ¶

Field config of Records.

Config({
    "dtype": null,
    "settings": {
        "id": {
            "name": "id",
            "title": "Id"
        },
        "col": {
            "name": "col",
            "title": "Column",
            "mapping": "columns"
        },
        "idx": {
            "name": "idx",
            "title": "Timestamp",
            "mapping": "index"
        }
    }
})

get_apply_mapping_arr method ¶

Records.get_apply_mapping_arr(
    field,
    **kwargs
)

Resolve the mapped array on the field, with mapping applied. Uses Records.field_config.

get_by_col_idxs method ¶

Records.get_by_col_idxs(
    col_idxs
)

Get records corresponding to column indices.

Returns new records array.

get_field_arr method ¶

Records.get_field_arr(
    field
)

Resolve the array of the field. Uses Records.field_config.

get_field_mapping method ¶

Records.get_field_mapping(
    field
)

Resolve the mapping of the field. Uses Records.field_config.

get_field_name method ¶

Records.get_field_name(
    field
)

Resolve the name of the field. Uses Records.field_config..

get_field_setting method ¶

Records.get_field_setting(
    field,
    setting,
    default=None
)

Resolve any setting of the field. Uses Records.field_config.

get_field_title method ¶

Records.get_field_title(
    field
)

Resolve the title of the field. Uses Records.field_config.

get_map_field method ¶

Records.get_map_field(
    field,
    **kwargs
)

Resolve the mapped array of the field. Uses Records.field_config.

get_map_field_to_index method ¶

Records.get_map_field_to_index(
    field,
    **kwargs
)

Resolve the mapped array on the field, with index applied. Uses Records.field_config.

id_arr property ¶

Get id array.

idx_arr property ¶

Get index array.

indexing_func method ¶

Records.indexing_func(
    pd_indexing_func,
    **kwargs
)

Perform indexing on Records.

indexing_func_meta method ¶

Records.indexing_func_meta(
    pd_indexing_func,
    **kwargs
)

Perform indexing on Records and return metadata.

is_sorted method ¶

Records.is_sorted(
    incl_id=False
)

Check whether records are sorted.

map method ¶

Records.map(
    map_func_nb,
    *args,
    dtype=None,
    **kwargs
)

Map each record to a scalar value. Returns mapped array.

See map_records_nb().

**kwargs are passed to Records.map_array().

map_array method ¶

Records.map_array(
    a,
    idx_arr=None,
    mapping=None,
    group_by=None,
    **kwargs
)

Convert array to mapped array.

The length of the array should match that of the records.

map_field method ¶

Records.map_field(
    field,
    **kwargs
)

Convert field to mapped array.

**kwargs are passed to Records.map_array().

metrics class variable ¶

Metrics supported by Records.

Config({
    "start": {
        "title": "Start",
        "calc_func": "<function Records.<lambda> at 0x119967060>",
        "agg_func": null,
        "tags": "wrapper"
    },
    "end": {
        "title": "End",
        "calc_func": "<function Records.<lambda> at 0x119967100>",
        "agg_func": null,
        "tags": "wrapper"
    },
    "period": {
        "title": "Period",
        "calc_func": "<function Records.<lambda> at 0x1199671a0>",
        "apply_to_timedelta": true,
        "agg_func": null,
        "tags": "wrapper"
    },
    "count": {
        "title": "Count",
        "calc_func": "count",
        "tags": "records"
    }
})

Returns Records._metrics, which gets (deep) copied upon creation of each instance. Thus, changing this config won't affect the class.

To change metrics, you can either change the config in-place, override this property, or overwrite the instance variable Records._metrics.

override_field_config_doc class method ¶

Records.override_field_config_doc(
    __pdoc__,
    source_cls=None
)

Call this method on each subclass that overrides field_config.

plots_defaults property ¶

Defaults for PlotsBuilderMixin.plots().

Merges PlotsBuilderMixin.plots_defaults and records.plots from settings.

recarray property ¶

records property ¶

Records.

records_arr property ¶

Records array.

records_readable property ¶

Records in readable format.

replace method ¶

Records.replace(
    **kwargs
)

See Configured.replace().

Also, makes sure that Records.col_mapper is not passed to the new instance.

sort method ¶

Records.sort(
    incl_id=False,
    group_by=None,
    **kwargs
)

Sort records by columns (primary) and ids (secondary, optional).

Note

Sorting is expensive. A better approach is to append records already in the correct order.

stats_defaults property ¶

Defaults for StatsBuilderMixin.stats().

Merges StatsBuilderMixin.stats_defaults and records.stats from settings.

subplots class variable ¶

Subplots supported by Records.

Config({})

Returns Records._subplots, which gets (deep) copied upon creation of each instance. Thus, changing this config won't affect the class.

To change subplots, you can either change the config in-place, override this property, or overwrite the instance variable Records._subplots.

values property ¶

Records array.

RecordsWithFields class ¶

RecordsWithFields()

Class exposes a read-only class property RecordsWithFields.field_config.

Subclasses

Records

field_config function ¶

Field config of ${cls_name}.

${field_config}

base module¶

Printing¶

Mapping¶

Filtering¶

Grouping¶

Indexing¶

Caching¶

Saving and loading¶

Stats¶

Plots¶

Extending¶

MetaFields class¶

field_config property¶

MetaRecords class¶

Records class¶

apply method¶

apply_mask method¶

build_field_config_doc class method¶

col_arr property¶

col_mapper property¶

count method¶

field_config class variable¶

get_apply_mapping_arr method¶

get_by_col_idxs method¶

get_field_arr method¶

get_field_mapping method¶

get_field_name method¶

get_field_setting method¶

get_field_title method¶

get_map_field method¶

get_map_field_to_index method¶

id_arr property¶

idx_arr property¶

indexing_func method¶

indexing_func_meta method¶

is_sorted method¶

map method¶

map_array method¶

map_field method¶

metrics class variable¶

override_field_config_doc class method¶

plots_defaults property¶

recarray property¶

records property¶

records_arr property¶

records_readable property¶

replace method¶

sort method¶

stats_defaults property¶

subplots class variable¶

values property¶

RecordsWithFields class¶

field_config function¶

base module ¶

MetaFields class ¶

field_config property ¶

MetaRecords class ¶

Records class ¶

apply method ¶

apply_mask method ¶

build_field_config_doc class method ¶

col_arr property ¶

col_mapper property ¶

count method ¶

field_config class variable ¶

get_apply_mapping_arr method ¶

get_by_col_idxs method ¶

get_field_arr method ¶

get_field_mapping method ¶

get_field_name method ¶

get_field_setting method ¶

get_field_title method ¶

get_map_field method ¶

get_map_field_to_index method ¶

id_arr property ¶

idx_arr property ¶

indexing_func method ¶

indexing_func_meta method ¶

is_sorted method ¶

map method ¶

map_array method ¶

map_field method ¶

metrics class variable ¶

override_field_config_doc class method ¶

plots_defaults property ¶

recarray property ¶

records property ¶

records_arr property ¶

records_readable property ¶

replace method ¶

sort method ¶

stats_defaults property ¶

subplots class variable ¶

values property ¶

RecordsWithFields class ¶

field_config function ¶