Getting started¶

What is vectorbt?¶

vectorbt is a Python package for quantitative analysis that takes a novel approach to backtesting: it operates entirely on pandas and NumPy objects, and is accelerated by Numba to analyze any data at speed and scale. This allows for testing of many thousands of strategies in seconds.

In contrast to other backtesters, vectorbt represents complex data as (structured) NumPy arrays. This enables superfast computation using vectorized operations with NumPy and non-vectorized but dynamically compiled operations with Numba. It also integrates Plotly and Jupyter Widgets to display complex charts and dashboards akin to Tableau right in the Jupyter notebook. Due to high performance, vectorbt can process large amounts of data even without GPU and parallelization and enables the user to interact with data-hungry widgets without significant delays.

With vectorbt, you can

Backtest strategies in a couple of lines of Python code
Enjoy the best of both worlds: the ecosystem of Python and the speed of C
Retain full control over execution and your data (as opposed to web-based services such as TradingView)
Optimize your trading strategy against many parameters, assets, and periods in one go
Uncover hidden patterns in financial markets
Analyze time series and engineer new features for ML models
Supercharge pandas and your favorite tools to run much faster
Visualize strategy performance using interactive charts and dashboards (both in Jupyter and browser)
Fetch and process data periodically, send Telegram notifications, and more
Support us to get access to parallelization, portfolio optimization, pattern recognition, event projections, limit orders, leverage, and 100+ other hot features!

Why vectorbt?¶

While there are many great backtesting packages for Python, vectorbt combines an extremely fast backtester and a data science tool: it excels at processing performance and offers interactive tools to explore complex phenomena in trading. With it, you can traverse a huge number of strategy configurations, time periods, and instruments in little time, to explore where your strategy performs best and to uncover hidden patterns in data. Accessing and analyzing this information for yourself could give you an information advantage in your own trading.

How it works¶

vectorbt was implemented to address common performance shortcomings of backtesting libraries. It builds upon the idea that each instance of a trading strategy can be represented in a vectorized form, so multiple strategy instances can be packed into a single multi-dimensional array, processed in a highly efficient manner, and compared easily. It overhauls the traditional OOP approach that represents strategies as classes and other data structures, which are easier to write and extend compared to vectors, but harder to analyze and also require additional effort to do it quickly.

Thanks to the time-series nature of trading data, most of the aspects related to backtesting can be translated into vectors. Instead of processing one element at a time, vectorization allows us to avoid naive looping and perform the same operation on all elements at the same time. The path-dependency problem related to vectorization is solved by using Numba - it allows both writing iterative code and compiling slow Python loops to be run at the native machine code speed.

Example¶

Let's say we have a complex strategy that has lots of (hyper-)parameters that have to be tuned. While brute-forcing all combinations seems to be a rather unrealistic attempt, we can still interpolate, and vectorbt makes exactly this possible. It doesn't care whether we have one strategy instance or millions. As soon as their vectors can be concatenated into a matrix and we have enough memory, we can analyze them in one go.

Let's start with fetching the daily price of Bitcoin:

>>> import numpy as np
>>> import pandas as pd
>>> from datetime import datetime

>>> import vectorbt as vbt

>>> # Prepare data
>>> start = '2019-01-01 UTC'  # crypto is in UTC
>>> end = '2020-01-01 UTC'
>>> btc_price = vbt.YFData.download('BTC-USD', start=start, end=end).get('Close')

>>> btc_price
Date
2019-01-01 00:00:00+00:00    3843.520020
2019-01-02 00:00:00+00:00    3943.409424
2019-01-03 00:00:00+00:00    3836.741211
...                                  ...
2019-12-30 00:00:00+00:00    7292.995117
2019-12-31 00:00:00+00:00    7193.599121
2020-01-01 00:00:00+00:00    7200.174316
Freq: D, Name: Close, Length: 366, dtype: float64

We are going to test a simple Dual Moving Average Crossover (DMAC) strategy. For this, we are going to use MA class for calculating moving averages and generating signals.

Our first test is rather simple: buy when the 10-day moving average crosses above the 20-day moving average, and sell when opposite.

>>> fast_ma = vbt.MA.run(btc_price, 10, short_name='fast')
>>> slow_ma = vbt.MA.run(btc_price, 20, short_name='slow')

>>> entries = fast_ma.ma_crossed_above(slow_ma)
>>> entries
Date
2019-01-01 00:00:00+00:00    False
2019-01-02 00:00:00+00:00    False
2019-01-03 00:00:00+00:00    False
...                            ...
2019-12-30 00:00:00+00:00    False
2019-12-31 00:00:00+00:00    False
2020-01-01 00:00:00+00:00    False
Freq: D, Length: 366, dtype: bool

>>> exits = fast_ma.ma_crossed_below(slow_ma)
>>> exits
Date
2019-01-01 00:00:00+00:00    False
2019-01-02 00:00:00+00:00    False
2019-01-03 00:00:00+00:00    False
...                            ...
2019-12-30 00:00:00+00:00    False
2019-12-31 00:00:00+00:00    False
2020-01-01 00:00:00+00:00    False
Freq: D, Length: 366, dtype: bool

>>> pf = vbt.Portfolio.from_signals(btc_price, entries, exits)
>>> pf.total_return()
0.636680693047752

One strategy instance of DMAC produced one column in signals and one performance value.

Adding one more strategy instance is as simple as adding one more column. Here we are passing an array of window sizes instead of a single value. For each window size in this array, it computes a moving average over the entire price series and stores it in a distinct column.

>>> # Multiple strategy instances: (10, 30) and (20, 30)
>>> fast_ma = vbt.MA.run(btc_price, [10, 20], short_name='fast')
>>> slow_ma = vbt.MA.run(btc_price, [30, 30], short_name='slow')

>>> entries = fast_ma.ma_crossed_above(slow_ma)
>>> entries
fast_window                   10     20
slow_window                   30     30
Date
2019-01-01 00:00:00+00:00  False  False
2019-01-02 00:00:00+00:00  False  False
2019-01-03 00:00:00+00:00  False  False
...                          ...    ...
2019-12-30 00:00:00+00:00  False  False
2019-12-31 00:00:00+00:00  False  False
2020-01-01 00:00:00+00:00  False  False

[366 rows x 2 columns]

>>> exits = fast_ma.ma_crossed_below(slow_ma)
>>> exits
fast_window                   10     20
slow_window                   30     30
Date
2019-01-01 00:00:00+00:00  False  False
2019-01-02 00:00:00+00:00  False  False
2019-01-03 00:00:00+00:00  False  False
...                          ...    ...
2019-12-30 00:00:00+00:00  False  False
2019-12-31 00:00:00+00:00  False  False
2020-01-01 00:00:00+00:00  False  False

[366 rows x 2 columns]

>>> pf = vbt.Portfolio.from_signals(btc_price, entries, exits)
>>> pf.total_return()
fast_window  slow_window
10           30             0.848840
20           30             0.543411
Name: total_return, dtype: float64

For the sake of convenience, vectorbt has created the column levels fast_window and slow_window for us to easily distinguish which window size corresponds to which column.

Notice how signal generation part remains the same for each example - most functions in vectorbt work on time series of any shape. This allows creation of analysis pipelines that are universal to input data.

The representation of different features as columns offers endless possibilities for backtesting. We could, for example, go a step further and conduct the same tests for Ethereum. To compare both instruments, combine price series for Bitcoin and Ethereum into one DataFrame and run the same backtesting pipeline.

>>> # Multiple strategy instances and instruments
>>> eth_price = vbt.YFData.download('ETH-USD', start=start, end=end).get('Close')
>>> comb_price = btc_price.vbt.concat(eth_price,
...     keys=pd.Index(['BTC', 'ETH'], name='symbol'))
>>> comb_price.vbt.drop_levels(-1, inplace=True)
>>> comb_price
symbol                             BTC         ETH
Date
2019-01-01 00:00:00+00:00  3843.520020  140.819412
2019-01-02 00:00:00+00:00  3943.409424  155.047684
2019-01-03 00:00:00+00:00  3836.741211  149.135010
...                                ...         ...
2019-12-30 00:00:00+00:00  7292.995117  132.633484
2019-12-31 00:00:00+00:00  7193.599121  129.610855
2020-01-01 00:00:00+00:00  7200.174316  130.802002

[366 rows x 2 columns]

>>> fast_ma = vbt.MA.run(comb_price, [10, 20], short_name='fast')
>>> slow_ma = vbt.MA.run(comb_price, [30, 30], short_name='slow')

>>> entries = fast_ma.ma_crossed_above(slow_ma)
>>> entries
fast_window                          10            20
slow_window                          30            30
symbol                       BTC    ETH    BTC    ETH
Date
2019-01-01 00:00:00+00:00  False  False  False  False
2019-01-02 00:00:00+00:00  False  False  False  False
2019-01-03 00:00:00+00:00  False  False  False  False
...                          ...    ...    ...    ...
2019-12-30 00:00:00+00:00  False  False  False  False
2019-12-31 00:00:00+00:00  False  False  False  False
2020-01-01 00:00:00+00:00  False  False  False  False

[366 rows x 4 columns]

>>> exits = fast_ma.ma_crossed_below(slow_ma)
>>> exits
fast_window                          10            20
slow_window                          30            30
symbol                       BTC    ETH    BTC    ETH
Date
2019-01-01 00:00:00+00:00  False  False  False  False
2019-01-02 00:00:00+00:00  False  False  False  False
2019-01-03 00:00:00+00:00  False  False  False  False
...                          ...    ...    ...    ...
2019-12-30 00:00:00+00:00  False  False  False  False
2019-12-31 00:00:00+00:00  False  False  False  False
2020-01-01 00:00:00+00:00  False  False  False  False

[366 rows x 4 columns]

>>> pf = vbt.Portfolio.from_signals(comb_price, entries, exits)
>>> pf.total_return()
fast_window  slow_window  symbol
10           30           BTC       0.848840
                          ETH       0.244204
20           30           BTC       0.543411
                          ETH      -0.319102
Name: total_return, dtype: float64

>>> mean_return = pf.total_return().groupby('symbol').mean()
>>> mean_return.vbt.barplot(xaxis_title='Symbol', yaxis_title='Mean total return')

Not only strategies and instruments can act as separate features, but also time. If we want to find out when our strategy performs best, it's reasonable to backtest over multiple time periods. vectorbt allows us to split one time period into many, given they have the same length and frequency, and represent them as distinct columns. For example, let's split the whole time period into two equal time periods and backest them at once.

>>> # Multiple strategy instances, instruments, and time periods
>>> mult_comb_price, _ = comb_price.vbt.range_split(n=2)
>>> mult_comb_price
split_idx                         0                         1
symbol              BTC         ETH           BTC         ETH
0           3843.520020  140.819412  11961.269531  303.099976
1           3943.409424  155.047684  11215.437500  284.523224
2           3836.741211  149.135010  10978.459961  287.997528
...                 ...         ...           ...         ...
180        10817.155273  290.695984   7292.995117  132.633484
181        10583.134766  293.641113   7193.599121  129.610855
182        10801.677734  291.596436   7200.174316  130.802002

[183 rows x 4 columns]

>>> fast_ma = vbt.MA.run(mult_comb_price, [10, 20], short_name='fast')
>>> slow_ma = vbt.MA.run(mult_comb_price, [30, 30], short_name='slow')

>>> entries = fast_ma.ma_crossed_above(slow_ma)
>>> exits = fast_ma.ma_crossed_below(slow_ma)

>>> pf = vbt.Portfolio.from_signals(mult_comb_price, entries, exits, freq='1D')
>>> pf.total_return()
fast_window  slow_window  split_idx  symbol
10           30           0          BTC       1.632259
                                     ETH       0.946786
                          1          BTC      -0.288720
                                     ETH      -0.308387
20           30           0          BTC       1.721449
                                     ETH       0.343274
                          1          BTC      -0.418280
                                     ETH      -0.257947
Name: total_return, dtype: float64

Notice how index is no more datetime-like, since it captures multiple time periods. That's why it's required here to pass the frequency freq to the Portfolio class in order to be able to compute performance metrics such as the Sharpe ratio.

The index hierarchy of the final performance series can be then used to group the performance by any feature, such as window pair, symbol, and time period.

>>> mean_return = pf.total_return().groupby(['split_idx', 'symbol']).mean()
>>> mean_return.unstack(level=-1).vbt.barplot(
...     xaxis_title='Split index',
...     yaxis_title='Mean total return',
...     legend_title_text='Symbol')

There is much more to backtesting than simply stacking columns: vectorbt offers functions for most parts of a backtesting pipeline - from building indicators and generating signals, to modeling portfolio performance and visualizing results.

Disclaimer¶

This software is for educational purposes only. Do not risk money which you are afraid to lose. USE THE SOFTWARE AT YOUR OWN RISK. THE AUTHORS AND ALL AFFILIATES ASSUME NO RESPONSIBILITY FOR YOUR TRADING RESULTS.