Backtesting.py Quick Start User Guide

This tutorial shows some of the features of backtesting.py, a Python framework for backtesting trading strategies.

Backtesting.py is a small and lightweight, blazing fast backtesting framework that uses state-of-the-art Python structures and procedures (Python 3.6+, Pandas, NumPy, Bokeh). It has a very small and simple API that is easy to remember and quickly shape towards meaningful results. The library doesn't really support stock picking or trading strategies that rely on arbitrage or multi-asset portfolio rebalancing; instead, it works with an individual tradeable asset at a time and is best suited for optimizing position entrance and exit signal strategies, decisions upon values of technical indicators, and it's also a versatile interactive trade visualization and statistics tool.

Data

You bring your own data. Backtesting ingests _all kinds of OHLC data_ (stocks, forex, futures, crypto, ...) as a pandas.DataFrame with columns 'Open', 'High', 'Low', 'Close' and (optionally) 'Volume'. Such data is widely obtainable (see: pandas-datareader, Quandl, findatapy). Besides these, your data frames can have additional columns which are accessible in your strategies in a similar manner.

DataFrame should ideally be indexed with a datetime index (convert it with pd.to_datetime()), otherwise a simple range index will do.

Strategy

Let's create our first strategy to backtest on these Google data, a simple moving average (MA) cross-over strategy.

Backtesting.py doesn't ship its own set of technical analysis indicators. Users favoring TA should probably refer to functions from proven indicator libraries, such as TA-Lib or Tulipy, but for this example, we can define a simple helper moving average function ourselves:

A new strategy needs to extend Strategy class and override its two abstract methods: init() and next().

Method init() is invoked before the strategy is run. Within it, one ideally precomputes in efficient, vectorized manner whatever indicators and signals the strategy depends on.

Method next() is then iteratively called by the Backtest instance, once for each data point (data frame row), simulating the incremental availability of each new full candlestick bar.

Note, backtesting.py cannot make decisions / trades within candlesticks — any new orders are executed on the next candle's open (or the current candle's close if trade_on_close=True). If you find yourself wishing to trade within candlesticks (e.g. daytrading), you instead need to begin with more fine-grained (e.g. hourly) data.

In init() as well as in next(), the data the strategy is simulated on is available as an instance variable self.data.

In init(), we declare and compute indicators indirectly by wrapping them in self.I(). The wrapper is passed a function (our SMA function) along with any arguments to call it with (our close values and the MA lag). Indicators wrapped in this way will be automatically plotted, and their legend strings will be intelligently inferred.

In next(), we simply check if the faster moving average just crossed over the slower one. If it did and upwards, we close the possible short position and go long; if it did and downwards, we close the open long position and go short. Note, we don't adjust order size, so Backtesting.py assumes maximal possible position. We use backtesting.lib.crossover() function instead of writing more obscure and confusing conditions, such as:

In init(), the whole series of points was available, whereas in next(), the length of self.data and all declared indicators is adjusted on each next() call so that array[-1] (e.g. self.data.Close[-1] or self.sma1[-1]) always contains the most recent value, array[-2] the previous value, etc. (ordinary Python indexing of ascending-sorted 1D arrays).

Note: self.data and any indicators wrapped with self.I (e.g. self.sma1) are NumPy arrays for performance reasons. If you prefer pandas Series or DataFrame objects, use Strategy.data.<column>.s or Strategy.data.df accessors respectively. You could also construct the series manually, e.g. pd.Series(self.data.Close, index=self.data.index).

We might avoid self.position.close() calls if we primed the Backtest instance with Backtest(..., exclusive_orders=True).

Backtesting

Let's see how our strategy performs on historical Google data. The Backtest instance is initialized with OHLC data and a strategy class (see API reference for additional options), and we begin with 10,000 units of cash and set broker's commission to realistic 0.2%.

Backtest.run() method returns a pandas Series of simulation results and statistics associated with our strategy. We see that this simple strategy makes almost 600% return in the period of 9 years, with maximum drawdown 33%, and with longest drawdown period spanning almost two years ...

Backtest.plot() method provides the same insights in a more visual form.

Optimization

We hard-coded the two lag parameters (n1 and n2) into our strategy above. However, the strategy may work better with 15–30 or some other cross-over. We declared the parameters as optimizable by making them class variables.

We optimize the two parameters by calling Backtest.optimize() method with each parameter a keyword argument pointing to its pool of possible values to test. Parameter n1 is tested for values in range between 5 and 30 and parameter n2 for values between 10 and 70, respectively. Some combinations of values of the two parameters are invalid, i.e. n1 should not be larger than or equal to n2. We limit admissible parameter combinations with an ad hoc constraint function, which takes in the parameters and returns True (i.e. admissible) whenever n1 is less than n2. Additionally, we search for such parameter combination that maximizes return over the observed period. We could instead choose to optimize any other key from the returned stats series.

We can look into stats['_strategy'] to access the Strategy instance and its optimal parameter values (10 and 15).