Open In Colab

First Zipline Backtest¶

This is the "Hello World" of Zipline backtests from the book Trading Evolved by Andreas F. Clenow (https://www.followingthetrend.com/trading-evolved/). These Google Colab installation preliminaries are by Jim White and the notebook is in GitHub at https://github.com/fovi-llc/trading_evolved/Chapter 7 - Backtesting Trading Strategies/First Zipline Backtest.ipynb. That is code from Clenow's book packaged by Ahmed Aboumalwa at https://github.com/ahmedengu/trading_evolved with updates from https://github.com/RiseT/trading_evolved.

Install TA-Lib¶

Colab doesn't have the C/C++ TA-Lib (https://ta-lib.org/) library installed and it isn't in one of the Linux distros it uses either so installing here is annoying.

This is a fast way to get TA-Lib installed including the binary for Colab that will work for zipline-reloaded but the script will break when Colab upgrades to Python 3.11.

Even though the PyPi ta-lib-bin supplies the Python package talib it won't satisify zipline-reloaded dependency on the PyPi ta-lib. That will result in it trying to install the PyPi ta-lib which will fail to build its wheel on Colab. So the workaround is to install ta-lib-bin then make a copy of it with the name ta-lib which will satisify the dependency requirement.

In [1]:
%%bash

pip install ta-lib-bin==0.4.26
pip show ta-lib-bin|grep Location
cd /usr/local/lib/python3.10/dist-packages
cp -R ta_lib_bin.libs ta_lib.libs
cp -R ta_lib_bin-0.4.26.dist-info ta_lib-0.4.26.dist-info
sed -i 's/^Name: ta-lib-bin/Name: ta-lib/' ta_lib-0.4.26.dist-info/METADATA
diff -u ta_lib_bin-0.4.26.dist-info/METADATA ta_lib-0.4.26.dist-info/METADATA
[ $? -eq 1 ] && exit 0
echo "Should have got diff for name in METADATA file"
exit 1
Collecting ta-lib-bin==0.4.26
  Downloading ta_lib_bin-0.4.26-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (19 kB)
Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from ta-lib-bin==0.4.26) (1.26.4)
Downloading ta_lib_bin-0.4.26-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.9 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.9/2.9 MB 31.1 MB/s eta 0:00:00
Installing collected packages: ta-lib-bin
Successfully installed ta-lib-bin-0.4.26
Location: /usr/local/lib/python3.10/dist-packages
--- ta_lib_bin-0.4.26.dist-info/METADATA	2024-09-22 23:00:28.122164238 +0000
+++ ta_lib-0.4.26.dist-info/METADATA	2024-09-22 23:00:33.512705317 +0000
@@ -1,5 +1,5 @@
 Metadata-Version: 2.1
-Name: ta-lib-bin
+Name: ta-lib
 Version: 0.4.26
 Summary: Python wrapper for TA-Lib
 Home-page: https://github.com/minggnim/ta-lib

Building TA-Lib binary on Colab.¶

If the above doesn't work for you, this should but it is more complicated.

Colab doesn't have the C/C++ TA-Lib (https://ta-lib.org/) library installed so we have to build it. This takes a while so I recommend activating Google Drive using the cell below then running this script cell.. Of course if the Linux environment changes then that will need to be deleted and rebuilt.

A shell script to build the TA-Lib C/C++ library based on this https://github.com/TA-Lib/ta-lib-python/issues/590#issuecomment-1534248996.

In [2]:
%%bash

pip show ta-lib
if [ $? -eq 0 ]; then
  echo "ta-lib is already installed"
  exit 0
fi
pwd
if [ -d ta-lib ]; then
  echo "ta-lib directory exists."
else
  wget http://prdownloads.sourceforge.net/ta-lib/ta-lib-0.4.0-src.tar.gz
  tar -xzf ta-lib-0.4.0-src.tar.gz
fi
cd ta-lib
./configure --prefix=/usr
make
make install
Name: ta-lib
Version: 0.4.26
Summary: Python wrapper for TA-Lib
Home-page: https://github.com/minggnim/ta-lib
Author: John Benediktsson
Author-email: mrjbq7@gmail.com
License: BSD
Location: /usr/local/lib/python3.10/dist-packages
Requires: numpy
Required-by: 
ta-lib is already installed

An alternative way to run those commands from https://stackoverflow.com/questions/49648391/how-to-install-ta-lib-in-google-colab.

pip zipline-reload¶

Quantopian Zipline is no longer maintained, but Stefan Jansen (https://www.ml4trading.io/) author of the comprehensive (and recommended) book Machine Learning for Trading which also uses Zipline, has "reloaded" it and the dependencies that also became unsupported after the book was published. Code for Stefan's book is at https://github.com/stefan-jansen/machine-learning-for-trading and the reloaded Python packages (zipline-reloaded, bcolz-zipline, pyfolio-reloaded, empyrical-reloaded, and alphalens-reloaded) are here: https://github.com/stefan-jansen?tab=repositories.

I use bcolz-zipline>=1.2.11 and numpy<2 conditions here to deal with incompatibilty problems that the move to Numpy 2 cause (https://github.com/stefan-jansen/bcolz-zipline/issues/61). If you're doing this in the future then hopefully the conditions can be omitted. Also if you can accept the incompatibilties that other of the Colab PyPi packages have with numpy>2 then go ahead.

In [3]:
!pip install "zipline-reloaded>=3.0.4" "bcolz-zipline>=1.2.11" "numpy<2"
Collecting zipline-reloaded>=3.0.4
  Downloading zipline_reloaded-3.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (24 kB)
Collecting bcolz-zipline>=1.2.11
  Downloading bcolz_zipline-1.2.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
Requirement already satisfied: numpy<2 in /usr/local/lib/python3.10/dist-packages (1.26.4)
Collecting alembic>=0.7.7 (from zipline-reloaded>=3.0.4)
  Downloading alembic-1.13.2-py3-none-any.whl.metadata (7.4 kB)
Collecting bottleneck>=1.0.0 (from zipline-reloaded>=3.0.4)
  Downloading Bottleneck-1.4.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.9 kB)
Requirement already satisfied: click>=4.0.0 in /usr/local/lib/python3.10/dist-packages (from zipline-reloaded>=3.0.4) (8.1.7)
Collecting empyrical-reloaded>=0.5.7 (from zipline-reloaded>=3.0.4)
  Downloading empyrical_reloaded-0.5.10-py3-none-any.whl.metadata (21 kB)
Requirement already satisfied: h5py>=2.7.1 in /usr/local/lib/python3.10/dist-packages (from zipline-reloaded>=3.0.4) (3.11.0)
Collecting intervaltree>=2.1.0 (from zipline-reloaded>=3.0.4)
  Downloading intervaltree-3.1.0.tar.gz (32 kB)
  Preparing metadata (setup.py) ... done
Collecting iso3166>=2.1.1 (from zipline-reloaded>=3.0.4)
  Downloading iso3166-2.1.1-py3-none-any.whl.metadata (6.6 kB)
Collecting iso4217>=1.6.20180829 (from zipline-reloaded>=3.0.4)
  Downloading iso4217-1.12.20240625-py2.py3-none-any.whl.metadata (2.7 kB)
Collecting lru-dict>=1.1.4 (from zipline-reloaded>=3.0.4)
  Downloading lru_dict-1.3.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.5 kB)
Requirement already satisfied: multipledispatch>=0.6.0 in /usr/local/lib/python3.10/dist-packages (from zipline-reloaded>=3.0.4) (1.0.0)
Requirement already satisfied: networkx>=2.0 in /usr/local/lib/python3.10/dist-packages (from zipline-reloaded>=3.0.4) (3.3)
Requirement already satisfied: numexpr>=2.6.1 in /usr/local/lib/python3.10/dist-packages (from zipline-reloaded>=3.0.4) (2.10.1)
Requirement already satisfied: pandas>=1.3 in /usr/local/lib/python3.10/dist-packages (from zipline-reloaded>=3.0.4) (2.1.4)
Requirement already satisfied: patsy>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from zipline-reloaded>=3.0.4) (0.5.6)
Requirement already satisfied: python-dateutil>=2.4.2 in /usr/local/lib/python3.10/dist-packages (from zipline-reloaded>=3.0.4) (2.8.2)
Collecting python-interface>=1.5.3 (from zipline-reloaded>=3.0.4)
  Downloading python-interface-1.6.1.tar.gz (19 kB)
  Preparing metadata (setup.py) ... done
Requirement already satisfied: pytz>=2018.5 in /usr/local/lib/python3.10/dist-packages (from zipline-reloaded>=3.0.4) (2024.2)
Requirement already satisfied: requests>=2.9.1 in /usr/local/lib/python3.10/dist-packages (from zipline-reloaded>=3.0.4) (2.32.3)
Requirement already satisfied: scipy>=0.17.1 in /usr/local/lib/python3.10/dist-packages (from zipline-reloaded>=3.0.4) (1.13.1)
Requirement already satisfied: six>=1.10.0 in /usr/local/lib/python3.10/dist-packages (from zipline-reloaded>=3.0.4) (1.16.0)
Requirement already satisfied: sqlalchemy>=2 in /usr/local/lib/python3.10/dist-packages (from zipline-reloaded>=3.0.4) (2.0.35)
Requirement already satisfied: statsmodels>=0.6.1 in /usr/local/lib/python3.10/dist-packages (from zipline-reloaded>=3.0.4) (0.14.3)
Requirement already satisfied: ta-lib>=0.4.09 in /usr/local/lib/python3.10/dist-packages (from zipline-reloaded>=3.0.4) (0.4.26)
Requirement already satisfied: tables>=3.4.3 in /usr/local/lib/python3.10/dist-packages (from zipline-reloaded>=3.0.4) (3.8.0)
Requirement already satisfied: toolz>=0.8.2 in /usr/local/lib/python3.10/dist-packages (from zipline-reloaded>=3.0.4) (0.12.1)
Collecting exchange-calendars>=4.2.4 (from zipline-reloaded>=3.0.4)
  Downloading exchange_calendars-4.5.6-py3-none-any.whl.metadata (37 kB)
Collecting Mako (from alembic>=0.7.7->zipline-reloaded>=3.0.4)
  Downloading Mako-1.3.5-py3-none-any.whl.metadata (2.9 kB)
Requirement already satisfied: typing-extensions>=4 in /usr/local/lib/python3.10/dist-packages (from alembic>=0.7.7->zipline-reloaded>=3.0.4) (4.12.2)
Collecting peewee<3.17.4 (from empyrical-reloaded>=0.5.7->zipline-reloaded>=3.0.4)
  Downloading peewee-3.17.3.tar.gz (3.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.0/3.0 MB 27.3 MB/s eta 0:00:00
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting pyluach (from exchange-calendars>=4.2.4->zipline-reloaded>=3.0.4)
  Downloading pyluach-2.2.0-py3-none-any.whl.metadata (4.3 kB)
Requirement already satisfied: tzdata in /usr/local/lib/python3.10/dist-packages (from exchange-calendars>=4.2.4->zipline-reloaded>=3.0.4) (2024.1)
Collecting korean-lunar-calendar (from exchange-calendars>=4.2.4->zipline-reloaded>=3.0.4)
  Downloading korean_lunar_calendar-0.3.1-py3-none-any.whl.metadata (2.8 kB)
Requirement already satisfied: sortedcontainers<3.0,>=2.0 in /usr/local/lib/python3.10/dist-packages (from intervaltree>=2.1.0->zipline-reloaded>=3.0.4) (2.4.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests>=2.9.1->zipline-reloaded>=3.0.4) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests>=2.9.1->zipline-reloaded>=3.0.4) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.9.1->zipline-reloaded>=3.0.4) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests>=2.9.1->zipline-reloaded>=3.0.4) (2024.8.30)
Requirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.10/dist-packages (from sqlalchemy>=2->zipline-reloaded>=3.0.4) (3.1.0)
Requirement already satisfied: packaging>=21.3 in /usr/local/lib/python3.10/dist-packages (from statsmodels>=0.6.1->zipline-reloaded>=3.0.4) (24.1)
Requirement already satisfied: cython>=0.29.21 in /usr/local/lib/python3.10/dist-packages (from tables>=3.4.3->zipline-reloaded>=3.0.4) (3.0.11)
Requirement already satisfied: blosc2~=2.0.0 in /usr/local/lib/python3.10/dist-packages (from tables>=3.4.3->zipline-reloaded>=3.0.4) (2.0.0)
Requirement already satisfied: py-cpuinfo in /usr/local/lib/python3.10/dist-packages (from tables>=3.4.3->zipline-reloaded>=3.0.4) (9.0.0)
Requirement already satisfied: msgpack in /usr/local/lib/python3.10/dist-packages (from blosc2~=2.0.0->tables>=3.4.3->zipline-reloaded>=3.0.4) (1.0.8)
Requirement already satisfied: MarkupSafe>=0.9.2 in /usr/local/lib/python3.10/dist-packages (from Mako->alembic>=0.7.7->zipline-reloaded>=3.0.4) (2.1.5)
Downloading zipline_reloaded-3.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.0 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.0/11.0 MB 90.2 MB/s eta 0:00:00
Downloading bcolz_zipline-1.2.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.4 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.4/5.4 MB 92.8 MB/s eta 0:00:00
Downloading alembic-1.13.2-py3-none-any.whl (232 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 233.0/233.0 kB 16.6 MB/s eta 0:00:00
Downloading Bottleneck-1.4.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (356 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 356.2/356.2 kB 20.7 MB/s eta 0:00:00
Downloading empyrical_reloaded-0.5.10-py3-none-any.whl (32 kB)
Downloading exchange_calendars-4.5.6-py3-none-any.whl (196 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 196.2/196.2 kB 13.0 MB/s eta 0:00:00
Downloading iso3166-2.1.1-py3-none-any.whl (9.8 kB)
Downloading iso4217-1.12.20240625-py2.py3-none-any.whl (10 kB)
Downloading lru_dict-1.3.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (30 kB)
Downloading korean_lunar_calendar-0.3.1-py3-none-any.whl (9.0 kB)
Downloading Mako-1.3.5-py3-none-any.whl (78 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 78.6/78.6 kB 6.0 MB/s eta 0:00:00
Downloading pyluach-2.2.0-py3-none-any.whl (25 kB)
Building wheels for collected packages: intervaltree, python-interface, peewee
  Building wheel for intervaltree (setup.py) ... done
  Created wheel for intervaltree: filename=intervaltree-3.1.0-py2.py3-none-any.whl size=26096 sha256=b39a83f600fdb1b24be4b42e27e93890ad1ba130d78810bdf62539e63832b620
  Stored in directory: /root/.cache/pip/wheels/fa/80/8c/43488a924a046b733b64de3fac99252674c892a4c3801c0a61
  Building wheel for python-interface (setup.py) ... done
  Created wheel for python-interface: filename=python_interface-1.6.1-py3-none-any.whl size=23218 sha256=f5e5bfb2c5c79d840d987012c7acebb654f45501b826d2175b7f9a5907db9f2a
  Stored in directory: /root/.cache/pip/wheels/14/be/d4/63092bf0e2bcab18e7ce315aa8935d786907f5782db969fbe8
  Building wheel for peewee (pyproject.toml) ... done
  Created wheel for peewee: filename=peewee-3.17.3-cp310-cp310-linux_x86_64.whl size=727578 sha256=63c7dcca57f4ef62a9c8b36dc894fa517391517d39a3edce14559c0e7cee2269
  Stored in directory: /root/.cache/pip/wheels/0d/32/b7/cad9f818b37cf97df4c87a8308da86a433af81651d98f8d8aa
Successfully built intervaltree python-interface peewee
Installing collected packages: peewee, korean-lunar-calendar, python-interface, pyluach, Mako, lru-dict, iso4217, iso3166, intervaltree, bottleneck, bcolz-zipline, alembic, exchange-calendars, empyrical-reloaded, zipline-reloaded
  Attempting uninstall: peewee
    Found existing installation: peewee 3.17.6
    Uninstalling peewee-3.17.6:
      Successfully uninstalled peewee-3.17.6
Successfully installed Mako-1.3.5 alembic-1.13.2 bcolz-zipline-1.2.11 bottleneck-1.4.0 empyrical-reloaded-0.5.10 exchange-calendars-4.5.6 intervaltree-3.1.0 iso3166-2.1.1 iso4217-1.12.20240625 korean-lunar-calendar-0.3.1 lru-dict-1.3.0 peewee-3.17.3 pyluach-2.2.0 python-interface-1.6.1 zipline-reloaded-3.0.4

Data Ingestion¶

To run a Zipline backtest you first need to "ingest" (load) a data bundle. The default bundle included in Zipline is Quandl. Quandl is no longer active and the data is not updated anymore, but the dataset (https://data.nasdaq.com/databases/WIKIP) covers the period 1996-01-01 thru 2018-03-27 and is available with a free API key from https://data.nasdaq.com/.

I recommend using Colab secrets for your API key and naming it QUANDL_API_KEY. Also, in order to avoid repeatedly ingesting the data I recommend linking Google Drive and setting the ZIPLINE_ROOT env var to a folder there.

Also the bundle ingestion takes some time and bandwidth, so I also recommend linking up to your Google Drive and storing the Zipline data there.

Of course if you're running this locally then you can just use the command line in the usual fashion.

N.B. Linux shell and build scripts don't like there to be spaces (or other "special") characters in paths so Colab plays tricks to hide the space in "My Drive".

In [5]:
import os
from google.colab import drive
drive.mount('/content/drive')

if not os.environ.get("MY_WORKSPACE"):
  os.environ["MY_WORKSPACE"] = "/content/drive/MyDrive/Workspace"

%mkdir -p /content/drive/MyDrive/Workspace
%cd /content/drive/MyDrive/Workspace
!pwd

from google.colab import userdata

if not os.environ.get("QUANDL_API_KEY"):
  os.environ["QUANDL_API_KEY"] = userdata.get("QUANDL_API_KEY")

if not os.environ.get("ZIPLINE_ROOT"):
  os.environ["ZIPLINE_ROOT"] = os.path.join(os.environ["MY_WORKSPACE"], "zipline")
Mounted at /content/drive
/content/drive/MyDrive/Workspace
/content/drive/MyDrive/Workspace

List Bundle Ingestions¶

Before running the Zipline ingest command lets see if you're already done that because every time you run the command it will just add it again, even if you've done it before (because it assumes the data source regularly updates and doesn't know that Quandl is now moribund).

In [6]:
%%bash

export ZIPLINE_ROOT="$ZIPLINE_ROOT"
echo $ZIPLINE_ROOT
zipline bundles
/content/drive/MyDrive/Workspace/zipline
csvdir <no ingestions>
quandl 2024-09-22 22:18:49.991135
quantopian-quandl <no ingestions>

Ingest the Quandl Bundle (Once!)¶

So if that shows no quandl data bundle ingestions then we should do that now. You can also come back a rerun the cell above to see the updated list of bundles, it should list one with a date and timestamp.

I omit the very length output to keep things a little more readable in the notebook. There are a bunch of complaints about dividends which afaik are mostly benign.

In [ ]:
%%bash

export QUANDL_API_KEY="$QUANDL_API_KEY"
export ZIPLINE_ROOT="$ZIPLINE_ROOT"

zipline ingest -b quandl

Run the Backtest!¶

At last we can run the backtest! Note that the end_date of 2018-03-28 is because the last day in the old Quandl WIKIP dataset is 2018-03-27 and we just get errors on the 28th and beyond. Of course getting an up-to-date data bundle is the cure for this.

In [9]:
# This ensures that our graphs will be shown properly in the notebook.
%matplotlib inline

# Import Zipline functions that we need
from zipline import run_algorithm
from zipline.api import order_target_percent, symbol

# Import visualization
import matplotlib.pyplot as plt

# Import Pandas
import pandas as pd


def initialize(context):
    # Which stock to trade
    context.stock = symbol("AAPL")

    # Moving average window
    context.index_average_window = 100


def handle_data(context, data):
    # Request history for the stock
    equities_hist = data.history(
        context.stock, "close", context.index_average_window, "1d"
    )

    # Check if price is above moving average
    if equities_hist.iloc[-1] > equities_hist.mean():
        stock_weight = 1.0
    else:
        stock_weight = 0.0

    # Place order
    order_target_percent(context.stock, stock_weight)


def analyze(context, perf):
    fig = plt.figure(figsize=(12, 8))

    # First chart
    ax = fig.add_subplot(311)
    ax.set_title("Strategy Results")
    ax.semilogy(
        perf["portfolio_value"], linestyle="-", label="Equity Curve", linewidth=3.0
    )
    ax.legend()
    ax.grid(False)

    # Second chart
    ax = fig.add_subplot(312)
    ax.plot(perf["gross_leverage"], label="Exposure", linestyle="-", linewidth=1.0)
    ax.legend()
    ax.grid(True)

    # Third chart
    ax = fig.add_subplot(313)
    ax.plot(perf["returns"], label="Returns", linestyle="-.", linewidth=1.0)
    ax.legend()
    ax.grid(True)


# Set start and end date
start_date = pd.Timestamp("1996-01-01")
end_date = pd.Timestamp("2018-03-28")

# Fire off the backtest
results = run_algorithm(
    start=start_date,
    end=end_date,
    initialize=initialize,
    analyze=analyze,
    handle_data=handle_data,
    capital_base=10000,
    data_frequency="daily",
    bundle="quandl",
)
/usr/local/lib/python3.10/dist-packages/zipline/finance/ledger.py:424: FutureWarning: Series.__setitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To set a value by position, use `ser.iloc[pos] = value`
  self.daily_returns_series[session_ix] = self.todays_returns
WARNING:ZiplineLog:Cannot place order for AAPL, as it has de-listed. Any existing positions for this asset will be liquidated on 2018-03-28 00:00:00.