Stats-Based Crypto-Trading Time Series Models (ADL)

Statistics | Autoregressive Distributed Lag Model | Trading Strategies | Backtesting

Preamble

This project was completed as part of the Statistics Module at SMU based on the Introduction to Econometrics, Global 4th Edition (2020) textbook. The course covered different ways of modelling and determining the fit of models based purely on statistics and gave me a deep understanding of the roots and concepts behind which current machine learning models were built upon. In terms of value, it develops a workflow for quickly determining the profitability of simple trading strategies through both the engineering of the data and backtesting of results.

From the perspective of someone who has worked with multiple machine learning models before this module, I found that the statistical method of developing models was often more tedious in both setting up the data and applying the correct model for the specific context. However the major advantage came in the form of interpretability of the model's fit (especially for simpler models like the Ordinary Least Squares Regression Models). The array of model parameters that are generated after the fit quantifies how well a model fits the data from different angles (error, mean, variance, lags) which are invaluable for interpretation and understanding the limitations of the model. This is (currently) often lacking for machine learning models.

The report explaining the project is as seen below. As my group was busy working on other projects, most of the programming and writing was completed by me.

The contents of the notebook appended below dive straight into the first of 5 stages:

  1. Converting Individual Technical Indicator Trading Strategies into Code
  2. Generating Crypto Instrument Features Based on Technical Indicators (e.g. RSI, MACD)
  3. Converting Features into Time Series Model Data
  4. Fitting Data into the Autoregressive Distributed Lag Model
  5. Testing the Profitability of the Trading Strategies Built on Model Insights

RSI (SIGNAL)

MACD (SIGNAL)

EMA (SIGNAL/CONTROL)

Next step is to label each order based on conditions as the price change.

ATR (CONTROL)

BBANDS (SIGNAL)

Generating Regression Data

Get Data

Windowed Linear Regression and Slope

Time Series Model

Note that TA variables are all endogenous since they are determined by its relationship with other variables. An exogenous variable would be something like hype or fundamentals which are not a function of the closing, low or high price.

class statsmodels.tsa.ardl.ARDL(endog, lags, exog=None, order=0, trend='c', *, fixed=None, causal=False, seasonal=False, deterministic=None, hold_back=None, period=None, missing='none')[source]¶

Non-Linear Terms

Interaction Terms

Testing Strategies

RSI

MACD