Creating a Predictive Trading Algorithm using TensorFlow

Dec 16

Looking for the next exciting thing to do in quarantine, I decided to give Algorithmic Trading a whirl. What's better than a computer program that can magically make money trading stock by itself?

While researching various ways to algorithmically trade, I discovered everything from creating your own strategies in C++ with homemade backtesting frameworks to eToro, where you just copy someone else's algorithm. There were more packages and online platforms than I could count to achieve many of the parts needed to put together your own algorithm. I wanted some middle ground guide that brought some of these packages together and made a base framework that isn't reliant on these online services as they could close shop or have limited usability. This is what I hope this guide can provide. A middle ground where anyone can start programming their ideas without installing anything to their computer and do it all for free. ( And have the added spice of AIML )

In this guide, we are going to cover topics including:

Exploring a basic Moving Average Cross Trading Algorithm and measuring its effectiveness.
Building an LSTM Neural Net to predict future values of indicators.
Backtesting this model to see if our new predictive algorithm is any better than the original.
And all of the nasty details in between.

The only prerequisite for you is a Gmail account where you set up Google Drive before (If you want to save your models and data). The tools we are going to use are:

Google Collab - Free Online Python Notebook. (Or any Python Environment)
Finnhub - Data Source for Stock Tickers (You will have to get your API Key from finnhub.io)
Backtrader - Financial Strategy Building and Testing
Tensorflow Keras - Neural Net Building and Testing
Keras-Tuner - Hyperparameter Tuning
and many standard data science packages.

This guide starts out with the assumption you can open up a new Google Collab notebook, and that's about it.

If you just want to look at the code, skip to Finding an Optimal Slow/Fast Moving Average

Moving Average Cross Trading Algorithm

The MACross algorithm is relatively simple. You have two indicators:

Fast Average: The moving average of the closing price of a stock over the past X days.
Slow Average: The moving average of a stock's closing price over the past Y days where Y is greater than X.

Whenever the Fast Average Crosses from below the Slow Average to above it, you buy. Whenever the Fast Average Crosses from above the Slow Average to below, you sell.

Now, this is the most basic version of this algorithm. Other versions take additional indicators to determine if these may be false signals, but this will work for us for now.

Taking this strategy a step further, assuming that every time you buy, it's a good buy signal, and every time you sell, it's a good sell signal; it would be nice if we could have bought a tiny bit sooner before the cross and sold slightly before the sell signal. This could give us an edge over the typical MACross algorithm. So that's what we are going to try and do.

I propose that there is enough information in the past Z amount of days of the fast average and Z amount of days of the slow average to predict whether the cross will happen the next day.

So we will create two models:

Fast Model - Takes in a certain amount of previous fast moving average values with the current value and predicts tomorrows fast moving average.
Slow Model - Takes in a certain amount of previous slow moving average values with the current value and predicts tomorrows slow moving average.

Using these predicted values we will try and predict if the averages will cross tomorrow. If it does we will either buy or sell accordingly.

LSTM Neural Nets and Why Not a Linear Regression

Neural Net on Left, LSTM Neural Net on Right

To be brief, a typical Neural Net is made up of a bunch of neurons that take in a set of inputs and spit out an output. Each neuron is connected and have associated weights in between. These NNs can learn the correct weights and neuron parameters to successfully give you a result based on a set of training data.

The difference with an LSTM Neural Net is that they have LSTM (Long Short Term Memory) elements. These elements can remember patterns and give you an output that takes into consideration a time series of data.

I propose that we don't need to know much about the inner workings of an LSTM model to use it. Instead, we must understand the principles of Data Science and parameter tuning so we can get by. ( Of course, having more in-depth knowledge will allow you to build a better model, but this can be remedied some by hyper parameter tuning )

"Why do we even need this?" you may ask. "I learned about linear regressions back in high school/college. Cant those extrapolate what is in the future." They can but with a lot more work than you may think. Classical Linear Regressions assume the absence of these principals:

Multicollinearity: Two of the input variables are related to one another. Let's say we want to predict stock prices based on the price of Gold and Silver. The price of Gold and Silver are probably very linearly related.
Heteroscedasticity: Variance is not constant. As the time series goes farther into the future, naturally, the difference between between the predicted and actual value increases.
Autocorrelation: The change in a variable is related to its previous change. If a stock went up or down yesterday can be related to whether it goes up or down tomorrow.

We can get rid of and account for these using concepts from Econometrics, but from taking Econometrics in college, I can assure you making an LSTM model is more straightforward.

Where we can go wrong from the start.

An important concept that needs to be explained first is the concept of lookahead-bias. While making a predictive model, we need to be careful that no future information is used when building the model. Basically, we cannot have anything feed the model either from or determined by the future at any point in time. If this is done, we can get fantastic results in testing, but it will perform poorly when the model is used in real-time. An example of this is actually the first version of this algorithm I came up with. I wanted to use the past Z days of fast and slow moving average values to predict how long it will be until the cross. To train this model, I would need the training data to consist of the last Z days and fit it to a "Days until Cross" variable. However, on any given day, I need to be able to know future information to get a "Days until Cross" variable.

Knowing about this bias, we are going to set up our model so it will use today's and the 59 previous days' fast and slow moving averages to predict what tomorrow's moving average values are. When training the model at any given point or day, the last 60 values will fit the current point or today's value. This way, no future information is used to train or execute the model.

Another concept is data leakage. This plays into lookahead-bias some, but we cannot use any information that's not included in the training set when training the model. A better way to put this is training for the test by putting some of the test info in the training set. For example, we are going to try and find the best durations for the fast and slow-moving windows. If we use our test and training data to determine this, we will get better test results than what we would get in production. We don't have the ability to know these optimal window size values at any given moment in real-time, so we need to find them on independent data.

To avoid these downfalls, the best we can, the data will be split into three major sections. Assuming we use the past 20 years of data:

For the first 7 years, we will explore practical fast and slow window durations for a particular stock.
The next 7 years are used for model building. This will include a training set, a validation set, and a test set.
- The training set is what will build the model.
- The validation set is what we will use to test the model over a comprehensive set of parameters, aka Hyper Parameter Tuning.
- The test set will test the model and help determine if we are overfitting the data or underfitting.
The next 6 years of data will not have had any way to leak into the model and be used for backtesting the model.

These dates and durations can differ significantly, but we are just using this as a starting point for this exercise. Using this much data has issues of its own:

Survivorship Bias: Let's say you want 30 years of data, so you go and try and find a large company with a lot of historical data. You build a model based on this data and then try to apply this to another company; however, you lose all your money as that company goes bankrupt. It is very possible that a long-standing company at many points showed signals which were profitable but caused 20 other companies to go under. So typically, you want to test your model on these defunct companies along with successful ones. The problem is to get complete financial data that includes these defunct companies is costly and probably out of reach to you and me. From what I read, this is a larger problem on some strategies versus others.
Regime Changes: You could have trained your model over 5 years in which a company had a great CEO. That CEO then resigned, and another CEO comes into power. Your model might not bode well under the fiscal policies of a new CEO. The CEO doesn't even need to necessarily be worse, just behave differently.

Now for some actual code.

Finding an Optimal Slow/Fast Moving Average

Before we start making a model, lets:

Install/Import our packages
Set up model parameters
Retrieve some data using Finnhub API
Setup the MACross Backtrading strategy in Backtrader
Find optimal parameters and get a baseline performance.

Install and Import Packages

First, we need to install all of our packages and import the needed libraries.

If you are using Google Collab the last line allows the notebook to access your Google Drive to save and load models if you wish. The first time you run it will ask you for credentials but shoulden’t after the first time.

# Install Needed Packages. If using Google Collab these need to be in seperate cells.
pip install finnhub-python
pip install keras-tuner
pip install backtrader[plotting]

# Import needed libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.markers as mark
import tensorflow as tf
import sys 
from sklearn.preprocessing import MinMaxScaler
import math
from pandas_datareader import data as web
import finnhub
from datetime import datetime
from datetime import timezone
from time import time, sleep
import requests
from google.colab import drive
from google.colab import files
import seaborn as sns

from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, LSTM
from tensorflow import keras
from kerastuner import HyperModel
from kerastuner.tuners import RandomSearch 
from kerastuner.tuners import Hyperband 
from pylab import rcParams

rcParams['figure.figsize'] = 16,9
rcParams['figure.facecolor'] = '#eeeeee'
plt.title('dummy')
plt.plot([1,3,2,4])
plt.close()

from __future__ import (absolute_import, division, print_function,
                        unicode_literals)

import backtrader as bt
import argparse
import backtrader.feeds as btfeeds
import os.path  # To manage paths
import sys  # To find out the script name (in argv[0])

# Have all of our plots show automatically
%matplotlib inline

# Mount your Google Drive to the worksheet
drive.mount('/content/drive')

Set Up Model Parameters

Next we want to set up all of our parameters so there is a central place to make any model changes. Hopefully, when testing out new tickers, granularities, data sets all you need to do is change these variables and run your worksheet.

# Global Parameters
ticker = "CVX"
granularity = "D"

# The amount of time divisions the LSTM model will look back for predictions
lookback_distance = 60

# Time Period for Stratagy Tuning
dt_start_model_parameter_tuning = datetime(2000, 10, 19)
dt_end_model_parameter_tuning = datetime(2006, 12, 31)

# Time Period for Model Building
dt_start_nn_model_training = datetime(2007, 1,1)
dt_end_nn_model_training = datetime(2013, 12, 1)

# Time Period for Backtesting
dt_start_backtest = datetime(2014, 1,1)
dt_end_backtest = datetime(2020, 12, 9)

# SMA Cross Stratagy Search Grid. This defines the windows we will test to find an optimal 
# fast and slow window combination.
pslow_start  = 30;
pslow_end    = 200;
pslow_step   = 10;

pfast_start  = 30;
pfast_end    = 200;
pfast_step   = 10;

# API Keys. Go to the finnhub website to get your API key and put it here.
finnhub_api_key = "Your Key Here";

# Drive Path
google_drive_path = "/content/drive/My Drive/"

Retrieve and Setup Data

Using Finnhub our program will retrieve the stock candles for our specified ticker over the dates we put in. The data will then be stored into a pandas dataframe with the correct column names so that backtrader can by default properly read the data. Lastly, we plot the data to sanity check it.

# Covert the dates to UTC timestamps that will be used by the Finnhub API call
timestamp_start = int(dt_start_model_parameter_tuning.replace(tzinfo=timezone.utc).timestamp())
timestamp_end = int(dt_end_model_parameter_tuning.replace(tzinfo=timezone.utc).timestamp())

# Initialize the Client
finnhub_client = finnhub.Client(api_key=finnhub_api_key)

# Get data and put into dataframe.
res = finnhub_client.stock_candles(ticker, granularity, timestamp_start, timestamp_end)
data = pd.DataFrame(res)

# Make Data Backtrader Friendly
dataBacktrader = data
dataBacktrader.columns = ['close', 'high', 'low','open','s','t','volume']

dataBacktrader['date'] = dataBacktrader.apply(lambda row : datetime.utcfromtimestamp(row['t']), axis = 1)
dataBacktrader.drop(['s','t'], axis = 1, inplace=True)
dataBacktrader['openinterest'] = 0

# Plot the data
plt.plot(dataBacktrader['close'])

Setup SmaCross Model

Next we will create the strategy object that Backtrader will run. This strategy was already constructed for us in the Backtrader quick start page. I did add one line which checks correct parameters otherwise it stops the run to not waste compute time. Later we will actually make our own custom strategy object but this gives us a good idea of how a strategy is structured.

# Create a subclass of Strategy to define the indicators and logic
class SmaCross(bt.Strategy):
    # list of parameters which are configurable for the strategy
    params = dict(
        pfast=60,  # period for the fast moving average window
        pslow=128   # period for the slow moving average window
    )
    
    # The __init__ method runs at the start of any stratagy.
    def __init__(self):
        sma1 = bt.ind.SMA(period=self.p.pfast)  # fast moving average
        sma2 = bt.ind.SMA(period=self.p.pslow)  # slow moving average
        
        # Only want to test valid parameters (and workable parameters)
        # otherwise skip the run
        if ((self.p.pfast <= self.p.pslow) and (self.p.pslow - self.p.pfast >= 5)) :
          self.crossover = bt.ind.CrossOver(sma1, sma2)  # crossover signal
        else :
          raise bt.StrategySkipError   

    # The next() method is run each step of the stratagy. 
    # The step of a stratagy is determined by the dataframe and the duraction 
    # between each data point.
    def next(self):
        if not self.position:  # not in the market
            if self.crossover > 0:  # if fast crosses slow to the upside
                self.buy()  # enter long

        elif self.crossover < 0:  # in the market & cross to the downside
            self.close()  # close long position
            
    # The stop method runs at the end of a stratagy run. 
    def stop(self):
        print('(FA Period %2d) (SL Period %2d) Ending Value %.2f' %
                 (self.params.pfast, self.params.pslow, self.broker.getvalue()))     

Run SMACross Strategy and Baseline Performance

Now we want to build and run a Cereboro object so that Backtrader will run our strategy over a set of data.

# Create array to store the results from all the backtrader runs.
results_list = []

# First we want to initialize our Cerebro object.
cerebro = bt.Cerebro(stdstats=False,maxcpus=None)

# Next we set the amount of cash we want to start out with.
cerebro.broker.setcash(100000.0)

# Next we read in the dataframe with the data we want backtest
dataFrame = bt.feeds.PandasData(dataname=dataBacktrader, datetime='date')
cerebro.adddata(dataFrame)

# Now we load the actual stratagy. Since we want to run the stratagy over a large
# combination of parameters we want to use the optstratagy method. We
# define the range of parameters we want to run over based on the additional arguments.
# Here we are running our stratagy over:
# - pFast = pfast_start to pfast_end, with a stepsize of pfast_step
# - pSlow = pslow_start to pslow_end, with a stepsize of pslow_step
cerebro.optstrategy(SmaCross, pslow=range(pslow_start,pslow_end+1,pslow_step), pfast=range(pfast_start,pfast_end+1,pfast_step))

# Next we add a sizer to the cerebro object. This determines the strategy of how large our buy and 
# sell orders are. Here we fix it top 99% of our portfolio. 
cerebro.addsizer(bt.sizers.PercentSizer, percents=99)

# We are almost there but we now need to add analyzers to the cerebro object.
# These analyzers allow us to extract information about how well the stratagy performed. 
# (Links to more info about these analyzers at the bottom of the post)
cerebro.addanalyzer(bt.analyzers.SharpeRatio_A, _name="sharperatio", timeframe=bt.TimeFrame.Days)
cerebro.addanalyzer(bt.analyzers.TimeReturn, timeframe=bt.TimeFrame.NoTimeFrame)
cerebro.addanalyzer(bt.analyzers.DrawDown)
cerebro.addanalyzer(bt.analyzers.SQN, _name="sqn")

# Finally we want to set how much each tranasction will cost in commision fees. Since I personally
# use Robinhood Ill just set this to 0.
cerebro.broker.setcommission(commission=0)

# Finally we run the model. This will run the stratagy over the wide array of parameters
# we set above. An object is returned which holds all of the results of the runs.
# (And the information we told it to gather from our Analyzers)
results = cerebro.run();

# Next we want to run over this results object and extract the information so it is easily
# readable and indexable. 
for i in range(0, len(results)):  
  # Need to check that this is a run that fully ran
  if (len(results[i]) >= 1) :
    # Temporarily store all the analyzer values into variables. 
    r_strat  = "slow_fast_" + str(results[i][0].params.pslow) + "_" + str(results[i][0].params.pfast);
    r_return = list(results[i][0].analyzers.timereturn.get_analysis().values())[0]*100
    r_sharpe = results[i][0].analyzers.sharperatio.get_analysis()['sharperatio'];
    r_draw   = results[i][0].analyzers.drawdown.get_analysis()['drawdown']/100;
    r_sqn   = results[i][0].analyzers.sqn.get_analysis()['sqn']

    # Append a dictionary which holds all the important results into our results array.
    results_list.append({
       "Stratagy" : r_strat,
       "pslow" : results[i][0].params.pslow,
       "pfast" : results[i][0].params.pfast,
       "Return" : r_return,
       "Sharpe" : r_sharpe,
       "SQN" : r_sqn,
       "Drawdown" : r_draw
    })


# Make the results a Pandas dataframe and give us statsitical information 
# about the values of the runs. 
results = pd.DataFrame(results_list)               
results.describe()

        pslow       pfast       Return      Sharpe      SQN          Drawdown
count   153.000000  153.000000  153.000000  153.000000  153.000000  153.000000
mean    146.666667  83.333333   55.581273   0.499984    0.710518    0.052910
std     41.231056   41.231056   20.770793   0.156723    0.353140    0.045569
min     40.000000   30.000000   3.309254    0.048054    -0.034536   0.031449
25%     120.000000  50.000000   40.986669   0.393614    0.472027    0.031853
50%     150.000000  80.000000   53.110563   0.485314    0.658597    0.031938
75%     180.000000  110.000000  69.834780   0.604150    0.960126    0.044747
max     200.000000  190.000000  126.805272  0.983955    1.684354    0.262238

Looking at our results object we will see a couple of values:

pslow - duration of the slow-moving average window'
pfast - duration of the fast-moving average window
Return - Percentage return of our strategy.
Annualized Sharpe - A metric that takes into account the return above, the returns from a risk-free investment, and the standard deviation of our returns. The Sharpe ratio gives us a better idea of how good a strategy is as it takes into account the risk and volatility of a strategy. I would suggest reading up some more on the Sharpe ratio.
SQN — This number takes into account the average profit from a trade, the variability of the profit, and how many trades you did. (From Backtrader Docs ->) System Quality Number defined by Van K. Tharp to categorize trading systems.
1.6 - 1.9 Below average
2.0 - 2.4 Average
2.5 - 2.9 Good
3.0 - 5.0 Excellent
5.1 - 6.9 Superb
Drawdown - The drawdown amount in %, essentially the largest amount of money the strategy was down at any given point. It's important to take into account how much the strategy may be down at any point when implementing a strategy. For example, if you have a strategy that can get you amazing returns but at any moment you can be 50% down you may not want to take that risk.

Looking at SQN the best we could do was a below-average strategy. Now it's important to note the same combination of moving average windows probably did not have the highest SQN, Sharpe, and lowest Drawdown. This table just gives us an idea of if the MACross can make money for this stock and what values we should expect moving forward. If you get a table of all negative numbers you may want to either find a new region of pslow and pfast values or try and find different sectors/classes of companies to try your strategy on.

Next, we want to create a heatmap of the Sharpe values to see any patterns. I have to confess I am still pretty novice when it comes to dataframe manipulation so there very well might be a better way to do this. I choose Sharpe because it takes into account not just the return but also the variation in return.

# Set up data frame where the rows are pslow and the columns are pfast.
heat_df = pd.DataFrame(columns=results['pfast'].drop_duplicates(keep='first', inplace=False))
row = results['pslow'].drop_duplicates(keep='first', inplace=False)
heat_df['pslow'] = row
heat_df = heat_df.set_index('pslow')
heat_df = pd.DataFrame(columns=results['pfast'].drop_duplicates(keep='first', inplace=False))
row = results['pslow'].drop_duplicates(keep='first', inplace=False)
heat_df['pslow'] = row
heat_df = heat_df.set_index('pslow')

heat_df.columns.name = None
heat_df.index.names = ['']

# Go through all the combinations of pfast and pslow and put 
# in the Sharpe ratio of the run with those params.
pslowCols = results.filter(['pslow','pfast','Sharpe'])
for index, row in pslowCols.iterrows():
    heat_df.loc[row['pslow'],row['pfast']] = float(row['Sharpe'])

# I needed to change the type of data in the dataframe from
# object to float. 
heat_df = heat_df.astype(np.float16)

# Display heat map
sns.heatmap(heat_df)

Heat Map of Sharpe Values over Parameter Range

Here we see the pfast values on the x-axis and pslow values on the y-axis. It's important to have some context about your data. For example, here we see that the largest value (150,170) is surrounded by some other high values, however, that is not always the case. There can be times the best value is surrounded by a bunch of combinations that have negative Sharpe ratios. If this is the case you may want to either modify the range, switch the value you optimize for, or select another local max that’s in a more stable region.

Now I’m not going to lie. One of the reasons I like Data Science is the cool graphs you make that can show really interesting patterns. That is why I did a run that ran a few 1000 runs to get a higher resolution and larger heat map. This is a heat map of a MSFT run I did testing out the code. Just to note this is Returns and not Sharpe.

Now I found this extremely interesting.

The highest values surround a large negative Return space.
There’s a neat region that looks like a “Sharpe” claw. HaHAHAH get it. This region possibly has values that didn’t capture a big upswing.
The cloudiness of the data . There’s an interesting mix of patterns here.
Bands that happen at odd intervals.
It looks like a picture I see at the eye doctor every time I visit.

I think there is a whole lot that could be done looking into heatmaps like this of different companies and seeing how they differ. They could provide neat insights. I just wanted to share this cool graph.

Now you could stop here if you just want to implement an algorithmic trading system. You set up a strategy, found well-working parameters, and all you would have to do is backtest your strategy over the most recent 10 years and see what you get. But that’s no fun so now we are going to add some AIML to our model.

Setting up and Training the LSTM Model

To start making the model, we need to:

Retrieve some data using Finnhub
Set up testing and training data sets
Scale the data
Train a model using karas (Easier)
Train an optimized model using karas-tuner (Harder)
Test the models and evaluate.

Retrieve and Setup Data

Here we want to get the next set of data from Finnhub much like before.

timestamp_start = int(dt_start_nn_model_training.replace(tzinfo=timezone.utc).timestamp())
timestamp_end = int(dt_end_nn_model_training.replace(tzinfo=timezone.utc).timestamp())

finnhub_client = finnhub.Client(api_key=finnhub_api_key)
res = finnhub_client.stock_candles(ticker, granularity, timestamp_start, timestamp_end)
data = pd.DataFrame(res)

Then we want to get the optimized pfast and pslow values from above. Using those values we will create two new columns in our data set which correspond to the fast and slow-moving average indicators.

row_optimized = results["Sharpe"].argmax()
opt_slow = results.loc[row_optimized,"pslow"]
opt_fast = results.loc[row_optimized,"pfast"]

data['fast_ra'] = data.iloc[:,1].rolling(window=opt_fast).mean()
data['slow_ra'] = data.iloc[:,1].rolling(window=opt_slow).mean()
data = data.dropna()

Set Up Training and Testing Data

Next we will create two arrays for both fast and slow-moving averages where one will contain the train data, and one will contain the test data. Since both the fast and slow indicators have their own model they need their own test and train data sets.

fast_data = data.filter(['fast_ra'])
slow_data = data.filter(['slow_ra'])

# Get Number of Rows to train model. (Use 80%)
fast_training_data_len = math.ceil( len(fast_data) * .8)
# Create train and test data arrays.
fast_train_data = fast_data[0:fast_training_data_len]
fast_test_data = fast_data[fast_training_data_len-lookback_distance:]

# Get Number of Rows to train model. (Use 80%)
slow_training_data_len = math.ceil( len(slow_data) * .8)
# Create train and test data arrays.
slow_train_data = slow_data[0:slow_training_data_len]
slow_test_data = slow_data[slow_training_data_len-lookback_distance:]

Next we need to scale the data using a MinMaxScalar. This modifies the data so the minimum value is 0 and the maximum value is 1. Then all values in-between will be linearly fit to a value between 0 and 1.

For example: [0, 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10] turns to [0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8,. 0.9, 1.0]

This makes it easier for the model to fit and train.

#Scale the Data
fast_scalar = MinMaxScaler(feature_range=(0,1))
fast_scaled_train_data = fast_scalar.fit_transform(fast_train_data)
fast_scaled_test_data = fast_scalar.fit_transform(fast_test_data)

#Scale the Data
slow_scalar = MinMaxScaler(feature_range=(0,1))
slow_scaled_train_data = slow_scalar.fit_transform(slow_train_data)
slow_scaled_test_data = slow_scalar.fit_transform(slow_test_data)

Now we need to set up the training data. We are going to make two models, one for the fast-moving average and one for the slow-moving average. This means we need two identical sets of code to set up the models so you will see a lot of duplication.

We need to create two arrays for each model. One array where each entry is an array of 60 moving average values (x), and another array which is the target moving average (y).

For example:

The first entry of the x array will be [ma0, ma1, ma2, … ma59] and the first entry of the y array is ma60

The second entry of the x array will be [ma1, ma2, ma3, … ma60] and the first entry of the y array is ma61

The pattern is we get 60 ma values to predict the 61st ma value.

fast_x_train = []
fast_y_train = []

slow_x_train = []
slow_y_train = []

# Create the two arrays. One with the set of 60 MA values leading to the target MA
# and one with the target MA
# We need to start at our lookahead distance to have enough values for the first 
# target moving average. 
for i in range(lookback_distance, len(fast_scaled_train_data)):
 fast_x_train.append(fast_scaled_train_data[i-lookback_distance:i, 0])
 fast_y_train.append(fast_scaled_train_data[i, 0])
 slow_x_train.append(slow_scaled_train_data[i-lookback_distance:i, 0])
 slow_y_train.append(slow_scaled_train_data[i, 0])

# Convert x_train and y_train to numpy array
fast_x_train,  fast_y_train = np.array(fast_x_train), np.array(fast_y_train)
slow_x_train,  slow_y_train = np.array(slow_x_train), np.array(slow_y_train)

# Reshape Data into Samples, Timesteps, Features
fast_x_train = np.reshape(fast_x_train, (fast_x_train.shape[0] ,fast_x_train.shape[1], 1))
slow_x_train = np.reshape(slow_x_train, (slow_x_train.shape[0] ,slow_x_train.shape[1], 1))

Train the Model using Karas

First we will make a static model for both the fast and slow-moving averages with an initial set of parameters. The model will look like:

(60 Previous MA Values) -> LSTM Layer -> LSTM Layer -> Dense Hidden Layer -> Single Neuron Output -> (Predicted MA)

# Initialize the constructor
best_fast_model = Sequential()
# Add an first hidden layer 
best_fast_model.add(LSTM(30, return_sequences=True, input_shape = (fast_x_train.shape[1],1)))
# Add a second hidden layer 
best_fast_model.add(LSTM(30, return_sequences=False))
# Add a third hidden layer 
best_fast_model.add(Dense(8))
# Add an output layer with one neuron and no activation specified
best_fast_model.add(Dense(1))

# Initialize the constructor
best_slow_model = Sequential()
# Add an first hidden layer 
best_slow_model.add(LSTM(30, return_sequences=True, input_shape = (slow_x_train.shape[1],1)))
# Add a second hidden layer 
best_slow_model.add(LSTM(30, return_sequences=False))
# Add a third hidden layer 
best_slow_model.add(Dense(8))
# Add an output layer with one neuron and no activation specified
best_slow_model.add(Dense(1))

opt = tf.keras.optimizers.Adam(learning_rate=0.001)

# Compile Model
best_fast_model.compile(optimizer=opt,loss='mean_squared_error', metrics=['accuracy','mse'])
best_slow_model.compile(optimizer=opt,loss='mean_squared_error', metrics=['accuracy','mse'])

Before we train our model we need some background info. A Nerul Net is trained by taking a sample from the training set, inputting it into a randomly initialized Neural Net, and receiving an output. You then find that difference between the output and the target variable to determine the error. The computer then propagates backward through the model trying to correct for this error by modifying the weights and threshold of the neurons. This is done multiple times over all the training data to build an accurate model. One forward and backward pass over all the training data is called an epoch. One forward and backward pass over a subset of training data is called an iteration. The size of the subset of data in an iteration is called the batch size.

model.fit() Inputs:

an array of leading ma values (X)
an array of target ma values (y)
batch_size - The amount of training data used for each iteration of a forward and backward pass. (Higher the more memory is needed for each pass)
epochs - The number of epochs to train the model. (Higher the better fit)
validation_split- The amount of data that will be set aside and not be used for training. This data will be used to evaluate the specified metrics at the end of each epoch.

# Train the model
best_fast_model.fit(fast_x_train, fast_y_train, batch_size=1, epochs=4, validation_split=0.1)
best_slow_model.fit(slow_x_train, slow_y_train, batch_size=1, epochs=4, validation_split=0.1)

Epoch 4/4 ... mse: 1.8177e-04 - val_loss: 1.4620e-04 - val_accuracy: 0.0083 - val_mse: 1.4620e-04

Epoch 4/4 ... mse: 1.3387e-04 - val_loss: 1.1088e-05 - val_accuracy: 0.0083 - val_mse: 1.1088e-05

One of the things we want to look at is the MSE is Mean Squared Errors. This is a measurement which tells us about how far our model is off on average. Its units are the same as the target variable squared. (Dollars^2 in this case). Here we see the mse is approximately the val_mse. This is important as if the mse of the training set is significantly lower that validation we are probably overfitting meaning our model is fitting to much to the noise. If its much higher than we are probably underfitting which is not fitting to the data well enough.

After this step we have two fully functioning models that can predict each of the fast and slow moving averages. However there are a lot of numbers and parameters that we just set randomly or by default when building the model. It would be useful if we could run the model over a whole set of parameters and see what the optimal set of parameters are. This is called hyperparameter optimization.

Train the Model using karas-tuner

We are going to use a package called karas-tuner to tune our model. To set it up we are going to make a HyperModel which is just the model we had before, however we replace some of the parameters with ranges of parameters we can test.

class HyperModel(HyperModel):
  # When making the model its possible to have meta parameters such as levels and shape.
  # These come in when initializing the HyperModel and here we have 2:
  # - The depth of the model. (We are just using 1 for now but thought it would be useful to show)
  # - The shape of the input data.
  def __init__(self,levels,input_shape) :
    self.levels      = levels;
    self.input_shape = input_shape;

  # Here we define our model just like above.
  def build(self, hp) :
    model = Sequential()
    # But here we replace 30 with this object which says we want to test values 28 to 36
    model.add(LSTM(
          units = hp.Int(
            'units',
            min_value=28,
            max_value=36,
            step=2,
            default=32
        ),
        return_sequences=True, 
        input_shape = self.input_shape
        )
    )

    # But here we replace 30 with this object which says we want to test values 28 to 36
    model.add(LSTM(
          units = hp.Int(
            'units',
            min_value=28,
            max_value=36,
            step=2,
            default=32
        ),
        return_sequences=False
        )
    )

    # Here we have a loop which can create multiple hidden layers according to levels
    for i in range(0, self.levels):
      # Here we replace 8 with this object which says we want to test values 6 to 10
      model.add(Dense(
        units = hp.Int(
            'units',
            min_value=6,
            max_value=10,
            step=2,
            default=8
        ),
        # Here we replace the default activation with other activation functions. 
        activation=hp.Choice(
            'dense_activation',
            values=['relu', 'tanh', 'sigmoid','linear'],
            default='linear'
        )
      ))

    # Here we replace the default activation with other activation functions.     
    model.add(Dense(1,
        activation=hp.Choice(
            'dense_activation',
            values=['relu', 'tanh', 'sigmoid','linear'],
            default='linear'
        )
    ))
    
    # We can also optimize parameters to the compile command. Here we test differant learning rates. 
    model.compile(
        optimizer=keras.optimizers.Adam(
            hp.Float(
                'learning_rate',
                min_value=8e-4,
                max_value=1e-3,
                sampling='LOG',
                default=1e-3
            )
        ),
        loss='mean_squared_error', 
        metrics=['mse', 'accuracy']
    )

    return model

# Create fast and slow models from same general HyperModel
fasthypermodel = HyperModel(1, (fast_x_train.shape[1],1));
slowhypermodel = HyperModel(1, (slow_x_train.shape[1],1));

Next we set up the tuner. The tuner determines how we will search the parameter space as we cannot possibly test all combinations. For example one option is Random which will just test a certain amount of random combinations, however here we select a search method with some more smarts, Hyperband.

We can have it search as short or as long as we want, however the longer you have it run the better results you will get.

# Once you run there will be a dialogue showing what iteration you are on.
# To get an estimate of how many iterations will be run use this formula: 
# hyperband_iterations * max_epochs * (math.log(max_epochs, factor) ** 2)

# Note: I had to change project name each time I ran. The good news is that you can
# run this, stop it, and then continue where it left off as it continuously saves your model. 
# You can even load a model from before.

# It took about 2 hrs to get a result with these settings. 

fast_tuner = Hyperband(
    fasthypermodel,
    objective='mse',
    seed=5354,
    max_epochs=4,
    hyperband_iterations=5,
    factor = 2,
    executions_per_trial=1,
    directory=google_drive_path+'hyperparameter_tuning',
    project_name='name0'
)

slow_tuner = Hyperband(
    slowhypermodel,
    objective='mse',
    seed=9953,
    max_epochs=4,
    hyperband_iterations=5,
    factor = 2,
    executions_per_trial=1,
    directory=google_drive_path+'hyperparameter_tuning',
    project_name='name1'
)

# (Optional) These output the search space the hyper tuner will run.
fast_tuner.search_space_summary()
slow_tuner.search_space_summary()

After we set up the tuners we just have to run them. They take the same arguments as the .fit() method.

fast_tuner.search(fast_x_train, fast_y_train, batch_size=1, epochs=4,validation_split=0.1) 
slow_tuner.search(slow_x_train, slow_y_train, batch_size=1, epochs=4,validation_split=0.1)

# (Optional) Show a summary of the search
fast_tuner.results_summary()
slow_tuner.results_summary()

# Retrieve the best model.
best_opt_fast_model = fast_tuner.get_best_models(num_models=1)[0]
best_opt_slow_model = slow_tuner.get_best_models(num_models=1)[0]

# (Optional) Save Models
best_fast_model.save(google_drive_path+"fast_model")
best_slow_model.save(google_drive_path+"slow_model")
best_opt_fast_model.save(google_drive_path+"fast_opt_model")
best_opt_slow_model.save(google_drive_path+"slow_opt_model")

At this point we have four total models. Two we created without any optimization and two which were optimized with hyperparameter tuning. Now we need to create the test set the same way we created the training set. Then get predictions, grab the RSME (Square Root of MSE) , and plot the difference to see how the models did.

Test Model and Evaluate

# Create the testing set
fast_x_test = []
fast_y_test = fast_data[fast_training_data_len:]

slow_x_test = []
slow_y_test = slow_data[slow_training_data_len:]

for i in range(lookback_distance, len(fast_scaled_test_data)):
 fast_x_test.append(fast_scaled_test_data[i-lookback_distance:i, 0])
 slow_x_test.append(slow_scaled_test_data[i-lookback_distance:i, 0])

# Convert the data to a numpy
fast_x_test = np.array( fast_x_test)
slow_x_test = np.array( slow_x_test)

# Reshape Data
fast_x_test = np.reshape( fast_x_test, ( fast_x_test.shape[0] , fast_x_test.shape[1], 1))
slow_x_test = np.reshape( slow_x_test, ( slow_x_test.shape[0] , slow_x_test.shape[1], 1))

# Preditions. We have to remember the model takes 
# in Scaled data so we have to un scale it coming out. 
predictions_opt_fast = best_opt_fast_model.predict(fast_x_test)
predictions_opt_fast = fast_scalar.inverse_transform(predictions_opt_fast)

predictions_opt_slow = best_opt_slow_model.predict(slow_x_test)
predictions_opt_slow = slow_scalar.inverse_transform(predictions_opt_slow)


# Preditions. We have to remember the model takes 
# in Scaled data so we have to un scale it coming out. 
predictions_fast = best_fast_model.predict(fast_x_test)
predictions_fast = fast_scalar.inverse_transform(predictions_fast)

predictions_slow = best_slow_model.predict(slow_x_test)
predictions_slow = slow_scalar.inverse_transform(predictions_slow)

# Evaluate Model and get RMSE
print("Hyperparameter Tuned")
fast_opt_rmse = np.sqrt( np.mean( predictions_opt_fast - fast_y_test )**2 )
slow_opt_rmse = np.sqrt( np.mean( predictions_opt_slow - slow_y_test )**2 )
print(fast_opt_rmse)
print(slow_opt_rmse)

# Evaluate Model and get RMSE
print("Untuned")
fast_rmse = np.sqrt( np.mean( predictions_fast - fast_y_test )**2 )
slow_rmse = np.sqrt( np.mean( predictions_slow - slow_y_test )**2 )
print(fast_rmse)
print(slow_rmse)

Hyperparameter Tuned
fast_ra    0.313589
dtype: float64
slow_ra    0.083455
dtype: float64
Untuned
fast_ra    0.151402
dtype: float64
slow_ra    0.090759
dtype: float64

Here we calculate RSME as it is easier to interpret as its in units of what we are predicting. (Dollars in this case).

The approximate MSEs in this case are above what we got in Training and Validation which may point to overfitting. To account for this we could do things such as lowering the Neuron count of each layer.

Here I was getting RSMEs smaller than a dollar, but the RSMEs may change depending on how that specific stock behaves. In addition you need to take into consideration how expensive the stock is to evaluate this value since its not normalized. For example being 20 cents off a 5$ stock is a lot worse than 20 cents off a $300 stock.

Now we want to plot the data.

ftrain = fast_data[:fast_training_data_len]
fvalid = fast_data[fast_training_data_len:]
fvalid['predictions'] = predictions_fast

strain = slow_data[:fast_training_data_len]
svalid = slow_data[fast_training_data_len:]
svalid['predictions'] = predictions_slow

plt.figure(figsize=(16,8))
plt.plot(ftrain['fast_ra'])
plt.plot(strain['slow_ra'])

plt.plot(fvalid[['fast_ra','predictions']])
plt.plot(svalid[['slow_ra','predictions']])

Here we see our optimized model did a pretty good job but now we need to backtest to see just how well our strategy works. (The bolded lines are where the predicted and actual values overlay.)

Backtesting and Implementing the Predictive Strategy

Now that we have the model, we need to implement the strategy. Lets

Retrieve some data using Finnhub API
Create a dataframe with all of our new indicators.
Tell Backtrader how to read in this new dataframe.
Create a custom strategy using the new indicators.
Run strategy and evaluate results.

Retrieve and Setup Data

Same stuff. Different date range.

# Validation for PFast and PSlow Parameters of MA Crossover
timestamp_start = int(dt_start_backtest.replace(tzinfo=timezone.utc).timestamp())
timestamp_end = int(dt_end_backtest.replace(tzinfo=timezone.utc).timestamp())

finnhub_client = finnhub.Client(api_key=finnhub_api_key)
res = finnhub_client.stock_candles(ticker, 'D', timestamp_start, timestamp_end)
data = pd.DataFrame(res)
data.dropna(inplace=True)

# Make Data Backtrader Friendly
dataBacktrader = data
dataBacktrader.columns = ['close', 'high', 'low','open','s','t','volume']

dataBacktrader['date'] = dataBacktrader.apply(lambda row : datetime.utcfromtimestamp(row['t']), axis = 1)
dataBacktrader.drop(['s','t'], axis = 1, inplace=True)
dataBacktrader['openinterest'] = 0

Create Pandas Dataframe with Indicators

Now we want to create a dataframe that has all of our indicators included that Backtrader can run. Here we want to add 6 columns to our typical dataframe:

fast_ra - Fast Rolling Average
slow_ra - Slow Rolling Average
pred_fast_opt - Tomorrow's predicted fast ra from the tuned model.
pred_slow_opt - Tomorrow's predicted fast ra from the tuned model.
pred_fast - Tomorrow's predicted fast ra from the untuned model.
pred_slow - Tomorrow's predicted fast ra from the untuned model.

# Populate the data frame with the rolling averages
data['fast_ra'] = data.iloc[:,1].rolling(window=opt_fast).mean()
data['slow_ra'] = data.iloc[:,1].rolling(window=opt_slow).mean()
data.dropna()

fast_data = data.filter(['fast_ra'])
slow_data = data.filter(['slow_ra'])

# Here we do the same thing as when we constructed  the testing set.
# Instead we will use all of the data in the backtesting data set.
fast_x_test = []
fast_y_test = fast_data[lookback_distance:]

slow_x_test = []
slow_y_test = slow_data[lookback_distance:]

fast_scaled_test_data = fast_scalar.fit_transform(fast_data)
slow_scaled_test_data = slow_scalar.fit_transform(slow_data)

for i in range(lookback_distance, len(fast_scaled_test_data)):
 fast_x_test.append(fast_scaled_test_data[i-lookback_distance:i, 0])
 slow_x_test.append(slow_scaled_test_data[i-lookback_distance:i, 0])

# Convert the data to a numpy
fast_x_test = np.array( fast_x_test)
slow_x_test = np.array( slow_x_test)

# Reshape Data
fast_x_test = np.reshape( fast_x_test, ( fast_x_test.shape[0] , fast_x_test.shape[1], 1))
slow_x_test = np.reshape( slow_x_test, ( slow_x_test.shape[0] , slow_x_test.shape[1], 1))

# Preditions. We have to remember the model takes 
# in Scaled data so we have to un scale it coming out. 
predictions_opt_fast = best_opt_fast_model.predict(fast_x_test)
predictions_opt_fast = fast_scalar.inverse_transform(predictions_opt_fast)

predictions_opt_slow = best_opt_slow_model.predict(slow_x_test)
predictions_opt_slow = slow_scalar.inverse_transform(predictions_opt_slow)

# Preditions. We have to remember the model takes 
# in Scaled data so we have to un scale it coming out. 
predictions_fast = best_fast_model.predict(fast_x_test)
predictions_fast = fast_scalar.inverse_transform(predictions_fast)

predictions_slow = best_slow_model.predict(slow_x_test)
predictions_slow = slow_scalar.inverse_transform(predictions_slow)

# Evaluate Model and get RMSE of Backtesting Set
print("Hyperparameter Tuned")
fast_opt_rmse = np.sqrt( np.mean( predictions_opt_fast - fast_y_test )**2 )
slow_opt_rmse = np.sqrt( np.mean( predictions_opt_slow - slow_y_test )**2 )
print(fast_opt_rmse)
print(slow_opt_rmse)

# Evaluate Model and get RMSE of Backtesting Set
print("Untuned")
fast_rmse = np.sqrt( np.mean( predictions_fast - fast_y_test )**2 )
slow_rmse = np.sqrt( np.mean( predictions_slow - slow_y_test )**2 )
print(fast_rmse)
print(slow_rmse)

Hyperparameter Tuned
fast_ra    0.617608
dtype: float64
slow_ra    0.477331
dtype: float64
Untuned
fast_ra    0.168646
dtype: float64
slow_ra    0.025827
dtype: float64

Here we see the RSME grew even more which makes sense, however it is still a relatively low value. I just used the RSMEs here as a checkpoint to make sure the models are correctly trained and fit.

Now to construct the actual DataFrame.

# Construct dataframe that will feed Backtrader Stratagy
dataBacktrader_trunc = dataBacktrader.loc[lookback_distance:,:]
dataBacktrader_trunc['pred_fast_opt'] = predictions_opt_fast
dataBacktrader_trunc['pred_slow_opt'] = predictions_opt_slow
dataBacktrader_trunc['pred_fast'] = predictions_fast
dataBacktrader_trunc['pred_slow'] = predictions_slow
dataBacktrader_trunc.dropna(inplace=True)

plt.figure(figsize=(16,8))
plt.plot(dataBacktrader_trunc[['fast_ra','pred_fast_opt','pred_fast']])
plt.plot(dataBacktrader_trunc[['slow_ra','pred_slow_opt','pred_slow']])

Here we see there are areas such as the right peak where the predicted values get the most off. This is probably the highest contributor to the RSME values.

Since we added indicators to our dataframe we need to tell Backtester about how to read it in. To do this we will create a new class to take in the dataframe object.

# Here we define the fields of our custom dataframe object that will feed Backtrader
class MAData(btfeeds.PandasData):

    # What new data will be availible in the Stratagies line object
    lines = ('fast_ra', 'slow_ra', 'pred_fast_opt', 'pred_slow_opt', 'pred_fast', 'pred_slow')

    # Which columns go to which variable
    params = (
        ('open', 'open'),
        ('high', 'high'),
        ('low', 'low'),
        ('close', 'close'),
        ('volume', 'volume'),
        ('openinterest', 'openinterest'),
        ('fast_ra', 7),
        ('slow_ra', 8),
        ('pred_fast_opt', 9),
        ('pred_slow_opt', 10),
        ('pred_fast', 11),
        ('pred_slow', 12)
    )

Now we want to create our custom strategy that implements our models. You could also go the route of creating custom built indicators but that could be for another time.

# Create a subclass of Strategy to define the indicators and logic
class SmaCrossPredicted(bt.Strategy):
    # We have a single parameter which indicates whther we want to use the:
    # 0 - Non Hyperparameter Tunbed Model
    # 1 - Hyperparameter tuned model.
    params = dict(
        opt=0
    )

    # At initialization of the stratagy
    def __init__(self):
        # We only need to setup the cross indicator of the predicted values. 
        self.crossover = bt.ind.CrossOver(self.data.l.pred_fast_opt, self.data.l.pred_slow_opt)  

    def next(self):
        if not self.position:  # not in the market
            if self.crossover > 0:  # if fast crosses slow to the upside
                self.buy()  # enter long

        elif self.crossover < 0:  # in the market & cross to the downside
            self.close()  # close long position

    def stop(self):
        print("...",self.p.opt)

Backtest and Evaluate Stratagies

Now using our custom strategy we want to backtest three versions:

opt - The tuned model
non opt - The untuned model
no model - The Classic SMACross strategy

# Arrays to store result data
results_list   = []
results_struct = []
strat          = ["Tuned Model", "Non Tuned Model", "Classic MA Cross"]

# Arrays to store buy, sell tranactions
txns           = [[],[],[]]

# Go through all of the stratagies in the strat array
for i in range(0,3):
  # Same initialization as before
  cerebro = bt.Cerebro(stdstats=False,maxcpus=None)
  cerebro.broker.setcash(100000.0)

  dataFrame = MAData(dataname=dataBacktrader_trunc, datetime='date')
  cerebro.adddata(dataFrame)

  # Depending on strat add the correct stratagy 
  if (i == 0) :
    cerebro.addstrategy(SmaCrossPredicted, opt = 1)  
  elif (i == 1) :
    cerebro.addstrategy(SmaCrossPredicted,  opt = 0)  
  else :
    cerebro.addstrategy(SmaCross, pslow=opt_slow, pfast=opt_fast)

  # Same as before
  cerebro.addsizer(bt.sizers.PercentSizer, percents=99)
  cerebro.addanalyzer(bt.analyzers.SharpeRatio, timeframe=bt.TimeFrame.Days)
  cerebro.addanalyzer(bt.analyzers.TimeReturn, timeframe=bt.TimeFrame.NoTimeFrame)
  cerebro.addanalyzer(bt.analyzers.TradeAnalyzer, _name="ta")
  cerebro.addanalyzer(bt.analyzers.DrawDown)
  cerebro.addanalyzer(bt.analyzers.SQN, _name="sqn")

  # Here we add a Transaction analyzer so we can see the differance of the stratagies
  # in terms of transaction start and lengths.
  cerebro.addanalyzer(bt.analyzers.Transactions, _name="txn")

  cerebro.broker.setcommission(commission=0)
  results = cerebro.run();

  # Now we have to accumulate the results ourselves 
  # vs all of the results being returned in one cerebro run.
  results_list.append(results)

  # We need to modify the transactions from dates of buys and sells to arrays of
  # (Date of Buy, How long it was held). This helps us to plot the transaction lengths
  for item in results_list[i][0].analyzers.txn.get_analysis().items():
    if (item[1][0][0] < 0):
       sell_date = int(item[0].replace(tzinfo=timezone.utc).timestamp())   
       txns[i].append((start_date,sell_date-start_date)) 
    else : 
       start_date = int(item[0].replace(tzinfo=timezone.utc).timestamp())

# Go through results and create a results dataframe.
for i in range(0, len(results_list)):    
  r_strat  = strat[i];
  r_return = list(results_list[i][0].analyzers.timereturn.get_analysis().values())[0]*100
  r_sharpe = results_list[i][0].analyzers.sharperatio.get_analysis()['sharperatio'];
  r_draw   = results_list[i][0].analyzers.drawdown.get_analysis()['drawdown']/100;
  r_sqn    = results_list[i][0].analyzers.sqn.get_analysis()['sqn']

  results_struct.append({
      "Stratagy" : r_strat,
      "Return" : r_return,
      "Sharpe" : r_sharpe,
      "Drawdown" : r_draw,
      "SQN" : r_sqn
  })

resultspd = pd.DataFrame(results_struct)    

for i in range(0, 3):   
  # See References and Resources
  printTradeAnalysis(results_list[i][0].analyzers.ta.get_analysis())

resultspd

Tuned Trade Analysis Results:
               Total Open     Total Closed   Total Won      Total Lost     
               0              3              2              1              
               Strike Rate    Win Streak     Losing Streak  PnL Net        
               66.666666666666662              1              38933.27       
Untuned Trade Analysis Results:
               Total Open     Total Closed   Total Won      Total Lost     
               0              7              3              4              
               Strike Rate    Win Streak     Losing Streak  PnL Net        
               42.8571428571428542              4              31463.47       
MA Cross Trade Analysis Results:
               Total Open     Total Closed   Total Won      Total Lost     
               0              7              3              4              
               Strike Rate    Win Streak     Losing Streak  PnL Net        
               42.8571428571428542              4              21216.55       

            Stratagy         Return     Sharpe      Drawdown    SQN
0           Tuned Model      38.933272  0.033459    0.055129    1.342116
1           Non Tuned Model  31.463466  0.020075    0.054352    0.946327
2           Classic MA Cross 21.216555  0.013938    0.083485    0.811401

Here we see interesting results.

First, our models appear to do better in terms of Returns. In addition, our model does better in terms of success rate and consistency of transactions as indicated by the higher SQN and Sharpe.
The tuned model does better than our untuned model.
However we still have a Below average SQN.

Now we create a broken bar plot to visualize when each strategy held positions in the stock and might provide insight into why our model did better.

fig, ax = plt.subplots(figsize=(30, 10))
ax.broken_barh(txns[0], (10, 9), facecolors='gray', hatch = 'X')
ax.broken_barh(txns[1], (20, 9), facecolors='gray', hatch = '/')
ax.broken_barh(txns[2], (30, 9), facecolors='gray', hatch = '.')
plt.ylabel("Tuned Model                                   Untuned Model                              MACross")
for i in range(0,len(txns[2])):
  ax.axvline(x=txns[2][i][0], color="black")
plt.show()

Here we see our model didn’t go exactly as planned. The Top is the MACross, the Middle is Untuned, and the bottom is the Tuned model. The black lines indicate when each buy was for the MACross.

It looks like rather than trying to buy or sell the cross early, our model filtered the buys and sells to the highest return areas, possibly acting as a filter of transient events. Our model may work like those secondary indicators I mentioned above, which can flag false buy and sell signals. Since the stock is held for a short amount of time, we could buy other stock during these times and gain an even better return. However, a portfolio-style setup is for another post.

Conclusions

In the end our model didn’t exactly do what we want, but it had an interesting effect nonetheless. It looks like the LSTM filtered the signal movement to almost what was supposed to happen according to previous patterns rather than possible transient noise or events. Now I ran this model over a whole bunch of stocks and found this similar pattern in each. Typically, there were fewer transactions over a smaller period, but those transactions' success rates were quite good. This example shows the model in a good light where the returns were better than the MACross, which held onto the stock for a long time. In many of the other runs I saw a higher success rate on shorter transactions; however, the returns were smaller since the stock wasn't held as long.

In the end I wouldn’t go trading your retirement. Still, I hope this worksheet provided a framework to test more (and hopefully better) ideas utilizing Neural Nets.

Thanks for reading my first post. Feel free to contact me if you see any typos or things that are just incorrect. Also, if you have any questions, feel free to ping me.

Future Work

Of course there are always some more things we could do. I might investigate these more in a future post.

Can test out trying this over different granularities such as months or see how it does with intraday trading.
Make a more robust model that possibly forecast movement vs. trying to capture a very noise next day indicator.
Try to predict a target further in the future than a day.
Find a better Slow/Fast Window optimization method that could consider a region of values vs. just the max value.
Run this strategy over a portfolio to have less dead time for our money.
Add a Twitter bot to ping you when a cross might happen.

References and Resources

[General Concepts] Quantitative Trading: How to Build Your Own Algorithmic Trading Business by Ernie Chan (Author)

[Backtrader] General Quickstart for Backtrader: https://www.backtrader.com/docu/quickstart/quickstart/

[Backtrader] How to impliment the simple MA Crossover: https://community.backtrader.com/topic/2381/simple-ma-crossover/2

[Backtrader] How to use datafeeds to get Pandas: https://www.backtrader.com/docu/pandas-datafeed/pandas-datafeed/

[Backtrader] How to use Analyzers https://github.com/soulmachine/crypto-notebooks/blob/master/backtest/backtrader-SMA-Cross.ipynb

[LSTM] How to predict Stock Prices using LSTM Models: https://www.youtube.com/watch?v=QIUxPv5PJOY

[Finnub] Finnhub API Documentation: https://finnhub.io/docs/api

[Twitter API] Twitter API Direct Messages: https://developer.twitter.com/en/docs/twitter-api/v1/direct-messages/sending-and-receiving/api-reference/new-event

[Hyper Parameters] https://www.sicara.ai/blog/hyperparameter-tuning-keras-tuner

[Hyper Parameters] https://keras-team.github.io/keras-tuner/documentation/tuners/

[Backtrader Strategy Development] https://medium.com/@danjrod/custom-indicator-development-in-python-with-backtrader-bc775552dc3e

[Aanlyzers] https://backtest-rookies.com/2017/06/11/using-analyzers-backtrader/

Author: www.backtest-rookies.com under MIT License

def printTradeAnalysis(analyzer):
    '''
    Function to print the Technical Analysis results in a nice format.
    '''
    #Get the results we are interested in
    total_open = analyzer.total.open
    total_closed = analyzer.total.closed
    total_won = analyzer.won.total
    total_lost = analyzer.lost.total
    win_streak = analyzer.streak.won.longest
    lose_streak = analyzer.streak.lost.longest
    pnl_net = round(analyzer.pnl.net.total,2)
    strike_rate = (total_won / total_closed) * 100
    #Designate the rows
    h1 = ['Total Open', 'Total Closed', 'Total Won', 'Total Lost']
    h2 = ['Strike Rate','Win Streak', 'Losing Streak', 'PnL Net']
    r1 = [total_open, total_closed, total_won, total_lost]
    r2 = [strike_rate, win_streak, lose_streak, pnl_net]
    #Check which set of headers is the longest.
    if len(h1) > len(h2):
        header_length = len(h1)
    else:
        header_length = len(h2)
    #Print the rows
    print_list = [h1,r1,h2,r2]
    row_format ="{:<15}" * (header_length + 1)
    print("Trade Analysis Results:")
    for row in print_list:
        print(row_format.format('',*row))

Disclaimer

This model doesn’t work that well. Don’t use it plz. I am not a professional financial advisor and have no formal training to give out personal financial advice. You need to do your own research or consult with a professional before doing anything actionable in terms of trading securities. This guide is meant for educational purposes only.

algorithmic trading

Eric Taylor

Creating a Predictive Trading Algorithm using TensorFlow

Moving Average Cross Trading Algorithm

LSTM Neural Nets and Why Not a Linear Regression

Where we can go wrong from the start.

Finding an Optimal Slow/Fast Moving Average

Install and Import Packages

Set Up Model Parameters

Retrieve and Setup Data

Setup SmaCross Model

Run SMACross Strategy and Baseline Performance

Setting up and Training the LSTM Model

Retrieve and Setup Data

Set Up Training and Testing Data

Train the Model using Karas

Train the Model using karas-tuner

Test Model and Evaluate

Backtesting and Implementing the Predictive Strategy

Retrieve and Setup Data

Create Pandas Dataframe with Indicators

Backtest and Evaluate Stratagies

Conclusions

Future Work

References and Resources

Disclaimer

Using Autoencoder Neural Nets to Compress and/or Upscale Video