Time Series Regression-based Trading Strategy
In this approach, we'll use past components of SPY to predict in the future with a certain degree of accuracy. A less complex task would be to predict only "one" day into the future.This approach is plausible because just like the trend following strategy, we are going to update our position on a daily basis. This signifies that we can use all the data up to the current time step and only predict the return for the next timestep. Our decision to Buy, Sell or Do nothing
is updated every single day. Therefore, we'd only worry about predicting one day into the future.
To feed our model we'll choose some companies from the S&P 500, which have the largest market cap. (arbitrary as we're just exploring)
A more intuitive approach would have been to feed our model all of the 500 & some companies and let the model do the work? however our model will overfit.
We also will be using returns in lieu of stock prices as ML models are not good at extrapolation since, as we know, stock prices in the market are generally going up.If we were train our model on prices in the range one hundred dollars to two hundred dollars, our model might learn what to do for that range. But if it goes up to three hundred dollars in the test set, our model has never seen that before and it doesn't know what to do in return. However, as we know, returns are more or less stationary therefore being a better candidate than prices for input features.
As displayed above, our data currently displays the close prices of various stocks from the S&P lined up by day. We've preprocessed our dataframe by removing any columns with missing values as well as drop any rows in which all the values are missing.
O/P: (0.0082717541782342, -0.011369618185062102)
As you can see these scores are quite low, a perfect score would have been one and predicting the average would be zero. We can think of that as like the naive prediction. For the training set, we are only slightly above the naive prediction and for the test set we're getting a negative value.
We are actually doing worse than the naive prediction. Nonetheless, this won't stop us from just seeing how the model performs.
We don't actually care about the value of the prediction, we simply want to know whether it is positive (we buy) or negative (we sell).
We call model model.predict
function which will give us Ptrain
and Ptest
. We'll then measure their accuracy by using the sign function which converts the array into plus one if the argument is positive and minus one if the argument is negative. Applying this for both the predictions and the targets, this will give us an array of booleans forwhich we will take the mean of the boolean array to arrive at our classification accuracy.
Last updated
Was this helpful?