Data Science / R / R News / Statistics

Trading strategy: Making the most of the out of sample data

by The R Trader · August 19, 2016

This article is originally published at http://www.thertrader.com

When testing trading strategies a common approach is to divide the initial data set into in sample data: the part of the data designed to calibrate the model and out of sample data: the part of the data used to validate the calibration and ensure that the performance created in sample will be reflected in the real world. As a rule of thumb around 70% of the initial data can be used for calibration (i.e. in sample) and 30% for validation (i.e. out of sample). Then a comparison of the in and out of sample data help to decide whether the model is robust enough. This post aims at going a step further and provides a statistical method to decide whether the out of sample data is in line with what was created in sample.

In the chart below the blue area represents the out of sample performance for one of my strategies.

A simple visual inspection reveals a good fit between the in and out of sample performance but what degree of confidence do I have in this? At this stage not much and this is the issue. What is truly needed is a measure of similarity between the in and out of sample data sets. In statistical terms this could be translated as the likelihood that the in and out of sample performance figures coming from the same distribution. There is a non-parametric statistical test that does exactly this: the Kruskall-Wallis Test. A good definition of this test could be found on R-Tutor “A collection of data samples are independent if they come from unrelated populations and the samples do not affect each other. Using the Kruskal-Wallis Test, we can decide whether the population distributions are identical without assuming them to follow the normal distribution.” The added benefit of this test is not assuming a normal distribution.

It exists other tests of the same nature that could fit into that framework. The Mann-Whitney-Wilcoxon test or the Kolmogorov-Smirnov tests would perfectly suits the framework describes here however this is beyond the scope of this article to discuss the pros and cons of each of these tests. A good description along with R examples can be found here.

Here’s the code used to generate the chart above and the analysis:

################################################
## Making the most of the OOS data
##
## [email protected] - Aug. 2016
################################################
library(xts)
library(PerformanceAnalytics)

thePath <- "myPath" #change this
theFile <- "data.csv" 
data <- read.csv(paste0(thePath,theFile),header=TRUE,sep=",")
data <- xts(data[,2],order.by=as.Date(as.character(data[,1]),format = "%d/%m/%Y"))

##----- Strategy's Chart
par(mex=0.8,cex=1)
thePeriod <- c("2012-02/2016-05")
chart.TimeSeries(cumsum(data),
 main = "System 1",
 ylab="",
 period.areas = thePeriod,
 grid.color = "lightgray",
 period.color = "slategray1")

##----- Kruskal tests
pValue <- NULL
i <- 1
while (i < 1000){
 isSample <- sample(isData,length(osData))
 pValue <- rbind(pValue,kruskal.test(list(osData, isSample))$p.value)
 i <- i + 1
}

##----- Mean of p-values 
mean(pValue)

In the example above the in sample period is longer than the out of sample period therefore I randomly created 1000 subsets of the in sample data each of them having the same length as the out of sample data. Then I tested each in sample subset against the out of sample data and I recorded the p-values. This process creates not a single p-value for the Kruskall-Wallis test but a distribution making the analysis more robust. In this example the mean of the p-values is well above zero (0.478) indicating that the null hypothesis should be accepted: there are strong evidences that the in and out of sample data is coming from the same distribution.

As usual what is presented in this post is a toy example that only scratches the surface of the problem and should be tailored to individual needs. However I think it proposes an interesting and rational statistical framework to evaluate out of sample results.

This post is inspired by the following two papers:

Vigier Alexandre, Chmil Swann (2007), “Effects of Various Optimization Functions on the Out of Sample Performance of Genetically Evolved Trading Strategies”, Forecasting Financial Markets Conference

Vigier Alexandre, Chmil Swann (2010), « An optimization process to improve in/out of sample consistency, a Stock Market case», JP Morgan Cazenove Equity Quantitative Conference, London October 2010

Thanks for visiting r-craft.org
This article is originally published at http://www.thertrader.com
Please visit source website for post related comments.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Trading strategy: Making the most of the out of sample data

You may also like...

Categories

Trading strategy: Making the most of the out of sample data

You may also like...

Using RcppParallel to aggregate to a vector

More about Flexible Frequency Models

Causal Inference cheat sheet for data scientists

Categories