Validation Principle
The second point is that the
fundamental Validation Principle must guide any validation method:
If a sub-set of trading system
parameters produces an optimal result on all available data and these
results are statistically significant
at level alpha as compared with results from other parameter sub-sets and
a benchmark random entry and exit
system having the same number of total trades, then the system can be expected to continue to produce optimal
results as long as the data remains stationary.
Validation Methods
. Almost all systems encompass a
large range of possible parameter sub-sets and trading results. At least
one of these sub-sets generally
produces a random like result. Therefore the task of validating most systems reduces to simply finding a statistically
significant optimal parameter sub-set utilizing all the data, developmental data periods included.
To determine statistical
significance, numerous sample results from each parameter sub-set must be obtained. The results must be compared between
parameter sets for each sample. If a system is being tested on a single security, for example U.S. 30-
Year T Bond Futures, then sample period results are determined by closing out open trades at the end of
each chosen calendar period (weeks, months, quarters, semi-annual, etc.)
However if a system is being tested over a number of different securities, for
example common stocks, then the samples
can be the total results for each security during a standardized period of
time. In either case the system results
will be denoted as r (i, j) where i index’s calendar periods or securities and
j indexes the parameter sub-sets, i
Several methods are available to
test the statistical significance of a trading system. Space limitations confine the discussion to methods that, in
most instances, have the superior statistical power. Another method frequently used is a studentized
range method.
DEPENDENT t TEST – REPEATED MEASURE
ANOVA METHOD
The first method to be considered is
a simple dependent t test. Differences are calculated between the returns for the optimal system and the m-1
sub-optimal returns for each period, d (i, k) = r (i, opt) – r (i, k), where
k ∋ [1, m-1] indexes sub-optimal set systems. Each
of the m-1 means over the samples, mu(k) = _ d(i,k)/n is tested by comparing the corresponding sample
t value, t(k) with tcrit (n-1, alpha’).Recognizing that "data- snooping" of all
potential pairs of parameter sub-sets is taking place, alpha’ is the
multi-comparison adjusted value of the
desired risk level alpha. A straightforward Bonferroni adjustment is
sufficient;
alpha’ = 1 – (1-alpha)^[2/(m(m-1))]
.
The system is validated if any of
the m-1 difference means, mu(k) are found to be significant; t(k) > tcrit.
Computationally, performing
dependent t tests on m(m-1)/2 pairs is equivalent to a Repeated Measure Anova of r(i,j), where each measure
represents a distinct parameter sub-set. Validation and significance is determined by the usual Anova F test carried
out at a significance level of alpha, not alpha’. Because Repeated Measure Anova is concise, it is the
favored computational procedure.
Prior information concerning the
system performance can be easily incorporated in the RM Anova test without using clumsy and complicated
Bayesian procedures. The use of pre-planned contrasts (formulated before conducting the test) that
compare one parameter sub-set with all the others greatly increases the power of the test because multicomparison
adjustments are not required. The use of such a pre-planned contrast set is illustrated in the example
below.
Category: Methods of technical analysis
|