An appropriate Null Hypothesis for many nonlinear dynamics tests is that the data arise from a linear dynamical system. In order to establish the significance of a test against this Null, one can generate many realizations of the Null, and estimate the significance empirically.
The Null of "linear dynamical system" is not very specific. For instance, it doesn't say anything about simple quantities such as the mean and variance. One approach to making a specific Null is to set the mean and variance to the same as that of the original data. In addition, and very importantly, the autocorrelation function can be specified as being the same as that of the original data. Surrogate data is random data generated to have the same mean, variance, and autocorrelation function as the original data.
The function fftsurr(ts) makes one realization of surrogate
data from time series ts. The surrogate data is random in the
sense that it is generated from random numbers. An optional argument seed
allows you to specify the random number generator's seed. This is useful when
the same surrogate data is used repeatedly during development or testing of
algorithms or software.
As an example, consider the time series generated by the sine function of a biased random walk. (Physical interpretation: a drunk is walking on a circle, taking a random step that is biased in the counter-clockwise direction. At each step, we measure the y-component of his position.)
» r = .1 + .2*randn(1000,1); » phase = cumsum(r); » ts = sin(phase);Making surrogate data from this is a simple matter:
» tssurr = fftsurr(ts);Time plots of the original data and one surrogate are shown below. Note that both the original and the surrogate have roughly the same period between "cycles."
The histogram of tssurr is gaussian, while that of
the original data ts is not. This certainly indicates
a nonlinearity, but it may be a nonlinearity of measurement and not
of dynamics. Theiler introduced the idea of "amplitude-adjusted" surrogate
data, where the histogram of the surrogate data is the same as that
of the original data. In fact, the surrogate data is simply a sorted
version of the original data, but the sorting is done in a very careful
way that attempts to match the autocorrelation function of the original data.
Amplitude-adjusted surrogate data is implemented by the program
ampsurr.
The use of amplitude-adjusted surrogate data overcomes many pitfalls in the use of surrogate data, but a warning must be given: surrogate data is not foolproof. Still, surrogate data provides the best sanity check available against spurious detection of nonlinear dynamics.