Delta-Epsilon

Introduction

This page describes a method of indirectly establishing the existence of a dynamical function underlying time-series data. The idea is to look for evidence of the continuity of a function, rather than estimating the function itself. It turns out that in some cases it is very much easier to see that there is an underlying continuous function than to estimate it.

The method was first described in D.T. Kaplan, Exceptional Events as Evidence for Determinism Physica D 73:38-48 (1994).

Software

The base program is deltaeps(pre, post). pre and post are embedded data and the scalar images of that data. If the hypothetical function linking pre and post is continuous, then we expect that points that are nearby in pre will also be nearby in post.

The program deltaeps(pre,post) simply calculates two columns of numbers, the distances between pairs of points in pre and the corresponding distances in post. An example will perhaps make things clear: we will

  1. Generate some data from the chaotic quadratic map x(t+1) = 4x(t)(1-x(t)).
  2. Embed the data. We will use dim=2 and lag=1 here.
  3. Pull out the images of the embedded data.
  4. Run deltaeps to calculate the distances.
» ts = fditer('quadmap',0.4325, 4, 100);
» x = lagembed(ts,2);
» [pre,post] = getimage(x,1);
» [delta,epsilon] = deltaeps(pre,post);
A scatter plot of delta versus epsilon shows the anticipated sign of continuity: as delta goes to zero, so does epsilon.

Plot of epsilon versus delta

Cautions: The output of deltaeps can be quite voluminous, so you are cautioned to put a semi-colon at the end of the line where it is invoked. Also, the program will work at a reasonable speed only on short files. It takes roughly a second to run on a time series of length 100, but the run time goes as the square of the time-series length.

An alternative way of graphing epsilon

An effective way of viewing epsilon vs delta is to take a cummulative average of epsilon. This is implemented by the function plotde(delta,epsilon). As an example, let's sum two independent quadratic maps, embed them in a three-dimensional space, and calculate delta-epsilon. We will then compare to two surrogates. (Two surrogates is too few for effectively judging statistical significance, but it gives us a reasonable visual impression.)

ts1 = fditer('quadmap',.342, 4, 100);
ts2 = fditer('quadmap',.42, 4, 100);
ts = ts1+ts2;
x = lagembed(ts,3);
[pre,post] = getimage(x,1);
[delta,epsilon] = deltaeps(pre,post);
% plot out the cummulative average of epsilon versus delta
% in yellow
plotde(delta,epsilon,'y'); 
%now look at two surrogates, for comparison
sur1 = ampsurr(ts);
sur2 = ampsurr(ts);
x = lagembed(sur1,3);
[pre,post] = getimage(x,1);
[delta,epsilon] = deltaeps(pre,post);
hold on;
plotde(delta,epsilon,'g'); 
x = lagembed(sur2,3);
[pre,post] = getimage(x,1);
[delta,epsilon] = deltaeps(pre,post);
plotde(delta,epsilon,'r'); 
hold off;
The results, shown below, clearly illustrate the difference between the sum-of-quadratic-maps and surrogate data. This type of plot is helpful in picking a deltamax as discussed in the next section. The test data is in yellow, and the surrogate data is in green and red.

Delta-Eps for sum of two quadratic maps.

Reduction to a single number

It's often important to be able to generate a single number that reflects the data. For instance, when comparing to surrogate data, having a single summary statistic for each time series allows a t-test or related non-parametric test to be used to establish the significance level at which the Null can be rejected.

There is as yet no published description of such a method for the delta-epsilon test. The method I have been using involves a linear regression of epsilon against delta, and finding the intercept of the regression line. That intercept can be interpreted as the hypothetical distance between images when the pre-images are identical --- in short, the "level of nondeterminism" in the data. The program defit implements one way of doing this.

The central problem in doing the linear regression is the fact that the relationship between delta and epsilon is not linear. A linear relationship does hold approximately for small ranges of delta, and the relevant small range for the purposes of characterizing continuity seems to be very small delta. However, there may not be many delta-epsilon pairs at small delta, so there is a question of how small to use.

I do not know of a always applicable answer to this question, but I offer two suggestions:

  1. Use plotde to pick a deltamax below which the curve is fairly straight. In the sum-of-two-quadratic-maps example above, deltamax=.2 seems appropriate.
  2. A rule of thumb that I have found useful: in the absence of other information, use roughly 500 delta-epsilon pairs for fitting, the ones with smallest delta. To make it easy to find this proposed maximum delta, there is a function disttyp which takes as arguments the embedded data (pre) and a list of percentiles, and returns a list of distances corresponding to each percentile. (This list, which can be used to estimate the correlation dimension, is really only an estimate, since only a fraction of the pairs of points are used in order to speed up the calculation.)

The program quickde summarizes the second method in a convenient way. It takes the pre and post images, computes delta and epsilon, finds deltamax, and fits a line to them. It returns a single number that can be interpreted as the "level of nondeterminism" in the data, i.e., the amount of noise.

» quickde(ts,2,1)
ans = .0123

This result indicates that there is a very small level of noise in the time series ts compared to the standard deviation. (The standard deviation is the rms residual of the mean, the mean being perhaps the simplest useful model of the data.)

» ans/std(ts)
ans = 0.0358

More generally, the function defit can be used to find the delta-epsilon statistic.

» defit(delta,epsilon,.1)
ans = 0.0077
The third argument, .1 here, is deltamax.

I regard the question of how best to extract a single number from the delta-epsilon pairs as an open question. I would welcome suggestions.

Delta-Epsilon and Surrogate Data

Using surrogate data is with delta-epsilon is not particularly different from using surrogate data with any other test statistic. One generates many realizations of the surrogate data, and calculates delta-epsilon independently on each one of them.

%find the value of delta-epsilon for the test data
» testde = quickde(ts,2,1)
testde = 0.0123

% do the same for many realizations of the surrogates
» for k=1:200
      tss = fftsurr(ts);
      res(k) = quickde(tss,2,1);
  end

% get a handle on the significance of the result by calculating
% the t-statistic
» (testde-mean(res))/std(res)
ans = -9.1157

% this is a very significant result!
Of course the "significance" of the result is in a statistical sense: the probability that a value as extreme as that found from the test data would arise from the Null. An estimate of the "strength" of the result might be the ratio of the mean of the surrogates to the test data
» mean(res)/testde
ans = 31.6267
in this case suggesting that there is 31 times as much noise in the surrogates as in the test data.

As in all applications of surrogate data, one needs to be careful to apply exactly the same test statistic to the surrogates as to the original data. In general, this means that the embedding dimension and lag should be the same for the surrogate data as for the test data. But there is also the question of deltamax. Here one has a choice:

  1. Use the same deltamax for the test data and for the surrogate data sets.
  2. Use the same number of delta-epsilon pairs for the surrogate data as for the test data.
I suggest using the second option. If the data really do arise from, say, a chaotic attractor with small dimension (this is the dream case --- how many times has this occurred in reality?), then the dimensionality of the surrogate sets will be higher than the test data. In our case, this means that a deltamax that might be quite reasonable for the low-dimensional test data might be too small for the surrogate data; there might be too few pairs of points closer than deltamax. The result is that the quality of the fit will be poor --- the surrogates will have a larger spread than if a larger deltamax were used and this will diminish the significance of the results. On the other hand, using a large deltamax for the test data in order to accomodate the surrogates may result in missing the determinism of the test data.

Weaknesses of Delta-Epsilon

The Delta-Epsilon method often works reasonably well. It's main shortcoming (at least the main one that I have observed) is in the large spread of results for surrogate data which tends to decrease the significance of results found with delta-epsilon. Other techniques, for instance piecewise linear function fitting, give a tighter range of values for surrogate data. The problem, I guess, is that delta-epsilon is unable to exploit the long-range smoothness of functions in the same way that a linear fit can. So, for functions that have large, locally linear neighborhoods, delta-epsilon is at a disadvantage to function fitting.

Delta-epsilon is at it's best when the linear neighborhoods are very small. An An extreme example of this is found in the linear congruential random-number generator found in Matlab; delta-epsilon is easily able to detect the deterministic nature of these random numbers with only about 1000 data points. This is because of the quite small size of the locally linear regions in the 1-dimensional map that describes the random number generator. In most real-world applications, however, measurement noise would wash out such fine structure.

Part of the effectiveness of delta-epsilon arises from the exceptional pairs of points that are very close together. It is a few such pairs that produce the positive results on the Matlab random-number generator. For high-dimensional systems (where "high" might mean greater than or equal to two!) such pairs are so infrequent as not to be of much practical significance.