Resampling

Navigation:  Randomness and Randomization >

Resampling

Previous pageReturn to chapter overviewNext page

The term resampling has a number of separate meanings in statistical analysis. The first, more properly denoted re-sampling, involves the process of taking repeated samples from a population under study. In particular, in cases where the initial sampling may have been inadequate, for example being too small, unrepresentative or incorrectly applied. However, in many cases re-sampling is not a practical option, so efforts must be focused on extracting the maximum information from the available sample(s).

The term resampling (no hyphen) is used within statistical analysis to describe a procedure where repeated samples are drawn from an existing sample of data, with or without replacement. Random permutations are an example of resampling, as are other computational techniques such as jackknifing and bootstrapping.

In geospatial analysis and image processing, the term resampling refers to the change of grid resolution, either increasing the resolution to produce a finer grid, or decreasing the resolution to produce a coarser grid. Simple (generally non-statistical) interpolation techniques are usually applied in such resampling, with the purpose often being to enable grids of different resolutions, but with the same spatial extent, to be combined.

Bootstrapping

Bootstrapping is a family of essentially simple but powerful computational techniques for obtaining the approximate sampling distribution of a statistic, R, from a random sample obtained from some unknown population frequency distribution, F. Instead of taking random sub-samples, as is the case with jackknifing, Efron (1979, [EFR1]) proposed that random samples with replacement be used.

We will use Efron's paper (1979, p2-3) to describe the basic method for a single sample problem. Here we are assuming we have a completely unspecified probability distribution function, F, and a random sample X from this distribution whose observed realization is the set of values {x1,x2,x3....xn}. We are interested in obtaining the sampling distribution of some random variable or 'statistic' R(X,F) such as the sample mean, median or correlation coefficient (for a worked example of the latter, see the discussion on obtaining confidence intervals for the product moment correlation coefficient later in this Handbook). The steps are as follows:

1. construct the sample probability distribution F*, putting mass 1/n at each point x1,x2,x3....xn

2. With F* fixed, draw a random sample of size n from F*. Call this the bootstrap sample, X*

3. Approximate the sampling distribution of R(X,F) by the bootstrap distribution of R*=R(X*,F*)

Efron proposed that Monte Carlo simulation be used to provide a general (non-analytical) means of obtaining the approximation to the bootstrap distribution, and this is the approach adopted for bootstrap algorithms. This simply involves taking repeated random samples of size n from F*, with replacement, calculating the statistic R*, and then computing the histogram of values of the statistic as the approximation sought. By taking a large number of samples the sampling distribution can be extremely well estimated for many problems. Bootstrapping in this form thus provides a method of non-parametric inference. Following the publication of Efron's original paper and his subsequent research monograph, an extensive body of research into the properties of bootstrap estimates was undertaken, expanding the range of applications and identifying cases for which bootstrapping is not effective. Chernick (2008, [CHE1]) provides a thorough and approachable discussion of these topics.

An obvious question is how large does the sample size, n, need to be in order to ensure it is both representative and unlikely to result in repeated bootstrap samples? As a rule of thumb, if n>50 then in most cases an approximate sampling distribution can be obtained which is asymptotically close to the true sampling distribution (Chernick, 2008, p174 [CHE1]), but sample sizes as low as 10-15 may yield usable results especially when estimating a simple measure such as the standard error. For heavy tailed distributions, such as the lognormal, much larger sample sizes are advisable (100+).The number of samples to draw is a further issue, with no definitive answers on the ideal size (except insofar as the number of replicates is affected by the sample size). Typical replicate sizes are 1000+ for simple measures and 10,000+ for confidence intervals (Chernick, 2008).

As a simple example of the application of bootstrapping, we use the classic "US Cities" dataset, which is comprised of 49 pairs of values. The data represents the population in thousands in 1920 and 1930 of a random sample of 49 US cities from a set of the 196 largest US cities in 1920 (the first 10 records are shown in the table below). The statistic, R, of interest, is the ratio of the population in 1930 over that in 1920, i.e. a measure of urban growth. From the data table this ratio is estimated as 1.520313, but what is the standard error of this estimate?

ID

1920

1930

1

138

143

2

93

104

3

61

69

4

179

260

5

48

75

6

37

63

7

29

50

8

23

48

9

30

111

10

2

50

Using the R library package "boot" (see R code sample in the Resources section) we can generate a bootstrap estimate from these 10 data items, using 999 replicates. The first few example replicates are:

       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]

  [1,]    1    0    1    1    5    0    0    0    2     0

  [2,]    0    0    0    3    0    1    1    1    3     1

  [3,]    1    0    0    2    3    1    0    1    0     2

Thus row 1 shows that the pair ID=1 was selected once, ID=2 was not selected... ID=5 was selected 5 times etc., yielding an overall estimate of the ratio for this replicate as being 1.576696. Repeating this process 999 times yields a sample result of the form:

Bootstrap Statistics :

original        bias                std. error

1.520313        0.03281374        0.2108548

This summary confirms the original ratio we computed, followed by the bias which is the difference between the mean obtained from the bootstrap (which in this case is 1.553126) and the original. The standard error in this case is estimated at 0.21, which is quite large, as can be seen from the histogram of bootstrap results below.

Bootstrap frequency histogram of ratio values

bootstrap

If we had used all 49 cities (which is available in the bigcity dataset) the bias would have been negligible and the standard error much reduced at approximately 0.033.

Jackknifing

Given a sample of size N from an infinite or finite but large population, a sub-sample of size n<N may be selected. A statistic, S, may then be computed from this sub-sample and the process repeated for a large number (or all possible) sub-samples. The result will be a set of values for S representing an approximation to the sampling distribution of S. In its standard form, n=N-1, and the sub-samples are simply samples from the original sample with a different single selected value removed. The jackknife can then be used to provide an estimate of the standard error for the statistic in question, and assuming a complete set of n jackknife samples are computed, the resulting estimate will be repeatable (unlike the bootstrap estimate, above). However, unless the statistic being estimated is relatively smooth, the jackknife can produce poor estimates of the standard error. This is particularly noticeable with percentiles and median estimates, where this technique is known to fail. The rise in readily available computational power has resulted in the jackknife by itself being little used today, with bootstrap or pure Monte Carlo estimation being preferred.

The jackknife approach can be used in related applications such as helping to identify single values that might overly influence the results from a bootstrapping exercise (an approach known as jackknife after bootstrap - see further, Efron, 1992, [EFR2]). This technique is supported (with a special graphical output) in the R library "boot" using the jack.after.boot function. In a variety of applications, a model (e.g. a regression model, a time series model) is built using input data. The quality of the model can then be analyzed by jackknifing the input data and predicting the value of the missing item from the model. This is then repeated over all of the input data values and a sum of squared errors measure used to compare alternative models or model parameterizations.

References

[CHE1] Chernick M R (2008) Bootstrap Methods: A Guide for Practitioners and Researchers. 2nd ed., J Wiley, Hoboken, NJ

[EFR1] Efron B (1979) Bootstrap methods: another look at the jackknife. Ann. Statist. 7, 1-26

[EFR2] Efron B (1992) Jackknife-after-bootstrap standard errors and influence functions (with Discussion). Journal of the Royal Statistical Society, B, 54, 83–127

[EFR3] Efron B, Tibshirani R J (1997) Improvements on cross-validation: The .632+ bootstrap method. J. American Stat. Assoc., 92, 548-560