Efron's Development of the Bootstrap
Efron's Development of the Bootstrap
Overview
Powerful computer-based data-analysis techniques referred to by statisticians as "bootstrap statistics" allow mathematicians, scientists, and scholars working with problems in statistics to determine, with great accuracy, the reliability of data. The techniques, invented in 1977 by Stanford University mathematician Bradley Efron, allow statisticians to analyze data, draw conclusions about data, and make predictions from smaller, less complete samples of data. Bootstrap techniques have found wide use in almost all fields of scholarship, including subjects as diverse as politics, economics, biology, and astrophysics.
Background
Almost all of the research and innovation in statistics during the last two decades of the twentieth century was a result of, or was deeply influenced by, the increasing availability and power of computers. In 1979, Efron's important and seminal article titled "Computers and Statistics: Thinking the Unthinkable" argued that statistical methods once thought absurd because of the large number of calculations required, would—given the growth of computing—soon be common mathematical tools. Subsequently, Efron's Bootstrap, a technique involving resampling an initial set of data thousands of times, did indeed become a standard tool of statistical analysis.
Descriptive statistics is a branch of mathematics concerned, in general, with determining quantities, means (averages), and other characteristics of a set of data. In contrast. inferential statistics is concerned with the reliability of descriptive statistics. Prior to the invention of bootstrap techniques, inferential statisticians largely relied on mathematical equations and techniques developed during the nineteenth century. Many of these techniques required a large number of calculations that were demanding in terms of time, and as a result, were often fraught with human error. At best, accurate statistical computations were often the result of a slow and tedious process. During the last quarter of the twentieth century, high-speed computing power made the development of bootstrap techniques possible. In contrast to the traditional limitations of statistical calculations, computers could quickly and accurately perform millions of calculations. In addition, computers could quickly incorporate and make accessible, large and complex databases.
Regardless of the subject (e.g., a poll on political preferences or the depiction of subatomic particle behavior in physics experiments), statistical sampling of data from small numbers is scrutinized in order to draw conclusions about the attributes of a larger group. Bootstrap statistics allow statisticians to specifically determine whether the selection of a particular sampling group influences the ultimate conclusions. In other words, statisticians can determine whether selecting another group of data would alter the outcome. In the alternative, bootstrap techniques allow statisticians the ability to determine exactly how small a data sample can be before the sample no longer is a valid representative of the larger group.
Older statistical methodologies determine the accuracy of a sample by comparing it with artificial samples projected from assumptions that data usually fall into one of several patterns of distribution (e.g., a bell-shaped curve). In contrast, computers running bootstrap statistical programs randomly select data elements from a set of data. If, for instance, astronomers made 10 measurements of radiation (e.g., gamma rays or x rays) emanated by a particular star, computers programmed to use bootstrap techniques would then review those 10 sets of data. Computers would then randomly select data from one of the 10 measurements and copy (i.e., the original data would remain a part of the larger set of all measurements) that data to a new data set for subsequent analysis. By continuing to randomly choose data from the original data set in this manner, a complete and new data set is eventually formed. Computers often repeat this procedure hundreds or thousands of times to form artificial data sets that might, for instance, contain multiple copies of the original data or, in another variation, be missing certain data present in the original measurements. With computer capability, statisticians can then look at how these artificial data sets vary from the original measurements to determine the reliability of conclusions derived from the original data.
Efron labeled his techniques "bootstrap" because of their ability allow measurements to "pull themselves up" (i.e., to provide information regarding their reliability ) by their own bootstraps (i.e., data). For his important advancement of "the bootstrap," Efron, a 1983 MacArthur Prize Fellow, was elected to the National Academy of Sciences. In 1990, Efron was awarded the S. S. Wilks Medal, the most prestigious prize of the American Statistical Association.
After Efron advanced his bootstrap techniques, they received considerable scrutiny from the mathematics community. Although bootstrap techniques gave substantially the same results as conventional techniques when the application of a bell-curve was appropriate, the bootstrap proved more broadly useful when sets of data were not distributed according to the traditional bell-curve.
Bootstrap techniques allowed statisticians a way to determine the trustworthiness of data and of statistical measurements derived from that data. The techniques are comparable to other statistical measures such as standard deviation. With many measurements it is important to determine the mean (i.e., statistical average) and the standard deviation of that mean. Using these common statistical methods, predictions can be made that an event (datapoint) will about 68% of the time be found within one standard deviation and 95% of the time within two standard deviations from the mean. Statisticians also refer to such intervals as "confidence intervals"—a standard interval allowing for a certain error above and below an estimate. The bootstrap allows mathematicians a method to improve on the reliability of such error measurements. In some cases estimates of error could be improved by whole orders of magnitude.
Although the classifications are not agreed upon by all mathematicians, in general, there are four basic types of resampling techniques: Randomization exact tests (i.e., Permutation tests), Cross-validation, Jacknife, and Bootstrap. Bootstrap techniques often provide a more accurate statistical picture than traditional first-order asymptotic approximations (e.g., standard curves) but—using the power of computers—without the laborious computation or mathematical complexity. Bootstrap techniques are also important in the fitting of curves to data and to the elimination of "noise" in data.
Impact
Many statisticians consider the advancement of bootstrap methods to be among the most important innovations in mathematics during the last half of the twentieth century.
By the end of the twentieth century, bootstrap techniques were widely used to analyze both applied and theoretical problems. One attraction of bootstrap techniques was that, regardless of the underlying theory or application, the essential techniques remain essentially unchanged. Moreover, one of the major impacts of using bootstrap techniques was the ability of statisticians to extract more information from data than could be obtained without using the techniques. This increased statistical capability allowed statisticians and scientists to solve many problems previously thought too complicated to tackle.
In politics, bootstrap techniques led to explosive growth in polling. Polls alleging to portray how people would vote, for which candidate people would vote, or polls of voter's opinions regarding key issues came to be a dominant force on the political landscape in America and other technologically advanced nations. The use of bootstrap techniques allowed pollsters to quickly, inexpensively, and confidently make predictions from very small voter samples. The results of this increased statistical efficiency also provoked opposition from those who felt that use of such polls was potentially disruptive to elections and the process of representative government. There was a fear that in a close election the process of "calling an election" based on small voter samples might discourage people from voting and thus sway the election. Other critics decried political leadership based on polling because they felt that political leaders might too easily decide policy affecting millions based not on what was proper, but rather on what was merely popular with a few hundred people in a statistical database.
Bootstrap techniques are useful to biologists because they allow construction of evolutionary trees based on small samples of DNA data. In medicine, pharmacologists are able to evaluate the effectiveness and safety of new drugs from small clinical trials involving only a relatively few patients. For both government and business, bootstrap techniques are increasingly used to analyze and forecast economic data. For business, the impact of bootstrap techniques on marketing can not be overstated. Major purchasing and manufacturing decisions are now routinely based on small samples of consumer preferences. In archaeology and forensic science, bootstrap techniques allow for identification and characterization of remains. In the environment sciences the bootstrap allows widespread predictions of climate change based upon historically limited observational data. In geology, researchers using the bootstrap can better develop earthquake models based on preliminary seismic data that, in turn, can enhance earthquake warning systems.
Over the last decade of the twentieth century, the availability and accessibility of bootstrap techniques increasingly influenced the teaching of statistics. Traditionally, difficult statistical theories were often reduced to oversimplified and impractical teaching models designed to reduce the amount of calculations required of students. These models were often of little practical use. Resampling bootstrap techniques made "real" analysis possible for students and allowed mathematics teachers to use more "real-world" examples in the teaching of probability and statistics.
Late in the twentieth century, as understanding of statistics and probability became increasingly critical quantitative skills needed to cope with mountains of data made possible by computers and improved sensing technologies, bootstrap statistical techniques provided scholars with mechanisms to avoid data "overload" (i.e., too much data to handle through routine calculation). In addition, armed with more accessible statistical techniques, scholars in many fields were able to reexamine old data in which meaningful statistics were often found that were once lost in a fog of algebraic complexity.
K. LEE LERNER
Further Reading
Books
Barlow, R. Statistics. New York: Wiley, 1989.
Edgington, E.S. Randomization Tests. New York : M. Dekker, 1995.
Efron, B. and R.J. Tibshirani. An Introduction to the Bootstrap. New York: Chapman & Hall, 1993.
Maddala, G.S. and J. Jeong. A Perspective on Application of Bootstrap Methods in Econometrics, in Handbook of Statistics, vol. 11. Amsterdam: North-Holland, pp. 573-610, 1993.
Mammen, E. When Does Bootstrap Work? Asymptotic Results and Simulations. New York: Springer-Verlag, 1992.
Periodical Articles
Diaconis, P. and B. Efron. Computer-Intensive Methods in Statistics. Scientific American (May 1983): 116-130.
Singer, J. D. and Willett, J. B. "Improving the Teaching of Applied Statistics: Putting the Data Back Into Data Analysis." The American Statistician vol. 44, no. 3 (August 1990): 223-30.