Econometric Models, Aggregate
Econometric Models, Aggregate
An econometric model is a set of equations de-signed to provide a quantitative explanation of the behavior of economic variables. This article discusses models that focus on the behavior of an economy in the aggregate, especially on the time paths of variables such as national income and product, consumption, investment, employment, the price level, the interest rate, etc. The pioneering model of this type was constructed by Jan Tinbergen (1939). The leading aggregate model builder for some years has been Lawrence R. Klein.
Aggregate econometric models grew out of a blend of several different streams of work. One is the mathematical stream springing from the work of Leon Walras, which represents the economy by a system of simultaneous equations. Another is the work of Ragnar Frisch and others in the theory of economic dynamics. A third is the work in statistical inference associated with Karl Pearson and his successors, showing how to estimate the value of unknown parameters with the aid of prior information and observed data. A fourth is the development by Willford King and Simon Kuznets and others of numerical estimates of national income and expenditure and their components [seeNational income and product accounts]. A fifth is the formulation of aggregative economic theories of income and employment, by R. F. Kahn, John Maynard Keynes, and others.
General features. The general characteristics of aggregate econometric models are described in the following sections. Then a very simple example is given, and contemporary models are discussed.
Definitional equations (identities). In any aggregate model some of the equations are definitions (usually called identities), of the type arising in national accounting; they are supposed to hold exactly and contain no unknown parameters. Examples are, “Consumption plus net investment plus government purchases plus exports minus imports equals net national product,” and “Total money wage bill equals average money wage rate times quantity of labor input.”
Stochastic equations. The remaining equations are stochastic. They are supposed to hold only approximately, and they contain disturbances that are assumed to be unobservable, small, and random, with expected values of zero. An example is, “Consumption during any period equals a constant proportion of that period’s disposable income plus a constant proportion of the preceding period’s consumption plus a third constant plus a random disturbance.” In some models the disturbances take the form of random errors in the measurement of the variables. The assumption of random disturbances is very convenient for statistical estimation of the values of the unknown constants, called parameters. It is sometimes justifiable even if the disturbances have systematic components; for if those components are small and numerous and independent of each other, their total effect behaves approximately as if it were random. In formulating a model of this kind, one hopes to include explicitly in each equation all the important systematic influences that are present, so that the disturbances will be small and at least approximately random.
Structural equations. Some of the stochastic equations describe the behavior of a group in the economy, such as consumers (as in the foregoing example), investors in real capital goods, etc. Some describe institutional or technological restraints, such as the tax laws or the so-called production function, which indicates the maximum output that can be produced with any given quantities of inputs. Some describe adjustment processes that take place in particular markets (e.g., for labor or goods) when there is excess demand or supply. (A special case of an adjustment equation is an equilibrium condition asserting that demand equals supply.) These four types of equations (definitional, behavior, restraint, and adjustment) are called structural equations, for each is supposed to describe some more or less well-defined part of the structure of the economy.
Types of variables. In addition to constant parameters and unobservable random disturbances, the equations contain observable variables, usually more than there are equations in the model. Some of the variables are supposed to be determined by forces completely outside the model, and their values are assumed to be given; these are called exogenous. Variables often regarded as exogenous are government policy variables, population, foreign countries’ actions, etc. The other variables, whose values are determined by the system when parameters, disturbances, and exogenous variables are given, are called endogenous. Typically, in a complete model there are just as many equations as endogenous variables. In many cases the equations for a given period will contain both current and lagged (i.e., past) values of the endogenous variables. The current endogenous variables are known as jointly dependent variables. The exogenous variables and lagged endogenous variables together are known as predetermined variables, for their values are determined as of any time period (either outside the system or by the past operation of the system) when the system goes to work to determine the jointly dependent variables for that time period.
The reduced form—forecasting. Suppose that the system of structural equations is solved for the jointly dependent variables, each being expressed as a function of structural-equation parameters, predetermined variables, and disturbances. The result is called the reduced form of the model. It could be used to forecast future values of the jointly dependent variables if its parameters and the future values of disturbances and predetermined variables were known in advance. In practice these are unknown, so that parameters must be estimated, future disturbances must be approxi-mated by estimates of their expected values (zero is often used for these estimates, since the disturbances are assumed to have zero expected values), and future values of predetermined variables must be assumed. Thus, forecasts made from the reduced form are necessarily approximate.
When the exogenous and lagged endogenous variables are taken as given, reduced-form forecasts based on them are said to be conditional upon the values of the exogenous and lagged endogenous variables. For example, a model might forecast that if tax rates are cut 10 per cent at the end of this year and other predetermined variables are unchanged, then national income next year will be 7 per cent higher than this year; whereas if tax rates are not changed, other things being the same, then national income next year will be only 4 per cent higher than this year. When the unknown future values of predetermined variables are forecast in some way (for exogenous variables this involves using information from outside the model), reduced-form forecasts of jointly dependent variables are said to be unconditional. For example, one might forecast that tax rates will be cut by 10 per cent at the end of this year, and then use a model to forecast that next year’s national income will be 7 per cent higher than this year’s.
Dynamic features. If a model contains lagged endogenous variables, it has a dynamic character, for its jointly dependent variables are affected not only by parameters, disturbances, and exogenous variables but also by the past history of the system. Simple systems containing lagged values or year-to-year changes of endogenous variables can generate cycles and/or long-term growth or decline, even with no changes in parameters, disturbances, or exogenous variables. There are other devices for introducing dynamic effects, e.g., time-trend variables, derivatives with respect to time, and cumulative variables such as capital stock that is the sum of past net investments. [SeeTime Series.]
Linearity versus nonlinearity. If the structural equations are linear in the jointly dependent variables and if the matrix of parameters of those variables is nonsingular, then the solution (the reduced form) is linear in the jointly dependent variables and is unique. If the structural equations contain nonlinearities in the jointly dependent variables, their solution will be nonlinear and may fail to be unique. In that case, one may use additional information to rule out the spurious solutions and find the one that represents the behavior of the economy (for example, any solution giving a negative national income would be spurious). Alternatively, a nonlinear model may be approximated by a linear one, with results that are acceptable as long as the range of variation of the variables being studied is small relative to their extreme values (this is likely to be so over short periods, but not over long periods). A model thus linearized has a linear reduced form. While models nonlinear in variables are fairly common, almost all models are built to be linear in unknown parameters, because that makes estimation of the parameters vastly simpler.
Model building and estimation. Numerical estimates of the unknown and supposedly constant parameters of reduced-form or structural equations are obtained by statistically fitting the equations to past data for the jointly dependent and predetermined variables, provided that the parameters have the property of identiftability [seeSimultaneous equation estimation; Statistical identifiability].
In this estimation process, one uses specifications indicating what variables appear in the model; which are endogenous and which are exogenous; what lags appear, if any; which variables appear in each of the structural equations; what the mathematical form of each structural equation is; and what properties the probability distribution of the random disturbances is assumed to have. In principle, these specifications are supposed to represent the model builder’s prior knowledge of the economy—prior in the sense of arising from sources other than the observed data that are to be used in estimating the parameters. In practice, the model builder is often uncertain about some of the specifications of the model, and so he may try fitting several differently specified theoretically plausible models to the data and then choose the one that offers the best combination of (a) goodness of fit to the data and (b) consistency with any knowledge that may not have been incorporated into the formal specifications of the models. Such knowledge may come from economic theory, cross-section studies, results obtained for other countries or time periods, or other sources. Recently developed methods of Bayesian inference are suitable for incorporating probabilistic prior knowledge into the estimation process [seeBayesian inference].
Forecasting and model testing in practice. Conditional forecasts can be expected to be quite accurate if (a) the specification of the model is substantially correct for both the sample period and the forecast period; (b) the data sample and estimation technique used are such as to give approximately correct estimates of the reduced-form parameters under assumption (a); and (c) a highly accurate explanation of the sample-period jointly dependent variables is provided by the estimated reduced-form equations, in the sense that when the observed values of the predetermined variables are substituted into the reduced form to get calculated sample-period values of the jointly dependent variables, the calculated values are close to the observed values. Conditional forecasts can be expected to have substantial errors if the foregoing conditions are not met.
Condition (c), above, can readily be tested by substituting sample-period data into the estimated reduced form. Condition (b) can be tested in a probabilistic sense, with the aid of statistical inference techniques that reveal the degree of confidence one should have in the estimated parameters, on the assumption that the model is correctly specified. Condition (a), the correct specification of the model, is more difficult to test. Economic theory provides some information concerning the adequacy of a model’s specifications, but the most powerful test is the indirect one based on the quality of forecasts that a model makes when conditions (b) and (c) are reasonably well satisfied. Individual structural equations can also be usefully tested by similar procedures, although the typical structural equation contains more than one jointly dependent variable and hence is not capable of making forecasts in the same sense as is the reduced form.
Unconditional forecasts, of course, can go wrong if the three foregoing conditions are not met, and also if the values of the predetermined variables used in the forecasts are not substantially correct.
A simple example. A very simple three-equation model will illustrate many of the foregoing points. Equation (1), below, is the accounting definition mentioned earlier: net national product (NNP) equals consumption plus net private domestic investment plus government purchases plus exports minus imports. Equation (2), below, is the consumer behavior equation mentioned earlier, specifying that consumption is a linear function of disposable income and lagged consumption plus a random disturbance. A third equation is needed to relate disposable income to NNP. Assume an economy in which (a) the whole of government revenue is raised by an income tax whose yield is a linear function of NNP; (b) there are no transfer payments; and (c) all business income is paid out to individuals, so that disposable income is also a linear function of NNP. Equation (3), below, expresses this. The three structural equations of this model are as follows (the notation and units are explained below:
- y = c + i + g,
- c = a + βd + γc−1 + u = 5.7 + .69d + .25c−1,
- d = y(1 −m) - h.
This model specifies further that there are three endogenous variables (c = consumption, d = disposable income, and y = NNP); and that there are four exogenous variables (i = net private domestic investment, g = government purchases plus exports less imports, h = the fixed part of tax revenues independent of NNP, and m = the marginal tax rate on NNP). Lagged consumption is denoted by ct-i or in brief by c−; α, β, and γ are three unknown parameters; and u is a random disturbance with zero mean. All these quantities are expressed in billions of real (i.e., deflated) dollars per year, except for m, β, and γ, which are pure numbers between 0 and 1.
The approximate numerical estimate of the consumption equation (2), above, was obtained by the two-stage least squares estimation method, from United States data (expressed in billions of 1954 dollars per year) for the years 1929–1941 and 1946–1959. The ratios of the three estimated parameters to their estimated standard errors are respectively 1.8, 12, and 3.5; and the standard error of estimate is 2.7 billion 1954 dollars per year (as compared with the maximum and minimum observed consumption values of 288.9 and 103.5 billion respectively).
The tax variable h may be negative if the income tax allows for a fixed total exemption. If equation (3) were to be applied to the United States economy, it would have to include a disturbance, for the United States tax structure is only very roughly described by a linear function of NNP.
The reduced form of this model is obtained by solving the three structural equations for y, d, and c, thus:
Note that the model and its reduced form are linear in endogenous variables, but not in all variables because of the term containing ym in equation (3). If in the reduced form one substitutes estimated values for the three structural parameters, zero for the disturbance u, and numerical values for the five predetermined variables for a certain year, one obtains estimates of the expected values of the three jointly dependent variables y, d, and c for that year, conditional on the chosen values of the predetermined variables.
Medium-scale models. Most of the medium-scale aggregate econometric models published so far have from 14 to 48 equations and accordingly are much more detailed and complex than the simple example just given, although, of course, they still involve great simplifications of reality. These models differ from each other in significant ways, but the following features are typical of most of the medium-scale models listed in the bibliography.
There is an identity, substantially like equation (1), above, stating that national product equals consumption plus investment plus government purchases plus imports minus exports. Consumption, private investment, and imports (and in some cases exports) are endogenous variables, and there are behavior equations to explain them or their components. Consumption is sometimes divided into parts, with one equation for each—such as consumer durable goods, nondurable goods, and services—and is explained in terms of variables such as disposable income, past consumption, liquid asset holdings, consumer credit conditions, and income distribution. Investment is commonly separated into plant and equipment purchases and inventory investment, and in some models residential construction is treated in one or more separate equations. Plant and equipment investment is explained in terms of accelerator variables, such as output and capital stock, or profits or both. Inventory investment is usually specified to depend on lagged inventory holdings, sales or output, and other variables. Most import functions depend on income and exogenous import prices. If exports are not exogenous, they are a function of exogenous variables such as world income and world prices. Government purchases are regarded as exogenous. [SeeConsumption Function; Inventories; Investment, article onthe aggregate investment function.]
In the early models tax revenues and government transfer payments (such as social security benefits, unemployment compensation, and interest on the national debt) were specified as exogenous, but in more recent models the tax and transfer schedules are specified as exogenous and equations are provided to explain tax and transfer payments as endogenous variables depending upon national income, the number of retired persons, unemployment, the national debt, and so on, according to the exogenous schedules.
There is a production function, to explain the output of goods and services in terms of inputs of labor and capital. There is a demand-for-labor equation, in many cases expressing total real labor income in terms of total real output. There is an identity expressing property income as the difference between total income and labor income. In many cases there is an equation explaining the allocation of property income between retained income (which is not part of disposable income) and payments to individuals (including interest and dividends).
There is a wage-rate adjustment equation, commonly expressing the change in the money wage rate in terms of the unemployment rate and the rate of inflation. (But see Netherlands 1961 for a model in which the wage rate is exogenous, since it has been a policy variable in the Netherlands in recent years.) Unemployment of course is the difference between the labor force, typically exogenous, and employment.
The general price level is endogenous in most of these models; and in some cases other prices appear that, when endogenous, are usually expressed as functions of the general price level. Typically, real output and the price level can be thought of as determined by a pair of equations containing only these two variables plus predetermined variables, the equations being obtained by substituting into two important identities all the other equations of the model. One of these identities is that equating output to the sum of all expenditure components; the other expresses the total real wage bill as employment multiplied by the real wage rate.
Among the more commonly used exogenous policy variables are government purchases, government transfer payments, and tax revenues or tax rates. In some models there are few or no variables or equations describing interest rates and the supply and demand for money, and where they do appear, they are commonly rather loosely tied to the rest of the model. Thus, most models built so far are much better suited to an analysis of effects of fiscal policy (government purchases, taxes, and transfer payments) than the effects of monetary policy.
Among the more commonly used exogenous non-policy variables are population and its distribution (and sometimes labor force), import prices, exports (or world income and world prices), and time. In certain more recent models there are variables measuring attitudes or anticipations, obtained from surveys.
Behavior equations, in most cases, contain variables in real (deflated) terms rather than in money terms, to reflect the theoretical postulate that real economic behavior depends upon real tastes and real opportunities, unaffected by price changes that leave these things the same. Some consumer behavior equations are stated in per capita terms, to allow for the possibility that an increase in aggregate income may have different effects, depending upon how it is distributed between increases in population and in per capita income; but most behavior equations are stated in aggregate terms.
Nearly all models contain some nonlinearities, especially where identities of the form “value equals price times quantity” are involved; but non-linearities in unknown parameters are rare in stochastic equations whose parameters have to be estimated, Tinbergen’s models (1939, 1951) and the model of the Netherlands (1961) have had all their nonlinear equations linearized.
First differences (i.e., year-to-year changes) of the data are used occasionally (e.g., Suits 1962), but most models use ordinary data, without this transformation. Early models used annual data, but quarterly models are becoming more common as quarterly data become available. Time trends, lags, and cumulated variables are the main devices used for dynamic effects, although occasionally a ratchet-type variable is used for this purpose. An example of a ratchet-type variable is the value of disposable income at its previous peak, which is sometimes included in the consumption function.
Parameters of structural equations are estimated by a variety of methods. Least squares is common, in spite of its asymptotic bias in a simultaneous equations context. With the advent of electronic computers, consistent estimating methods have become cheap and are increasingly used, especially the limited-information and the two-stage least squares methods [seeSimultaneous equation estimation].
Large-scale models. Two large-scale models have appeared. One, dealing with the United States and having 219 equations, is sponsored by the Social Science Research Council (SSRC) and the Brookings Institution (Duesenberry et al. 1965). The other, dealing with Japan and having 164 equations, is the work of a group at Osaka Uni versity (Ichimura et al. 1964). In essential conception these models are similar to the medium-scale models discussed above; they innovate chiefly in providing a more detailed treatment of certain markets and sectors of the economy. The SSRC–Brookings model goes into detail particularly regarding consumption, housing, fixed and inventory investment, new orders, six nonagricultural production sectors, agriculture, government, and population and labor force. The Osaka model is particularly detailed regarding fixed and inventory investment, eight production sectors (including agriculture), foreign trade, and the monetary and financial sector. The SSRC-Brookings model contains a seven-by-seven input-output model corresponding to its seven production sectors, and plans are under way for a more detailed model to contain about 32 production sectors [seeInput—output analysis].
Application and evaluation. Econometric model forecasts and their comparison with subsequent actual events have been all too uncommon but are becoming accepted as one important means of evaluating models. Some recent results have been quite good. The Klein-Goldberger model (1955) and its successors under Suits (1962) have been used each November since 1952 to make annual unconditional forecasts of real United States gross national product (GNP) for the following year The percentage errors for the years 1953 to 1962 have been as follows (where, for example, —0.7 means that the model’s forecast of GNP was too low by 0.7 per cent of the subsequently observed value, +0.5 means that it was 0.5 per cent too high, etc.): -0.7, +0.5, -6.4, +0.2, +0.5, +0.05, -4.0,-1.6,+0.7,-0.1.
Such models may also be used for simulation studies of the economy’s stability and long-term behavior, as in Adelman and Adelman (1959), or of its reaction to policy changes. Simulation studies, as well as forecasts, acquire practical value to the extent that the models used can be shown to be accurate representations of the relevant aspects of economic behavior and not merely systems of equations that fit past data well. [SeeSimulation, article oneconomic processes.]
Carl F. Christ
[Directly related are the entriesAggregation; Business cycles, article onMathematical models; Prediction and forecasting, economic; Income and employment theory.]
BIBLIOGRAPHY
Following most of the entries is a notation giving the time period fitted, the number of equations, and the number of exogenous variables.
Adelman, Irma; and Adelman, Frank L. 1959 The Dynamic Properties of the Klein-Goldberger Model. Econometrica 27:596–625.
Brown, T. M. 1964 A Forecast Determination of National Product, Employment, and Price Level in Canada, From an Econometric Model. Pages 59–86 in Conference on Research in Income and Wealth, Models of Income Determination. Studies in Income and Wealth, No. 28. Princeton Univ. Press. → 1926–1941 and 1946–1956 annually; 40 equations, 47 exogenous variables.
Christ, Carl F. 1951 A Test of an Econometric Model for the United States: 1921–1947. Pages 35–107 in Conference on Business Cycles, New York, 1949. New York: National Bureau of Economic Research. → 1921– 1941 and 1946–1947 annually; 14 equations, 16 exogenous variables.
Christ, Carl F. 1956 Aggregate Econometric Models: A Review Article. American Economic Review 46, no. 3:385–408.
Duesenberry, James S.; Eckstein, Otto; and Fromm, Gary 1960 A Simulation of the United States Economy in Recession. Econometrica 28:749–809.
Duesenberry, James S., et al. (editors) 1965 The Brookings Quarterly Econometric Model of the United States. Chicago: Rand McNally; Amsterdam: North-Holland Publishing. → 1948–1962 quarterly; 219 independent equations (including about 150 estimated equations), over 100 exogenous variables.
Ichimura, Shinichi et al. 1964 A Quarterly Econometric Model of Japan: 1952–1959. Osaka Economic Papers 12, no. 2:19–44. → 1952–1959 quarterly; 164 equations, about 130 exogenous variables. This paper presents only the equations and definitions of symbols; a book describing the model is scheduled to appear.
Klein, Lawrence R. 1950 Economic Fluctuations in the United States: 1921–1941. New York: Wiley. → 1921–1941 annually; 16 equations, 13 exogenous variables.
Klein, Lawrence R. 1961 A Model of Japanese Economic Growth: 1878–1937. Econometrica 29:277–292. → 1878–1937 quinquennially; 10 equations, 3 exogenous variables.
Klein, Lawrence R. 1964 A Postwar Quarterly Model: Description and Applications. Pages 11–36 in Conference on Research in Income and Wealth, Models of Income Determination. Studies in Income and Wealth, No. 28. Princeton Univ. Press. → A model of the United States. 1948–1958 quarterly; 34 equations, 19 exogenous variables.
Klein, Lawrence R.; and Goldberger, A. S. 1955 An Econometric Model of the United States: 1929–1952. Amsterdam: North-Holland Publishing. → 1929–1941 and 1946–1952 annually; 20 or 25 equations, 20 exogenous variables.
Klein, Lawrence R.; and Shinkai, Y. 1963 An Econometric Model of Japan: 1930–1959. International Economic Review 4:1–28. → 1930–1936 and 1951–1958 annually; 22 equations, 15 exogenous variables.
Klein, Lawrence R. et al. 1961 An Econometric Model of the United Kingdom. Oxford: Blackwell. → 1948–1956 quarterly; 37 equations, 34 exogenous variables.
Liebenberg, Maurice; Hirsch, Albert A.; and Popkin, Joel 1966 A Quarterly Econometric Model of the United States: A Progress Report. Survey of Current Business 46, no. 5:13–39. → 1953–1964 quarterly; 49 equations, 32 exogenous variables.
Liu, Ta-chung 1963 An Exploratory Quarterly Econometric Model of Effective Demand in the Postwar U.S. Economy. Econometrica 31:301–348. → 1947–1959 quarterly; 36 equations, 16 exogenous variables.
Narasimhan, Nuti V. A. 1956 A Short-term Planning Model for India. Amsterdam: North-Holland Publishing. → 1923–1948 annually; 18 equations, 13 exogenous variables.
Nerlove, Marc 1962 A Quarterly Econometric Model for the United Kingdom: A Review Article. American Economic Review 52, no. 1:154–176.
Nerlove, Marc 1966 A Tabular Survey of Macro-econometric Models. International Economic Review 7:127–175.
Netherlands, Central Planbureau 1961 Central Economic Plan 1961. The Hague: The Bureau. → 1923–1938 and 1949–1957 annually; 30 equations, 20 exogenous variables.
Smith, Paul E. 1963 An Econometric Growth Model of the United States. American Economic Review 53, no. 4:682–693. → 1910–1959 annually; 10 equations, 1 exogenous variable.
Suits, Daniel B. 1962 Forecasting and Analysis With an Econometric Model. American Economic Review 52, no. 1:104–132. → 1947–1960 annually; 32 equations, 21 exogenous variables.
Tinbergen, Jan 1939 Statistical Testing of Business-cycle Theories. Volume 2: Business Cycles in the United States of America: 1919–1932. Geneva: Economic Intelligence Service, League of Nations. → 1919–1932 annually; 48 equations, 22 exogenous variables.
Tinbergen, Jan 1951 Business Cycles in the United Kingdom: 1870–1914. Amsterdam: North-Holland Publishing. → 1870–1914 annually; 45 equations, 9 exogenous variables.
Ueno, Hiroya 1963 A Long-term Model of the Japanese Economy, 1920–1958. International Economic Review 4:171–193. → 1920–1936 and 1952–1958 annually; 38 equations, 35 exogenous variables.
Valavanis-Vail, Stefan 1955 An Econometric Model of Growth: USA 1869–1953. American Economic Review 45, no. 2:208–221. → 1869–1948 quinquennially; 20 equations, 7 exogenous variables.