Covariance

views updated Jun 27 2018

Covariance

The covariance is a measure of the magnitude of association between the scores of cases on two variables that have been measured at the interval or ratio level. It describes both the direction and the strength of the association. In

the social sciences, the covariance is most commonly used in structural equation modeling of systems of linear equations of measured and unmeasured variables.

Formally, the covariance between the scores of c cases (i through N ) on the variables X and Y is:

That is: subtract the first case’s score on X from the mean of X; subtract the first case’s score on Y from the mean of Y; multiply these “deviations.” Repeat this process for all of the cases, and sum the results. Divide this product by the population size (N).

When the relationship between X and Y is being examined in a random sample of cases drawn from the population, N -1 is usually substituted in the denominator. Most statistical software uses N -1.

The covariance of a variable with itself (e.g., COV (X, X)) is the variance. (For a more in-depth formal treatment of the covariance, see Snedecor and Cochran 1980).

If there is a tendency for higher scores on X to cooccur with higher scores on Y, the covariance will have a positive value; if there is a tendency for higher scores on X to co-occur with lower scores on Y, the covariance will be negative. If the scores on two variables are not associated, the covariance will equal zero. The units of measurement of the covariance are XY; for example, if X was measured in dollars, and Y was measured in years, the magnitude of the covariance would be dollar-years. When we are working with multiple variables, the variances and covariances among all the variables are arrayed in a symmetric “variance-covariance” matrix.

Consider the relationship shown in the scatter-plot, between the level of urbanization (X) and female life expectancy (Y) in nineteen African countries in the mid-1990s. Inspection suggests that the scores positively covary: on the average (but not in all cases), the higher the urbanization, the higher the life expectancy.

The covariance for this relationship is 57.538. The positive value indicates a positive relationship. The strength of the relationship is difficult to assess because the unit of measurement of the covariance is percent-years. Because of this peculiar metric, the covariance is rarely used as a simple description. The Pearson correlation (which is. 47 in this example) is preferred.

The covariance is the most commonly used measure of association when research involves predicting Y from X using structural equation modeling. In predictive modeling, there is often the desire to describe the relationship between Y and X in the original scales of the variables: How much Y do we get for each unit of X ?

Some warnings: Restricted variation in either variable, non-linearity in the relationship, and non-normality in the joint distribution of X and Y can limit the validity of the covariance as an index of the strength and direction of the relationship.

SEE ALSO Standard Deviation

BIBLIOGRAPHY

Snedecor, George W., and William Gemmell Cochran. 1980. Statistical Methods. 7th ed. Ames: Iowa State Press.

Robert Hanneman

International Encyclopedia of the Social Sciences

covariance

views updated Jun 11 2018

covariance A measure of the joint variation of two random variables, analogous to variance (see measures of variation). If the variables are x and y then the covariance of x and y is Σ(x_i – x̄)(y_i – ȳ)

The analysis of covariance is an extension of the analysis of variance in which the variables to be tested are adjusted to take account of assumed linear relationships with other variables. See also correlation.

A Dictionary of Computing JOHN DAINTITH

covariance

views updated May 08 2018

covariance In statistics, a measure of the association between two variables. Covariance is calculated as the difference between the average product of corresponding values in the two data sets and the product of the means of the two data sets. A value of zero indicates no relationship between the data sets. The covariance value can be used to calculate linear regression and principal components.

A Dictionary of Earth Sciences AILSA ALLABY and MICHAEL ALLABY