Statistical Correlations


No account yet? Register

A correlation between two variables that a change in one variable, the other variable also changes. The changes can be positive or negative.

Height and weight are correlated. As height of a population increases, weight tends to increase. This is a positive correlation.

As a population’s fruit and vegetable consumption increases, the the mortality rate for heart disease decreases which is a negative correlation.

Pearson’s correlation coefficient is the most commonly used correlation. It is represented by the Greek letter rho (ρ) or r for a population parameter and a sample statistic respectively.

This gives a number between -1 and + 1. In medicine, the correlations are interpreted as:

  • 0.9 A very strong correlation
  • 0.7 A strong correlation
  • 0.5 A moderate correlation
  • 0.3 A week correlation

Below is an example of calculating Pearson’s correlation coefficient using information stored in a database, instead of a spreadsheet.

The data is obtained from the China-Cornell-Oxford Project (The China Study) that examines the link between blood cholesterol and the amount of animal protein consumed as a percentage of total protein. Male and female data are very similar. Only the female data is shown in the table.

Surveys were conducted in 1983–1984 and 1989–1990. The study consisted of 6,500 people in 65 counties from 25 provinces. In each county, two villages (xiang) were selected with 25 men and 25 women from different families selected from each village. Blood, urine and food samples were obtained for analysis, questionnaires were completed and three-day diet information was recorded.

The A89AllVariables table contains the following information,

  • Province Code
  • Province Name
  • County
  • Sex: M, F, T (combined M and F)
  • Xiang: Xiang 3 combines data from xiang 1 and 2
  • P001: Total cholesterol mg/dL
  • D036: % animal protein / total protein consumption
  • plus an additional 360 variables
    'Correlation P001 & D036: ' AS Labels,
    IF(Sex='M' THEN 'Male' ELSE 'Female') AS Sex,
    (sum(P001 * D036) - ((sum(P001)*sum(D036))/count(P001))) /
    sqrt((sum(P001 * P001)-((sum(P001) * sum(P001))/COUNT(P001))) *
    (sum(D036 * D036)-((sum(D036) * sum(D036))/COUNT(D036))))
        AS Correlation
  FROM A89AllVariables

This produces the following results, which matches the results obtained from LibreOffice spreadsheet program to 7 decimal places.

Correlation P001 & D036: Female 0.65 Correlation P001 & D036: Male 0.67

WHO’s Recommendations

WHO's recommendations on saturated fat are out of date, expert team says.
However, the study has been funded by the dairy and beef industries.
Discover how industry-funded research is deceiving the public.

Three eBooks

Low-carbohydrate Mania

Low-carbohydrate Mania: The Fantasies, Delusions, and Myths


Center for Nutrition Studies

Center for Nutrition Studies