How can we know what food is healthy? Proving any causal relationship in nutrition is nearly impossible. I highly recommend this Freakonomics episode about how unreliable nutrition studies are. It’s not that the people studying nutrition aren’t smart, it’s just impractical to conduct the kinds of experiments we’d need to prove anything. What we need are large, long-term randomized trials. What we typically get are observational studies, many of which are small, short-term, retrospective instead of prospective. These observational studies often fail to measure things like smoking, exercise or income that could correlate with healthy eating but also independently contribute to a longer lifespan. Even truly randomized experiments have to contend with substitution effects: if you tell people not to eat one thing, they’ll eat another thing. It’s as if in a trial of NSAIDs the treatment group is given aspirin and the control group is given ibuprofen. If everyone in the trial is taking an NSAID how on earth can you figure out what difference NSAIDs make?
In this post I’ll contrast three different views on how we can determine which foods are healthy or not. First, Schwingshackl et al. (hereafter, “the Europeans”) says meta-studies should compare what happens when people eat specific food groups, not specific nutrients. This approach can result in straightforward recommendations for what foods to eat, regardless exactly which chemical in a given food is good or bad for you.
Second, Wang et al. (hereafter, “the Americans”) take the opposite view and compare what happens when people get different portions of their calories from carbohydrates or the four basic kinds of fats. The Americans argue that by focusing on nutrients and controlling for substitution effects they have a more direct link to mortality than the food groups in the European meta-study.
Third, John Ioannidis, a medical school professor with a joint appointment in statistics, argues these findings are all based on poorly designed studies with flawed statistical analysis. He pushes for reform in nutrition research that would provide larger grants for a few well designed randomized trials instead of a large number of low quality observational studies. Until we complete large RCTs, all we really know about nutrition is that obesity and severe malnutrition are bad for you.
The Europeans calculate correlations between consumption of 12 food groups and all-cause mortality. For each food group they present both a linear dose, which I summarize in the table below, and nonlinear dose response curves, for which I show their table. Nuts, whole grains, fish, fruit and vegetables are correlated with lower mortality. Red meat and processed meat are correlated with higher mortality. The significance of the other five groups depends on whether the response is allowed to be linear or nonlinear (which makes for easier overfitting).
Food | Dose in g/day [range] | Relative Risk of Mortality |
Nuts | 28 [0-52] | 0.76 [0.69, 0.84] |
Whole grains | 30 [0-110] | 0.92 [0.89, 0.95] |
Fish | 100 [0-225] | 0.93 [0.88, 0.98] |
Fruit | 100 [0-626] | 0.94 [0.92, 0.97] |
Vegetables | 100 [5-663] | 0.96 [0.95, 0.98] |
Legumes | 50 [6-166] | Not significant |
Refined Grains | 30 [0-183] | Not significant |
Eggs | 50 [4-68] | Not significant |
Dairy | 200 [0-1041] | Not significant |
Sugar-sweetened beverages | 250 [0-930] mL/day | Not significant |
Red meat | 100 [0-200] | 1.10 [1.04, 1.18] |
Processed meat | 50 [0-200] | 1.23 [1.12, 1.36] |
The Americans use data from the Nurses’ Health Study and the Health Professionals Follow-Up Study, which followed tens of thousands of men and women and asked them, among other things, about what food they ate. They measured the effect on all-cause mortality of substituting one calorie of carbohydrate (whether it’s fiber, sugar, or another kind) with one calorie of trans fat, saturated fat, monounsaturated fat, or polyunsaturated fat. The short answer: trans fat is super bad, saturated fat is about one fourth as bad, and unsaturated fat is good. See their figure below.
In some ways, these correlations match the results from the Europeans and received wisdom. Salmon, hazelnuts, and olive oil are all supposed superfoods high in unsaturated fat. Sunflower oil actually has an even higher portion of polyunsaturated fat than olive oil. Extrapolating from the figure above, sunflower oil should be even better for you than olive oil. You know what food is made with sunflower oil and is high in polyunsaturated fat? Sun Chips. Now, Sun Chips do sort of market themselves as the healthy alternative to potato chips. But guess what. Lays Potato Chips have fewer carbs and more of both polyunsaturated and monounsaturated fat. If we accept the correlations from this study and the trials it’s based on as proof that extra virgin olive oil is good for you, then we would have to conclude that Lays Potato Chips are good for you.
The last view is that these observational studies are all deeply flawed ways of studying causal relationships in nutrition. Problems highlighted by Ioannidis include:
- These studies are based entirely on asking people to remember what they ate. We are all terrible at doing that. If our memory lapses were random then the true effect sizes would actually be larger than those found here. Alternatively, if healthy behaviors like exercise are positively correlated with exaggerating how many vegetables we ate, the the true effect sizes would actually be smaller.
- The European study finds that “High levels of statistical heterogeneity persisted in subgroup analyses.” What that means is that the individual studies pooled in their meta-analysis strongly contradicted each other even after trying to explain away those contradictions based on variations in the study features they could think of. Drawing one central conclusion from studies that contradict each other does not make for strong evidence.
- The European study also finds “publication bias in the analyses of vegetables, fruits, nuts, dairy products, and red meat.” Studies that find positive correlation are more likely to be submitted to journals and more likely to be accepted. For topics that are very popular, like whether or not vegetables make you live longer, you can actually test for publication bias. If there is one true value that describes the mortality benefits of vegetables, individual studies should report values that follow a symmetrical distribution around that true value. If the observed distribution is actually skewed to the right, it suggests that studies with negative findings are going unpublished. There is evidence of that happening here.
- Confounders: The two meta-studies referenced in this post make a valiant attempt to control for potential confounding variables like smoking and exercise. It’s still possible their simple models and limited data (such as asking people whether they smoke but not how much) don’t fully control for these confounders. Other observational studies often don’t control for confounders at all.
- Many of the papers included in the meta-study are non-overlapping. If a study that asks about whole grains doesn’t ask about nuts and vice versa, and if consumption of nuts and whole grains is correlated, then any health benefits of nuts and whole grains get double counted in the meta-study.
Until the NIH pays for large randomized trials in nutrition, let’s just all live on potato chips.