What’s the infection fatality rate of the COVID-19?

First, let’s define some terms. Case fatality rate (CFR) is total number of known deaths divided by total number of known cases. Infection fatality rate (IFR) is total deaths divided by total cases. The CFR in Italy as of April 1st is 11.9%. The CFR in South Korea as of April 1st is 1.7%. When we look at forecasts, worst-case scenarios, and best-case scenarios for how many people will get infected, we need to know the IFR to convert those infection numbers to fatalities. In the worst case, if we simply fall back on herd immunity and something like 2/3rds of the population is infected, does that mean (11.9% x 67% =) 8% of the population will die? Or (1.7% x 67% =) 1.1%? Or some other fraction?

CFR can over- or underestimate IFR. If we fail to diagnose most cases, we’ll likely overestimate the IFR. That’s probably what’s happening in Italy. Even though some people who die of COVID are likely not being diagnose, people are probably more likely to be diagnosed if their symptoms are severe, so the denominator (number of infections) is underestimated more than the numerator (number of deaths). On the other hand, if the number of known infections is growing exponentially, most of the infections are recent. But the time from infection to death from COVID averages around 20 days. So if we’re identifying the vast majority of cases, as is likely in South Korea, the early CFR numbers will likely creep up. On March 1st, the CFR in South Korea was 0.52%. Today on April 1st, now that the number of new cases has stayed low for weeks, the CFR has risen to 1.66%. It may continue to creep up slightly.

One more note about IFR: it will vary across countries based on the age distribution of their population and the quality and capacity of their healthcare system relative to the scale of the outbreak.

The best way we have to estimate COVID-19’s IFR is based on a few instances where everyone in a high-risk population was tested for the virus. That way, we can identify all the cases, regardless of how bad people’s symptoms are, if they’re symptomatic at all. We have two good samples like this: 1) the Diamond Princess cruise ship and 2) foreigners who were repatriated from Wuhan.

Diamond Princess

This paper from March 5th was the first to estimate COVID-19’s IFR based on data from the Diamond Princess. Most of the 3,711 passengers and crew were tested. 704 were positive. After a 14-day quarantine, people who never showed symptoms and never had contact with someone who tested positive weren’t tested. Let’s assume 100% of the people who were infected on the ship were correctly identified. The paper uses the confirmation-to-death time distribution from China to estimate how many more of the positive cases would end in fatalities. At the time of publishing, 7 people had died and more were still sick. They estimate that the IFR on the ship was **1.2%**.

This paper from March 30th has updated outcomes for the Diamond Princess passengers and crew through March 25th. By then, the fatality count was up to 10 and there were still 11 people in critical condition. The estimate that the IFR on the ship was **1.4%** (with a confidence interval of 0.7-2.6%).

Note that the ship was docked in Japan. Everyone who needed to be hospitalized was cared for to a high standard. The IFR would go up in an area with a poor healthcare system to begin with or a good healthcare system overwhelmed by too many cases.

The age demographics of cruise ships skews older. The median age was in the 60-69 bracket. The IFR would be a lower in young China, though neither of the above papers explicitly calculate IFR by age or adjust the IFR to any other demographics.

Repatriation Flights

The authors collected data for six flights from Wuhan that took foreigners back to their countries of origin (Japan, Denmark, France, Germany, and Mongolia). The passengers on these flights were carefully screened and tested. Like the Diamond Princess cruise ship, we can assume that just about everyone infected was identified. Instead of tracking outcomes for those specific people, they used this to estimate how many people in Wuhan had actually been infected so they would fix the denominator in the equation CFR = total deaths / total infections. They do a bunch of other analysis to correct for factors like the delay from onset to death.

I’m most interested in that far-right column in the figure below. They’ve got IFR by age bracket. The ‘overall’ line is for the demographics of China. We can take those IFR’s by age and adjust them to the United States. In this Google Sheet I do exactly that and calculate an IFR for the US of **0.8%**. That’s eight times higher than the seasonal flu, for a virus with no vaccine or existing immunity. If the virus infected 2/3rds of the US population, that would be 2.7 million fatalities. That’s a huge deal. And the fatalities would be higher if our healthcare system weren’t able to provide people with the treatment they need. But it’s also much lower than you’d get by applying Italy’s 11.9% CFR.