Comparing COVID-19 Mortality Rates Over Time By Country

As COVID-19 spreads its deadly effects around the world, many data analysts are struggling to track these effects in useful ways. Some attempts work better than others, however. Comparing these effects among various countries is particularly challenging. Some attempts that I’ve seen are confusing and difficult to read, even for statisticians. Here’s an example that was brought to my attention recently by a statistician who found it less than ideal:

I believe that the objectives of displays like this can be achieved in simpler, more accessible ways.

Before proposing an approach that works better, let’s acknowledge that country comparisons of deaths from COVID-19 are fraught with data problems that will never be remedied by any form of display. Even here in the United States, many deaths due to COVID-19 are never recorded. If someone with COVID-19 suffers from pneumonia as a result and then dies, what gets recorded as the cause on the death certificate: COVID-19 or pneumonia? Clear procedures aren’t currently in place. Medical personnel are focused on saving lives more than recording data in a particular way, which is understandable. This problem is no doubt occurring in every country. The integrity of the data from country to country differs to a significant degree and does so for many reasons. It’s important to recognize whenever we display this data that country comparisons will never be entirely reliable. Nevertheless, working with the best data that’s available, we must do what we can to make sense of it.

If we want to compare the number of deaths due to COVID-19 per country, both in terms of magnitudes and patterns of change over time, the following design choices seem appropriate:

  1. Assuming that we want to understand the proportional impact on countries, use a ratio such as the number of deaths per 1 million people rather than the raw number of deaths, to adjust for population differences.
  2. Aggregate the data to weekly values to eliminate the noise of day-to-day variation.
  3. Use rolling time (i.e., week 1 consists of days 1 through 7, week 2 consists of days 8 through 14, etc.) rather than calendar time, beginning with the date on which the first death occurred in each country.

The following line graph exhibits these design choices. To keep things simple for the purpose of illustrating this approach, I’ve included four countries only: the U.S., China, Italy, and Canada. Also, for the sake of convenience, I’ve relied on the most readily available data that I could find, which comes from www.ourworldindata.org.

Most people in the general public could make sense of this graph with only a little explanation. It’s important to recognize, however, that no single graph can represent the data in all the ways that are needed to make sense of the situation. Perhaps the biggest problem with this graph is the fact that the number of weekly deaths per 1 million people per country varies so much in magnitude, ranging from over 90 at the high end in Italy to less than 1 at its peak in China, the blue line representing China appears almost flat as it hugs the bottom of the graph, which makes its pattern of change unreadable. Assuming that the number of deaths in China is accurate (not a valid assumption for any country), this tells us that COVID-19 has had relatively little effect on China overall. The immensity of China in both population and geographical space is reflected in this low mortality rate. The picture would look much different if we considered Wuhan Province alone.

Obviously, if we want to compare the patterns of change among these countries more easily, regardless of magnitude, we must solve this scaling problem. Some data analysts attempt to do this by using a logarithmic scale, but this isn’t appropriate for the general public because few people understand logarithmic scales and their effects on data. Another approach is to complement the graph above with a series of separate graphs, one per country, that have been independently scaled to more clearly feature the patterns of change. Here’s the same graph above, complemented in this manner:

With this combination of graphs, there is now more that we can see. For instance, the pattern of change in China is now clearly represented. Notice how similar the patterns in China and Italy are. From weeks 1 through 7, which is all that’s reflected in Italy so far, the patterns are almost identical. Will their trajectories continue to match as time goes on? Time will tell. Notice also the subtle differences in the patterns of change in the U.S. versus Canada. In the beginning, mortality increased in Canada at a faster rate but started to decrease from the fourth to fifth week while the pattern in the U.S. does not yet exhibit a decrease as of the sixth week. Will mortality in the U.S. exhibit a decline by week 7 similar to China and Italy? When another complete week’s worth of data is added to the U.S. graph, we’ll be able to tell.

Clearly, there are many valid and useful ways to display this data. I propose this simple set of graphs as one of them.

3 Comments on “Comparing COVID-19 Mortality Rates Over Time By Country”


By Dale Lehman. April 14th, 2020 at 2:01 pm

Your separate country graphs are a vast improvement. The problem with the absolute levels is that they depend critically on the total population levels – China, as a nation, is huge, Hubei less so. Given that the pandemic breaks out in particular areas and may or may not be confined to those areas, I think any attempt to put the countries on a single graph is likely the obscure the meaningful patters. Your final graphs avoid this and allow for potentially meaningful differences to emerge. My speculation (and it is only speculation at this point) is that Italy has a higher death rate than the US, but had a superior response to the crisis – they got more serious about quarantining, as did China (and Korea, etc.), while the delayed response in the US and Canada show up more clearly. Expanding this to include more locations would be instructive. So, I think that ultimately it will not be the absolute mortality rate of COVID that we want to compare, but rather the shapes of these curves. Then, combining the shapes with some measures of health system capacity to deal with the crisis is what will be most meaningful.

By Stephen Few. April 14th, 2020 at 6:35 pm

Dale,

It’s tempting to speculate about causes, isn’t it? I’m inclined to believe that Italy’s higher mortality rate is tied to its greater degree of socializing and that its relatively quick turnaround is tied in part to a good national health care system in addition to social distancing, but this is pure speculation. My guess that their health care system deserves some credit is highly biased by some personal experiences that I had with it several years ago during visits to Italy, which impressed me and made me grateful.

By Catalin. April 22nd, 2020 at 3:44 pm

Finally some useful writing and examples in a sea of useless charts!

Not sure if links are allowed but this brought a smile to my face 🙂

https://xkcd.com/2294/

Leave a Reply