Today, the Institute for Health Metrics and Evaluation
(IHME, my employer) is launching 8 new interactive data visualizations that bring to life the results of the 5-year Global Burden of Disease (GBD) study
at the country level. The GBD study compiled all available data on health outcomes for 187 countries in the world for 1990 and 2010, and provides estimates for the burden caused by different diseases and risk factors that are comparable across countries and over time. Regional results were published in a dedicated triple-issue of the Lancet in December 2012 (see my related post here
). Managing the Data Team at IHME, I have been lucky enough to support the project with finding and managing data over the past 4 years, as well as overseeing the creation of these visualizations.
The data visualizations play a key role in the GBD project for several reasons. It started with IHME’s need to review the results of GBD. Tables and static graphs just don’t provide the flexibility to properly assess results and identify patterns and trends.
GBD uses four key metrics: number of deaths, years of live lost (YLL), years of life lost to disability (YLD), and disability adjusted life-years (DALY). The results datasets are massive, broken down by several dimensions:
- 291 causes of disease and injuries at the most granular end of a 5-level cause hierarchy
- 66 risk factors
- 1100 cause-risk factor attributions (i.e. burden caused by a given risk factor via a particular disease or injury)
- 187 countries, 21 GBD regions, global
- 27 age groups: early neonatal, late neonatal, post neonatal, 1-4 years, 5-9, 10-14 and so on until 75-79, 80+, as well as under 5, 5-14, 15-49, 50-69, 70+, all ages, and age-standardized
- Male, female, both
- 3 years: 1990, 2005, 2010
- Estimates expressed as total number, rate, and %, as well as ranked by country
- 95% uncertainty intervals: lower bound, mean, and upper bound (not strictly a dimensions but adds to the size of the database)
In total, about 1 billion (!) results were calculated for the project, and then there are aggregations by cause, age, and geography. A nightmare to review, but a gold mine for visualizations. The results datasets are fully imputed for all dimensions, i.e. there are no gaps in the datasets. And consistent use of methods ensure comparability of results across all dimensions.
We improved the tools as we reviewed our results, then started using the tools to show the results to collaborators and country experts to obtain feedback, review our estimates, and discuss what data were used for analysis (and what data may be available to further inform and improve the estimates). Realizing how powerful these tools are for different audiences to explore the results of GBD, we decided to make them publicly available. In December 2012, we launched 5 visualization tools with the regional results of GBD (available here) with the publication of the GBD papers in The Lancet.
Updates for these tools are now available with country-level results. In addition, we created three new tools that allow users to review and explore the data from completely new angles. Here is a quick overview of the country-level visualizations:
- GBD Compare is a powerful platform that visualizes the data in treemaps, maps, time plots, age plots and stacked bar charts. The most powerful feature is the 2-panel view that allows users to review any two of these charts simultaneously to compare and review trends across causes, risks, countries, ages etc. The panels are interactive, e.g. the map can be used to select countries in the other panel and quickly explore countries around the world. It’s a powerful tool, but requires a bit of commitment to make use of all the features. My video tutorial for GBD Compare can be found here.
- GBD Cause Patterns provides results for 21 cause groups in stacked column charts. It allows quick exploration of trends across geographies, ages, gender and time (see options at the bottom of the screen).
- GBD Arrow Diagram shows very concisely the rank of causes and risks for a given country or region in 1990 and 2010, along with the related growth trend. The connecting arrows quickly show how fast causes and risks have grown or decreased between 1990 and 2010. A version of the GBD Arrow Diagram is embedded below.
- GBD Heatmap ranks causes and risks by burden within a country, but then allows comparisons of those ranks across countries and/or regions (you can compare the ranks within a country with the ranks for a given region or the world).
- GBD Uncertainty Visualization allows users to compare uncertainty bounds across causes and risks for all dimensions. Countries or causes/risks where the data were more sparse or inconsistent will have wide uncertainty intervals.
- HALE/LE Visualizations shows the relationship between total life expectancy and healthy life expectancy, i.e. the number of years people can expect to spend in good health over their lifetime.
- Mortality Visualization provides an interesting addition to the results: users can look at all-cause mortality estimates and uncertainty bounds in the context of the underlying input data points. The hovers provide detailed metadata about the source of the data point.
- COD Visualization show the input data points for cause of death data by country, cause, and sex, also with detailed metadata.
All visualizations also feature “share” functionality that creates a unique URL for the chosen settings that can be shared via email, Twitter, Facebook or other social media. This should be useful to bring up the tools in online conversations about the health situation in different countries, disease patterns and international comparisons.
These tools will be used extensively in policy and country consultations, and many of these conversations will be conducted in locations that have less than reliable internet connections. To facilitate use, we created offline versions of these tools as well. The sheer size of the data provided a substantial challenge, but the tools are now performing well offline.
If you are interested in building additional visualizations with the GBD results, you should start with the regional results of GBD, all available for download on the GHDx here. The country-level results will be made available via the GHDx in September 2013.
I would love get your feedback on your experience with using the visualizations. Are they intuitive? Are there features that you like or don’t like? Are there things you would like to see or do with the data that aren’t possible yet? Leave suggestions in the comments, and I will make sure to include them in our discussions for future development
Example: GBD Arrow Diagram