Saturday, December 6, 2008

CDC Mortality data: 3

This is the third in the series of posts on the CDC's (Centers for Disease Control and Prevention) public use mortality data sets. Many people are reluctant to get deeply involved in the nitty-gritty of medical informatics because they don't know how to program or they do not understand the algorithmic approaches to data analysis or the statistical methods for assessing the significance of these analyses. I hope to show, in this series of blogs on the CDC mortality data sets, that the technical steps in data organization, data integration, and data analysis are all relatively simple. The tough part of medical informatics comes from human socio-political-legal activities.

In today's blog, we'll see how the underlying causes of death are listed on a prototypical death certificate.

Much of what we think we know about the ways that Americans die, comes from analyses of death certificates. Annual death certificate data for the entire U.S. population have been collected since 1935 by the Vital Statistics Program of the National Center for Health Statistics. Death certificate data are notoriously error-prone, and the problems seem to extend beyond national borders, as a similar set of complaints have been voiced in the United States and the United Kingdom. The most common error occurs when a mode of death is listed as the cause of death (e.g., cardiac arrest, cardiopulmonary arrest), thus nullifying the potential value of the death certificate. A recent survey of 49 national and international health atlases has shown that there is virtually no consistency in the way that death data are presented.

Members of the public may believe that death certificates are completed after a formal autopsy is conducted. This is seldom the case. Autopsies are conducted in only a small percentage of deaths, worldwide. Autopsies can take weeks before the final report is issued. Doctors who complete the death certificate, usually within minutes or hours of the patient's death, do so without the benefit of a pathologist's post-mortem examination. The death certificate contains a doctor's best guess of the patient's cause of death, but the best guess may be inaccurate.

Complicating the "cause of death data" is the rather strange ways we have come to think about the biological steps leading to death. For centuries, the cause of death has been encapsulated in a backwardly-sequential list of conditions.

Here are a few examples from:

Here is an example of an underlying cause of death leading to an immediate cause of death.

(a) Bleeding of esophageal varices
(b) Portal hypertension
(c) Liver cirrhosis
(d) Hepatitis B

Hepatitis B is the underlying cause of death. Hepatitis led to the development of liver cirrhosis, which, in turn, produced portal hypertension. Portal hypertension led to the development of esophageal varices. The varices bled. The patient's imminent cause of death was internal bleeding (from esophageal varices). Hepatitis B was the antecedent for every condition listed.

How would this be entered on the patient's death certificate? Let's look at a blank form.

This figure of a part of a death certificate form, is extracted from a U.S. government publication available at:

In Part 1 of Item 27, hepatitis B would be listed on line d; Liver cirrhosis on line c; Portal hypertension on line b; and Bleeding of esophageal varices on line a. Additional significant medical conditions that did not cause the patient's death are listed in Part 2 of Item 27. Nothing could be easier!

Seemingly intractable problems arise when:

There are multiple, sometimes unrelated, conditions that contribute to the patient's death;

The doctor filling the death certificate is not familiar with the patient's history;

The doctor has not been trained to complete the death certificate properly;

The doctor is lazy and does not make the effort to provide a complete and accurate death certificate

The cause of death is obscure or contentious;

The doctor has reason to conceal conditions leading to the cause of death. The phrase "I did it" never appears in Item 27 of the death certificate, though we know that many deaths are iatrogenic.

Thousands of instructional pages have been written on the proper way to complete a death certificate. Though we strive to do our best, it is unlikely that humans can be expected to prepare consistent and accurate summaries of what has always been a phenomenon shrouded by ignorance.

In the next blog, we'll discuss how the data on every death certificate is transformed into a mortality record consisting of an alphnumeric sequence.

As I remind readers in almost every blog post, if you want to do your own creative data mining, you will need to learn a little computer programming.

For Perl and Ruby programmers, methods and scripts for using a wide range of publicly available biomedical databases, are described in detail in my prior books:

Perl Programming for Medicine and Biology

Ruby Programming for Medicine and Biology

An overview of the many uses of biomedical information is available in my book,
Biomedical Informatics.

More information on cancer is available in my recently published book, Neoplasms.

© 2008 Jules Berman

As with all of my scripts, lists, web sites, and blog entries, the following disclaimer applies. This material is provided by its creator, Jules J. Berman, "as is", without warranty of any kind, expressed or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose and noninfringement. in no event shall the author or copyright holder be liable for any claim, damages or other liability, whether in an action of contract, tort or otherwise, arising from, out of or in connection with the material or the use or other dealings in the material.

Science is not a collection of facts. Science is what facts teach us; what we can learn about our universe, and ourselves, by deductive thinking. From observations of the night sky, made without the aid of telescopes, we can deduce that the universe is expanding, that the universe is not infinitely old, and why black holes exist. Without resorting to experimentation or mathematical analysis, we can deduce that gravity is a curvature in space-time, that the particles that compose light have no mass, that there is a theoretical limit to the number of different elements in the universe, and that the earth is billions of years old. Likewise, simple observations on animals tell us much about the migration of continents, the evolutionary relationships among classes of animals, why the nuclei of cells contain our genetic material, why certain animals are long-lived, why the gestation period of humans is 9 months, and why some diseases are rare and other diseases are common. In “Armchair Science”, the reader is confronted with 129 scientific mysteries, in cosmology, particle physics, chemistry, biology, and medicine. Beginning with simple observations, step-by-step analyses guide the reader toward solutions that are sometimes startling, and always entertaining. “Armchair Science” is written for general readers who are curious about science, and who want to sharpen their deductive skills.