How we calculated All Children’s surgical mortality rates

In April, the CEO of Johns Hopkins All Children's Hospital said the hospital’s heart surgery program had “challenges” and that its mortality rate had risen in 2017. The hospital declined to elaborate or provide 2017 data.

No one forces children’s hospitals to publicly report these outcomes. The Society of Thoracic Surgeons, the nonprofit that represents heart surgeons, publishes four-year rolling averages for almost every pediatric heart center in the United States, including All Children’s. But it does not publish data for individual years. By including four years of data, recent problems can be masked.

To better understand what happened in 2017, we calculated the annual outcomes for All Children’s and compared its heart center to every children’s heart surgery program in Florida.

[ Read the full investigation: Heartbroken. ]

Below, we describe our methodology and some of the ways we tested our findings. Experts in hospital data or heart surgery can click on technical details for more specifics, including the computer code we used.

We started with each hospital's billing data, and then filtered to only heart surgeries.

Every hospital must report “discharge data” to the Florida Agency for Health Care Administration. The data, also called administration or admissions data, includes information about every patient treated at the hospital. It lists procedures the patient underwent, doctors involved in the treatment and any diagnosed conditions.

We get the data from the state every quarter, stripped of information identifying specific patients.

We analyzed the past 10 years of data for the 10 Florida hospitals that run children’s heart surgery programs. We also ran an analysis of the program at St. Mary’s Hospital, which was only open from 2011 to 2015. We only compared complete years of data.

Then we identified children who had at least one congenital heart surgery.

For data before October 2015, we used a method created by the federal Agency for Healthcare Research and Quality and Boston Children’s Hospital.

We used the research agency's methodology for measuring pediatric heart surgery volumes (Pediatric Quality Indicator 07). That methodology applies to administrative data coded in the ICD-9-CM system.

We deviated from the methodology in only two ways.

  1. The methodology excludes babies less than 30 days old who have only one heart-related diagnosis and procedure that involves a Patent Ductus Arteriosus. That’s a condition that occurs when a natural hole in a newborn's heart is supposed to close soon after birth but does not. Before 2018, Florida's administrative data only lists babies as 0 years old, not by a number of days. After consulting with experts, we decided to exclude all babies less than 1 year old who match this description.
  2. We did not exclude heart transplants. The Society of Thoracic Surgeons includes transplants in their mortality calculations, and we wanted to compare our results to those that the society reported. At All Children’s, transplants are performed by the Heart Institute’s doctors.

In October 2015, the codes hospitals use to represent procedures and diagnoses changed.

The healthcare research agency has not fully updated its method to identify heart surgeries using the new codes. It has published a draft, but some additional codes have been added since it was written.

For data starting in October 2015, we started with the draft methodology, and then widened the net to capture a broader list of procedures categorized as open-heart surgeries.

Not every open-heart surgery is a congenital heart procedure, however. So we filtered to children who were treated by a congenital heart surgeon, to take out the irrelevant cases.

We gathered a list of heart surgeons that operated at All Children’s since 2008. They were listed on 99 percent of the hospital’s cases identified by the pre-2015 method.

Florida data is coded in ICD-10-CM starting in October 2015.

We started with the Agency for Healthcare Research and Quality’s proposed PDI 06 methodology.

Then we added additional heart surgeries. Heart operations start with the digits “02” in ICD-10-CM. Open-approach surgeries have the digit “0” in the fifth spot. So we included patients who had at least one procedure that matched a “02__0__” pattern, and at least one diagnosis code for congenital heart disease.

# Checks if the discharge has a congenital heart disease listed as a diagnosis

def has_chd(row):

icd_coding = row['ICD_CODING']

# Loops through each column that lists diagnoses

for diagnosis_header in diagnosis_headers:

if any(chd_code == row[diagnosis_header] for chd_code in ahrq_chd_codes[icd_coding]):

return True

return False

# Checks if any procedure matches a 02__0__ pattern

p = re.compile('02..0..')

def has_open_heart(row):

# Loops through each column that lists procedures

for procedure_header in procedure_headers:

if (row['ICD_CODING'] == 'ICD10' and p.match(row[procedure_header])):

return has_chd(row)

return False

This included some discharges that wouldn’t have been counted under the older research agency's method and that of the Society of Thoracic Surgeons. So we filtered to cases that had a congenital heart surgeon listed as either the attending physician or one of the operating doctors.

Nine of the 10 active hospitals report the names of their heart surgeons to the Society of Thoracic Surgeons. One hospital, Wolfson Children’s Hospital in Jacksonville, does not report to the society, so we got a list of surgeons from the hospital.

# Checks if a heart surgeon is listed as one of the patient's doctors

def has_surgeon(row):

if row['FACLNBR'] in hospital_surgeons:

this_surgeons = hospital_surgeons[row['FACLNBR']]

# Checks attending physician, operating physician, and other operating physician columns

if row['ATTEN_PHYI'] in this_surgeons or row['OPER_PHYID'] in this_surgeons or row['OTHOPER_PH'] in this_surgeons:

return True

return False

This left us with a pool of surgeries performed by each children’s heart program for every year in the past decade.

There are many reasons to think the changes in coding did not influence our results:

  1. Both methods captured about the same percentage of All Children’s cases identified in the four-year numbers released by the Society for Thoracic Surgeons (more on that below).
  2. The change in methodology did not result in an immediate spike in mortality at All Children’s. The hospital’s rate in 2016 — the first full year of data coded the new way — was only 0.23 percentage points above the previous year.
  3. The mortality rate reported for All Children’s by the Society of Thoracic Surgeons from 2013 to 2016 was only 0.25 percentage points higher than the mortality we calculated for that same period.
  4. We hand-reviewed many of All Children’s 2017 cases. We looked for discharges that specified a heart surgeon but lacked a heart surgery, and congenital heart operations that were being excluded because they didn't show a surgeon. The review did not find any cases that were incorrectly excluded.

We found a big increase in deaths and complications at All Children’s in 2017.

We were able to count the number of deaths at each Florida heart program, and four additional indicators of complications.

We calculated the number of days it took patients to recover after surgery and the number of postoperative patients that acquired sepsis — a condition where the body's immune system attacks itself during an infection. We also calculated the number of patients whose surgical wounds broke open, and the number of times surgeons put patients on a type of heart support machine called extracorporeal membrane oxygenation, or ECMO, after surgery.

Deaths were counted if a patient's discharge status was labeled as “20,” or expired. This is a more limited criteria than the one used by the Society of Thoracic Surgeons, which also counts deaths that occur within 30 days of discharge. Administrative data does not give us that information.

# Checks if the discharge status is 20

def has_deceased(row):

return row['DISCHSTAT'] == 20

The length of stay after heart surgery was calculated by taking the total length of stay and subtracting the number of days into the stay that the first heart procedure was performed.

# Helper function gets the number of days into a hospital stay that a procedure took place

def get_days_from_proc(surgery, procedure_header):

# Gets the day a procedure took place from the appropriate column

if procedure_header == 'PRINPROC':

return surgery['DAYSPROC']

  else:

proc_num = int(procedure_header.replace('OTHPROC', ''))

if proc_num > 9:

return surgery['DAYS_PRO_' + d[proc_num-9]]

  else:

return surgery['DAYS_PROC' + str(proc_num)]

# Calculates the length of stay after the procedure by subtracting

# day procedure took place from the total length of stay

def get_los(row, procs):

return row['LOSDAYS'] - get_days_from_proc(row)

The number of sepsis cases was calculated using Agency for Healthcare Research and Quality's methodology (PDI 10 measures postoperative sepsis).

The frequency of wound disruption was calculated by counting the number of cases with at least one diagnosis code that identifies disruption.

# Lists disruption code prefixes for ICD-9 and ICD-10 coding

disruption_coding = {

'ICD9': '998.3',

'ICD10': 'T81.3'

}

# Checks if any of the diagnoses starts with a disruption prefix

def has_disruption(row):

icd_coding = row['ICD_CODING']

for diagnosis_header in diagnosis_headers:

if str(row[diagnosis_header]).startswith(disruption_coding[icd_coding]):

return True

return False

The frequency of postsurgical use of extracorporeal membrane oxygenation, or ECMO, was calculated by counting the number of discharges that include at least one procedure code that identifies ECMO use on the same day as or after the first heart surgery.

# Lists ECMO code for ICD-9 and ICD-10 coding

ecmo_coding = {

'ICD9': ['39.65'],

'ICD10': ['5A15223']

}

# Checks if any of the procedures are ECMO

def has_ecmo(row):

icd_coding = row['ICD_CODING']

for procedure_header in procedure_headers:

if row[procedure_header] in ecmo_coding[icd_coding]:

# Checks if ECMO happened after first heart surgery

if get_days_from_proc(row, procedure_header) >= (row['LOSDAYS'] - row['INITHPROC']):

return True

return False

We ran those numbers for every hospital with a children’s heart surgery program in Florida for every year from 2008 through 2017.

The results showed stark increases at All Children’s in 2017.

The mortality rate at the state's children's heart centers is typically around 3 percent. That also was the case at All Children’s before 2017. Last year, it increased to 9.6 percent.

The median length of stay at All Children’s was 14 days long, twice as long as the 10-year statewide figure of 7 days.

The rate of postoperative sepsis at All Children’s was 8.4 percent, three times as high as the Florida average of 2.5 percent.

The rate surgical wounds ruptured was 8.4 percent, almost five times as high as the Florida average of 1.7.

The rate of ECMO heart-support use went up to 9.6 percent; the statewide rate is around 2.3 percent most years.

For four of the five metrics, All Children’s in 2017 had the worst outcomes of any hospital during the ten-year period we analyzed. The only exception was ECMO use; the St. Mary's Medical Center heart program had higher ECMO use in 2014, the last full year before it was closed. (A link to our results data file is at the bottom of this page.)

Billing data isn’t perfect, but it is a common way to measure hospital performance when other data isn’t available.

Some researchers and medical experts have raised concerns about the accuracy of discharge data. They point out that this data is primarily used for billing and is submitted by billing coders, not medical practitioners. These experts prefer clinical data sources, which are built from medical records.

That's how the Society of Thoracic Surgeons gets its data; hospitals collect clinical data for each heart surgery patient and provide it to the society directly. Dr. Jeffrey Jacobs, a pediatric surgeon and deputy director of All Children’s Heart Institute, is the chair of the society's Workforce on National Databases.

All Children’s declined to release the data it submitted to the society for 2017, pointing only to the four-year averages published online. Those show a small increase from 3.41 percent in 2016 to 3.55 percent in 2017.

The society declined to release one-year numbers for 2017, or to allow the Times to purchase them. It also wouldn’t provide the historical four-year averages that were previously published on its site.

There is a long history of using administrative data to measure hospital performance. Hospital Compare, a site run by the U.S. Centers for Medicare & Medicaid Services, provides dozens of quality measures for hospitals based on administrative data, many of which are also based on the metrics developed by the Agency for Healthcare Research and Quality. (Hospital Compare doesn’t have statistics on All Children’s because it does not cover children's hospitals.)

Administrative data has also been successfully used to calculate mortality rates for pediatric heart surgeries.

The method we followed was developed and used by the federal government and the Boston Children’s Hospital, a top program in pediatric heart surgery.

The National Quality Forum, a nonprofit organization that chooses and recommends health quality measures, endorsed the ICD-9-CM federal method that uses administrative data. The forum also endorsed the Society of Thoracic Surgeons method, and the president of the society has called the forum's endorsement the “gold standard”.

Jacobs has written about the accuracy problems administrative data presents (we will address these below). Nonetheless, he told reporters at the Philadelphia Inquirer in 2016 that administrative data is a “reasonable alternative” to clinical outcomes data.

Big changes in mortality rates measured with administrative data are “at least concerning, and merit further investigation,” Jacobs told the Inquirer.

The increase in 2017 is not because our administrative data analysis is undercounting heart surgery cases.

One of the most common concerns about administrative data is that it does not capture as many heart surgery cases as clinical data.

A research paper, coauthored by Jacobs, on the differences between administrative and clinical data says clinical data captures 96.8 percent of cases while administrative data captures between 85.5 and 90.9 percent.

In our analysis, for the latest two cycles of the Society of Thoracic Surgeons' data, the number of cases captured by administrative data was 81.6 percent of the society's statewide totals.

For All Children’s, administrative data captured 88.6 percent of the society's totals over the past four reporting cycles, back to 2011.

The research paper notes that using administrative data generally lowers mortality rates by an average of 4.7 percent. But it can occasionally increase a hospital's rate if too many successful surgeries are missed.

All Children’s gave us the Heart Institute's internal surgical counts for each of the past three years. For 2015 and 2016, the administrative data captured more than 87 percent of the totals provided by All Children’s.

The only exception was in 2017, the year we are focusing on. That year, the hospital said it had 106 cases. Our method identified 83, or 78 percent of the hospital’s total.

Using the hospital’s numbers doesn't change the trend. If you divide the deaths in the administrative data by the hospital’s count of surgeries — which assumes every patient our method didn’t capture survived — the mortality rate still triples from 2015 to 2017.

Our mortality rates closely match other published figures, and administrative data does not generally overcount deaths.

To ensure our analysis wasn’t overcounting deaths, we compared our results to published clinical data.

We were able to gather the Society of Thoracic Surgeons 2016 and 2017 public data reports. But the society would not release older reports. We used the WayBack Machine, a site run by the nonprofit Internet Archive, to access older reports for All Children’s.

In no four-year period did our analysis show more deaths than the society's data.

The administrative death rates were on average 0.07 percentage points lower than the society’s. The largest underestimate was 0.81 percentage points.

Most overestimates were under 0.6 percentage points. Only once did the analysis overstate mortality by more than that: Jackson Memorial Hospital’s 2013 to 2016 mortality figures, which were 0.92 percentage points higher in our data.

In addition, Nicklaus Children’s Hospital in Miami releases realtime clinical data, which allowed us to compare the one-year results we calculated for their program to Nicklaus’ clinical data.

Our analysis underestimated Nicklaus’ self-reported mortality in seven out of 10 years. The largest one-year overestimate was 1.48 percentage points in 2009.

If we decreased our calculation of All Children’s mortality by the largest overestimate we found, the hospital's figure would drop from 9.6 to 8.12 percent — still 2.9 times higher than the rate we calculated in 2015.

Our reporting backs up the data.

Our contract with the state prohibits us from using the data to identify the eight patients who died.

However, through other reporting, including interviews and reviews of records provided by patients’ families, the Times was able to confirm at least six patients died in 2017. We have no reason to believe we found every patient.

The increase is not because the hospital was performing more difficult procedures.

Surgeons prefer to use risk-adjusted statistics, which take into account the difficulty of the procedure and the health and age of the patient. That helps avoid penalizing doctors who treat the sickest patients.

We did not factor in risk adjustments because the federal government has not yet published a method to do so that works with the new coding system. Instead, we used what’s called “raw” mortality figures — the actual percentage of patients who died.

Nonetheless, we do not believe this explains All Children’s increase, for two reasons:

  1. Dr. Jonathan Ellen, the All Children’s CEO, told us in April that the Heart Institute began purposefully avoiding complicated cases in the second or third month of 2017. Only one death identified by our analysis occurred in the first quarter of 2017.
  2. The four-year clinical reports by the Society of Thoracic Surgeons are split into categories by surgical difficulty. Comparing the 2017 report to the 2016 report shows All Children's didn't significantly increase its percentage of high-complexity cases. In fact, the hospital’s risk-adjusted “expected mortality” — based on the society’s estimation of its caseload difficulty — dropped.

You can see our full data and the computer code we wrote to produce it online.

You can download the computer scripts we used to produce these results and the calculations they yielded. You won’t be able to run the scripts without the underlying data from the Florida Agency for Health Care Administration, which we are not allowed to publish under the contract we sign with the agency, to protect patient privacy.

Our analysis was conducted to better understand the outcomes All Children's patients experienced in 2017. The steps we listed to vet our conclusions are specific to All Children's. Similar work would be required to evaluate the quality of other Florida heart programs.

The full series