MUSINGS ON IRAQ: Major Flaws With The Lancet Reports On Iraqi Deaths, Part II

The British medical journal The Lancet published two reports on the estimated deaths caused by the Iraq war in 2004 and 2006. They were generally well received by the public and media, but behind the scenes they started a long debate amongst academics and researchers about their results. The reason why this happened was because the Lancet papers had casualty figures far above almost every single report or survey done before or after. Much of this controversy centered around statistical anomalies, but also included how they presented other work on fatalities after the 2003 invasion. What this highlighted were major flaws in the two Lancet reports that largely debunked their findings.

Both Lancet reports were based upon cluster sample surveys. The first Lancet was conducted in September 2004 in Iraq, using 33 clusters made up of 30 households each across 11 of Iraq’s 18 provinces. Teams would break the governorates up into districts, and randomly select from them. They would then randomly select a neighborhood within each district, then a street, and then a house to start the poll on. Survey teams interviewed 988 households that included 7,868 people. Only 5 houses, 0.5% refused to participate. Afterward, the data was collated and extrapolated to cover Iraq’s entire population to determine a mortality rate for the country before and after the March 2003 U.S. invasion. An estimate for excess deaths caused after the fall of Saddam Hussein was determined by subtracting the pre-invasion mortality rate from the post-invasion one. The first Lancet came up with a 5.0 per 1,000 people rate pre-March 2003, and a 12.3 per 1,000 one afterward. Because 71% of the deaths recorded were in Fallujah that was considered an outlier and not included in the final results. Without Fallujah, the authors came to a 7.9 per 1,000 post-invasion mortality rate. That meant there was an excess mortality rate of 2.9 per 1,000 due to the overthrow of the old regime. That equaled an estimated 98,000 excess deaths with a range of 8,000-194,000. The second Lancet, conducted in 2006, was slightly larger covering 50 clusters with 40 households each. Teams interviewed 1,849 households in total covering 12,801 people in 16 provinces using the same random methods as the first. 16 houses had no one home, and 15 did not want to take the survey. Three clusters were also dropped, because of problems meaning the final results were based upon 47 total. A pre-invasion estimated mortality rate of 5.5 per 1,000 was subtracted from 13.2 per 1,000 post-Saddam to come up with 7.8 per 1,000 excess deaths in the 40 months after the fall of the old regime. That meant there was the highest probability that there were 654,965 excess deaths from March 2003 to July 2006 with a range of 392,979-942,636. The Lancet authors claimed that their 2006 results were very close to their 2004 ones proving their validity. Those two estimates were far and away the highest made about possible deaths caused by the Iraq War with one exception, a 2007 Opinion Research Business(ORG) survey. The Lancet reports were immediately picked up by the press, and became part of the debate about the costs of the war. Much of the public discussion ignored the heavy criticism the Lancet papers came under from academics and researchers.

One problem was over the confidence interval the first Lancet came up with for estimated excess deaths after the U.S. invasion. Fallujah was not included in that paper, because it was considered an outlier with almost ¾ of the recorded deaths. David Kane, a Fellow at the Institute for Quantitative Social Science at Harvard, included the city into the Lancet figures, and found that the range would expand from 8,000-194,000 to include 0. That’s called the “null hypothesis,” because it would mean that there was a very slight chance that there were no excess deaths caused by the U.S. invasion. That was obviously false, and would undermine the entire Lancet results. Kane speculated that the authors knew this, and therefore purposely did not include Fallujah in their findings to avoid this dilemma. To add to the controversy, the Lancet writers have consistently refused to let others know what their confidence interval would be if it included Fallujah. The sharing of data is a common practice in academic work so that others can re-create, test, and critique research. In the case of the two Lancet surveys, the authors have only partially released their data, and only to specific people upon request rather than making it all public. One Lancet researcher, Dr. Les Roberts even said that he never wanted to share his work with anyone, and was censured by the American Association for Public Opinion Research in 2009 for only giving partial answers about the studies. That has only raised the concerns that there is something wrong with the Lancet surveys since no one has ever gained full access to it, and that the authors might be covering something up.

A second issue was the estimated pre-war mortality rate used in the Lancet studies. The first Lancet used an estimated pre-war mortality rate of 5 per 1,000 people, while the second used a 5.5 per 1,000 rate. Fred Kaplan of Slate talked with Osborne Daponte of Yale University’s Institution for Social and Policy Studies who did research into Iraq’s population figures. Based upon U.N. data he estimated that in 2002 the mortality rate was higher than 5 per 1,000. The director of the Iraqi Living Conditions Survey done in October 2004 for the United Nations said that the U.N. estimated 9 deaths per 1,000 before the invasion. If they were correct, that would mean there would not have been as big a difference with the first Lancet’s post-invasion mortality rate of 7.9 per 1,000, and the second’s 13.2 per 1,000, meaning not as many people died as a result of the U.S. invasion as the Lancet papers suggested. In fact, the first paper would have to use the Fallujah outlier mortality rate of 12.3 deaths per 1,000 to even come up with a positive excess death figure if the pre-war rate was 9 deaths per 1,000. On the one hand, that could mean that the Lancet reports underestimated pre-war fatality rates, and thus could have underreported the post-war rates as well, meaning even more people died as a result of the Iraq War. Given the lack of oversight and the various violations of protocol and methodology shown by the survey teams, it’s also possible that some of them faked their results as well, which inflated the post-March 2003 death figures.

Several critics questioned the extremely high rates of death certificates the Lancet survey teams claimed they saw. The first Lancet said that 80% of the deaths recorded were backed by a death certificate, while the second Lancet had 92%. The Lancet authors wrote this proved that the households were not lying about how many casualties they had suffered. This raised immediate questions. In June 2006, the Iraqi Health Ministry and Baghdad morgue only issued around 50,000 death certificates. If the second Lancet was correct, that would mean that approximately 500,000 certificates had been handed out, but never recorded. Roberts tried to counter this criticism by saying that the government was highly inefficient, and could not be counted on to accurately count the number of fatalities that had occurred. If the authorities were so unreliable however, how could 80-92% of the deaths recorded by the Lancet teams have death certificates? This could be evidence that the survey workers faked their results or did not follow protocol. Kane of Harvard inquired about this issue with the Lancet authors, and was given access to some of their data. He used that to see whether there was a relation between when survey teams said they forgot to ask for a death certificate and the date of the casualty and the type, and found a correlation. He believed that meant the Lancet workers asked for certificates sometimes and not others. These kinds of inquiries led the authors to release one disk of data from the second Lancet to a select group. It showed that 22 times the survey teams asked for a death certificate when a household said they had suffered a violent death, but were not provided with one. All of those cases occurred in Ninewa province. 23 other types of deaths did not have certificates, but were spread across eight governorates. It was suspicious that every missing record for a violent deaths was in one area. Another anomaly was that 24 deaths were recorded for a car bombing in a Baghdad cluster. The survey team did not ask for death certificates for any of those deaths despite that being protocol, but did for one non-violent fatality in that same cluster. Those 24 casualties were split between 18 houses on the same strip of streets. That meant the survey workers found a continuous string of 40 houses that all suffered deaths from a single bombing incident, which they did not ask for documentation to prove. All together these brought up questions about the conduct of the survey teams. First, the Lancet authors can’t have it both ways about the Iraqi government. Either it is incompetent and not a reliable source for deaths in the country or it does an extremely good job in handing out death certificates. If the latter were true, then researchers could just check the government records to confirm or deny the Lancet findings, when in fact, the official figures do not support them. Second, the limited data that the Lancet authors shared showed very suspicious trends. The writers have admitted that their published methodology was not always followed in the field, that they were actually in the dark about just what happened in Iraq, and have been censured for not following their own protocol. It’s for those reasons that many have challenged the validity of the fieldwork done in 2004 and 2006.

Both Lancet studies had very suspicious response rates. The response rate is the number of people who were contacted for a survey, and then agreed to complete it. In 2004, 99.5% of the 988 households visited finished the surveys. In 2006, it was slightly lower at 98.3%. Kane of Harvard went through other surveys conducted around the world and in Iraq, and could not find any with such high response rates. For instance, a September 2006 World Public Opinion poll done in Iraq using face-to-face interviews with 1,150 adults in all 18 of Iraq’s provinces had a 67% response rate. Stephen Apfelroth of the Department of Pathology at Albert Einstein College of Medicine questioned whether 30 consecutive houses could be found with families at home willing to talk 99.5% of the time as the first Lancet claimed. He speculated that rather than following protocol, the teams just interviewed those houses where people were willing to talk, and thus achieved their extremely high response rate. Fritz Scheuren, the former president of the American Statistical Association also questioned the credibility of such high participation recorded in the second Lancet. That survey was done in 2006 during the middle of the civil war, leading economist Michael Spagat of the Royal Holloway University of London to question how the teams were able to find 40 homes in a row with everyone home and willing to answer the survey questions when the sectarian fighting was going on. Finally, the survey teams in 2004 and 2006 were largely the same group of people from al-Mustansiriya University in Baghdad. They were fluent in Arabic and English, but none of them spoke Kurdish. In the first survey, Sulaymaniya was included, and in the second, Sulaymaniya and Irbil were polled. There are many areas of Kurdistan where people don’t speak Arabic, which begs the question of how were they able to complete the survey? Again, these issues bring up the conduct of the survey teams. Were these additional examples of the workers not following protocol or were they simply making up some of their results? Since there was very little oversight of their work, the Lancet authors claimed to have destroyed some of their records, and have only selectively released other parts of their data, there is no way to tell. That last point is especially telling, because the Lancet writers could quell a lot of this controversy by simply sharing their statistics, but have largely refused to do so.

Physicists Sean Gourley and Neil Johnson of Oxford University, and economist Michael Spagat postulated that the second Lancet suffered from a “main street bias.” The second Lancet said that survey teams would randomly pick a main street, and then select a major cross street to start their polling work on. Main streets have markets, cafes, restaurants, military patrols, checkpoints, etc., all of which are common targets of attacks in Iraq. That would compare to streets far away from those major intersections, which would largely be residential, and would have no chance of being visited. That meant the survey workers would only interview households near a main thoroughfare, which would have a far greater chance to witness violence, and thus inflate the casualty figures recorded. One report by Gourley, Johnson, Spagat, and others speculated that this main street bias could increase the death rate of the Lancet study by three times. It also meant the survey was not completely random, rebutted the claim that all houses in a selected area had an equal chance to be interviewed, and that violent and none violent areas were included in the same amounts in the study. Burnham and Roberts replied that non-main street areas were included in the survey, while the former said that the teams moved far away from the main streets. Burnham also wrote that most violence happened outside the home, so it didn’t matter where the survey teams went in a neighborhood. One Iraqi team member said that they did not start on main streets, but rather business streets, and then chose a residential cross street, which would include peripheral areas away from main streets. It was convenient for the Lancet authors and worker to claim that they avoided the main street bias theory after the fact. Then again, they did not write about it in the second Lancet paper, and have not shared any data to support their claims. Gourley, Johnson, and Spagat’s argument is that the majority of violence like car bombings occur on main streets, so what block the survey team decided to start on did matter. The closer to one of those main thoroughfares the teams began on, the higher likelihood there would be that they would encounter households that experienced an attack. It didn’t appear that the teams had detailed maps with them when they selected a neighborhood, and the authors claimed that they were able to interview 30-40 houses in just a few hours a day, which meant they didn’t have the time to thoroughly map out an area beforehand. It was most likely that the published account of working off of main streets was correct after all, opening the possibility of greatly over reporting and estimating the number of deaths that occurred in post-invasion Iraq.

A map showing how the "main street bias" misses entire sections of neighborhoods (Journal of Peace Research)

The Iraq Family Health Survey Study Group done by the United Nations directly challenged the findings of the second Lancet. The health study was published in the New England Journal of Medicine in January 2008. The report was based upon a survey of 9,345 households in Iraq, and estimated 151,000 deaths from March 2003 to June 2006 with a possible range of 104,000-223,000. It compared those findings to the second Lancet and Iraq Body Count. It thought that Lancet overestimated deaths considerably. The second Lancet broke up its figures into three time periods for instance. The last was from June 2005 to June 2006, and had an average of 925 violent deaths per day. When compared to the Iraq Family Health Survey and Iraq Body Count it would mean that that the former missed 87% of those fatalities, while the latter missed 90%. The study group thought it was highly improbably for so many violent incidents to go unrecorded. The Family Health Survey also had much higher quality control to ensure rigor than the second Lancet, which had no real oversight of its fieldwork. That gave the Family Health Survey far more credibility.

The second Lancet and the Bloomberg School of Public Health at Johns Hopkins didn’t seem to agree, because it misrepresented the findings of the Iraq Family Health Survey Study, and other reports that mentioned Iraqi casualties to support Roberts et. al. The second Lancet included a chart that claimed its trends in deaths was in line with the Defense Department and Iraq Body Count even though the exact figures were different. First, there were no similarities with Iraq Body Counts’ data. Second, the Pentagon’s numbers were for Iraqi casualties, which included both the dead and wounded, and therefore were not comparable to Lancet either. The second Lancet also included questionable figures. It claimed the Defense Department recorded 12,000 Iraqi deaths from March 2003 to April 2004, when no such figure existed. It also mentioned a report of 37,000 civilian casualties from March to September 2003, but the source was a letter posted to a blog that was picked up by Al Jazeera, which as never confirmed. The Bloomberg School used a 2007 survey by the BBC, ABC, and NHK, which they claimed supported the Lancet findings. The poll did not include any questions about deaths. The closest to it was about whether anyone had been physically harmed by violence. 17% said yes. Physical harm could include deaths, being wounded, or kidnapping. The 2007 Opinion Research Business (ORB) survey was quoted as well, that estimated 1,033,000 Iraqi deaths. That research was not peer reviewed, the person in charge of it had little training or experience in fieldwork, and had no scientific standing. Finally, the Iraq Family Health Survey Study Group was used, even though it explicitly criticized the Lancet findings for overestimating the possible number of deaths after the U.S. invasion. At best, this was an example of shoddy research to try to prove the Lancet paper. At its worse, it was the deliberate manipulation of others’ work.

This series of anomalies raises serious questions about the validity of both Lancet papers. There is evidence that the authors overestimated the number of post-invasion Iraq deaths, did not include data that contradicted their findings, and misrepresented other reports about casualties during the Iraq War to bolster their argument. The Iraqi survey teams might not have followed protocol and the methodology, and some may have faked their results. The responses of the Lancet writers to their critics have been a series of convenient and contradictory statements, and their refusal to openly share all of their data only adds to the suspicions that their work is deeply flawed. Given all that, there is little reason to give credence to the Lancet reports, especially because there are several other surveys of Iraq that covered a far larger number of people, and were much more rigorous, and thus are more believable. Those should be consulted first, while the two Lancet papers should be dismissed.

SOURCES

Apfelroth, Stephen, “Mortality in Iraq,” The Lancet, 3/26/05

BBC, “Iraqi death researcher censured,” 2/4/09

- “Lancet author answers your questions,” 10/30/06

Bohannon, John, “Iraqi Death Estimates Called Too High; Methods Faulted,” Science, 10/22/06

Boseley, Sarah, “UK scientists attack Lancet study over death toll,” Guardian, 10/23/06

Burnham, Gilbert, Doocy, Shannon, Dzeng, Elizabeth, Lafta, Riyadh, Roberts, Les, “The Human Cost of the War in Iraq, A Mortality Study, 2002-2006,” Bloomberg School of Public Health Johns Hopkins University, School of Medicine Al Mustansiriya University, 9/26/06

Burnham, Gilbert, Lafta, Riyadh, Doocy, Shannon, Roberts, Les, “Mortality after the 2003 invasion of Iraq: a cross-sectional cluster sample survey,” The Lancet, 10/11/06

Burnham, Gilbert and Roberts, Les, “Counting Corpses The Lancet number crunchers respond to Slate’s Fred Kaplan (And Kaplan replies), Slate,” 11/20/06

Daniel, “Lancet roundup and literature review,” Crooked Timber, 11/11/04

Dardagan, Hamit, Sloboda, John, and Dougherty, Josh, “Reality checks: some responses to the latest Lancet estimates,” Iraq Body Count, 10/16/06

Dissident 93 Blog, “Project Censored as censors?” 10/20/08

Giles, Jim, “Death toll in Iraq: survey team takes on its critics,” Nature, 3/1/07

Goldin, Rebecca, “The National Journal Takes on the Lancet Iraq Casualty Figures,” Stats Bloc, 1/8/08

Guha-Sapir, Debarati Degomme, Olivier, “Estimating mortality in civil conflicts: lessons from Iraq,” Centre for Research on the Epidemiology of Disasters, June 2007

Iraq Body Count

Iraq Family Health Survey Study Group, “Violence-Related Mortality in Iraq from 2002 to 2006,” New England Journal of Medicine, 1/31/08

Johnson, Neil, Spagat, Michael, Gourley, Sean, Onnela, Jukka-Pekka, and Reinert, Gesine, “Bias in Epidemiological Studies of Conflict Mortality,” Journal of Peace Research, September 2008

Johns Hopkins Bloomberg School of Public Health, “Review Completed of 2006 Iraq

Kane, David, “Kurdish Speakers?” Lancet On Iraqi Mortality, 3/12/07

- “Old Posts,” Lancet On Iraqi Mortality, 6/4/09

- “Plausible Response Rates,” Lancet On Iraqi Mortality, 3/12/07

- “Thoughts From a Lancet Skeptic,” Ideas In Action, 12/7/07

- “Timing Comments,” Lancet On Iraqi Mortality, 3/16/07

- “Tweaked,” Lancet On Iraqi Mortality, 1/13/08

Kaplan, Fred, “100,000 Dead-or 8,000 How many Iraqi civilians have died as a result of the war?” Slate, 10/29/04

- “Number Crunching Taking another look at the Lancet’s Iraq study,” Slate, 10/20/06

Moore, Steven, “655,000 War Dead? A bogus study on Iraq casualties,” Wall Street Journal, 10/18/06

Morley, Jefferson, “Is Iraq’s Civilian Death Toll ‘Horrible’ – Or Worse?” Washington Post, 10/19/06

Munro, Neil, “Counting Corpses,” National Journal, 1/4/08

- “Data Bomb,” National Journal, 1/4/08

Onnela, J.-P., Johnson, N.F., Gourley, S., Reinert, G., and Spagat, M., “Sampling bias in systems with structural heterogeneity and limited internal diffusion,” EPL, January 2009

Roberts, Les, Lafta, Riyahd, Garfield, Richard, Khudhairi, Jamal, Burnham, Gilbert, “Mortality before and after the 2003 invasion of Iraq: cluster sample survey,” The Lancet, 10/29/04

Shone, Robert, “Dubious polls: How accurate are Iraq’s death counts?” The Comment Factory, 6/30/10

- “Scientists criticize Lancet 2006 study on Iraqi deaths,” Media Hell, 2007

Soldz, Stephen, “Nature on Iraq mortality study,” Psyche, Science, and Society, 7/1/07

Spagat, Michael, “Ethical and Data-Integrity Problems In The Second Lancet Survey of Mortality in Iraq,” Defense and Peace Economics, February 2010

- “Mainstreaming an Outlier: The Quest to Corroborate the Second Lancet Survey of Mortality in Iraq,” Department of Economics Department, University of London, February 2009

Van der Laan, Mark, “”Mortality after the 2003 invasion of Iraq: A cross-sectional cluster sample survey”, by Burnham et al (2006, Lancet, www.thelancet.com): An Approximate Confidence Interval for Total Number of Violent Deaths in the Post Invasion Period,” Division of Biostatistics, University of California, Berkeley, 10/26/06

Van der Laan, Mark de Winter, Leon, “Lancet,” November 2006

- “Statistical Illusionism,” U.C. Berkeley, 2006

MUSINGS ON IRAQ

Thursday, August 2, 2012

Major Flaws With The Lancet Reports On Iraqi Deaths, Part II

No comments:

Iraq Oil Exports Continue Slow Decline But Revenues Up In January