Chapter 23. Deliberate bias: Conflict creates bad science

When proper scientific procedure is undermined by conflicting goals so that it results in deception, we say it is biased. This form of bias is prevalent in advertising - companies universally advocate their products, emphasizing product assets while concealing product faults and concealing the advantages of competitor products. You don't expect an advertisement from a private company to offer a fair appraisal of the commodity. But bias even exists in the way that states promote their lotteries by advertising the number of winners and money "given away" without telling you the number of losers and money taken in.

Deliberate bias occurs in science as well as in business and politics. The potential for bias arises when a scientist has some goal other than (or in addition to) finding an accurate model of nature, such as increasing profits, furthering a political cause, or protecting funding. As an example, the environmental consulting firm that renders an opinion about the environmental effects of the latest subdivision may not give a completely accurate assessment. The future work that this firm receives from developers is likely to depend on what they say. If the developers don't like the assessment, they will likely find another environmental next time. Thus there is a conflict between obtaining an unbiased model, and making the largest profit possible. In these cases, the scientists are motivated to present biased arguments. There are many situations which present a conflict between scientific objectivity and some other goal.

Drugs and medicine

We take for granted an unlimited supply of medicinal drugs. If we get pneumonia, gonorrhea, HIV, or cancer, the drugs of choice are invariably available. They may not cure us, but whatever drugs we know about (that have been approved) are in abundant supply.

For the most part, this abundance of and reliance on drugs comes from public trust of health care. We don't imagine that our doctors try to prescribe us useless or unnecessary drugs (if anything, the patient often requests drugs when they are unnecessary). But in reality, many drugs ARE unnecessary, and some drugs are no better than cheaper alternatives. The Food and Drug Administration (FDA) is charged with approving new foods and drugs for the U.S. In 1992, it was approving about 20 new drugs a year but regarded only about 20% of those as true advances. So many drugs are no better than alternatives (many are obviously at least slightly worse than alternatives). And physicians often don't have the evidence to know which drugs are best.

The goals of consumers are in conflict with those of drug companies in some respects.

The consumer wants drugs that are

If two drugs are equally effective, we want the cheaper one. We may even not want the most effective drug if a cheaper one will do the trick.

But the goals of any drug company are different:

which may involve

It costs to hundreds of millions of dollars to get a drug approved by the FDA now. Much of the cost is in research and trials, but even FDA consideration itself costs millions. So it is not cheap. Most important is time, because the sooner a drug hits the market, the sooner the company reaps the benefits. So drug companies have strong incentives to market any product that is approved by the FDA -- once approved, the major costs of money (and time) have already been borne.

Of course, it does not behoove a company to market a harmful product -- liability costs can be quite high. But most products that pass all the hurdles of FDA approval can be regarded as harmless at worst. The drug company has a very strong incentive to market its approved products regardless of whether the consumer benefits or not. One of the most economically successful drugs ever was an ulcer medicine that reduced suffering. It did not cure ulcers but instead had to be taken as long as the patient had the ulcer. Research later found that most ulcers were caused by a bacterium and that treatment with antibiotics cured the ulcer. So the original ulcer treatment was based on a misunderstanding of the cause of ulcers.

The drug industry is a relatively new one. Yet is it also a huge one economically. Perhaps as a consequence of this high-dollar label, its practices have come under scrutiny in the last decade or so. In spite of the public trust of health care in this country, many practices of the drug industry are not in the best interest of the patient. They instead seemed to be directed at promoting and selling products, and for a prescription drug, the key player in drug sales is the physician. Drug companies have thus made huge efforts to influence physicians toward selling company products. Some of the major (legal) practices that have been uncovered include:

1) Drug companies have paid for university research on their products and have then blocked publication of unfavorable results and/or cut continued funding of the work when the results began to look bad.

2) Pharmacy sales people routinely visit physicians at work, offering them free lunches, free samples of medicines, gifts, information to promote their products, and notepads and pens with company logos. (Next time you visit a physician, look around the inner rooms for evidence of company logos on charts, pens, pads, and so on.)

3) To maintain their licenses, physicians are required to take courses in continuing medical education (CME). These courses are often sponsored and paid for by drug companies in exotic locations and with hand-picked speakers who provide favorable coverage of company products.

4) Drug companies publish ads in medical journals that look and read like research articles. These ads promote products.

These practices are not in the interest of good medicine, unless one assumes that what is good for the drug company is good for us.

DNA

A second, well-documented case in which conflict is manifested is over DNA typing. These examples may not reflect current debate over DNA technology, but one should use them to appreciate the strong potential for conflict over any scientific issue in the legal system.

The rush to implement DNA typing in the U.S. criminal system was done before guidelines were set for proper DNA typing procedures. Consequently, there were varying levels of uncertainty in the use of these methods by law enforcement agencies and commercial labs. They were also reluctant to admit the uncertainty. The manifestation of conflict over DNA evidence was thus heated and surfaced in the popular press on several occasions. We introduce this material in a prospective manner - by first imagining how the prosecution, defense, and forensic lab can be expected to behave to achieve their goals in ways that are contrary to fair scientific procedure. You have already been exposed to the nature of the models and data in DNA typing, so now the issue is how the different legal parties deal with the problems in evaluation and ideal data.

If a case that uses DNA typing has come to trial, we can assume that the DNA results support the prosecution's case. There are thus three parties in conflict:

Prosecution <<< (in conflict with) >>> Defense <<< (in conflict with) >>> DNA lab

We can assume that the DNA lab's results support the Prosecution's case, or there would not be a trial, so the conflict will lie between the Defense and the other two agencies. Now consider how this conflict might be manifested.

I) What might the prosecution do to improve the chances of reaching its goals?

II) With respect to the errors and uncertainties of DNA evidence in specific cases:

Evidence from DNA cases

The case histories available from the last 4-5 years of DNA forensics verify many of these expectations. In particular, DNA testing has omitted such basic elements as standards and blind procedures (I.1 above); the prosecution in the Castro case ignored inconsistencies in the evidence (I.5, I.6); the lab in the Castro case overstated the significance of a match and defended such practices as failing to include a male control for the male-specific probe (III.2, III.3). Defense and prosecution agencies definitely keep lists of sympathetic witnesses (I.3, II.1), and defense agencies indeed choose witnesses to challenge the nature of DNA evidence based on its (necessarily false) assumptions (II.3). And finally, harassment by the prosecution of experts who testify for the defense is well documented, both in the courtroom and outside (I.2). This harassment includes character assassination on the witness stand, implied threats to personal liberties for witnesses who were in the U.S. on visas, and contacts made to journal editors to prevent publication of papers submitted by the witness. Some of these cases have been described in the popular press, and others are known to us through contacts with our colleagues. These latter manifestations of conflict don't make sense in the context of a single trial, but they stem from the fact that networks exist in which prosecuting attorneys share information and parallel networks exist among defense attorneys. Expert witnesses are often used in multiple trials across the country, so any expert witness who is discouraged at the end of one trial will be less likely to participate in future trials, and conversely, an expert who does well in a trial may be more likely to participate in future trials.

The suggestions of harassment have even extended to scientists who merely publish criticisms of forensic applications of DNA typing. In the 1991-92 Christmas break, two papers were published in Science on opposites of the DNA fingerprinting conflict (Science 254: 1745-50, and 1735-39). At the same time, news items were also published in Science, Nature, The New York Times, and The Washington Post, in which the details of this conflict were aired in full detail. The authors of the paper opposing use of current DNA typing methods (Lewontin & Hartl) were phoned by a Justice Department official and allegedly threatened with retaliation (having their federal funding jeopardized); the official denied the threats but did not deny the phone call. The Lewontin-Hartl article had apparently been leaked in advance of publication, and an editor for Science contacted the head editor to have a rebuttal published. However, it turned out that the editor requesting the rebuttal owns a patent for DNA fingerprinting and stands to benefit financially by forensic use of DNA typing methods. The two authors chosen for the rebuttal had been funded by the FBI to work on DNA fingerprinting methods. So, there appears to have been some serious conflicts of interest at least on one side of the issue.

This treatment of conflict in DNA trials has omitted a 4th category of people whose goals may conflict with the agencies above: the expert witnesses themselves. The goals of expert witnesses may be varied, including monetary (the standard rate is $1000/day in court), notoriety, and philosophical (e.g., some people volunteer to assist the defense on a no-cost basis, merely to ensure a fair trial).

EMFs

The last specific example to be discussed concerns some old examples of conflict between power companies and population exposure to high-intensity EMFs. The primary conflict has been between the people exposed to EMFs (e.g. those living under high-voltage power lines) and corporations or government agencies responsible for various aspects of electric power. In a series of three articles in The New Yorker (12, 19, and 26 June, 1989), Paul Brodeur described in a somewhat sensationalist manner how various agencies dealt with the potentially damaging evidence that EMF-emitting utilities may be harmful, the various extents to which they attempted to obscure and suppress evidence contrary to their views, along with the suspicious ill fates that befell the careers of scientists who testified in opposition to these organizations. For example, the Navy "classified" a report of such evidences to prevent its dissemination. The New York Power Company attempted to discredit scientists testifying against it. Witnesses testifying against the NYPC lost appointments and funding. The President of the National Academy threatened to sue the Saturday Review for an article criticizing its own investigation of this matter. Fortunately, the suppression of possible harmful effects of EMFs happears to have subsided, and power companies have even recently funded research into the problem.

The footprints of bias: Generalities

The attempt to deceive in science may take many specific forms. At a general level, arguments may avoid the scientific method entirely, or they may instead appear to follow scientific procedure but violate one or more elements of the scientific method (models, data, evaluation, revision). The next few sections of this chapter describe different kinds and levels of possible bias at a level that transcends specific cases. These generalities are useful in that they enable you to detect bias without knowing the specifics of the situation.

The standard scientific approach to evaluating a model is to gather data. If you suspect bias (e.g., you doubt the claim of support for a model), the ideal approach is to simply gather the relevant data yourself and evaluate the claim. But this approach requires time that none of us have (we can't research everything). In many cases, blatant examples of bias can be detected by noting some simple features of a situation..

Look for Conflict of Interest

The first and easiest clue to assist you in anticipating deliberate bias is conflict of interest. If another party's goal differs from your goal, and your goal is to seek the scientific "truth", then there is a good chance that that party is biased -- just as you may be biased if YOUR goal differs from seeking scientific truth. Service on many Federal panels requires a declaration of all conflicts of interest in advance (and you will be excused from consideration where those conflicts lie). That is, the government avoids the mere appearance of bias based on the existence of conflict, without looking for evidence of actual bias. However, in our daily lives, we are confronted with conflict at every turn, and we can't simply avoid bias by avoiding interactions involving conflict (e.g., every time you make a purchase, there is a conflict of interest between you and the seller). Thus, being aware of conflict is a first step in avoiding bias, but you can also benefit by watching for a few symptoms of bias.

Non-scientific arguments (blatant bias)

Sometimes, someone is so biased that they resort to lines of reasoning and argumentation that are clearly in violation of science. These cases are easy to expose, because they can be detected without even looking at data or analysis. And many of them are already familiar to you, as given in the following table:

Arguments in Violation of the Scientific Method

Appeal to authority

Appeal to authority is the defense of a model by indicating that the model is endorsed by someone well known (an authority). A model should stand on its own merits. The fact that a particular person supports the model is irrelevant, though the specifics of what they have to say may assist you in evaluating the model.

Character assassination of opponent

Character assassination is the attempt to discredit someone's character (e.g., point out that they associate with undesirable people, etc.). The character of somebody is irrelevant to the evidence they present that supports or refutes the model. We should evaluate the evidence, not the person presenting it.

Refusal to admit error

Refusal to admit error is the refusal to specify the conditions under which a model should be rejected or the refusal to accept its refutation in the face of solid evidence against it. All models are false, and anyone who refuses to discuss how their model could be seriously in error is obscuring a fair appraisal of their model (or is using an unfalsifiable model)

Identify trivial flaws in an opponent's model

This violation refers to the practice of searching for unimportant details about a model that are false, and using those minor limitations as the basis for refuting the model. The fact that all models are false does not mean that all are useless. Yet it is a common trick of lawyers to harp endlessly on the fact that a particular model advocated by their opponent is not perfect and thus should be abandoned.

Defend an unfalsifiable model

A model must be falsifiable to be useful. "Falsifiable" merely means that it could be refuted if the data turn out to be a certain way. An unfalsifiable model is one that cannot be refuted no matter how the data turn out. Creationists, for example, adopt and then defend an unfalsifiable model. An unfalsifiable model is one that is framed so that we could never gather data to show it is wrong. By contrast, science is predicated on the assumption that all models will eventually be overturned.

Require refutation of all alternatives

A special case of defending an unfalsifiable model, this one is subtle. It is takes the form of insisting that a class of models is correct until all variations of them have been rejected. As an example, we might refuse to accept that the probability of Heads in a coin flip is 0.5 unless we reject all alternatives to 0.5. Whereas it is possible to refute that the probability of Heads in a coin flip is 1/2, it is impossible to refute that the probability of Heads is anything other than 1/2, because that would mean showing it is exactly 1/2. (It would take an infinite number of flips to reject everything other than 1/2.) This argument also takes the form of claiming that there is some truth to a model until it has been shown that there is nothing to it at all.

Use anecdotes and post hoc observations

This category represents a non-systematic presentation of special cases made in defense of (rather than as a test of) a particular model. An anecdote is an isolated, often informal observation made without a systematic, thorough evaluation of the available evidence. As a selected observation, it is not necessarily representative of a systematic survey of the relevant observations. Post hoc observations are observations made after the fact, often to bolster a particular model. It is easy to select unrepresentative data that support almost any model.

 

Perhaps the most subtle but useful of these points is the refusal to admit error. In science, models are tested precisely because the scientist acknowledges that every model has imperfections which may warrant its abandonment. Someone who is trying to advocate a model may want to suppress all doubt about its imperfections and thus suggest that it can't be wrong. That attitude is a sure sign that the person is biased. Of course, in many cases you will already know that the person is biased (as with a car salesperson), and the best that you can hope for is to determine how much they deviate from objectivity.

Subtle violations of the scientific method: Experimental design

The template for ideal data presented earlier is a strategy for producing data with a minimum of bias. But the template can be applied in many ways, and someone with a goal of biasing data can nonetheless adhere to this template and still generate biased data. Let's consider a pharmaceutical company testing the efficacy of a new drug. How many ways can we imagine that the data reported from such a study might be deliberately biased, when the trials are undertaken by the company that would profit from marketing the drug? The following table lists a few of the possibilities.

Bogus designs

Violation of accepted procedure

Impact

Change design in mid-course

An investigator may terminate an experiment prematurely if it is producing unwanted results; if the experiment is never completed, it will not be reported.

Assay for a narrow spectrum of unlikely results

The public well being is many-faceted, and a product is unlikely to have a negative impact on more than a few facets. With advance knowledge of the likely negative effects (e.g., a drug causes brain cancer), a study can be designed to purposefully omit measuring those negative effects and focus on others (e.g., colon cancer). Were the subjects a fair sample of the relevant population? The medicine might be more effective on some age groups than others, so the study might be confined to the most responsive age groups (determined in preliminary trials). While the data would be accurate as reported, details of the age group might be omitted to encourage a broader interpretation of the results than is warranted.

Protocol concealed

It is easy to write a protocol that conceals how the study was actually conducted in some important respects. For example, was a blind design really used? Although a blind design exists on paper, it is possible to let patients and staff know which patients belong to which groups. Indeed, patients can sometimes determine whether they are receiving the drug or placebo. Were the controls treated in exactly the same manner as the group receiving the medicine? It is possible to describe countless ways in which the control group and treatment group were treated similarly, yet to omit ways in which they were treated differently. The medicine might be given along with some other substance that can affect patient response, with this additional substance being omitted from the placebo.

Small samples

Science often assumes "innocent until proven guilty" in interpreting experiments designed to determine if a product is hazardous. Small samples increase the difficulty of demonstrating that a compound is hazardous, even when it really is.

Non-random assignments

Most studies, especially those of humans, begin with enough variation among subjects that random assignment to control or treatment groups is essential to eliminate a multitude of confounding factors. Clever non-random assignments could produce a strong bias in favor of either outcome.

Pseudo controls

There are many dimensions to the proper establishment of controls, including assignment of the control groups and subsequent treatment of the controls. It is possible to describe many ways in which a control is treated properly while omitting other ways in which a control is treated differently.

 

We can obviously find additional ways to bias the outcome of tests. Short of undertaking the study yourself, or having a neutral organization conduct the study, there are always ways to present a biased model comparison while nonetheless being absolutely truthful about the experimental design.

Biased model evaluation

Even when the raw data themselves were gathered with the utmost care, there is still great opportunity for bias. Bias can arise as easily during data analysis, synthesis and interpretation, as during data gathering. This idea is captured in the title of a book published some years ago, "How to Lie With Statistics." Two methods of biasing evaluation are (i) throwing out some of the results, and (ii) searching for a statistical test to support a desired outcome.

Throwing out results. We often assume that a study reports all relevant results. But studies often have (valid) reasons for throwing out certain results. Throwing out results can also bias a study, however. If we flip a coin ten times, and we repeat this experiment enough times, we will eventually obtain 10 heads in some trials and ten tails in others. We might then report that a random coin flip test produced ten head (or tails), even though the entire set of results produced an equal number of heads and tails - by failing to report some results, we have biased those that we do report. For example, a product test may have been repeated many times, with the ones finally published being limited to those favoring the product.

This principle applies widely. In a court case, the defense will only present the data that they have that tends to exonerate their client. In other cases, the models being tested during the evaluation and revision step of the scientific method are not representative of all those that could be compared. For example, the U.S. Forest Service, when writing management plans for National Forests, is often required to compare several alternative management options. The bias arises because the alternative options (e.g. models) that the U.S. Forest Service considers are not always representative of all possible management options. Hence the models being evaluated are a biased subset of all those conceivable.

Searching for the "right" test. There are hundreds of ways to conduct statistical tests. Some study designs fit cleanly into standardized statistical procedures, but in many cases, unexpected results dictate that statistical tests be modified to suit the circumstances. Thus, any one data set may have dozens to hundreds of ways of being analyzed. In reporting the results, someone may bias the evaluation step by reporting only those tests favorable to a particular goal. We should point out that this practice offers a limited opportunity to bias an evaluation. If the data strongly support a particular result, it won't be easy to find an acceptable test which obscures that result.

Very subtle bias: Controlling the null model

As noted in a previous chapter, many evaluations are based on a null model approach: the null model is accepted until proven wrong. To "control" the null model means to "choose" the null model. Choice of the null model can have a big effect on the outcome of even the most unbiased scientific evaluation for the simple reason that a null model is accepted until proven guilty. Any uncertainty or inadequacy in the data will thus rule in favor of the null model. By choosing the null model, therefore, many of the studies testing the model will "accept" it, not because the evidence for it is strong, but because the evidence against it is weak. As a consequence, the null model enjoys a protected status, and it is to anyone's advantage to choose which model is adopted as the null model. Choice of the null model in this sense does not even mean developing it or proposing/inventing it. Given a set of alternatives decided upon in advance, controlling the null model means simply the selection of which model from that set is adopted.

Consider the two alternative models that might be used in approving a new food additive for baby formula:

As the null model, (a) requires a rigorous demonstration of the safety of a food additive before it is approved. In contrast, (b) requires that an additive can be used until a harmful effect is demonstrated. As noted in the Data chapters, an enormous sample size might be required to demonstrate a mild harmful effect, so a harmful product could reach market much more easily under null model (b) than under (a).

Choice of the null model represents a powerful yet potentially subtle way in which an entire program of research can be biased. Every other aspect of design, models used, and evaluation could meet acceptable standards, yet choice of a null model favorable to one side in a conflict will bias many outcomes in favor of that side.

Minimizing the abuses

Recognizing the possible abuses of science is the simplest and most effective way to avoid being subjected to them. Beyond this, we can think of no single simple rule to follow that will minimize the opportunity for someone to violate the spirit of science and present misleading results -- there are countless ways to bias data. One strategy to avoid bias is to require detailed, explicit protocols. Another is to have the data gathered by and individual or company lacking a vested interest in the outcome. But even with these policies, there is no guarantee that deliberate biases can be weeded out. The following table gives a few pointers.

Ensuring legitimate science

Property of Study

Impact

Publish protocols in advance of the study

Prevents mid-course changes in response to results; enables requests for design modifications with little cost.

Publish the actual raw data

Enables an independent researcher to look objectively at the data, possible uncovering any attempts to obfuscate certain results.

Specify evaluation criteria before obtaining results

Minimizes after-the-fact interpretation of data.

Anticipate vested interests

Conclusions of individuals, corporations, and political bodies can be predicted with remarkable accuracy by knowing their financial interests and their political and ideological leanings. Understanding these data helps immensely in understanding how they may have used biased (but perhaps, well-intentioned) methods in arriving at conclusions.