Specifically, such decisions and the institutions that make them are frequently criticized by the scientific community as irrational and not properly founded on good scientific evidence and theory. For example, tort litigation which serves to compensate victims injured by others' activities and products, as well as to deter similar conduct in the future, has been criticized as making decisions, at least sometimes, on the basis of "junk science," i. e., scientific theories that have no basis in scientific fact or theory. 1 Also, regulatory institutions that attempt to prevent harm from technological products or processes before they injure the public or employees in the workplace have been criticized -- on the one hand for basing regulations, not on proper scientific risk assessments, but rather upon public sentiment and pressure, and on the other hand for being captured by industry's conception of the science.
As recounted elsewhere in this issue, Arthur Kantrowitz proposed a science court as an alternative or supplement to existing institutions. With advocates on both sides of a scientific debate and judges to evaluate their claims, a science court would evaluate only the scientific portion of mixed scientific-policy debates. His aim was to assure more accurate scientific information about technical issues, to limit the policy making powers of scientists (who are claimed to have too much policy influence), to prevent policy makers from hiding behind scientific conclusions for their policy decisions, and to identify and expose discredited scientific claims. 2
Kantrowitz originally suggested that a science court might work best when addressing relatively specific, but "big" or "large scale" questions, such as whether the SST would harm the ozone layer or how rapidly to reduce automobile emissions. Since then, however, writers have suggested that any such procedure should be directed at quite specific scientific and technological issues that are addressed by regulatory agencies or even those that may arise in tort cases.
Exactly what role science courts should have or the aims they should serve is unclear. Should they address only very general scientific questions, and thus serve to modify the procedures of institutions such as the National Academy of Sciences? Should they replace technical sections in agencies such as the Environmental Protection Agency (EPA), the Occupational Safety and Health Administration (OSHA), or the Food and Drug Administration (FDA)? As the symposium introduction asks: Would they constrain or enhance the influence of scientists in public policy debates? Moreover, would they facilitate separation of scientific and policy issues, and is this desirable? These are only a few of the questions that might be raised about science courts, but they are more than enough for this paper.
Many of Kantrowitz's concerns have merit. We should certainly resist scientists who propose their own projects and succeed because of special influence as may have happened with Star Wars. Also, even though scientists have technical expertise, that does not authorize them to make moral and political recommendations; their votes should count no more or less than others in a democracy. Neither is it desirable to have politicians avoid political responsibility for normative aspects of decisions nor for Washington science advisors to exercise inordinate influence. Finally, most would agree that it is difficult to comprehend and control technological developments. Any institutional procedure which adequately addressed these concerns is to be commended.
In what follows, I do not focus directly on these issues. Instead, I consider the strengths and weaknesses of science courts to address but one aspect of our technological revolution -- the identification and control of toxic substances. This group of mixed science-policy decisions may or may not be representative, but it serves to make issues specific, to illustrate some shortcomings of using science courts, and to illuminate some generic difficulties of relying upon scientific expertise to address sharply contested issues on the frontiers of knowledge.
With regard to control of toxic substances, it seems difficult, if not impossible, to separate socially useful scientific facts from normative judgments. It also appears that, because of different evidentiary requirements for scientific inquiry and public policy decisions, an attempt to separate the science from normative considerations and adhere to evidentiary standards typical of research science will poorly serve public debate on relevant issues. Thus, insofar as mixed science-policy problems raised by the identification, assessment and regulation of toxic substances are typical, science courts may not be the best way to address the scientific aspects of mixed science-policy issues. And if the problems raised here are atypical, this suggests limits to the applicability of science courts to technological problems.
The kinds of mistakes likely to result from imperfect institutional procedures involves the distribution of institutional mistakes. Decisions about potential toxic substances which rest on imperfect scientific procedures may result in false positives, also called type I errors (a substance is wrongly thought to cause harm or risks of harm), or false negatives, also called type II errors (a substance or product is wrongly believed to be "safe"). Which kind of mistake should be of most concern raises questions of the social consequences of different kinds of mistakes in the context of different kinds of institutions. For example, criminal procedures strongly protect against wrongly punishing an innocent person, the juridical equivalent of a false positive, but public health agencies tend to be more concerned with missing possibly dangerous product or diseases, than the criminal law or pure scientific research.
The costs of making one kind of mistake rather than another are especially important in designing and deciding upon evidentiary procedures. In assessing possible toxic substances, false positives may misdirect investment and impose costs on manufacturers, shareholders, and consumers of their products. The very existence of firms or product lines or the welfare of the public may be threatened. False negatives impose costs on those put at risk of death, disease, or compromised quality of life -- along with associated economic costs.
In science, peer review, demanding standards of evidence and research practices all protect against false positives to avoid mistakenly adding to the stock of scientific knowledge and misguiding future research. 3 In contrast, when scientific research is used for public heath purposes, such as designing procedures to screen for rubella antibodies, to detect the AIDS virus, or to identify toxic substances, the costs of false negatives become much more important. Thus, in addition to accuracy, simpliciter, in evaluating institutional decisions which have substantial social consequences, the distribution of the kinds of mistakes and the magnitude of the costs of those mistakes are of substantial moment. These two considerations are critical in evaluating the use of science in identifying and regulating carcinogens. 4
Third, for institutions that must function within a democratic society, to what extent should they articulate with the democratic procedures and reflect the democratic will? 5
Fourth, and related to the above point, institutions should be evaluated in terms of whom they politically empower. Kantrowitz seeks to limit the influence and power of scientists in institutions that must make mixed science-policy decisions and believes that science courts will achieve this goal. (Whether he is right on this is a further question to which we return.)
Fifth, in many cases the rate of decisions is also quite important; in others less so. In research science, the rate of scientific research is typically not a desiderata (except to the scientist concerned), unless the research has substantial social consequences. The rate of development of an atomic weapon was considered quite important during World War II and the rate of research on AIDS is presently of great concern. However, when institutional decisions based on scientific research have social consequences, such as those with which we are concerned, the rate of decision making may be very important.
Finally, it is important that the overall decision in mixed science-policy disputes be within the limits of acceptability to those affected by it. Presumably a science court would further this goal because at least the factual basis would be agreed to by the science court "judges" even if other aspects of the decision were subject to dispute. While science courts might preserve or enhance the standing of the scientific community, it is not clear that they will enhance the validity of overall mixed science-policy decisions.
To establish that a substance, X, is a human carcinogen, several policy decisions must be made. For epidemiological studies a number of judgments, not straightforwardly scientific, are appropriate: What are the relative weights of positive and negative studies? What are the relative weights of prospective and case-control studies? What statistical significance is required for studies to be considered positive? What is the scientific or public health significance of positive findings when the route of exposure is different from a population at risk? 8 And, finally, might these decisions be made differently for pure scientific research versus research for public health purposes?
Even the seemingly uncontroversial notion of statistical significance is infected with policy considerations. There is a scientific convention that a study must have a significance of .05 or lower to be scientifically valid, but this may not be decisive for either scientific or public health issues. One can show for an epidemiological study that on the same facts -- same sample sizes, same background disease rates, and same reported disease or death rate -- the choice of what constitutes a statistically significant result will affect whether the study is considered positive or negative. 9 The raw data are not decisive; interpretation and the purposes for which the study will be used are equally critical. Similar statistical and policy problems plague the interpretation of animal studies. The point is not just that scientists in particular areas of research need convention to guide their interpretation of results or to determine what scientific evidence is valid or credible. This is expected simply because they are engaged in a human activity. 10 Yet, the appropriate conventions may, and probably should, vary depending upon the purposes for which data will be used. I return to this in comparing burdens of proof in science and the law.
The first point is related to a second, equally basic issue. Even something as fundamental to scientific judgments as hypothesis acceptance and rejection tests have implicit social policy judgments in them. These tests were developed: 11
in a certain decision theoretic framework where there would be a clear basis for the test specifications... tests are formulated as mechanical rules or recipes for reaching one of two possible decisions: `act as if H were true' or `act as if H were false', according to whether H is accepted or rejected.... By considering such consequences the scientist is, presumably, able to specify the risks he can `afford'.... However, this opens the door to the... relativist's concern. For such considerations of consequences -- in our case social, ethical and economic -- are clearly policy matters; so it appears that specifying a test is tantamount to making a policy decision....
The problem with such tests is "automatically equating rejections of H (statistically significant differences) with finding substantively important discrepancie s from H, and failures to reject H with finding... unimportant discrepancies." 12
Moreover, this raises the concern that different statistical cutoff points may be appropriate for different scientific and social purposes and concerns about what constitutes appropriate "facts". If the conventional cutoff is met, scientific data are established as credible or valid, and if it is not, they are not. However, conventionally specified test cutoffs can be misleading for there may be perfectly good clues and evidence of harm even though a particular test of significance is not met. 13 And, such conventions may beg some social policy issues (I return to this later).
More neutral ways have been suggested for presenting the results of statistical studies, ones that do not rely upon conventions of either statistical significance or hypothesis acceptance and rejection. 14 These strategies avoid some of the problems of relying upon policy considerations, but they pose others. For one thing, the attempt to purge carcinogen risk assessment of all policy considerations is likely to make the resulting document or report nearly unintelligible to all but a small class of experts -- and of little use for the larger policy debates. This hardly serves social debate or understanding. For instance, in reporting the results from animal studies, should scientists report merely the number of tumors in the experimental and the control groups (which will not be terribly helpful to policy makers and can be controversial in contested cases) or go beyond this (relatively) raw data 15 and judge whether this increase is significant? If the latter, they then encounter the problems just indicated. Of course, scientists might agree that there was a statistically significant increase in animal cancers but disagree whether that was evidence that a substance caused human cancer. But these further disagreements will also likely depend upon implicit policy considerations. 16 Second, according to the science court model, science court judges would render a decision about which side is right. But this will pose other problems (discussed later). It seems, then, that advocates of science courts may be unduly optimistic about possible agreement in controversial cases.
Further, scientists and policy makers using animal studies as evidence for the claim that a carcinogen has a certain potency in humans face more problems. If scientists from opposing sides in a science court were to try to agree about what they knew, they might only agree on the experimental evidence, i. e., on the tumor counts in experimental and control groups of animals (and even that could be controversial). It is unlikely they would agree on the potency numbers at low doses for humans. 17 In fact, if one insisted on unanimity for a science court in sharply contested cases, it would have to report that "there was not agreed-upon scientific evidence of the potency at low doses."
Thus, contrary to assumptions favoring science courts, it is, at a minimum, difficult to separate scientific fact from policy considerations in carcinogen risk assessments. Or, if there is success in achieving the separation, the results may be of little use in public debate. Thus, it is not clear how many facts might be scientifically uncontroversial to judges of such a court. And separately, it is unclear in sharply contested cases how much agreement could emerge between opposing scientists. 18
For example, to have sufficient evidence that a substance is a carcinogen, some scientists have insisted not only on good epidemiological studies and good animal studies in several different species at exposure levels and routes of administration equivalent to that of humans, but also several positive short-term tests. 19 Requiring such evidence as necessary and sufficient to treat something as a human carcinogen will lead to disagreement between scientists and to disagreements between some scientists and the public and regulatory communities. Thus, adversarial scientists are unlikely to agree on such a point in sharply contested cases. Further, the use of such stringent standards of what counts as a human carcinogen would frustrate present regulatory policies even more than at present, 20 because for very few substances is all the required evidence available. 21
Moreover, disagreements about the sufficiency of evidence that X is a carcinogen show how evidentiary standards serve different policy aims depending upon the context. In many circumstances, there can be an inconsistency between scientific and legal or social norms of good evidence -- which is in turn a result of the consequences of different mistakes that are of concern in science and in protecting public health. Scientists primarily seek to prevent false positives, from falsely adding to scientific knowledge and mistakenly chasing research chimeras, while public health advocates are typically concerned to prevent false negatives, to avoid mistakenly treating a toxin as non-toxic. Even if both scientists and public health advocates were to agree as a matter of a decision principle on more general principles for addressing the costs of mistakes (as they might well), 22 they are likely to disagree about the relative weight of mistakes to avoid: false positives or false negatives. Thus, they could and probably will endorse different evidentiary standards to avoid the mistake they regard as worse. 23
To see this point, consider analogous issues in the law. For various policy reasons, differing evidentiary standards and burdens of proof 24 are assigned to one side in legal controversies. Those who fail to carry their burden to the satisfaction of fact finders such as juries lose, i. e. they fail to establish certain facts for legal purposes. Consider, for example, criminal insanity cases. If a defendant in a murder case wishes to argue that he is not guilty by reason of insanity, then the defendant must raise that issue. However, this does not dispose of the burden. As discussed in a legal treatise on evidence, there several possibilities. 25 First, the defendant might have to establish insanity by a preponderance of the evidence, i.e., insanity is more probable than not. Or he might have to establish it by "clear and convincing evidence" or the criminal law's much more difficult "beyond a reasonable doubt" burden. Conversely, once a defendant raises the question of sanity, the prosecution might have to show "beyond a reasonable doubt" that the defendant is sane. Thus, whether an otherwise guilty defendant is adjudged not guilty by virtue of insanity is dependent upon which evidence is found credible in accordance with established burdens and levels of proof, i.e., upon who has the burden of proof, how demanding the burden is and whether the party with the burden has carried it. Thus, identical evidence can clearly lead to different legal outcomes in different jurisdictions.
Similarly, implicit burdens of proof in science and in policy areas that must use scientific evidence determine how much evidence is sufficient for such areas. I have argued elsewhere that the scientific .05 test of significance is roughly equivalent of the criminal law's "beyond a reasonable doubt" burden of proof. If this standard is imposed in tort or regulatory proceedings that normally have less demanding burdens of proof, this will frustrate important social goals. Requiring proof beyond a reasonable doubt would be inappropriate when the "more likely than not standard" has been long established as sufficient. 26
Moreover, in carcinogen risk assessment and risk management, if there are disagreements about the sufficiency of evidence that a substance is a carcinogen and the more demanding standards of research science prevail, 27 this will tend to skew policy outcomes in favor of the status quo. In most cases, this will leave substances in commerce until they are found to pose risks of harm; in a few others, it may keep them out. 28 If scientists must have the same degree of confidence in results that they would if they were doing scientific research, this will result in several "no decision" or "not enough information for decision" judgments. 29 Such judgments, while appropriate for scientific research, in regulating toxic substances for public health purposes may not be appropriate where scientists may have some information in the form of clues of harm. Failure to act may leave people at risk from exposure to a toxic substance.
Furthermore, a report by a science court about what is and is not known about the substance in question, even if this leaves open the decision about appropriate action, may be misleading and inadvertently influence the public debate and decision. Kantrowitz indicates that such reports would have "presumptive validity." 30 If "Does X cause cancer in humans?" is the question framed and answered by the court, the answer according to some will be negative on the evidence of a few positive animal studies on scientific grounds alone. 31 Thus, the court might answer that there is some evidence of carcinogenic activity in animals and in short-term tests but, as a matter of the science, it cannot be concluded that X is a human carcinogen. Such a decision could easily be arrived at by scientists indicating what they did and did not know about a substance. Yet, it might be quite misleading and unduly persuasive in public debates if the public fails to understand the implications of the claim about human carcinogenicity. That scientists do not have the kind, amount and quality of evidence needed for certain purposes to conclude that X is a human carcinogen should not determine whether it should nevertheless be treated as such for other purposes.
Moreover, this concern about the contents of a report from a science court raises further issues about the implicit burdens of proof in science and its evidentiary conservatism.
Implicit burdens of proof in science have been designed or have evolved to serve the aims of scientific understanding and research, not necessarily to serve public policy purposes. It is not surprising, then, that scientific and public health burdens of proof, for example, may be different. However, we should not have institutional procedures that may conflate the two and substitute a less appropriate burden. Proposed science courts risk doing just that. Thus, a finding that there is not sufficient evidence that a substance is a human carcinogen may merely be the claim that the scientific burdens of proof required to show that it is a human carcinogen have not been met. It does not follow that the substance ought not to be treated as a human carcinogen or that it ought not to be the object of precautionary regulatory action. In fact there may be sufficient evidence to take precautionary action, even if there is not sufficient evidence to establish on research scientific grounds that the substance is a human carcinogen.
If a science court implicitly endorses the evidentiary standards of research science, it inadvertently reinforces what is typically one (the anti-regulatory position) side of the larger public debate about identifying and regulating carcinogens primarily because of the laws involved. Such an outcome may be inadvertent -- resulting from the design of science courts and the explicit or implicit evidentiary procedures that scientists bring to it. We should not permit such accidental results, but design institutional procedures to address the problems directly. Perhaps a better approach (to which I return at the end) is to have public debates about the wisdom of one mixed science-policy course of action versus others, with all sides being as explicit as possible about both the scientific and the policy aspects of their view and then resolve those in some appropriate way. 32 The science court proposal appears to predispose such debates in inappropriate ways.
The burdens of proof just described make scientific research an epistemically conservative institution; information is added cautiously and only after demanding scrutiny. In the regulation of toxic substances this can have substantial effects on public policy and the public health. There will be a tendency to evaluate each substance on a substance-by-substance basis, for otherwise one is not evaluating carefully each substance. Such analysis is slow, and several procedures for identifying carcinogens are insensitive (they may not detect a risk of harm even when it is present). This, combined with many laws that leave carcinogens in commerce until an agency has established that they pose risks of harm, indicates that few substances will be addressed. Both may leave human carcinogens unidentified and unaddressed as carcinogens for a considerable period of time. Science courts appear to reinforce this evidentiary conservation, which raises questions about their appropriate use.
It seems clear that science courts should not become part of the ordinary regulatory procedures for evaluating toxic substances. Present procedures for identifying toxins, conducting potency and exposure assessments and coming to a regulatory decision are much too slow. Their slowness leaves a large universe of substances unassessed, and thus people at risk. We need faster, not slower and not more science-intensive procedures. 33 Science courts as part of ordinary regulatory processes at the EPA, OSHA or FDA seem likely to make regulatory processes even slower than they are at present. Proposed science court procedures seem roughly analogous to formal adjudicatory procedures within the agencies, the slowest and most cumbersome legal procedures that are most likely to frustrate more expeditious agency actions. Such considerations argue against incorporating science courts into ordinary agency procedures.
After reflecting upon some of the shortcomings of procedures aiming to separate scientific fact from policy and issues of "accuracy," I am not sure science courts would serve well to address even large scale, more general scientific issues. For one thing, such questions frequently are not posed except in the more ordinary processes of regulatory agencies. Questions about nuclear safety usually are raised only when there are regulatory questions: Should a new reactor be licensed? Should an old one be decommissioned? However, it is possible that science courts could have a role in addressing large-scale issues provided that they do not frustrate the charge of the agency they aim to serve. On occasions when there might be special inquiries independent of regulatory proceedings, science courts might appropriately be used, provided they have no other serious shortcomings.
If more general, as well as specific, scientific debates are infected with policy considerations, incorporate possibly misleading standards for the sufficiency of scientific evidence and such, so as to make it difficult to separate scientific facts from values, then I have reservations about using the courts to address even more general issues. But this is more open, depending upon the area of science and the issues at stake. Not all technological and scientific issues on the frontiers of scientific knowledge may be as unsettled, uncertain, and infected with social policy issues as carcinogen risk assessment. If not, then science courts may be more appropriately used.
The last point raises a fundamental psychological or sociological point underlying the felt need for a science court. Kantrowitz and others argue for the importance of accuracy, worrying that science may become tainted if it is not accurate or becomes too contaminated by public policy disputes. This is a legitimate concern, but there is more than one side to the issue. If science maintains its "purity," this insures that, when scientists speak as scientists, they will have a certain automatic credibility. Their statements will be adjudged as beyond dispute, at least by lay persons. While it is easy to see the benefits for the scientific community, benefits for the general public are less clear. Scientific evidentiary standards preserve the knowledge status quo until evidence passes enough tests to overcome the implicit burdens of proof in the field, but this can frustrate good public health policy.
For the sake of public health protection, it is important that the public question both policies and their scientific bases. Thus, if the findings of a science court were to have a kind of automatic credibility which is epistemically conservative and which may beg policy questions, this is not necessarily desirable. It may be much better for the public to be skeptical toward both the scientific and policy judgments, so that experts in both areas are forced to defend their positions. Speaking as one who came to debates about carcinogen risk assessment from a non-biological background, it is clear that we should be skeptical of scientific claims for two major reasons. First, the "best science" conventionally conceived is epistemically and (typically) normatively conservative. Thus, it reinforces the regulatory status quo. Also, what counts as evidence in debates about carcinogens is so normatively laden that one's larger philosophic views about health and larger social policies are critical. Thus, scientific claims should not be free from public scrutiny and skepticism. It is easy for putative "scientific facts" to disguise either implicit or explicit world views that should properly be open to debate. The remedy is not to separate science from policy -- if this can even be done while providing useful information -- but to increase scrutiny of both scientific experts and policy makers.
This raises a final point about the science court proposal. The suggestion appears to be that the scientific issues related to a particular policy question can be separated and decided independently of the particular policy, much as scientists might decide issues in their lab or at a conference. And the metaphor of a science "court" further suggests that "judges" might issue a judgment about which side has the correct scientific view, independent of the policy considerations.
Enough has been said to indicate what is problematic about both suggestions. There is an important interaction and substantial subtle understanding needed between the scientific data, evidentiary standards for judging it, and the policy decisions to be made. In assessing contested science-policy issues, one must judge which composite data-evidence-policy decision would be best in the circumstances, rather than have a scientific judgment endorsed by a science court independent of evidentiary standards appropriate to the policy and independent of the policy goals followed by contested policy debates of the scientific decision. At least in carcinogen risk assessment and risk management, it appears better, recognizing the complex data-evidence-policy relationship, to subject the composite product to careful scrutiny, including the science, than to try to address the scientific and policy issues completely separately.
Throughout, I have used the assessment and regulation of toxic substances in examining mixed science-policy disputes. Problems in this area may or may not be representative of those in other areas. They appear representative, however, when the following features exist: Factual issues, plagued by considerable uncertainty and information gaps, are at the frontiers of scientific knowledge and infected by policy considerations; the kinds and costs of mistakes are quite different, in light of policy goals, from those posed in typical research science procedures; conservative evidentiary standards of science favor one of the obvious positions in the political debate; and the rate at which mixed science-policy decisions are made is important. In such cases, issue separation as proposed for science courts seems on balance undesirable. The extent to which this characterizes science-policy disputes elsewhere is for others to decide.
