Point-Counterpoint: Is the System for Awarding U.S. Government Basic Research Grants Scientifically Bankrupt?
“They left innovation out because it seemed a bad idea to suggest that every grant should strive for creativity.” This statement was made by (a) the leader of a mediocre team of scientists, attempting to justify the pedestrian research proposed by his group. (b) the head of a National Institutes of Health (NIH) grant evaluation panel, sarcastically lampooning a pedestrian research proposal by a mediocre team of scientists. (c) the director of research for the government of a tyrannical dictatorship famous for its suppression of dissent, describing research-award policies in his country. (d) the chief of research for a company that has gone out of business, describing why his company failed after it fell way behind the competition in introductions of innovative new products. (e) the extramural research director at the National Institute of Mental Health, approvingly describing the recommendation of an internal NIH panel he co-chaired for restructuring the way grants are evaluated.
The correct answer to this multiple-choice question believe it or not, is (e), as quoted in Science (Marshall, 1996, p. 1257). For years, many scientists have feared that the system of awarding grants at NIH and other government agencies covertly discourages creativity and risk-taking. The view of these scientists mayor may not have been correct in the past, but currently, it is not correct, but only because the discouragement is no longer covert.
How could an internal NIH panel believe that creativity is unnecessary for grant proposals, in an era in which many panels make awards to single-digit percentages of applications? How could this panel have rejected the dissent and alternative proposal of a member of the panel, biologist Keith Yamamoto (1996), of the University of California-San Francisco, urging that “creativity or innovation” be explicitly recognized as important in grant proposals? How has a federal granting institution reached the point where high-level officials would assign a backseat to creative innovation in science?
I would like to suggest five things that have gone wrong, as well as what can be done to correct them. As a psychologist, I am particularly concerned about how these issues apply to psychology. In some cases, the problems in psychology are even worse than in other fields: Funding is particularly tight in psychology compared with, say, biology, and psychologists are reputed to be more critical of their colleagues’ work than are scientists in any other field. But the same issues apply in any science.
The important thing to remember is that although it is easy to point to federal agencies, the Congress, or anyone else as the enemy, the first fingers we need to point are at ourselves, because we are the ones who review grant proposals and serve on the panels that evaluate the reviews. Ultimately, the people in the federal government represent us. If there are changes to be made, we need to stall making them ourselves. Thus, the five points made below apply to all of us, not just to those who serve in government. When it comes to the system for awarding grants, employees of funding agencies are not our worst enemies; we are our own worst enemies.
(1) Low selection ratios coupled with ceiling effects in a rating scale create a blackball system.
The problem: NIH currently uses a 5-point rating scale, with” I” the rating of highest priority. Because selection ratios for funding are so low, an averaged rating just slightly over 1.0, such as 1.2, can be marginal in terms of actual funding. The result of such a system is that even one negative appraisal can effectively veto funding of a grant proposal. Consequently, the reward system values proposals that offend no one, and devalues risky proposals that have a higher probability of being offensive to at least some vested interest.
A solution: First, a rating system is needed that does not have a ceiling effect, so that more highly positive ratings from some panel members can offset the current effectiveness of a blackball from even one member of a panel (or one very negative external review). For example, the current 5-point system could be expanded to 10 points, with instructions to raters to use the whole scale. Second, the motives of “blackballers” need scrupulously to be examined. Of course, these individuals may see things others do not see. Often, however, what they see is a threat to their vested interest, and their vested interest may even be rather obvious, with only social nicety preventing their motives from being called into question. At a personal level, we all need to question our own motivations in evaluating proposals. Third, ideally, funding would be improved, so that more proposals could be funded. Although we cannot directly control funding, we can, and, for the survival of research, must lobby intensively for improved funding.
(2) The emphasis on finding methodological flaws leads to the funding of grant proposals that may be technically flawless but scientifically vacuous.
The problem: As Yamamoto (as quoted in Marshall, 1996) has pointed out, the current system of evaluation leads reviewers not to say, “‘This grant is boring,'” but rather to ”’write several pages describing technical flaws'” (p. 1257). The applicant then goes back and writes revision after revision, fixing technical flaws instead of concentrating on writing a new proposal of greater scientific interest. This effect, in combination with that described in (1) above, can lead to the funding of research that is technically flawless but potentially of little scientific merit. It would be analogous to investing money in diamonds that are internally flawless but that have poor cut or color. Gemologists, by the way, almost never buy such stones: They know better.
A solution: Panels making recommendations on grant proposals should concentrate first and foremost on potential creative scientific contribution. not on methodological flaws. When proposals do not have a potential major creative scientific contribution to make, the feedback to the proposer should stress this fact rather than the methodological flaws, if any, of the proposal. Indeed, it is not even clear that methodological issues need to be examined in such proposals. What’s the sense of doing or even evaluating good experiments on bad ideas? At a personal level, we all have to ask whether we are evaluating first and foremost the scientific contribution of proposed work.
(3) Emphasis on reliability of the rating system has over-shadowed more important issues of validity.
The problem: The tendency in evaluating rating systems can be to emphasize reliability, while de-emphasizing validity.
Reliability appears to have been a main emphasis in the recommendations of the NIH panel to restructure the rating system. Reliability of a rating system is far easier to assess than is the system’s validity, and so it is human nature to concentrate on the problems that are easier to solve. But they are not necessarily the more important problems.
A solution: Explicit research is needed on the validity of the system of assigning priorities. It is well within the power of NIH and similar organizations to do such research. For example, they could fund proposals by the regular system, and then use a different method of evaluation for funding a separate group of proposals-say, high-risk ones that might not otherwise have been funded. Then, five and again perhaps 10 years later, the scientific impact of research funded under the two systems could be evaluated, for example, in terms of Science Citation Index citations as well as other measures (such as citations in textbooks or ratings of the impact of the research by scientists in the field). Such a proposal might or might not be accepted by the field. But clearly, we cannot wait five or 10 years to make changes. We need to start making them now.
At a personal level, we all have to ask ourselves whether we reward risk-taking, or discourage it, in evaluating proposals. We cannot put all the blame on evaluators or bureaucrats in Washington, but rather must start with ourselves.
(4) The mindset of review panels is too local, too narrow, and too focused on short-term payoffs.
The problem: Several factors conspire to create a mindset among panel members that emphasizes the local , the narrow, and the short-term.
First, given that very few proposals can be funded, it is tempting to fund those that are extremely likely to yield a payoff. Why risk funding a proposal that looks dicey, when there are so many other proposals that are almost certain to yield something?
Second, the number of proposals to be reviewed can become mind-numbing, especially when one considers revisions as well as new proposals. So, when faced with a staggering number of proposals to review, it is hard to concentrate on the scientific deep structure of the proposals, rather than on the less important surface-structural details that are less challenging to evaluate.
Third, many of the most creative and broad-minded people in the field do not want to serve on the panels, because doing so is very time-consuming and often not terribly rewarding. Finally. it is much easier to get consensus on issues of methodological rigor and flaws than on issues of creative contribution and given the desire of a group to reach some kind of consensus, there will be a temptation to steer away from conflict-producing issues.
A solution: First, grant panels should be explicitly directed to concentrate more on deep structure than on surface structure–on the quality of the science than on minor points of methodology. Second, the number of proposals can be reduced by giving panels the same authority journal editors have-to reject outright without possibility of revision those proposals in which the science lacks merit, however strong the methodology may be. Third, when exceptionally creative scientists realize that these changes have been made, they will have an additional incentive to serve on review panels, because they will have fewer proposals to review, and they will be reviewing them according to the criteria that have guided their own work. At a personal level, we all have to ask whether we are ourselves being broad-minded and focusing on long-term payoffs when we evaluate proposals.
(5) Panelists need better to separate fashion from substance.
The problem: As many who have studied creativity have documented, scientists are no more immune from following the crowd than are others. They are as susceptible to jumping on bandwagons and to seeking only to confirm rather than to disconfirm their own beliefs (see Sternberg & Lubart, 1995, 1996). The result is that proposals that observe current fashions are more likely to be funded than are those that are less stylish.
A solution: Scientists need far more training in the philosophy as well as the hi story of science than they get. Many psychologists have never even read such basic works as Kuhn (1970), which points out the tendency of scientists to engage in normal science, and thereby to fill in holes in existing paradigms; or Popper (J 959), which points out the need for disconfirmability of scientific theories and hypotheses. If scientists are not self-aware, it is in part because they have not been trained in a way that emphasizes first the scientific questions to be asked, and only second, the finding of answers to these questions (see Simonton, 1988; Zuckerman, 1977). The solution to this problem lies in education. At a personal level, we all have to ask whether we are rewarding proposers who defy the crowd, or only those who follow it.
In conclusion. our system for awarding grants is approaching financial bankruptcy. But our concern needs to focus as well on the danger of scientific bankruptcy. Can we afford to relegate creativity to a backseat in the scientific enterprise? I believe not. We need. therefore. to restructure our enterprise. We must question the assets and liabilities in our own set of scientific values. The problem is not just in the rating system, per se, but in the system of values underlying it. Again, it ‘s our own system of values we have to question, not just that of anonymous bureaucrats. Bureaucracies are slow to change. We as individuals don’ t have to be. We can start now.
References and Further Reading:
Kuhn, T.S. (1970). The structure of scientific revolutions (2nd ed.). Chicago; University of Chicago Press.
Marshall. E. (1996). NIH panel urges overhaul of the rating system for grams. Science. 272, 1257 .
Popper, K.R. (1959). The logic of scientific discovery. London: Hutchinson.
Simonton, D.K. (1988). Scientific genius. New York: Cambridge University Press.
Sternberg, R.J., & Lubart, T.I. ( 1995). Defying the crowd: Cultivating creativity in a culture of conformity. New York: Free Press.
Sternberg, RJ., & Lubart, U. (in press). Investing in creativity. American Psychologist.
Yamamoto, K.R. (1996). Rating of grant applications: A proposal for discussion by the DRO Advisory Committee. Unpublished document.
Zuckerman, H. (1977). Scientific elite: Nobel laureates ill the United States. New York: Free Press.
Leave a comment below and continue the conversation.