Panel, Panel, on the Mall, Who's the Fairest of Them All?

Weighing in at 2.3 kilograms, Research-Doctorate Programs in the United States: Continuity and Change IS the latest (and fairest?) report on institutional quality in graduate education. Produced by the National Research Council (NRC) (see page 9), this 750-page mammoth study is the most extensive of a series of recent publications (e.g., US News and World Report).

“One can criticize these [NRC] rankings, but I think they’re a good measure of the quality of doctoral training programs in the United States, commented APS Fellow and NRC panel participant Richard Atkinson.”At the graduate level there’s just nothing comparable to this,” he said in response to reference to the rankings published annually in US News and World Report. Referring to the latter rankings as “pretty darned annoying,” the University of California director believes “the journalism rankings do not begin to approach the thoroughness of the NRC rankings.”

Programs Studied

Released in September, the hard-bound NRC study presents a comprehensive and detailed report on PhD programs in the arts and humanities, biological sciences, physical sciences and mathematics, engineering, and social and behavioral sciences.

To warrant inclusion in the report, a field of study must have been robust enough to have about 50 programs active in the United States, and to have produced about 500 PhDs in the years 1986-1992.

Included in the evaluation are 3,634 graduate programs, representing about 78,000 faculty members in 41 fields at 274 universities. These 41 fields produce a total of about 23,000 PhDs per year in the United States; the 3,634 programs included in the NRC survey confer 90% of those degrees.

While fields like theater arts are not big enough to qualify for inclusion, and classics barely squeaks in, psychology is the largest single field represented, with over 3,000 PhDs produced annually. The next largest field, chemistry, graduates about 2,000 per year. The 185 psychology programs covered by the study graduate 91 % of the PhDs in the field–the other 9% come from 43 programs that were not included. These latter programs were excluded either by their own choice or because they were too small to qualify or because they submitted information after NRC deadlines for inclusion. Some 1,916 programs appear in both the 1982 and the present study, allowing comparison of changes across the decade.

Ordered by Rank

Like its 1982 predecessor, the 1995 NRC study includes reputation scores generated by a faculty survey, along with extensive objective data on publication rates, and student characteristics. However, unlike the earlier study, which downplayed the results of the surveys of academic reputation, listing the schools in alphabetical order, the 1995 report organizes many of its tables (see table beginning on page 6) in descending order of the reputation of the faculty for scholarship and research. The table reproduced here was adapted from Appendix M-6, Selected Characteristics of Research-Doctorate Programs in Psychology. The Rank number and the last column, Standard Score of Overall Rating, were added in the Observer version. Another table summarizing psychology departments is Table P-37, Relative Rankings for Research-Doctorate Programs in Psychology (pp. 608-613). Too extensive to reproduce here, this table is also ordered by departmental overall rank, and provides information on departmental visibility, percent of raters who considered the department distinguished or not, trimmed averages of quality and effectiveness rankings, and comparative rankings from the 1982 survey.

APS William James Fellow Gardner Lindzey, who co-chaired the earlier study and also participated in the new one, told the Observer that in 1982 the graduate deans were opposed to rankings and that “one of the conditions for getting their participation was that we agreed to present the findings in such a way as to make it very difficult to make comparisons across institutions.”

“The data are much more accessible in [the current] report,” he explained. It is appropriate that the data be presented with ‘ faculty quality’ as the most visible variable, he said, because the “peer rating of quality is probably the most interesting variable,” for the intended audience.


Who is the audience? Individuals making choices about graduate programs, including undergraduates preparing to go to graduate school, graduate students thinking about changing schools, faculty thinking about moving to another school, federal and private funding agencies, and internal administration officials who must deal with the allocation of limited funds. (Some data from the report are available online at the NAS Internet home page at

The Survey

For each program to be evaluated, surveys were sent to a randomized sample of 200 academic raters. None were asked to rate the institution at which they worked.

The raters were stratified by academic rank, and three waves of follow-up surveys were sent, to bring the response rate to at least 50% for every program evaluated. Importantly, all raters were provided with lists of the current faculty in the programs, so they knew exactly which researchers and teachers they were being asked to evaluate.

Programs were rated on a scale of 0 (not sufficient for doctoral education) to 5 (distinguished), for two measures: faculty quality (93Q) and effectiveness in training students (93E). The “93” in these codes refers to the year of this NRC study.

Error Range

The average of the 93Q and 93E scores, as well as numerous other data, are reported (Table M-6). One caveat relates to the confidence interval in both Table M-6 and Appendix Q (not shown here) for the faculty quality scores. These intervals range from about plus or minus 0.1 to plus or minus 0.6, and serve to remind the reader that most small “differences” in rank are non-significant.

Throughout the report, both measures (93Q and 93E) are explicitly focused on the advancement of the field through research and scholarship, rather than on practical applications. While this focus is non-controversial in fields like classics and art history, it creates a problem for psychology, in which there are many programs, particularly professional schools of psychology, which aim to produce practicing clinical psychologists, as opposed to research-oriented clinical psychologists. Most such programs de-emphasize research.

On Size, Science, and Rank

In most fields, the number of graduates per program decreases steadily as one descends the rankings. In psychology, however, the lowest quartile of schools has more students than the second or third quartiles, reflecting the large number of students and programs aimed at clinical work.

APS Fellow Brendan Maher, co-chair of the NRC panel pointed out that these programs are rated low clearly “because they are not focused on research.” Trained in the Boulder model, with a very strong tradition of basing therapy on a scientific foundation, Maher says “there has been an influx of people to the field who do not want to be scientists, but want to be clinicians. The original high standards promulgated by the Boulder model of training have since become eroded.” He  says that the schools training clinicians ought to grant PsyD degrees instead, because by giving the “PhD” they are maintaining a claim to certain standards, and will be rated poorly by those in programs that award a research/science-oriented PhD.

Relative Position

Where do the social and behavioral sciences sit with regard to other disciplines in terms of “Scholarly Quality of Program Faculty”? According to ratings on the study’s 5-point scale (“Distinguished” to “Not Sufficient for Doctoral Study”), 56% of these programs were rated as “Distinguished,” “Strong,” or “Good.” On the other hand, about 62% of the total 3,634 programs in the study were rated as such. Programs in arts and humanities had the highest rating (68%) in this category. The other disciplines achieved ratings as follows: biological science (65%), engineering (63%), and physical sciences and mathematics (59%).

With regard to program “effectiveness in educating research scholars/scientists,” about two-thirds of all programs were considered to be extremely or reasonably effective on this 5-point scale, and fewer than 10% were considered to be not effective, according to the report. Only the top 54 psychology programs (29% of the 185 programs) had ratings above 3.00 on this scale.


It may seem somewhat perverse that reported reactions to the report (see September 22, 1995, Science) include the complaint that the ratings are biased against practical applications.

But psychologist and APS Fellow Gardner Lindzey makes a relevant distinction, “Actually, it’s not biased against applications. It’s biased against applications not seated within a research context.” An NRC study panelist, Lindzey demonstrated the point with the research-oriented program at the University of Minnesota, “Minnesota has been one of the strongest in applied psychology for decades, and they’re always in the top ten.”

Nonetheless, even though the express purpose of the NRC work is to evaluate the pursuit and perpetuation of academic research and scholarship, the authors of the report would have evaluated employment histories for graduates of the various programs, including jobs in academe, government, business and industry, if not for a shortage of time and money. It is only natural to judge an education with reference to its ability to prepare one for a career other than research, even if the nominal focus is on research.

But, if future NRC reports include the provision for such evaluations, one can only speculate as to whether they will include assessment of the centrality of scientific and theoretical knowledge in the careers of those who chose non-research oriented employment.

Students Count

The faculty quality and program effectiveness rankings, although highly visible, are clearly not the whole story in the NRC report. Maher emphasizes the value to the potential graduate student of the multitude of other data included in the report. “If you’re a female student, would you choose a school with 30% women, or 3% women?” While this comment applies more to graduate study in engineering than to psychology, APS Member Patricia Gurin, chair of psychology at the University of Michigan commented, “This department, from the 1960s, has had a special mission in the training of minorities.” The report’s tables are a rich source of data on gender and minority composition of departments as well as information on research- and teaching-assistantships and the average number of years required to obtain the PhD.

Not everyone will be happy with their ratings, and not only because the purely clinically oriented programs are low rated. As Gurin puts it, “Any national ratings will miss the departments that make a unique offering. They miss a lot of excellent things.” She also points out that many top-rated programs are so because they get outstanding students, while good to excellent education is, and should be, available to other populations, too. In the report, the panel discusses briefly the fact that in evaluating the quality of graduates from a program, they had no way to separate what the students would have achieved anyway, from the “value added” by the programs.

Additional Details

Brief as it is, and without statistical rigor, the US News and World Report ratings do attempt to go beyond the overall reputation rankings in another way. Specifically, they add a short list of top schools within subspecialties. And, for what it is worth, the top 27 psychology schools for overall reputation are to be found within the top 35 of the NRC list, with few major discrepancies in the rankings. The degree of correspondence between the US News ranking and NRC ranking varies across fields. In sociology, for example, the agreement is better than that for psychology. For the psychology data set, it appears that none of the US News ratings fall outside the confidence intervals in the NRC study.

Some of the schools listed as tops in the US News subspecialties are found much further down in the NRC rankings, but because the NRC report does not include comparable data on subdivisions of psychology, there is nothing to check these data against.

What is the National Academy of Sciences? It’s an independent quasi-governmental institution chartered by Congress in 1863 to provide scientific advice and guidance to Congress and the federal government. The National Research Council is the research arm of the National Academy of Sciences and comprises over 625 committees that research a wide range of science, engineering, medical, technology, and education topics and produce reports on their findings. Watch for the March 1996 Observer for a feature article on the Academy and its distinguished psychologist members.

APS regularly opens certain online articles for discussion on our website. Effective February 2021, you must be a logged-in APS member to post comments. By posting a comment, you agree to our Community Guidelines and the display of your profile information, including your name and affiliation. Any opinions, findings, conclusions, or recommendations present in article comments are those of the writers and do not necessarily reflect the views of APS or the article’s author. For more information, please see our Community Guidelines.

Please login with your APS account to comment.