Member Article

Dating the Data

In current empirical psychological articles it is uncommon for the data collection to be dated. A typical experimental method section of a paper includes information about geographic location (e.g., “a large Midwestern university”), and the age, race/ethnic, and sex makeup of the sample. Whether the data were collected in 1979, 1989, or 2004 is not part of that description. An increase in meta-analyses has led us to adhere strictly to standard reporting methods regarding standard deviations, means, and exact p-values. It would take relatively little effort to include data collection dates as well. It seems at least as relevant to know when as to know where data collection occurred.

Cook and Campbell (1979) address history as a threat to internal validity. Specifically, events that take place between a pretest and a posttest impact the ability to make causal inferences. They point out that history may be a particular threat for correlational data but may not be a threat to internal validity in experimental studies. However, it is important to expand our consideration beyond the issue of internal validity to a rather larger issue of understanding our work in its proper historical context. Part of this requires recognizing the discrepancy between publication dates and the actual date of data collection.

The most recent data point published in the most current journals are typically at least a year old (and likely older) if we make very generous assumptions about how quickly we are able to collect, analyze, submit, review, revise, and resubmit. Journals have made some allowances for estimation. If a societal development occurs after the ‘received date’ then we may feel comforted that this societal development was not influencing the findings. If something happened between ‘received’ and ‘revision received’ we can hope the authors address any relevant new information.

However, it is unclear exactly when in the Great Before that data might have been collected. Certainly pressure to publish encourages us to publish data as soon as possible, but many data sets get shopped around to a couple of journals and go through several recollections prior to publication. Study 1 may well have been collected several semesters prior to Study 4, and most top tier journals have begun to favor packages of studies over the sole empirical endeavor. The received dates are only when the data were received by that journal – that doesn’t mean the piece wasn’t received for the first time by another journal some years before. Further, those who find themselves at smaller colleges may need a year or even two to get different participants for each of three studies.

The issue is not that the data are not “new” but that we aren’t really sure when exactly they were collected.

There are several instances where dating data could be useful. Our understanding of some phenomena could be improved if we knew whether certain data sets were collected before or after 9/11, the Clinton administration’s redefinition of “sex,” or the airing of the King of the Hill episode exploring the Implicit Associations Test (Greenwald, McGhee, & Schwartz, 1998).

Data collected circa 1987 that works out quite nicely in light of a 2003 theory seems quite publishable in 2005. In this case, dating our data would allow us to know that a theory worked even before it was articulated. Further, knowing the date of data collection would enable us to properly credit those who used a paradigm first (even if it wasn’t published until later); those interested in the history of the field might find this particularly enlightening.

Some psychological research is focused on processes (e.g., neuroscience, animal behavior, quantitative modeling, and cognitive processing) that would unlikely be moved by global, national, or even regional situations. However, there are many topics that we study precisely because they are timely (e.g., political attitudes, work/family issues, disaster responding, etc.). These topics are hot because there is a situation occurring in the world or nation that is changing the ways in which humans function. Some articles mention exactly these sorts of events in their introductions as catalysts for the research. Knowing the dates of data collection might be especially important as views toward a variety of issues (e.g., health care, government, and the economy) fluctuate even more rapidly in the information age.

There are some concerns about what it would mean to researchers if the date of data collection becomes a conventional part of the methods section. First, it is paramount that the reporting of the date of data collection not be taken to mean that “newer” is better. Nor is it necessarily reasonable that a reviewer should request newer data simply because they now know the “age” of the data. This would only cause a cycle of constant requests for newer and more recent data, which wouldn’t really meet the spirit of dating our data. Data collected prior to a new theory, paradigm, or technological advance should not remain unpublished because the data is “too old.” Data from any point in time can be useful in enlarging our understanding of a phenomenon and should be considered for publication just as it is now, without regard for its age.

It might also be argued that if the study examines basic cognitive processes the temporal context is irrelevant. It is certainly true that for some work undertaken by psychologists we would not expect to see large differences in findings, theory, or participant functioning within a 20, 50, or perhaps even 100 year span. After all, a good theory will stand the test of time. Regardless of when the data are collected, the theory should work not only in spite of but given the variety of global, national, situations that a participant might be experiencing. And what better way to demonstrate that a theory holds regardless of fluctuations in Zeitgeist than to include the date the data were collected in the methods section of the published paper?

Including temporal information will allow us to place studies in the larger context of our field’s history as well as in a global context when appropriate. More exact reporting of dates of data collection should prove quite useful as our science grows out of its relative youth. If we can’t see the future, we should at least accurately document the present.


Cook, T. D., & Campbell, D. T. (1979). Quasi-Experimentation: Design and Analysis Issues for Field Settings. Houghton Mifflin: Boston.
Greenwald, A. G., McGhee, D. E., & Schwartz, J. L. K. (1998). Measuring individual differences in implicit cognition: The implicit association test. Journal of Personality and Social Psychology, 74, 1464-1480.

APS regularly opens certain online articles for discussion on our website. Effective February 2021, you must be a logged-in APS member to post comments. By posting a comment, you agree to our Community Guidelines and the display of your profile information, including your name and affiliation. Any opinions, findings, conclusions, or recommendations present in article comments are those of the writers and do not necessarily reflect the views of APS or the article’s author. For more information, please see our Community Guidelines.

Please login with your APS account to comment.