Dating the Data

Traci Y. Craig

Member Article

Dating the Data

Traci Y. Craig

September 3, 2004

Tags:

Log in to Save for Later

In current empirical psychological articles it is uncommon for the data collection to be dated. A typical experimental method section of a paper includes information about geographic location (e.g., “a large Midwestern university”), and the age, race/ethnic, and sex makeup of the sample. Whether the data were collected in 1979, 1989, or 2004 is not part of that description. An increase in meta-analyses has led us to adhere strictly to standard reporting methods regarding standard deviations, means, and exact p-values. It would take relatively little effort to include data collection dates as well. It seems at least as relevant to know when as to know where data collection occurred.

Cook and Campbell (1979) address history as a threat to internal validity. Specifically, events that take place between a pretest and a posttest impact the ability to make causal inferences. They point out that history may be a particular threat for correlational data but may not be a threat to internal validity in experimental studies. However, it is important to expand our consideration beyond the issue of internal validity to a rather larger issue of understanding our work in its proper historical context. Part of this requires recognizing the discrepancy between publication dates and the actual date of data collection.

The most recent data point published in the most current journals are typically at least a year old (and likely older) if we make very generous assumptions about how quickly we are able to collect, analyze, submit, review, revise, and resubmit. Journals have made some allowances for estimation. If a societal development occurs after the ‘received date’ then we may feel comforted that this societal development was not influencing the findings. If something happened between ‘received’ and ‘revision received’ we can hope the authors address any relevant new information.

However, it is unclear exactly when in the Great Before that data might have been collected. Certainly pressure to publish encourages us to publish data as soon as possible, but many data sets get shopped around to a couple of journals and go through several recollections prior to publication. Study 1 may well have been collected several semesters prior to Study 4, and most top tier journals have begun to favor packages of studies over the sole empirical endeavor. The received dates are only when the data were received by that journal – that doesn’t mean the piece wasn’t received for the first time by another journal some years before. Further, those who find themselves at smaller colleges may need a year or even two to get different participants for each of three studies.

The issue is not that the data are not “new” but that we aren’t really sure when exactly they were collected.

There are several instances where dating data could be useful. Our understanding of some phenomena could be improved if we knew whether certain data sets were collected before or after 9/11, the Clinton administration’s redefinition of “sex,” or the airing of the King of the Hill episode exploring the Implicit Associations Test (Greenwald, McGhee, & Schwartz, 1998).

Data collected circa 1987 that works out quite nicely in light of a 2003 theory seems quite publishable in 2005. In this case, dating our data would allow us to know that a theory worked even before it was articulated. Further, knowing the date of data collection would enable us to properly credit those who used a paradigm first (even if it wasn’t published until later); those interested in the history of the field might find this particularly enlightening.

Some psychological research is focused on processes (e.g., neuroscience, animal behavior, quantitative modeling, and cognitive processing) that would unlikely be moved by global, national, or even regional situations. However, there are many topics that we study precisely because they are timely (e.g., political attitudes, work/family issues, disaster responding, etc.). These topics are hot because there is a situation occurring in the world or nation that is changing the ways in which humans function. Some articles mention exactly these sorts of events in their introductions as catalysts for the research. Knowing the dates of data collection might be especially important as views toward a variety of issues (e.g., health care, government, and the economy) fluctuate even more rapidly in the information age.

There are some concerns about what it would mean to researchers if the date of data collection becomes a conventional part of the methods section. First, it is paramount that the reporting of the date of data collection not be taken to mean that “newer” is better. Nor is it necessarily reasonable that a reviewer should request newer data simply because they now know the “age” of the data. This would only cause a cycle of constant requests for newer and more recent data, which wouldn’t really meet the spirit of dating our data. Data collected prior to a new theory, paradigm, or technological advance should not remain unpublished because the data is “too old.” Data from any point in time can be useful in enlarging our understanding of a phenomenon and should be considered for publication just as it is now, without regard for its age.

It might also be argued that if the study examines basic cognitive processes the temporal context is irrelevant. It is certainly true that for some work undertaken by psychologists we would not expect to see large differences in findings, theory, or participant functioning within a 20, 50, or perhaps even 100 year span. After all, a good theory will stand the test of time. Regardless of when the data are collected, the theory should work not only in spite of but given the variety of global, national, situations that a participant might be experiencing. And what better way to demonstrate that a theory holds regardless of fluctuations in Zeitgeist than to include the date the data were collected in the methods section of the published paper?

Including temporal information will allow us to place studies in the larger context of our field’s history as well as in a global context when appropriate. More exact reporting of dates of data collection should prove quite useful as our science grows out of its relative youth. If we can’t see the future, we should at least accurately document the present.

References

Cook, T. D., & Campbell, D. T. (1979). Quasi-Experimentation: Design and Analysis Issues for Field Settings. Houghton Mifflin: Boston.
Greenwald, A. G., McGhee, D. E., & Schwartz, J. L. K. (1998). Measuring individual differences in implicit cognition: The implicit association test. Journal of Personality and Social Psychology, 74, 1464-1480.

Observer > 2004 > September > Dating the Data

Cookie	Duration	Description
at-rand	never	AddThis sets this cookie to track page visits, sources of traffic and share counts.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
uvc	1 year 27 days	Set by addthis.com to determine the usage of addthis.com service.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_3507334_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.

Cookie	Duration	Description
loc	1 year 27 days	AddThis sets this geolocation cookie to help understand the location of users who share the information.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Member Article

Dating the Data

About the Author

Related

The Pluses and Pitfalls of Online Research

Seven Tips for Conducting Research With Low-Income Participants

Methods: Don’t Be Too Creative With Your Measures! Avoiding Questionable Measurement Practices

About the Author

Related

The Pluses and Pitfalls of Online Research

Seven Tips for Conducting Research With Low-Income Participants

Methods: Don’t Be Too Creative With Your Measures! Avoiding Questionable Measurement Practices

Cookies