I reposted a quote from a paper on twitter this morning entitled “The earth is flat (p > 0.05): Significance thresholds and the crisis of unreplicable research.” The quote, which is worth repeating, was “reliable conclusions on replicability…of a finding can only be drawn using cumulative evidence from multiple independent studies.”
An esteemed colleague (Daniël Lakens @lakens) responded “I just reviewed this paper for PeerJ. I didn’t think it was publishable. Lacks structure, nothing new.”
Setting aside the typical bromide that I mostly curate information on twitter so that I can file and read things later, the last clause “nothing new” struck a nerve. It reminded me of some unappealing conclusions that I’ve arrived at about the reproducibility movement that lead to a different conclusion—that it is very, very important that we post and repost papers like this if we hope to move psychological science towards a more robust future.
From my current vantage, producing new and innovative insights about reproducibility is not the point. There has been almost nothing new in the entire reproducibility discussion. And, that is okay. I mean, the methodologists (whether terroristic or not) have been telling us for decades that our typical approach to evaluating our research findings is problematic. Almost all of our blogs or papers have simply reiterated what those methodologists told us decades ago. Most of the papers and activities emerging from the reproducibility movement are not coming up with “novel, innovative” techniques for doing good science. Doing good science necessitates no novelty. It does not take deep thought or creativity to pre-register a study, do a power analysis, or replicate your research.
What is different this time is that we have more people’s attention than the earlier discussions. That means, we have a chance to make things better instead of letting psychology fester in a morass of ambiguous findings meant more for personal gain than for discovering and confirming facts about human nature.
The point is that we need to create an environment in which doing science well—producing cumulative evidence from multiple independent studies—is the norm. To make this the norm, we need to convince a critical mass of psychological scientists to change their behavior (I wonder what branch of psychology specializes in that?). We know from our initial efforts that many of our colleagues want nothing to do with this effort (the skeptics). And, these skeptical colleagues count in their ranks a disproportionate number of well-established, high status researchers who have lopsided sway in the ongoing reproducibility discussion. We also know that another critical mass is trying to avoid the issue, but seem to be grudgingly okay with taking small steps like increasing their N or capitulating to new journal requirements (the agnostics). I would even guess that the majority of psychological scientists remain blithely unaware of the machinations of scientists concerned with reproducibility (the naïve) and think that it is only an issue for subgroups like social psychology (which we all know is not true). We know that many young people are entirely sympathetic to the effort to reform methods in psychological science (the sympathizers). But, these early career researchers face withering winds of contempt from their advisors or senior colleagues and problematic incentives for success that dictate they continue to pursue poorly designed research (e.g., the prototypical underpowered series of conceptual replication studies, in which one roots around for p < .05 interaction effects).
So why post papers that reiterate these points? Even if those papers are derivative or maybe not as scintillating as we would like? Why write blogs that repeat what others have said for decades before?
Because, change is hard.
We are not going to change the minds of the skeptics. They are lost to us. That so many of our most highly esteemed colleagues are in this group simply makes things more challenging. The agnostics are like political independents. Their position can be changed, but it takes a lot of lobbying and they often have to be motivated through self-interest. I’ve seen an amazingly small number of agnostics come around after six years of blog posts, papers, presentations, and conversations. These folks come around one talk, one blog, or one paper at a time. And really, it takes multiple messages to get them to change. The naïve are not paying attention, so we need to repeat the same message over and over and over again in hopes that they might actually read the latest reiteration of Jacob Cohen. The early career researchers often see clearly what is going on but then must somehow negotiate the landmines that the skeptics and the reproducibility methodologists throw in their way. In this context, re-messaging, re-posting, re-iterating serves the purpose to create the perception that doing things well is supported by a critical mass of colleagues.
Here’s my working hypothesis. In the absence of wholesale changes to incentive structures (grants, tenure, publication requirements at journals), one of the few ways we will succeed in making it the norm to “produce cumulative evidence from multiple independent studies” is by repeating the reproducibility message. Loudly. By repeating these messages we can drown out the skeptics, move a few agnostics, enlighten the naïve, and create an environment in which it is safe for early career researchers to do the right thing. Then, in a generation or two psychological science might actually produce, useful, cumulative knowledge.
So, send me your huddled, tired essays repeating the same messages about improving our approach to science that we’ve been making for years and I’ll post, repost, and blog about them every time.
Brent W. Roberts
“The quote, which is worth repeating, was “reliable conclusions on replicability…of a finding can only be drawn using cumulative evidence from multiple independent studies.”
Perhaps Study Swap can play a role in having different labs collaborating on replicating their studies: https://osf.io/view/studyswap/
I agree that whether something new or not is irrelevant to science, just as it is irrelevant whether an effect is statistically significant or a paper is easy/fun to read is irrelevant to science (see Giner-Sorolla 2012 in Perspectives of Psy Science).
I disagree, however, that everything has already been said and nothing IS new in methodology and statistics papers nowadays. Each year, more than a dozen papers appear with new insights and tools/methodology to assess and deal with ‘reproducibility’.
I don’t want to diminish the work that is being done on the reproducibility issues. And, there are numerous technologies that are new and quite helpful–OSF, Study Swap, As Predicted, TIVA, p-curves, and R-statistics, etc. These things are great. But, the real issue is robustness. No statistic, platform or shiny app replaces the mundane documentation that you can reproduce an effect with ease and even under duress–See Rolf Zwaan’s new paper for example: https://osf.io/preprints/psyarxiv/rbz29. No new fancy statistic is necessary to evaluate whether you get the same effect over and over again.
Relatedly, I chafe at the obsession with innovation. I mean, I love creative work. But, the inanity of a recent grant call for “innovative” methods to evaluate reproducibility was the height of silliness. We don’t need innovation. We just need replication. The problem with our science can be largely linked to the incentive system, which starts with the granting agencies demanding that you do something innovative rather than get it right. It has led to a perverse world where we really don’t care if what we do is right as long as it looks new and shiny.
Great rough taxonomy of the different types of researchers in relation to open science/reproducibility position (skeptics, agnostics, naive, sympathizers); i will start using these terms from now on, adding a few additional terms and subtypes.
that said, your term “skeptic” is somewhat suboptimal in this context given such folks are skeptical of skeptics whom adopt/advocate open science/reproducibility positions. perhaps the term “dissenters” would be better/more clear?
Admittedly, I considered a few other labels for the skeptics and some additional categories. I don’t like dissenters because it is really the reproducibility group that is dissenting–we don’t want the status quo. As for alternatives for the skeptics, what would the German word for Methodological Luddite be? That would capture the right flavor for the skeptics. Let’s keep things like the old ways, etc.
I left out the reproducibility group, of course. There have been an excess of names used to describe them/us. I’d put that one up for a vote.
I also left out the “politicians.” The ones who say let’s all get along and that there’s some compromise position when, of course, there really isn’t. There aren’t many of them, but they are conspicuous.
Any other groups?
I guess we could call our tribe the “New Bad People” a la Heathers’ blog https://medium.com/@jamesheathers/meet-the-new-bad-people-4922137949a1
good point about the suboptimality of the alternative term “dissenter”. As a better alternative, another term I’ve used is the clunkier term “status quo defenders”, a group which can be further broken down into at least two subtypes which will not be named at this current time.
and yes i agree the “politicians” group is important one to point out;
for the “reproducibility group”, i’ve used the term “reformers”/”open science proponents”, a group which can be further decomposed into “non-preachy adopters”, “zealous respectful reformers” (the subgroup i include myself in), and the “militant/pro-public shaming reformers” (Uli? ;-0)
could use “the old guard” instead of skeptics, esp if trying to connote “Luddite”. would say “old guard” sounds kind of insulting, but they are probably already insulted by charge that status quo methods are insufficient
Sorry, but Daniel lakens is much more an adolescent trying to criticize everyone than a real scientist. His job consisting of login on Facebook and chat with other adolescents….
Dear FCA,
Consider several things. First, Daniel has been at this for a very long time. For the “top section” of the flagship journal of his field to continue to publish poor quality work can get very distressing after a while. After all, the changes he and others, like myself have recommended are not technically or resource intensive despite the protests. To see colleagues continue to willfully design uninformative studies and then try and infer something from that work can get frustrating. It is analogous to being a parent, who asks their child to clean their room. Said child then stuffs everything under the bed and in the closet and says “look dad, everything is clean!” At which point the father opens the closet door and the mess tumbles out. After 5 years of watching the mess tumble out of the closet, you can get exasperated. Second, there is no use saying these things here. Say them to Daniel on Twitter and Facebook. Some people have already done so and to good effect. Third, calling him an “adolescent” and not a “real scientist” is the same as Daniel saying that his colleagues are incompetent. While it is always impressive to me how human scientists are–as in how frail we are in the face of what we know we should do, like not calling people names–it is still a good idea not to practice that which you criticize.
Brent
Pingback: sometimes i'm wrong: be your own a**hole