A Hands-On Introduction to Preregistration

Jason Geller

Princeton Univeristy

10-21-22

Outline

  • The replication crisis and how we got here

  • Preregistraion

    • What
    • Why
    • Where
    • How
  • Do’s and dont’s of preregistration

Poll

The Replication Crisis: A Timeline

Daniel Lakens blog post

Psi Demo

The Replication Crisis: A Timeline

The Replication Crisis: A Timeline

Open Science Collaboration (2015)

Open Science Collaboration (2015)

  • Replications used materials supplied by original authors and were high-powered

    • Results: 39% of the original studies were successfully replicated
      • 25% of social psychology studies replicated
      • 50% of cognitive psychology studies replicated

Potential Reasons

High false positive rate in original studies (Simmons, Nelson, & Simonsohn, 2011)?

Why do we have a problem?

  • Researcher Degrees of Freedom
  • Optional stopping (ending data collection when p<0.05)
  • Selective reporting of variables and conditions
  • Exclusion of participants (“Outlier”)
  • Conducting tons of analyses but only report the one that best fits the hypothesis (“Garden of forking paths”)
  • Adding covariates to the analysis
  • Transforming the data
  • Forming hypotheses post-hoc (HARKing)

Potential Soloutions

  • Preregistration

    • A formalized (time-stamped) document that specifies all hypotheses & methodological choices in writing prior to data collection

      • Reduces RDoF

      • Makes it hard to p-hack

      • Can’t HARK

Why Preregister?

  • Planning improves the quality and transparency of your research

  • Facilitates replication/reproducibility

  • Makes the research process more transparent

  • Allows to clearly distinguish between a priori and post-hoc decisions (confirmatory and exploratory analyses)

  • Strict documentation of studies

  • Makes it harder to fool yourself and others

What to preregister?

  • Everything you can specify beforehand
  • Fixed decisions (e.g., outcome measures)
  • Decision rules (e.g., “if we cannot collect 100 participants until June 5, 2020, we will stop the data collection that very day and…”)
  • Statistical models (ideally analysis scripts, including data-dependent decision rules)
  • Rules for interpretation we will interpret an effect size of X as support for our hypothesis that…“)
  • General rules of your lab regarding standard operation procedures (SOP) or outlier exclusion

Where to Preregister?

  • OSF
    • Large number of templates
    • All participating authors can cancel within 48 hours
    • Prereg can be kept private for up to four years but then gets published
    • Withdrawing possible but leaves behind basic metadata
    • Also allows to make scripts, materials, preprints public

Where to Preregister?

  • Aspredicted.org
    • Short: 9 questions

    • All authors approve

    • Does not become public until authors act to publish it

When to preregister?

  • Ideally before data collection starts

  • Before new round of data collection after peer review

  • Everything before analysis is ok as long as you’re transparent

  • Secondary data preregistration also possible (template on OSF)

Preregistration

  1. Create an OSF account

  2. Start a preregistration

  • First, sign in to the OSF, and go to https://osf.io/prereg/.

  • You will be taken to the “OSF Preregistration” landing page.

Preregistration

  • Click start a new preregisrtation

Preregistration

  • A textbox will appear

    • Enter a title for your preregistration into the textbox, then click Continue

Preregistration

  • An OSF project will be created, and you will be taken to the project’s Registrations page.
  • Click the New registration button

Preregistration

  • Select OSF Preregistration from the list, then click the Create Draft button.

Preregistration

  • The preregistration form will appear

Preregistration

  • The preregistration form will appear

What to preregister?

  1. Hypotheses

  2. Method

  3. Analysis - confirmatory

Do’s and Dont’s

  • Research Question or Hypothesis

    • Bad Answer: Building on the work of Picasso (1901-1904), we hypothesized that….

    • Good Answer: Does sadness increase preference for the color blue?

Do’s and Dont’s

  • Dependent Variable

    • Bad answer: Preference for the color blue

    • Good answer: Participants will rate their liking for red, blue, orange, and purple on 7-points scales (1 = not at all; 7 = an extreme amount). Preference for blue will be defined as the difference between a participant’s rating for blue and their average rating of the three non- blue.

Do’s and Dont’s

  • Manipulations

    • Bad answer: We will manipulate mood by having participants watch different videos.

    • Good answer: Before rating their color preferences, participants will be randomly assigned to one of three conditions in which they watch a clip from either a sad video (My Dog Skip), a happy video (Pitch Perfect), or a neutral video (Gone Curling).

Do’s and Dont’s

  • Analyses

    • Bad answer: We will regress preference for the color blue on mood condition.

    • Good answer: We will run an OLS regression predicting preference for the color blue with condition (coded 1 = sad video; 0 = happy or neutral video). We will control for gender (1=male; 0 = female).

Do’s and Dont’s

  • Outliers & Exclusions

    • Bad answer: We will exclude participants who are inattentive, and those who show an extreme preference for the color orange.

    • Good answer: We will exclude participants who fail at least two out of three attention checks that will be included at the beginning of our study (before the manipulation). We will also exclude participants whose rating of orange is higher than 5 on the 7-point scale.

Do’s and Dont’s

  • Sample size

    • Bad answer: We conducted a power analysis that showed that… And so we decided to collect between 100 and 200 observations

    • Good answer: We will stop data collection once 150 participants have submitted a response on MTurk. Deviations from this goal are entirely due to MTurk software and outside of our control.

Potential Solutions

  • Registered Reports

    • Review and acceptance prior to data collection

Potential Solutions

  • Registered Reports
    • Reduces RDoF
    • Can’t p-hack
    • Can’t HARK
    • Reduces publication bias