Saturday, March 19, 2016

I Was Wrong!

Yesterday, ResearchGate suggested that I read a new article reporting that ego depletion can cause aggressive behavior. This was a surprise to me because word has it that ego depletion does not exist, so surely it cannot be a cause of aggressive behavior.

The paper in question looks about like you'd expect: an unusual measure of aggression, a complicated 3 (within) × 2 (between) × 2 (between) design, a covariate tossed into the mix just for kicks, a heap of measures collected and mentioned in a footnote but not otherwise analyzed. It didn't exactly change my mind about ego depletion, much less its role in aggressive behavior.

But it'd be hypocritical of me to criticize this ill-timed paper without mentioning the time I reported an ego-depletion effect through effect-seeking, exploratory analysis. I've also been meaning to change my blogging regimen up a bit. It's time I switched from withering criticism to withering self-criticism.

The paper is Engelhardt, Hilgard, and Bartholow (2015), "Acute exposure to difficult (but not violent) video games dysregulates cognitive control." In this study, we collected a hearty sample (N = 238) and had them play one of four modified versions of a first-person shooter game, a 2 (Violence: low, high) × 2 (Difficulty: low, high) between-subjects design.

To manipulate violence, I modified the game's graphics. The violent version had demons and gore and arms bouncing across the floor, whereas the less violent version had silly-looking aliens being warped home. We also manipulated difficulty: Some participants played a normal version of the game in which monsters fought back, while other participants played a dumb-as-rocks version where the monsters walked slowly towards them and waited patiently to be shot.

After the game, participants performed a Spatial Stroop task. We measured the magnitude of the compatibility effect, figuring that larger compatibility effects would imply poorer control. We also threw in some no-go trials, on which participants were supposed to withhold a response.

Our hypothesis was that playing a difficult game would lead to ego depletion, causing poorer performance on the Spatial Stroop. This might have been an interesting refinement on the claim that violent video games teach their players poorer self-control.

We looked at Stroop compatibility and found nothing. We looked at the no-go trials and found nothing. Effects of neither violence nor of difficulty. So what did we do?

We needed some kind of effect to publish, so we reported an exploratory analysis, finding a moderated-mediation model that sounded plausible enough.


We figured that maybe the difficult game was still too easy. Maybe participants who were more experienced with video games would find the game to be easy and so would not have experienced ego depletion. So we split the data again according to how much video game experience our participants had, figuring that maybe the effect would be there in the subgroup of inexperienced participants playing a difficult game.

The conditional indirect effect of game difficulty on Stroop compatibility as moderated by previous game difficulty wasn't even, strictly speaking, statistically significant: p = .0502. And as you can see from our Figure 1, the moderator is very lopsided: only 25 people out of the sample of 238 met the post-hoc definition of "experienced player." 

And the no-go trials on the Stroop? Those were dropped from analysis: our footnote 1 says our manipulations failed to influence behavior on those trials, so we didn't bother talking about them in the text.

So to sum it all up, we ran a study, and the study told us nothing was going on. We shook the data a bit more until something slightly more newsworthy fell out of it. We dropped one outcome and presented a fancy PROCESS model of the other. (I remember at some point in the peer review process being scolded for finding nothing more interesting than ego depletion, which was accepted fact and old news!)

To our credit, we explicitly reported the exploratory analyses as being exploratory, and we reported p = .0502 instead of rounding it down to "statistically significant, p = .05." But at the same time, it's embarrassing that we structured the whole paper to be about the exploratory analysis, rather than the null results. 

In the end, I'm grateful that the RRR has set the record straight on ego depletion. It means our paper probably won't get cited much except as a methodological or rhetorical example, but it also means that our paper isn't going to clutter up the literature and confuse things in the future. 

In the meantime, it's showed me how easily one can pursue a reasonable post-hoc hypothesis and still land far from the truth. And I still don't trust PROCESS.

8 comments:

  1. Joe,

    Just a quick note to say that there is something truly inspiring about this post. I really admire you for doing this.

    ReplyDelete
    Replies
    1. That's very kind of you to say, but there's really very little to it. We managed to squeeze out a paper on a thin, exploratory finding. Now, thanks to the ego depletion RRR, we know that our finding is more likely to be a false positive.

      Rather than try to defend the finding to the death, I think it's more sensible (and more easy!) just to let it go. It was a speculative finding based on some post-hoc rummaging. It's beyond me why some researchers will think this kind of flimsy exploratory result is worth defending to the death -- it's just not my style.

      Delete
  2. If you now think the result is wrong, and if you don't think that people should cite it, have you given any thought to retracting the article?

    ReplyDelete
    Replies
    1. Someone asked me this on Facebook yesterday, too. Here's my reply:

      I don't think retraction is at all appropriate in our case because the exploratory finding is clearly labeled as exploratory. I don't think that exploratory or implausible results should be expunged from the literature. (Talk about your publication bias!)

      I do think it's important that everything be reported to give the reader the fullest sense of the strength of the evidence, and I think we were very direct about that. The exact p-value is reported (p = .0502, not rounded down to "significant, p = .05"), and the presented model is clearly labeled "exploratory." It's not a biased summary of our evidence, even if the conclusions we drew from that evidence may be wrong.

      Delete
  3. Translation:

    "I realize that my admission will result in no one citing my paper, even though no one will cite it anyway once all this ego depletion stuff is revealed as bullshit. However, I can still use this to my advantage. If I admit to doing some slightly improper behavior that everyone else does but won't admit, engage in some self-castigation, and then give us all the moral lesson that comes out of it, I can get some public sympathy. Fame level +5."

    ReplyDelete
  4. Very interesting. Thanks so much for sharing this info.

    ReplyDelete
  5. Why don't you trust PROCESS? Or you don't trust the PROCESS :)?

    ReplyDelete
  6. To me the most telling piece is "We needed some kind of effect to publish". It isn't necessarily that you wanted an effect to publish but that journals wanted you to have an effect to publish. By and large, journals still require this. Even as a "new" researcher (a few years into PhD), I have had some papers declined on the basis of "your effects are all null, we see nothing publishable here". If journals don't change their ways, our ways cannot change. We still need incomes and frankly, if we have to shake our data a bit to get one, so be it. Let us publish null results if the methodology is sound, or don't complain about QRP's.

    ReplyDelete