the science of journaling
twelve gratitude rcts ranked by control rigour
twelve gratitude rcts ranked by what they controlled for. the effect collapses as rigour rises. the honest read on gratitude journaling research.
The gratitude prescription has hardened into a slogan. Three things, every morning, for happiness. Behind the slogan are roughly thirty years of randomised trials and three serious meta-analyses, and the meta-analyses do not say what the slogan says. Gratitude works, somewhat, against weak controls. Set it against any other writing exercise of the same length and most of the effect goes away. This post ranks twelve gratitude RCTs by the rigour of what they controlled for, then watches the effect size collapse as the controls get harder. The size of the claim should match the rigour of the test that produced it.
the question that breaks the gratitude story
The story descends from Emmons and McCullough's 2003 study of counting blessings versus burdens. [5] Three short trials with modest effects. The number that travelled was Study 1's d ≈ 0.42 against a daily-hassles control. Two decades of consumer wellness writing has cited Emmons and stopped there.
The two meta-analyses that should have ended the conversation are Davis 2016 with thirty-two samples and Cregg and Cheavens 2021 with twenty-seven RCTs and N of 3,675. [3][2] They ask one methodological question that almost no consumer writing engages with. Effective compared to what? The phrasing is from Wood, Froh and Geraghty's 2010 review, the canonical critique. [8] If the comparison is do nothing, the effect is medium. If the comparison is do anything else of the same length, the effect is small. If the comparison is do another positive-psychology exercise of equal expectancy, the effect is roughly zero.
the four tiers of control
Trials get ranked by what gratitude was tested against.
- Waitlist or measurement-only. The gratitude group writes. The control group does nothing. Any improvement might be attention, expectancy, or self-monitoring. The weakest comparison the field uses.
- Negative-event diary. The control writes about hassles or unpleasant events. Better than nothing, but the contrast is gratitude versus dwelling on what is wrong, not gratitude versus a fair alternative.
- Matched activity. The control writes about daily events, weekly activities, mood, or memories of similar length and structure. This is where the literature gets disciplined.
- Psychologically active. The control runs a different positive-psychology task with similar expectancy. Best-possible-self journaling, acts of kindness, early-memories writing, to-do lists for tomorrow. The hardest comparison, and the one closest to the choice a real reader actually faces.
The progression matters because the marketing claim, gratitude rewires your brain, is a tier-one claim made on tier-one evidence and quietly extended beyond what tier-three and tier-four trials support.
the chart, the pattern
Twelve trials drawn from Cregg and Cheavens's 2021 corpus, ordered roughly by control rigour. Effect size is the absolute value of Hedges' g on depressive symptoms. The y-axis is in favour of gratitude.
| study (control type) | |Hedges' g| |
|---|---|
| Cheng 2015 (waitlist) | 0.64 |
| Booker 2017 (waitlist) | 0.45 |
| Southwell 2017 (waitlist) | 0.33 |
| O'Leary 2015 (waitlist) | 0.28 |
| Watkins 2015 (matched) | 0.6 |
| Lambert 2012 (matched) | 0.36 |
| Jackowska 2016 (matched) | 0.35 |
| Kerr 2015 (matched) | 0.34 |
| Manthey 2016 (active) | 0.22 |
| Mongrain 2012 (active) | 0.21 |
| Sergeant 2011 (active) | 0.05 |
| Lyubomirsky 2011 (active) | 0.02 |
The four largest trials by sample size sit in the active-control tier. Sergeant 2011 with N of 514, Manthey 2016 with N of 300, Lyubomirsky 2011 with N of 208, Mongrain 2012 with N of 190. All four return an absolute g of at most 0.22, and two return effects indistinguishable from zero. The smaller trials with weaker controls produce the headline numbers consumer wellness sites quote.
Cregg's pooled finding follows the chart exactly. Against waitlist controls, gratitude reduced depressive symptoms by g of −0.51, a medium effect. Against active controls matched on time and structure, the effect collapsed to g of −0.18. Removing two outliers (Geraghty 2010, Ki 2009) shrank the depression effect by 26% and rendered the anxiety effect non-significant.
where the effect lives, where it doesn't
Davis 2016 reached the same conclusion three years earlier, with cleaner numbers. Across thirty-two samples, gratitude beat measurement-only controls on psychological well-being by d of 0.31. Against psychologically active comparisons it was d of −0.03. After trim-and-fill correction for publication bias, the matched-activity edge collapsed to d of 0.02. The authors wrote, in plain prose, that gratitude interventions may operate primarily through placebo effects.
gratitude interventions had a medium effect when compared with waitlist-only conditions, but only a trivial effect when compared with putatively inert control conditions involving any kind of activity.
Dickens's 2017 series of fifty-six meta-analyses, drawing on a different study set, lands on the same conclusion. Well-being effects of d ≈ 0.31 against neutral controls drop to d ≈ 0.17 against active controls. [4] Three meta-analyses across a decade, with overlapping but not identical inclusion criteria, all converging on the same moderator. Control type explains most of the variance the consumer literature attributes to gratitude itself.
what about sleep, immunity, the other branch
The strongest non-psychological signal is sleep. Boggiss's 2020 review found subjective sleep quality improved in five of eight RCTs that measured it. [1] The other physical-health outcomes (inflammation, blood pressure, glycaemic control) were equivocal or under-powered. The sleep finding is the one corner of the gratitude literature where a competent placebo control would still leave a genuine signal, and it points at the mechanism. A short bedtime list of three things, gratitude or otherwise, displaces pre-sleep cognitive arousal. Scullin's 2018 polysomnography trial randomised young adults to spend five minutes writing either a specific to-do list or a list of tasks already completed. [6] The to-do group fell asleep nine minutes faster, mean 15.8 against 25.1, Cohen's d of 0.63. What does the work is structuring attention before bed. Gratitude at bedtime evicts whatever was about to loop, the same job a planning list does. The signal echoes what survives in the older expressive-writing literature.
a quieter case for keeping the practice
The reading of thirty years of trials is not that gratitude journaling does nothing. Against a deconditioned baseline, almost any structured positive writing exercise produces a small, real benefit. Gratitude is one structured exercise among several. It does not justify a daily ritual on the strength of a meta-analytic effect that disappears under a competent placebo.
What survives is humbler. Two minutes of attention to something that went well, embedded in the science of a minimum-effective practice, is worth doing. Pair it with the rest of a one-line log and ask compared to what. With the marketing claim quieter, the practice itself is what is left to argue about.
references.
- 1.Boggiss, A.L. et al. (2020). A systematic review of gratitude interventions: Effects on physical health and health behaviors. Journal of Psychosomatic Research 135, 110165.doi:10.1016/j.jpsychores.2020.110165
- 2.Cregg, D.R. & Cheavens, J.S. (2021). Gratitude interventions: Effective self-help? A meta-analysis of the impact on symptoms of depression and anxiety. Journal of Happiness Studies 22(1), 413–445.doi:10.1007/s10902-020-00236-6
- 3.Davis, D.E. et al. (2016). Thankful for the little things: A meta-analysis of gratitude interventions. Journal of Counseling Psychology 63(1), 20–31.doi:10.1037/cou0000107
- 4.Dickens, L.R. (2017). Using gratitude to promote positive change: A series of meta-analyses investigating the effectiveness of gratitude interventions. Basic and Applied Social Psychology 39(4), 193–208.doi:10.1080/01973533.2017.1323638
- 5.Emmons, R.A. & McCullough, M.E. (2003). Counting blessings versus burdens: An experimental investigation of gratitude and subjective well-being in daily life. Journal of Personality and Social Psychology 84(2), 377–389.doi:10.1037/0022-3514.84.2.377
- 6.Scullin, M.K. et al. (2018). The effects of bedtime writing on difficulty falling asleep: A polysomnographic study comparing to-do lists and completed activity lists. Journal of Experimental Psychology: General 147(1), 139–146.doi:10.1037/xge0000374
- 7.Seligman, M.E.P. et al. (2005). Positive psychology progress: Empirical validation of interventions. American Psychologist 60(5), 410–421.doi:10.1037/0003-066X.60.5.410
- 8.Wood, A.M. et al. (2010). Gratitude and well-being: A review and theoretical integration. Clinical Psychology Review 30(7), 890–905.doi:10.1016/j.cpr.2010.03.005
related.
- ten journaling books we don't recommendthe popular journaling shelf has a contrarian list of its own. ten books that overclaim, ignore the evidence, or sell as journaling what isn't.
- best time to journal, there is no rctno head-to-head trial settles morning vs evening journaling. four indirect lines of evidence, chronobiology, sleep, worry, and one bedtime study, tilt one way.
- ten science-of-journaling books worth readingthe science-side canon of journaling books is smaller than the popular shelf. ten books, four decades of research, honest about what replication has shown.