Distributions of baseline categorical variables were different from the expected distributions in randomized trials with integrity concerns

Mark J Bolland, Greg D Gamble, Alison Avenell, David J Cooper, Andrew Grey

Research output: Contribution to journalArticlepeer-review

Abstract

OBJECTIVE: Comparing observed and expected distributions of baseline continuous variables in randomized controlled trials (RCTs) can be used to assess publication integrity. We explored whether baseline categorical variables could also be used.

STUDY DESIGN AND SETTING: The observed and expected (binomial) distribution of all baseline categorical variables were compared in four sets of RCTs: two controls, and two with publication integrity concerns. We also compared baseline calculated and reported p-values.

RESULTS: The observed and expected distributions of baseline categorical variables were similar in the control datasets, both for frequency counts (and percentages) and between-groups differences in frequency counts. However, in both sets of RCTs with publication integrity concerns, about twice as many variables as expected had between-group differences in frequency counts of 1 or 2, and far fewer variables than expected had between-group differences of >4 (P<0.001 for both datasets). Furthermore, about 1 in 6 reported p-values for baseline categorial variables differed by >0.1 from the calculated p-value in trials with publication integrity concerns.

CONCLUSION: Comparing the observed and expected distributions and reported and calculated p-values of baseline categorical variables may help in the assessment of publication integrity of a body of RCTs.

Original languageEnglish
Pages (from-to)117-124
Number of pages7
JournalJournal of Clinical Epidemiology
Volume154
Early online date27 Dec 2022
DOIs
Publication statusPublished - Feb 2023

Keywords

  • statistical methods
  • research integrity
  • categorical variables
  • p-values
  • data intergrity
  • randomization
  • fabricated data

Fingerprint

Dive into the research topics of 'Distributions of baseline categorical variables were different from the expected distributions in randomized trials with integrity concerns'. Together they form a unique fingerprint.

Cite this