OBJECTIVE: Comparing observed and expected distributions of baseline continuous variables in randomized controlled trials (RCTs) can be used to assess publication integrity. We explored whether baseline categorical variables could also be used.
STUDY DESIGN AND SETTING: The observed and expected (binomial) distribution of all baseline categorical variables were compared in four sets of RCTs: two controls, and two with publication integrity concerns. We also compared baseline calculated and reported p-values.
RESULTS: The observed and expected distributions of baseline categorical variables were similar in the control datasets, both for frequency counts (and percentages) and between-groups differences in frequency counts. However, in both sets of RCTs with publication integrity concerns, about twice as many variables as expected had between-group differences in frequency counts of 1 or 2, and far fewer variables than expected had between-group differences of >4 (P<0.001 for both datasets). Furthermore, about 1 in 6 reported p-values for baseline categorial variables differed by >0.1 from the calculated p-value in trials with publication integrity concerns.
CONCLUSION: Comparing the observed and expected distributions and reported and calculated p-values of baseline categorical variables may help in the assessment of publication integrity of a body of RCTs.
|Number of pages||7|
|Journal||Journal of Clinical Epidemiology|
|Early online date||27 Dec 2022|
|Publication status||Published - Feb 2023|
- statistical methods
- research integrity
- categorical variables
- data intergrity
- fabricated data