Principles for reporting statistical methods

General Principles for Reporting of Statistical Methods in Journal Manuscripts

The importance of clear and accurate reporting of statistical methods is vital for the integrity, credibility, and reproducibility of scientific research (American Statistical Association [ASA], 2022; EQUATOR Network, 2024).

Introduction

The importance of clear and accurate reporting of statistical methods is vital for the integrity, credibility, and reproducibility of scientific research (American Statistical Association [ASA], 2022; EQUATOR Network, 2024). Reporting on statistical analysis is much more than an ethical/technical requirement; it is an integral part of trustworthiness and interpretability of findings from research. Empirical assessments provide clear and actionable evidence of disparities in the way statistical analyses are documented, thus contributing to a larger reproducibility crisis related to science (Gosselin, 2021; Xiong & Cribben, 2022).

As a result, academic communities and journal editors have sought standardisation of reporting guidelines such as CONSORT, STROBE, PRISMA, and ARRIVE (EQUATOR Network, 2024; ASA, 2022). New contributions by Keppler et al. (2022) and Giofrè et al. (2022) highlighted the necessity of a more structured and comprehensive manner of reporting statistical methods. They argue that there is clear improvement in transparency if statistical assumptions, statistical software, objectives, and open data practices are explicitly reported.

This manuscript will offer best practices for reporting the statistical procedures conducted in journal manuscripts and will incorporate what we see as current scholarly guidance to promote transparency of the study methods and integrity of science.

Preliminary Analyses

Formal reports of evidence using complex datasets begin with the complete accounting of all of the pre-analysis steps, including data manipulation.
Transformations include logarithm, square-root, and Box–Cox and are typically used to perform data transformation of skewed distributions. The researcher should explain the rationale and mathematical methods used and any caveats for interpreting the data (Keppler et al., 2022).

2. Deriving Variables

When variables are derived (for example, body mass index, standardised scores), the researcher should report the description of the derivation, including the formula used, scaling mechanism and source.

3. Categorisation

When continuous variables have been categorized (for example, age into deciles), the cut points of the categories and the rationale for them – ideally clinically based or based in prior literature – should be openly discussed (EQUATOR Network, 2024).

4. Merging Categories

Merging of categories of ordinal or nominal variables to achieve a minimum sample size or to enable more reasonable comparisons should be explained with rationale.

5. Outlier Identification

Classifying outliers, including any z-score and/or other methodologies (for instance, Mahala Nobis distance, Cook’s distance) and sensitivity analysis, if any, that assesses the impact on results should be explicitly stated.

Where feasible, the above practices facilitate reproducibility since they lead the reader to acknowledge and repeat the original manipulation of the dataset (ASA, 2022; Keppler et al., 2022).

1. Analytical Objectives

All analyses should correspond to a specified research hypothesis; therefore, it is written, “A t-test (parametric statistical analysis) was used to determine if intervention A significantly reduced systolic blood pressure above and beyond what was accounted for in the control.” This provides context (Giofrè et al., 2022).

2. Variable Definitions and Descriptive Statistics

Researchers must provide definitions for independent and dependent variables and also report the central tendency and variability using an appropriate measure (mean ± SD, median and interquartile range) and, if categorical, include a frequency distribution.

3.Clinically Meaningful Thresholds

In studies that are related to health or clinical outcomes, when relevant, researchers must report the minimal clinically important difference (MCID) (Heston, 2023).

4. Assumption Testing

Explicitly state and test the model assumptions. For example, normality must be assessed before applying parametric tests; if the assumption is violated, then the appropriate non-parametric alternative must be utilised. If regression analyses have been conducted, then linearity and multicollinearity diagnostics must accompany them. (Keppler et al., 2022).

5. Adjustment for Multiple Comparisons

When several hypotheses are tested (i.e., multiple hypothesis testing), it is worth exploring control methods for family-wise error (i.e., Bonferroni or false discovery rate procedures), with reasons provided to reduce the likelihood of Type I error (Xiong & Cribben, 2022).

6. Directionality and Significance Levels

The authors should clarify whether their tests were one-tailed or two-tailed and, if one-tailed tests were used, provide a rationale for their use. The alpha level (usually p < 0.05) should be explicitly stated.

7. Statistical Software and Packages

All software, versions and packages should be stated, for instance, “analyses were conducted in R version 4.3.1 using the lme4 and car packages.” The ability to transparently report software and versions is of paramount importance for computational reproducibility (Giofrè et al., 2022).
Additional analyses can serve both as a robustness check and as an exploratory analysis:

1. Sensitivity analysis

Sensitivity analyses (exploration of assumptions or models, e.g., alternative ways to deal with or adjust for outliers or influential cases) should report all results transparently and completely (ASA, 2022).

2. Missing Data Handling

The methods of missing data handling must be addressed (e.g., complete-case analysis, multiple imputation, maximum likelihood estimation; where applicable, assumptions or reasoning about the components of the missingness mechanism for the missing data; EQUATOR Network, 2024).

3. Validation of Assumptions

Post hoc checks of the assumptions of a model (e.g., checks for homoscedasticity or checking for potentially influential points in a regression) should be reported transparently.

4. Post Hoc or Exploratory Analyses

Exploratory analyses should be mentioned in the context of analyses not explicitly specified in a preregistered study plan; findings should be interpreted with sensitivity to the prospect of data dredging and generalised conclusions (Keppler, et al., 2022).
For the statistical reporting, the ASA (2022) establishes a code of ethics whose principles are honesty, accountability, and methodological rigour. The authors are advised against selective reporting and are expected to declare any potential conflicts of interest. Moreover, most leading journals now encourage or require the application of standardised reporting guidelines such as CONSORT, PRISMA, STROBE, and ARRIVE (EQUATOR Network, 2024; Giofrè et al., 2022). These ensure transparency and replicability of reporting within any form of study, from randomised trials to observational designs.

Independent researchers can reproduce results through reproducibility, utilising the same dataset and code (Xiong & Cribben, 2022). Best practices, including open science practices such as sharing the code via GitHub or providing statistical notebooks (e.g., R Markdown, Jupyter), are increasingly adopted (ASA, 2022).

In welcome showing what journals prescriptively demanded of authors to report confidence intervals and open data, they did significantly improve the quality of statistical reports. Similarly, the findings enabled correcting discrepancies between the assessed and reported p-values to assist reviewers and readers in checking analytical accuracy (Heston, 2023).

Further pre-registration and publication of analysis plans help enhance reproducibility by limiting exploratory data analysis and optionality within a data analysis (Pownall et al., 2023).

Discussion and Implications

A standardised reporting of statistical development in the same vein as statistical development is critical to continued development of any scientific discipline. In conjunction with recent best practices of the field and empirical literature, we suggest that authors pre-register their analytic plans (and post hoc analyses) and separate confirmatory from exploratory analyses. Furthermore, we advocate for authors to report:
1. All steps of data processing (including transformation and outlier treatment), with thorough detail.

2. Justification for every statistical method used, including adjustments for multiple testing and whether models were used as assumed.

3. Details of the software used and R package version numbers.

4. Established guidelines (e.g., CONSORT, STROBE, PRISMA).

5. Make their programming code available on publicly available and shareable platforms (such as GitHub) or in supplemental material.

6. Adopt recognised ethical and scientific standards for reporting that promote transparency and reproducibility.

By adhering to these principles, researchers will remedy not only the output requirements of journals but also develop the integrity and utility of scientific literature more generally.

Need Assistance with Statistical Reporting in Your PhD Research?
PhD Assistance Research Lab provides personalized PhD statistical assistance to improve the transparency, credibility, and likelihood of publication for your work.
Contact us to upskill your methodology chapter or statistical approach!

References

1. American Statistical Association. (2022). Ethical Guidelines for Statistical Practice. ASA.

2. EQUATOR Network. (2024). Reporting guidelines for health research.

3. Giofrè, D., Cumming, G., Fresc, L., Boedker, I., & Tressoldi, P. (2022). The impact of journal guidelines on statistical reporting and open science practices in psychology. Psychological Science, 33(2), 145–161.

4. Gosselin, R.-D. (2021). Poor transparency in preclinical statistical reporting: A systematic review. Scientific Reports, 11(1), 15129.

5. Heston, T. F. (2023). Statistics, ethics, and reproducible research. ResearchGate.

6. Keppler, A. M., Haslbeck, J. M. B., & Fried, E. I. (2022). A ten-point checklist for statistical reporting in psychology, medicine, and health. Advances in Methods and Practices in Psychological Science, 5(1), 1–14.

7. Sandoval-Lentisco A, Tortajada M, López-Nicolás R, et al. Preregistration of Psychology Meta-Analyses: A Cross-Sectional Study of Prevalence and Practice. Advances in Methods and Practices in Psychological Science. 2025;8(1). doi:10.1177/25152459241300113

8. Xiong, X., & Cribben, I. (2022). A reproducibility crisis in statistics? Empirical findings and recommendations. Journal of Statistical Theory and Practice, 16(4), 1–18.