Reporting Association Analysis

Best Practices for Reporting Association Analyses in Biomedical Research

Association analyses are core methodologies in biomedical and public health research, and they are commonly used to evaluate relationships between variables such as exposure/outcome, biomarker/disease, or policy contexts/behaviours.

Introduction

Association analyses are core methodologies in biomedical and public health research, and they are commonly used to evaluate relationships between variables such as exposure/outcome, biomarker/disease, or policy contexts/behaviours. However, the potential to produce meaningful findings through the association analysis of variables is intimately tied to the transparency and rigor by which these association analyses are reported. Reporting practices can undermine reproducibility, credibility, and interpretability if it includes imprecise descriptions, offered without the necessary statistical details, or unqualifies commonly misused terms like “significant” or “strong.”
The article discusses best practice reporting methods for association analyses and connects to other methodological guidance and peer-reviewed examples from recent literature. Using structured principles, the confidence and scientific transparency of reports of association analyses is increased.

Clearly Describe the Association of Interest

  • Every association study starts with a defined research question. This requires specifying the exposure (independent variable) and outcome (dependent variable), and the supposed direction or type of association (Bahadoran et al., 2025). For example, Patterson et al. (2025) defined their research question, in specific environmental and health terms when examining the association between takeaway food accessibility and adiposity, which improved interpretability and policy relevance.
  • Creating a meaningful and testable question is not just a rite of passage – it is the conceptual foundation that guides the study, the choice of inference method, and the conceptualization of study findings (Bahadoran et al., 2025).
  • Certainly, a comprehensive description of variables is paramount. It is necessary to describe the type of variable, whether categorical or continuous, the measurement units, and to provide at a minimum one summary measure, such as mean, standard deviation (SD), or median and interquartile ranges (IQRs) where warranted. For categorical variables, frequencies or proportions are provided.
  • Dwivedi et al. (2025) provided an excellent example of this in their examination of phthalate metabolites and sex hormones and gave various population distributions and estimates where appropriate. Goodman et al. (2025) were also able to describe the variables from their genome-wide association study with over 400,000 participants with summary statistics by variable which is necessary in order to provide strong statistical inference.
  • Authors should identify which test was done to test for the association (e.g., chi-square test, regression models, Pearson’s correlation, etc), and whether a one or two tailed test was analysed and provide an explanation of the inference from the decision.
  • For example, when focusing on genetics, see, Gouveia et al. (2025) used logistic regression models, controlling for ancestry, sex, and evidence of the data is complex, and likewise Abdul Rahman et al. (2025) made a contribution to data on choosing and conducting a chi-square test selecting the model type, determining sample size, and avoidance of Type I or II errors.
  • Ideally, authors should report the actual P value (e.g., P = 0.03) to clarify the strength of the statistical evidence instead of vague language, for example “NS” (non-significant). Reporting the actual P value is typical across high impact journals and removes uncertainty about statistical significance.
  • In evaluating cannabis policy and use, for example, Pessar et al. (2025), provided exact P values, and if they are comparing the decline trends of states, then we would be able to find significant ones ourselves. This level of explicitness builds our confidence in a study’s findings to aid support in a meta-analysis for the same purpose.
  • In addition to significance tests, the researcher must also estimate the strength of the association using a measure or measures (e.g. odds ratios [ORs], relative risks [RRs], correlation coefficients or beta coefficients), and report the precision (e.g., confidence intervals [e.g., 95% CI]).
  • As illustrated in the research study by Larsen et al. (2024) that examined the relationship between lipopolysaccharide endotoxins and retinal neurodegeneration; by using beta coefficients and 95% CIs as estimates, they demonstrated a far more informative interpretation of their findings than would have been possible by reporting only as a PID (i.e., without consideration of CI).
  • Dwivedi et al. (2025) recommended to practice IFORS by including ORs, RRs or other measures of effect size in reporting. Although Dwivedi et al. expressed caution over inflated reliance on P values and encouraged researchers to use biological and clinical reasoning to provide their perspective on the importance of the effect sizes, they are equally cautious about providing actual effect sizes.
  • Present Supporting Data

  • To apply your analysis with transparency, you should allow room for contingency tables or other forms of data representation when reporting associations, to allow reviewers and readers to look at the raw associations on their own. This is especially neede for primary comparisons, or comparisons from subgroup analyses.
  • For example, Patterson et al. (2025) looked at stratified data over time points over urban walkability and take-away food density levels, to show that the environmental factor acted on BMI, when allowed to see against time point or demographic filter. In particular, the studies from Alienor published by Larsen et al. (2024) analysed both contingency tables and adjusted regression models on the pattern of associations of endotoxin exposures and structural ocular degeneration over aging groups.
  • Identify the Statistical Software Used

  • Specifying which statistical software was used (e.g., R, SPSS, SAS, STATA) increases transparency and provides information about replicability. Statistical software makes use of differing assumptions regarding handling of missing data, multicollinearity, or assumptions regarding variance. These differences matter a great deal in small ways.
  • In Goodman et al’s (2025) large scale meta-analysis of GWAS’s they stated they analyzed the data using PLINK and R, as they felt this would allow computational replicability. In a similar way, Abdul Rahman et al. (2025) quoted their use of G*Power for sample size calculations, and this provided transparency and clarity for the research in respect to those methods.
  • Conclusion

    The accurate and complete reporting of association analyses improves the transparency, reproducibility, and consequential-ness of scientific research. There is much work put into scientific research and as each section of the reporting follows an explicit thought process from defining variables and hypotheses to testing them and reporting effect sizes with confidence intervals, adds to the logical and scientific reporting.
    As the field moves toward open science and data sharing, these best practices, supported by recent methodological and applied research, will help assure research findings can be both trusted and valuable to the broader scientific community.

    Require Professional Help Alongside Reporting Association Analyses for Your Biomedical Work?
    PhD Assistance Research Lab has dedicated services to help you report your association analyses accurately and transparently in your study.
    Contact us today to ensure that your study complies with the best statistical reporting practices to enhance the credibility and reproducibility of your research to successfully publish your work!

    References

    1. Abdul Rahman, H., Noraidi, N., Hj Khalid, K., et al. (2025). Practical guide to calculate sample size for chi-square test in biomedical research. BMC Medical Research Methodology, 25, 144. https://doi.org/10.1186/s12874-025-02584-4

    2. Bahadoran, Z., Mirmiran, P., Kashfi, K., & Ghasemi, A. (2025). Biomedical research: Formulating a well-built and worth-answering research question. Addiction & Health, 17, 1564. https://doi.org/10.34172/ahj.1564

    3. Dwivedi, A. K., Ahmed, S., & Dubey, P. (2025). EXPRESS: Evidence-based biostatistics and value-based biostatistics practices in biomedical research: Application to evaluating the association between phthalate metabolites and sex hormones in US male adults. Journal of Investigative Medicine. Advance online publication. https://doi.org/10.1177/10815589251350922

    4. Goodman, M. O., Faquih, T., Paz, V., et al. (2025). Genome-wide association analysis of composite sleep health scores in 413,904 individuals. Communications Biology, 8, 115. https://doi.org/10.1038/s42003-025-07514-0

    5. Gouveia, M. H., et al. (2025). [Title not provided]. The American Journal of Human Genetics, 112(6), 1286–1301.

    6. Larsen, P. P., Féart, C., Pais de Barros, J. P., Gayraud, L., Delyfer, M.-N., Korobelnik, J. F., Schweitzer, C., & Delcourt, C. (2024). Association of lipopolysaccharide-type endotoxins with retinal neurodegeneration: The Alienor Study. Ophthalmology Science, 5(1), 100610. https://doi.org/10.1016/j.xops.2024.100610

    7. Patterson, R., Ogilvie, D., Hoenink, J. C., Burgoine, T., Sharp, S. J., Hajna, S., & Panter, J. (2025). Combined associations of takeaway food availability and walkability with adiposity: Cross-sectional and longitudinal analyses. Health & Place, 91, 103405. https://doi.org/10.1016/j.healthplace.2024.103405

    8. Pessar, S. C., Smart, R., Naimi, T., Lira, M., Blanchette, J., Boustead, A., & Pacula, R. L. (2025). The association between state cannabis policies and cannabis use among adults and youth, United States, 2002–2019. Addiction, 120(1), 164–170. https://doi.org/10.1111/add.16663