CohortSymmetry: An R package to perform sequence symmetry analysis using the OMOP common data model

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background

Real-world data are valuable for detecting adverse drug events, and Sequence Symmetry Analysis (SSA) is a simple yet effective method frequently used for this purpose. However, heterogeneous implementations across studies limit reproducibility and scalability. To address this, we developed an open-source R package that standardises SSA analytics using data mapped to the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM).

Methods

We developed CohortSymmetry , an R package that implements SSA for OMOP CDM data. The package was validated through unit testing and evaluated empirically by estimating adjusted sequence ratios (ASRs) with 95% confidence intervals (CIs) for 23 positive and 10 negative controls across six European databases, including CPRD GOLD (UK) and THIN® (Belgium, Italy, Romania, Spain, UK). Sensitivity and specificity were defined as the proportions of positive and negative controls correctly identified by SSA. Sensitivity analyses varied key parameters, including the washout period.

Results

CohortSymmetry passed high-coverage unit tests. Of 33 eligible controls, four showed results consistent with expectations across all databases; for example, the amiodarone–levothyroxine pair had a lower 95% CI bound >1 in each. Sensitivity was moderate, whereas specificity was high in the primary analyses. Parameter variation influenced outcomes; a 365-day prior observation requirement reduced specificity in CPRD GOLD from 75% to 38%.

Conclusions

CohortSymmetry enables reproducible SSA using OMOP CDM data. Differences across databases likely reflect heterogeneity in data capture and prescribing patterns. Limitations include residual data variability and SSA’s susceptibility to time-varying confounding, underscoring the need for tailored analytic design in pharmacovigilance studies.

Key Messages

  • We developed CohortSymmetry, an open-source R package that standardises SSA analytics using OMOP CDM-mapped data and verified the correctness of functions via unit testing and application to real-world datasets.

  • CohortSymmetry passed high-coverage tests, and among 33 selected controls, four showed results consistent with expectations across all databases; varying analytical parameters affected results.

  • The package provides a reproducible and scalable framework for multi-database SSA studies, supporting robust pharmacovigilance, but careful specification of parameters is required to account for the characteristics of the medical domain under investigation.

Article activity feed