PI-118 - DRUG-INDUCED ADE PREDICTION MODEL PERFORMANCE DEPENDS ON STATISTICAL MODELS AND STATISTICALLY SIGNIFICANT THRESHOLDS FOR DRUG-INDUCED ADE DATA
Wednesday, May 28, 2025
5:00 PM - 6:30 PM East Coast USA Time
J. Ouyang1, P. Zhang2, L. Wang3, L. Li1; 1The Ohio State University, Columbus, OH, United States, 2Indiana University, Indianapolis, IN, United States, 3The Ohio State University, Columbus, OH, USA.
Graduate Research Associate The Ohio State University Columbus, Ohio, United States
Background: Drug-induced ADE prediction models typically rely on positive labels inferred from statistical methods. However, the impact of various statistical models and significance thresholds on model performance has not been extensively studied. Methods: The study utilized a rigorously cleaned and structured FAERS dataset, where drug names were mapped using ChEMBL. We implemented six different statistical methods to assess drug-ADE relationships: the Proportional Reporting Ratio (PRR), Reporting Odds Ratio (ROR), Information Component (IC), Bayesian Confidence Propagation Neural Network (BCPNN), Empirical Bayesian Geometric Mean (EBGM), and a novel Frequency and Risk-Aware Model (FARM) that assumes relative risks (RRs) follow a mixture distribution. For each drug-ADE pair, a contingency table was generated to facilitate the calculation of metrics under these models. Drug-ADE pairs were then labeled as positive based on thresholds of top 10%, 30%, and 50% ranks. Then, predictive models were developed using logistic regression that included sixteen molecular features from PubChem. The prediction model performance is evaluated by the area under the receiver operating characteristic curve (AUC). Results: Analysis of statistical overlaps showed that PRR and ROR are almost identical, with over 99% similarity in their top 10% ranked drug-ADE pairs. The FARM model demonstrated a significant 84% overlap with both PRR and ROR. BFDR and BCPNN also showed an 84% overlap, while other methodology pairs like IC with BFDR or BCPNN had lower overlaps ranging from 11% to 34%. This variation demonstrated signal detection is methodology-dependent. We investigated the drug-induced ADE prediction model performance based on statistical methods and statistical significance thresholds. As shown in Figure 1, BFDR leads to the highest performance in AUC. Data generated from the top 10% thresholds generally have better performance than the other thresholds. Conclusion: The study shows that PRR/ROR is very similar to FARM in rank top drug-ADE pairs, as are BFDR and BCPNN. The other method pairs are quite different. Regardless of which statistical methods, drug-ADE data generated from the top 10% thresholds generally led to better ADE prediction model performance than the other thresholds.