New research published on arXiv reports that drug targets backed by human genetic evidence are significantly more likely to gain regulatory approval. The observational analysis, conducted by Victoria Paterson and colleagues, examined 26,278 target-disease pairs from the Open Targets and ChEMBL databases to quantify the association between genetic support and successful drug development.
Study Design and Scope
The study extracted 26,278 target-disease pairs, encompassing a wide range of therapeutic areas, from Open Targets and ChEMBL. Each pair was classified based on whether genetic evidence (e.g., genome-wide association studies, rare variant associations) supported the target-disease link. Approval status was determined using ChEMBL and additional sources. The analysis employed both pair-level and target-level odds ratios to account for non-independence of pairs sharing the same gene.
Key Findings: Genetic Evidence Triples Approval Likelihood
The primary result showed that targets with any genetic association had a 3.25-fold higher approval rate compared to those without (OR = 3.25, 95% CI 2.79–3.79, p = 1.91e–42). When accounting for non-independence at the target level, the odds ratio attenuated to 2.79 (bootstrap 95% CI 2.22–3.53). This indicates that while genetic support is strongly associated with approval, some overestimation occurs when treating each target-disease pair as independent.
The study also reported area-specific estimates. For oncology, the pair-level odds ratio was 6.72, but after target-level adjustment it dropped to 2.71, illustrating how non-independence can inflate estimates in specific therapeutic areas.
Temporal Validation: Association Persists in Recent Approvals
To test whether the enrichment is driven by historical data, the researchers replicated the analysis on post-2015 approvals alone. The enrichment remained significant, with an odds ratio of 3.51 (p = 1.72e–8), confirming that the association is not a relic of older drug development practices.
Feature Ablation and Predictive Value
A machine learning classifier was trained to predict approval using six types of evidence, including genetic associations, literature mining, and other biological data. Feature ablation revealed that literature mining accounted for most of the predictive performance (AUPRC = 0.099 with all features vs. 0.099 when literature was included alone). This is consistent with temporal leakage from post-approval publications that cite genetic evidence after approval.
When literature was excluded, the remaining evidence types still performed above baseline (AUPRC = 0.084, 1.63× baseline), indicating a genuine signal beyond publication bias. However, the overall predictive value was limited: the best model had poor calibration, and genetic evidence alone provided only a 1.0-percentage-point absolute AUPRC gain. The researchers caution that the classifier has limited practical predictive value.
Sensitivity Analyses and Catalog of Phase 1/2 Pairs
Sensitivity analyses bracketed the pair-level odds ratio between 3.25 and 4.93, depending on the analytical approach. The study also provides a catalog of 1,433 genetically supported Phase 1/2 target-disease pairs as a hypothesis-generating resource for future drug development.
Summary of Odds Ratios
| Analysis Level | Odds Ratio | 95% CI / p-value |
|---|---|---|
| Pair-level (primary) | 3.25 | 2.79–3.79, p=1.91e–42 |
| Target-level (bootstrap) | 2.79 | 2.22–3.53 |
| Post-2015 approvals | 3.51 | p=1.72e–8 |
| Oncology pair-level | 6.72 | (attenuated to 2.71 target-level) |
| Sensitivity range | 3.25–4.93 | — |
Limitations
All findings are observational; no causal inference can be drawn. The reliance on literature mining introduces temporal leakage, and the classifier's calibration remained poor. The authors emphasize that genetic evidence is only one of many factors influencing drug approval.
Implications for Manufacturing Executives
While this study does not directly address manufacturing capacity or supply chain decisions, it highlights a trend: drug targets with strong genetic support are more likely to succeed in clinical trials. For pharmaceutical manufacturing organizations, this suggests that investments aligned with genetically validated targets may carry lower risk of late-stage failure, potentially affecting long-term capacity planning and resource allocation. However, the observational nature of the evidence means that such decisions should be made cautiously, integrating other predictive factors.