Ferolito BR et al., Human Genetics and Genomics Advances, 7 (2026) 100556. doi:10.1016/j.xhgg.2025.100556 - Meta-analysis of MVP, UK Biobank and FinnGen with Mendelian randomization using eQTL/pQTL instruments implicates 6,447 genes and 69,669 causal gene-trait links. Key terms: Mendelian randomization, biobank meta-analysis, pQTL, drug target discovery, machine learning ranking.
Study Highlights:
The authors meta-analyzed GWAS from MVP, UK Biobank, and FinnGen across 2,003 harmonized phenotypes and used cis-eQTLs and cis-pQTLs from GTEx, eQTLGen, ARIC, Fenland, and deCODE to perform two-sample Mendelian randomization. They identified 69,669 significant gene-trait pairs (p ≤ 1.6×10⁻⁹) representing 6,447 genes with strong causal evidence and performed colocalization and sensitivity analyses to assess concordance. An XGBoost classifier trained on ChEMBL-derived approved targets and engineered biological features achieved a precision-recall AUC of 0.79 to rank MR hits by likelihood of clinical success. The resource yields rediscoveries and repurposing leads (e.g., ANXA2 nominated for lipid regulation) and supplies a prioritized list for downstream target evaluation.
Conclusion:
Integrating >1.2 million individuals' GWAS from large biobanks with eQTL/pQTL Mendelian randomization and orthogonal annotations yields 69,669 candidate causal gene-trait links and a machine-learning ranking that prioritizes targets for drug development.
Music:
Enjoy the music based on this article at the end of the episode.
Article title:
Leveraging large-scale biobanks for therapeutic target discovery
Journal:
Human Genetics and Genomics Advances, 7 (2026) 100556. doi:10.1016/j.xhgg.2025.100556
DOI:
10.1016/j.xhgg.2025.100556
Reference:
Ferolito BR, Dashti H, Giambartolomei C, Peloso GM, Golden DJ, Gravel-Pucillo K, Rasooly D, Horimoto ARV R, Matty R, Gaziano L, Liu Y, Smit IA, Zdrazil B, Tsepilov Y, Costa L, Kosik N, Huffman JE, Tartaglia GG, Bini G, Proietti G, Ioannidis H, Karim MA, Hunter F, Hemani G, Butterworth AS, Di Angelantonio E, Langenberg C, Ghoussaini M, Leach AR, Liao KP, Damrauer S, Selva LE, Whitbourne S, Tsao PS, Moser J, Gaunt T, Cai T, Whittaker JC, Million Veteran Program, Casas JP, Muralidhar S, Gaziano JM, Cho K, Pereira AC. Leveraging large-scale biobanks for therapeutic target discovery. Human Genetics and Genomics Advances. 7 (2026) 100556. https://doi.org/10.1016/j.xhgg.2025.100556.
License:
This episode is based on an open-access article published under the Creative Commons Attribution 4.0 International License (CC BY 4.0) - https://creativecommons.org/licenses/by/4.0/
Support:
Base by Base – Stripe donations: https://donate.stripe.com/7sY4gz71B2sN3RWac5gEg00
Official website https://basebybase.com
On PaperCast Base by Base you’ll discover the latest in genomics, functional genomics, structural genomics, and proteomics.
Episode link: https://basebybase.com/episodes/biobank-mendelian-randomization-targets
QC:
This episode was checked against the original article PDF and publication metadata for the episode release published on 2026-02-25.
QC Scope:
- article metadata and core scientific claims from the narration
- excludes analogies, intro/outro, and music
- transcript coverage: Audited the transcript portions describing Mendelian randomization (MR) as a nature-encoded trial, biobank meta-analysis (MVP/UKBB/FinnGen), molecular instruments (cis-eQTL/pQTL), concordant signal filtering, ML ranking, rediscovery/repurposing highlights, and lipid-target case study with ANXA2.
- transcript topics: Mendelian randomization as a natural clinical trial; Biobank meta-analysis (MVP, UKBB, FinnGen) and phenome-wide scope; cis-eQTL and cis-pQTL instrument sources; Two-sample MR and concordant-instrument filtering; XGBoost classifier for target prioritization; Drug rediscovery and repurposing (ANXA2, PCSK9, HMGCR)
QC Summary:
- factual score: 10/10
- metadata score: 10/10
- supported core claims: 8
- claims flagged for review: 0
- metadata checks passed: 4
- metadata issues found: 0
Metadata Audited:
- article_doi
- article_title
- article_journal
- license
Factual Items Audited:
- 69,669 gene-trait pairs with strong causal evidence (p ≤ 1.6×10⁻⁹)
- 6,447 genes with strong causal evidence across 2,003 phenotypes
- rediscovered ~9% of approved drug targets in ChEMBL34
- ML ranking classifier (XGBoost) with precision-recall AUC = 0.79
- ANXA2 nominated as a lipid-regulation target and acts via PCSK9 inhibition
- Trastuzumab-associated cardiotoxicity flagged as a risk