Oxford Nanopore Sequencing Multi-Modal Foundation Models for DNA and RNA Modified Base Detection
Monday, December 8, 2025
10:00 AM-12:00 PM
BIOMED PhD Research Proposal
Title:
Oxford Nanopore Sequencing Multi-Modal Foundation Models for DNA and RNA Modified Base Detection
Speaker:
Mian Umair Ahsan, PhD Candidate
School of Biomedical Engineering, Science and Health Systems
Drexel University
Advisors:
Kai Wang, PhD
Research Scientist
Children’s Hospital of Philadelphia (CHOP) Research Institute
Professor of Pathology and Laboratory Medicine
Perelman School of Medicine
University of Pennsylvania
Andres Kriete, PhD
Associate Dean for Academic Affairs
Teaching Professor
School of Biomedical Engineering, Science and Health Systems
Drexel University
Details:
Chemical modifications to DNA and RNA play critical roles in gene regulation, development, viral infection, and human disease, including cancer and neurological disorders. More than 150 epigenetic and epitranscriptomic modifications have been identified across living systems, yet their functions remain poorly understood due to limitations in current detection technologies. Oxford Nanopore Technologies (ONT) is a single-molecule sequencing platform which measures ionic current signal as single strands of DNA/RNA molecules translocate through a nano-scale pore and translates the signal into a sequence of nucleotide bases. This technology offers a transformative opportunity to detect modified nucleotides directly from ionic current signals; however, current computational tools remain constrained by error-prone signal processing, limited adaptability to diverse modifications, and a lack of high-quality training labels—resulting in high false-positive rates, especially for rare modifications.
This study proposes to develop a deep-learning framework that enables robust and accurate detection of diverse DNA and RNA modifications from ONT signals without relying on conventional signal-to-reference alignment or large labeled datasets. We will, 1) develop a new signal processing pipeline that uses an approximate signal-to-sequence alignment produced by the ONT sequencing platform and train Bidirectional Long Short-Term Memory (BiLSTM) and Transformer architecture neural network models on statistically derived features for accurate 5mC methylation detection; 2) train a multi-modal foundation model that jointly learns nanopore current and sequence representations using self-supervised cross-modality objectives across vast unlabeled ONT datasets, enabling robust transfer learning for downstream modification detection tasks; 3) establish weakly supervised and de novo frameworks for identifying rare and unknown modifications by integrating semi-labeled data and anomaly detection architectures, and full raw signal modeling.
The proposed research will deliver a versatile and generalizable platform for modification detection across organisms and modification types. By bypassing traditional alignment requirements and exploiting unlabeled nanopore datasets at scale using foundation models, this work addresses fundamental limitations of current tools and will accelerate discovery in epigenetics and epitranscriptomics and improve our understanding of disease mechanisms.
Contact Information
Natalia Broz
njb33@drexel.edu