CYCLOPS 2.0: Empowering and Applying Molecular Rhythm Analysis in Clinical Data with Confounders
Thursday, September 28, 2023
10:00 AM-12:00 PM
BIOMED PhD Research Proposal
Title:
CYCLOPS 2.0: Empowering and Applying Molecular Rhythm Analysis in Clinical Data with Confounders
Speaker:
Jan A. Hammarlund, PhD Candidate
School of Biomedical Engineering, Science and Health Systems
Drexel University
Advisors:
Ron C. Anafi, MD, PhD
Assistant Professor
Perelman School of Medicine
University of Pennsylvania
Andres Kriete, PhD
Associate Dean for Academic Affairs
Teaching Professor
School of Biomedical Engineering, Science and Health Systems
Drexel University
Details:
Many genes have circadian rhythms across healthy mouse tissues. The same gene can display different cyclic behaviors depending on tissue type, disease state, age, or sex. These findings are helpful but are based on animal models and can diverge from human physiology. While circadian animal data are carefully collected, taking care to record sample collection time, human samples often omit this information. As a result, human circadian analysis requires different methods. One approach that has shown promise is unsupervised machine learning.
CYCLOPS (CYCLic Ordering by Periodic Structure) is a machine learning approach developed to order unstructured data. CYCLOPS uses the intrinsic periodic structure of a dataset to assign a relative order to un-timed samples. CYCLOPS was validated on various mouse tissues and was successfully applied to human tissues collected from a single center. Thousands of transcripts were discovered to be rhythmic in multiple tissues, some of which are targets of frequently prescribed medications. The algorithm has drawbacks: It requires many samples, and non-circadian confounding influences can drastically alter the ordering. These drawbacks are significant in clinical data. Human data feature higher intersubject genetic variability. Furthermore, confounding variables controlled in mouse experiments are uncontrolled in human clinical data, and these data are often collected across many sites.
It is common to normalize data in the pre-processing workflow. However, sample collection time and circadian traits may systematically vary with confounding variables, like collection center. As a result, normalizing these confounders may remove the actual circadian signal and compromise ordering. Therefore, new methods are required to identify cycling transcripts and proteins reliably in most clinical data.
I will improve CYCLOPS to handle labeled confounding variables to meet this challenge. I will validate my approach using clinical, semi-synthetic, and synthetic data. I will apply the improved CYCLOPS to aggregated data from major repositories to identify molecular rhythms in diseased and adjacent human female breast tissue. Finally, I will demonstrate a new use case of human circadian data, developing regression-based methods to identify circadian-adjusted expression quantitative trait loci.
Contact Information
Natalia Broz
njb33@drexel.edu