Protein-coding Hotspots in the Human Genome
Wednesday, March 10, 2021
2:00 PM-4:00 PM
BIOMED PhD Thesis Defense
Protein-coding Hotspots in the Human Genome: Annotation, Significance, and Their Conservation in Animal Models
Debra V. Klopfenstein, PhD Candidate
School of Biomedical Engineering, Science and Health Systems
Will Dampier, PhD
Department of Microbiology and Immunology
College of Medicine
Uncovering understudied genes that are not yet associated with disease, but that have common functions with nearby genes that are not in the same gene family, can lead towards further understanding of these molecular mechanisms and may reveal novel drug targets. Previous studies of utilizing population genetics approaches did not focus on the chromosomal topology of a large number of major diseases on the human genome.
Clustering algorithms, augmented for this thesis to run on the linear topology of the human genome, identified the densest clusters with ten or more genes, called hotspots. Enrichment analysis on the hotspots finds genes which both share a genomic hotspot and significant gene functions to highlight genes that are understudied and/or are not yet associated with disease for further study.
Methods developed for this thesis include: new ways to compare functions among a set of genes, even if the genes are in different species; a library for examining Gene Ontology relationships; and an augmented exploratory literature search experience using PubMed combined with public access citation data from the National Institute of Health’s Open Citation Collection (NIH-OCC).