Monday, October 21, 2019 from Noon to 12:50pm
One Capitol Square - 305, Off-Campus
Public Portion of the Final Defense for
Department of Biostatistics
Virginia Commonwealth University
The three-dimensional (3D) structure of the genome plays a crucial role in gene expression regulation. Chromatin conformation capture technologies (Hi-C) have revealed that the genome is organized in a hierarchy of topologically associated domains (TADs), sub-TADs, and chromatin loops which is relatively stable across cell-lines and even across species. These TADs dynamically reorganize during development of disease, and exhibit cell- and condition-specific differences. Identifying such hierarchical structures and how they change between conditions is a critical step in understanding genome regulation and disease development. Despite their importance, there are relatively few tools for identification of TADs and even fewer for identification of hierarchies. Additionally, there are no publicly available tools for comparison of TADs across datasets. These tools are necessary to conduct large-scale genome-wide analysis and comparison of 3D structure.
To address the challenge of TAD identification, we developed a novel sliding window-based spectral clustering framework that uses gaps between consecutive eigenvectors for TAD boundary identification. Our method, implemented in an R package, SpectralTAD, has automatic parameter selection, is robust to sequencing depth, resolution and sparsity of Hi-C data, and detects hierarchical, biologically relevant TADs. SpectralTAD outperforms four state-of-the-art TAD callers in simulated and experimental settings. We demonstrate that TAD boundaries shared among multiple levels of the TAD hierarchy were more enriched in classical boundary marks and more conserved across cell lines and tissues. SpectralTAD is available at http://bioconductor.org/packages/SpectralTAD/.
To address the problem of TAD comparison, we developed TADCompare. TADCompare is based on a spectral clustering-derived measure called the eigenvector gap, which enables a loci-by-loci comparison of TAD boundary differences between datasets. Using this measure, we introduce methods for identifying differential and consensus TAD boundaries and tracking TAD boundary changes over time. We further propose a novel framework for the systematic classification of TAD boundary changes. Colocalization- and gene enrichment analysis of different types of TAD boundary changes revealed distinct biological functionality associated with them. TADCompare is available on https://github.com/dozmorovlab/TADCompare.