On November 20th Bjørn Bredesen Successfully Defended His PhD Thesis With The Title: Modelling the structure, function and evolution of Polycomb/Trithorax Response Elements.
“The broad repertoire of cell types in our bodies is enabled by gene regulatory systems. Polycomb/Trithorax Reponse Elements (PREs) are regulatory elements in DNA that recruit Polycomb/Trithorax group (PcG/TrxG) proteins and maintain an epigenetic memory of gene transcription states across cell division.
PREs were first discovered in the fruit fly, where they are enriched in a variety of sequence motifs. These motifs, together with a small set of known PREs, have previously been used for training predictive models. There have been discrepancies between computationally predicted PREs and experimentally mapped PcG/TrxG binding, as well as between different experimental data sets. We reconciled these differences by training models with genome-wide experimental data. We also developed a new method that improves the prediction of PREs: SVM-MOCCA.
Over the past decade, vertebrate PREs have also been identified. The sequence features of vertebrate PREs are less well understood. We developed a new reinforcement learning regimen that exploits genome-wide experimental data and machine learning methods that do not require prior motif knowledge. Using this, we predicted PcG/TrxG binding in the fruit fly, mouse and human genomes and gained insights into the evolution of the underlying DNA sequences.
I developed a suite of tools called MOCCA, which implements multiple methods, including SVM-MOCCA and a derivative method, RF-MOCCA. We demonstrated the broader applicability of SVM-MOCCA by training it on a different class of regulatory elements: boundary elements. I also developed Gnocis, a package for Python 3 that streamlines the reproducible analysis and modelling of regulatory element DNA sequences. Gnocis provides tools for data processing and a declarative syntax for feature set and model specification, and implements functionality for model evaluation and genome-wide prediction.”
Bredesen, B.A. and Rehmsmeier, M., 2019. DNA sequence models of genome-wide Drosophila melanogaster Polycomb binding sites improve generalization to independent Polycomb Response Elements. Nucleic acids research, 47(15):7781-7797. The article is available in the main thesis. The article is also available at: https://doi.org/10.1093/nar/gkz617
Bredesen B. A., Rehmsmeier M. Biomarker reinforcement learning with k-spectra enables precise Polycomb target site prediction without prior motif knowledge. Full text not available in BORA.
Bredesen B. A., Rehmsmeier M. MOCCA: A flexible suite for modelling DNA sequence motif occurrence combinatorics. Full text not available in BORA.
Bredesen B. A., Rehmsmeier M. Gnocis: An integrated system for interactive and reproducible analysis and modelling of cis-regulatory elements in Python 3. Full text not available in BORA.