Why are Read-Overlap Graphs Computationally Tractable? (Visiting Scholar)
Rapidly evolving viruses (REVs) manifest themselves in a host as a family of related but genetically distinct virions called a quasispecies. With the advent of next-generation sequencing (NGS) technologies, it has become possible to quickly sequence an enormous amount of short reads from genetic samples, resulting in a growing literature to address the assembly of quasispecies sequence fragments into the original set of haplotype sequences along with their frequencies. HaploClique assembles quasispecies by analyzing the maximal cliques of the read overlap graph (ROG). Results in that paper strongly suggest the possibility that the ROG from their experiments is often a chordal graph, which is unlikely to happen by chance. This raises a number of questions that we intend to answer: is the ROG chordal or near-chordal? Is there an easy-to-find phylogenetic tree representing the quasispecies? Finally, is this an artifact of how the simulated data was generated?