Graduation Date
5-2016
Document Type
Master's Thesis
Degree Name
Master of Science
Department or Program
Biological Sciences
Department or Program Chair
Maggie Louie, PhD
First Reader
Wyatt T. Clark, PhD
Second Reader
Randall Hall, PhD
Abstract
As with any complex biological pathway, the splicing process has both advantages and obstacles with respect to the diversity and fidelity of protein production. The potential benefits of being able to produce multiple versions of a gene (isoforms) must be weighed against the additional complexity introduced by the noisy and mechanistically complicated process of splicing. Indeed, research has found that errors in splicing can be implicated in an increasing number of disorders. Variants that cause disease may operate by disrupting splicing; however many of the variants are frequently annotated as disrupting function through a missense mutation, or via an unknown mechanism. The objective of this study is to determine the ubiquity of splice-altering variants (SAVs) in the human genome with a focus on coding missense and silent synonymous polymorphisms that may impact splicing. As a first step, we evaluated the ability of in silico prediction tools to predict whether a given variant will disrupt splicing. Top performing tools were then used to predict splicing disruption for two sets of variants in the genome; one data set contained variants located anywhere in an exon, and the second restricted variants by location with the focus specifically on those annotated as being involved in disease. The results demonstrate that for some of these prediction tools there is a bias in the results based on variant proximity to the exon-intron junction. Also, analysis of the data sets suggests that the variants listed as non-splice affecting in the database include a considerable number of false negatives. These results may be beneficial for updating the information in widely used databases to improve the usefulness of such resources. The efforts summarized in this thesis will hopefully bring insights into the mechanisms by which splicing errors contribute to disease development and thus facilitate disease treatment improvements.