Question
There are 3 data sources for this assignment:
• Humvar – a list of variants with known impact (Benign/Pathogenic) across all diseases
• Lab Variants – a list of mutations in HGVS format and the phenotype the patients had
• Variants Annotated – The mutations in Lab Variants, annotated with Variant Effects.
You must train a Decision Tree using the Humvar dataset, and report the Accuracy, True Positive Rate and False Positive Rate using a Train/Test experiment. Then, using the trained humvar decision tree, predict the outcome of mutations in epilepsy and muscular conditions, again reporting Accuracy, True Positive Rate and False Positive Rate.
Is it easier to predict when a mutation will cause epilepsy compared to predicting if a mutation will cause muscular conditions? How can we tell?
Finally, plot Receiver Operating Curves (ROC) for each disease and explain the interpretation of the ROCs.
DescriptionIn this final assignment, the students will demonstrate their ability to apply two ma
Path finding involves finding a path from A to B. Typically we want the path to have certain properties,such as being the shortest or to avoid going t
Develop a program to emulate a purchase transaction at a retail store. Thisprogram will have two classes, a LineItem class and a Transaction class. Th
1 Project 1 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of
1 Project 2 Introduction - the SeaPort Project series For this set of projects for the course, we wish to simulate some of the aspects of a number of