Using Deep Learning to predict the mortality of Leukemia patients

Loading...
Thumbnail Image

Authors

Muthalaly, Reena Shaw

Date

Type

thesis

Language

eng

Keyword

Deep Learning , H2O , Leukemia , Cancer , Acute Lymphoblastic Leukemia , ALL

Research Projects

Organizational Units

Journal Issue

Alternative Title

Abstract

“If it were not for the great variability among individuals, medicine might as well be a science, not an art.” Sir William Osler, Canadian physician and McGill alumnus, quoted in 1892. Personalized medicine is the approach that tailors the treatment of patients based on their unique genetic makeup and genetic environment. When equipped with details of individual variation, physicians can separate patients into subgroups to predict which patients must be treated aggressively and which patients would respond to one drug rather than another. Personalized medicine is now being applied towards the prediction of mortality in childhood Acute Lymphoblastic Leukemia (ALL) patients. This is because individual children differ in the sensitivity of their leukemic cells and in their response to treatment-related toxicity. Currently, mortality prediction for childhood ALL is based on risk-stratification performed by expert doctors. The genotypic (actual set of genes), phenotypic (expression of those genes in observable traits) and clinical information is used to stratify children into various risk categories. The information collected by doctors for risk-stratification include response to certain drugs, clinical factors such as age and gender and biological measurements such as white blood cell count. The goal of this thesis is to automate the prediction of childhood ALL mortality using genotypic and phenotypic information of patients. We intend to achieve this using the Deep Learning algorithm, which is known to analyze non-linear and complex information, such as cell-interactions, effectively. We conduct this thesis using the Big Data software called H2O, using R. We first experimented with the standard Deep Learning parameters, and later adjusted the size and shape of the general network by optimizing the depth and retention factor of hidden neurons. We improved the accuracy by using the techniques of Out-of-Bag Sampling, Bagging and Voting. We later tuned H2O’s platform-specific network parameters. Later, the number of votes needed to obtain the highest accuracy of predictions, is also tuned. We built confusion matrices for each of our Deep Learning models to evaluate how well our models perform. Our results are promising as they are corroborated by the clinical dataset and they perform better than the physicians’ predictions.

Description

Citation

Publisher

License

Queen's University's Thesis/Dissertation Non-Exclusive License for Deposit to QSpace and Library and Archives Canada
ProQuest PhD and Master's Theses International Dissemination Agreement
Intellectual Property Guidelines at Queen's University
Copying and Preserving Your Thesis
This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.
CC0 1.0 Universal

Journal

Volume

Issue

PubMed ID

External DOI

ISSN

EISSN