Data Mining the Genetics of Leukemia

Loading...
Thumbnail Image

Authors

Morton, Geoffrey

Date

2010-01-13T15:28:42Z

Type

thesis

Language

eng

Keyword

Data Mining , Acute Lymphoblastic Leukemia

Research Projects

Organizational Units

Journal Issue

Alternative Title

Abstract

Acute Lymphoblastic Leukemia (ALL) is the most common cancer in children under the age of 15. At present, diagnosis, prognosis and treatment decisions are made based upon blood and bone marrow laboratory testing. With advances in microarray technology it is becoming more feasible to perform genetic assessment of individual patients as well. We used Singular Value Decomposition (SVD) on Illumina SNP, Affymetrix and cDNA gene-expression data and performed aggressive attribute se- lection using random forests to reduce the number of attributes to a manageable size. We then explored clustering and prediction of patient-specific properties such as disease sub-classification, and especially clinical outcome. We determined that integrating multiple types of data can provide more meaningful information than individual datasets, if combined properly. This method is able to capture the cor- relation between the attributes. The most striking result is an apparent connection between genetic background and patient mortality under existing treatment regimes. We find that we can cluster well using the mortality label of the patients. Also, using a Support Vector Machine (SVM) we can predict clinical outcome with high accu-racy. This thesis will discuss the data-mining methods used and their application to biomedical research, as well as our results and how this will affect the diagnosis and treatment of ALL in the future.

Description

Thesis (Master, Computing) -- Queen's University, 2010-01-12 18:40:44.2

Citation

Publisher

License

This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.

Journal

Volume

Issue

PubMed ID

External DOI

ISSN

EISSN