Biomedical Data Mining with Evolutionary Computing
Loading...
Authors
Sha, Chengyuan
Date
Type
thesis
Language
eng
Keyword
Linear Genetic Algorithm , Machine learning , Metabolomics , Evolutionary computing
Alternative Title
Abstract
Direct link between metabolism and cell and organism phenotype in health and disease makes metabolomics, a high throughput study of small molecular metabolites, an essential methodology for understanding and diagnosing disease development and progression. Machine learning methods have seen increasing adoptions in metabolomics thanks to their powerful prediction abilities. However, the “black-box” nature of many machine learning models remains a major challenge for wide acceptance and utility as it makes the interpretation of decision process difficult. This challenge is particularly predominant in biomedical research where understanding of the underlying decision making mechanism is essential for ensuring safety and gaining new knowledge.
In this thesis, we proposed 1) a novel computational framework, Systems Metabolomics using Interpretable Learning and Evolution (SMILE), for supervised metabolomics data analysis 2) an extension of SMILE on hidden class clustering (SMILE-clustering). Our methodology uses an evolutionary algorithm to learn interpretable predictive models and to identify the most influential metabolites and their interactions in association with disease. Moreover, we have developed a web application with a graphical user interface that can be used for easy analysis, interpretation and visualization of the results. Performance of the method and utilization of the web interface is shown using metabolomics data for Alzheimer’s disease (AD).
SMILE was able to identify several influential metabolites in AD and to provide interpretable predictive models that can be further used for a better understanding of the metabolic background of AD. In addition, SMILE-clustering enables us to discover two distinct classes in a community-based adolescent lipidomics dataset. SMILE addresses the emerging issue of interpretability and explainability in machine learning, and contributes to more transparent and powerful applications of machine learning in bioinformatics.
Description
Citation
Publisher
License
Queen's University's Thesis/Dissertation Non-Exclusive License for Deposit to QSpace and Library and Archives Canada
ProQuest PhD and Master's Theses International Dissemination Agreement
Intellectual Property Guidelines at Queen's University
Copying and Preserving Your Thesis
This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.
Attribution-NonCommercial 3.0 United States
ProQuest PhD and Master's Theses International Dissemination Agreement
Intellectual Property Guidelines at Queen's University
Copying and Preserving Your Thesis
This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.
Attribution-NonCommercial 3.0 United States