Show simple item record

dc.contributor.authorNahlawi, Layanen
dc.date2010-12-15 16:38:05.612
dc.date2010-12-15 18:03:00.208
dc.date.accessioned2010-12-16T15:07:10Z
dc.date.available2010-12-16T15:07:10Z
dc.date.issued2010-12-16T15:07:10Z
dc.identifier.urihttp://hdl.handle.net/1974/6242
dc.descriptionThesis (Master, Computing) -- Queen's University, 2010-12-15 18:03:00.208en
dc.description.abstractThe recent decade has witnessed great advances in microarray and genotyping technologies which allow genome-wide single nucleotide polymorphism (SNP) data to be captured on a single chip. As a consequence, genome-wide association studies require the development of algorithms capable of manipulating ultra-large-scale SNP datasets. Towards this goal, this thesis proposes two SNP selection methods; the first using Independent Component Analysis (ICA) and the second based on a modified version of Fast Orthogonal Search. The first proposed technique, based on ICA, is a filtering technique; it reduces the number of SNPs in a dataset, without the need for any class labels. The second proposed technique, orthogonal search based SNP selection, is a multivariate regression approach; it selects the most informative features in SNP data to accurately model the entire dataset. The proposed methods are evaluated by applying them to publicly available gene SNP datasets, and comparing the accuracies of each method in reconstructing the datasets. In addition, the selection results are compared with those of another SNP selection method based on Principal Component Analysis (PCA), which was also applied to the same datasets. The results demonstrate the ability of orthogonal search to capture a higher amount of information than ICA SNP selection approach, all while using a smaller number of SNPs. Furthermore, SNP reconstruction accuracies using the proposed ICA methodology demonstrated the ability to summarize a greater or equivalent amount of information in comparison with the amount of information captured by the PCA-based technique reported in the literature. The execution time of the second developed methodology, mFOS, has paved the way for its application to large-scale genome wide datasets.en
dc.language.isoengen
dc.relation.ispartofseriesCanadian thesesen
dc.rightsThis publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.en
dc.subjectSNP Selectionen
dc.subjectFast Orthogonal Searchen
dc.subjectIndependent Component Analysisen
dc.subjectGenetic Data Analysisen
dc.titleGenetic Feature Selection Using Dimensionality Reduction Approaches: a Comparative Studyen
dc.typethesisen
dc.description.degreeM.Sc.en
dc.contributor.supervisorMousavi, Parvinen
dc.contributor.departmentComputingen
dc.degree.grantorQueen's University at Kingstonen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record