Using Cluster Analysis, Cluster Validation, and Consensus Clustering to Identify Subtypes
Shen, Jess Jiangsheng
MetadataShow full item record
Pervasive Developmental Disorders (PDDs) are neurodevelopmental disorders characterized by impairments in social interaction, communication and behaviour [Str04]. Given the diversity and varying severity of PDDs, diagnostic tools attempt to identify homogeneous subtypes within PDDs. The diagnostic system Diagnostic and Statistical Manual of Mental Disorders - Fourth Edition (DSM-IV) divides PDDs into five subtypes. Several limitations have been identified with the categorical diagnostic criteria of the DSM-IV. The goal of this study is to identify putative subtypes in the multidimensional data collected from a group of patients with PDDs, by using cluster analysis. Cluster analysis is an unsupervised machine learning method. It offers a way to partition a dataset into subsets that share common patterns. We apply cluster analysis to data collected from 358 children with PDDs, and validate the resulting clusters. Notably, there are many cluster analysis algorithms to choose from, each making certain assumptions about the data and about how clusters should be formed. A way to arrive at a meaningful solution is to use consensus clustering to integrate results from several clustering attempts that form a cluster ensemble into a unified consensus answer, and can provide robust and accurate results [TJPA05]. In this study, using cluster analysis, cluster validation, and consensus clustering, we identify four clusters that are similar to – and further refine three of the five subtypes defined in the DSM-IV. This study thus confirms the existence of these three subtypes among patients with PDDs.
Request an alternative formatIf you require this document in an alternate, accessible format, please contact the Queen's Adaptive Technology Centre
Showing items related by title, author, creator and subject.
Grant, Ryan Eric (2012-09-25)This dissertation demonstrates new methods for increasing the performance and scalability of high performance networking technologies for use in clustered computing systems, concentrating on Ethernet/High-Speed networking ...
Niu, Yi (2013-02-01)Clustered failure time data often arise in biomedical and clinical studies where potential correlation among survival times is induced in a cluster. In this thesis, we develop a class of marginal models for right censored ...
Campbell, Ainsley (2011-10-12)The globular cluster system (GCS) of the elliptical galaxy NGC 4649 has been examined using the Gemini Multi-Object Spectrograph (GMOS); spectra for 156 candidate globular clusters (GCs) were obtained, extending to a ...