Inferring Systemic Functional Language Models

Loading...
Thumbnail Image

Authors

Alsadhan, Nasser

Date

2014-08-29

Type

thesis

Language

eng

Keyword

Text Mining

Research Projects

Organizational Units

Journal Issue

Alternative Title

Abstract

Language production in the brain is a complicated process that is not yet fully understood. The bag-of-words model, which considers the frequencies of each word in a document, is a useful approach in many text mining fields, but it does not provide any information about how language is produced. Systemic networks model language as a set of choices, where each choice operates in a particular context. Capturing patterns of choices used to create a particular document provides useful information about the authors and what they were feeling and thinking when they created the document. However, producing systemic networks manually is expensive. We define an automated way of producing systemic networks. Given a set of documents, we cluster words of interest into smaller groups, by using Non-Negative Matrix Factorization (NNMF). We create hierarchical clusters that we interpret as systemic networks. We validate the produced systemic networks in a number of ways; we use them in an authorship prediction problem and compare their results to that of the bag-of-words model, as well as how well they cluster the different choices made by the authors. We also generate random systemic networks and compare their performance with the produced systemic networks.

Description

Thesis (Master, Computing) -- Queen's University, 2014-08-28 23:28:17.897

Citation

Publisher

License

This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.

Journal

Volume

Issue

PubMed ID

External DOI

ISSN

EISSN