A Bootstrapped Approach for Abusive Intent Detection in Social Media Content
Loading...
Authors
Simons, Benjemin
Date
Type
thesis
Language
eng
Keyword
natural language processing , intent detection , deep learning
Alternative Title
Abstract
The proliferation of Internet connected devices continues to result in the creation of massive collections of human generated content from websites such as social media. Unfortunately, some of these sites are used by criminal or terrorist organizations for recruitment or to spread rhetoric. By analyzing this content, it is possible to gain insights into the future actions of the writers. This information can support organizations in taking proactive measures to modify or stop said actions from taking place. The textual feature of interest is the expression of abusive intent, which can be thought of as a plan to carry out a malicious action. The proposed approach independently detects abuse and intent in documents, then computes a joint prediction for the document. Abusive language detection is a well-studied problem, which enabled a model to be trained using supervised learning. The intent detection model requires a semi-supervised technique since no labelled datasets exist. To do this, an initial collection of labels was generated using a linguistic model. These labels were then used to co-train a statistical and deep learning model. Using crowd-sourced labels, the abuse and intent models were found to have accuracies of 95% and 80%, respectively. The joint predictions were then used to prioritize documents for manual assessment.
Description
Citation
Publisher
License
Queen's University's Thesis/Dissertation Non-Exclusive License for Deposit to QSpace and Library and Archives Canada
ProQuest PhD and Master's Theses International Dissemination Agreement
Intellectual Property Guidelines at Queen's University
Copying and Preserving Your Thesis
This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.
CC0 1.0 Universal
ProQuest PhD and Master's Theses International Dissemination Agreement
Intellectual Property Guidelines at Queen's University
Copying and Preserving Your Thesis
This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.
CC0 1.0 Universal