Log File Categorization and Anomaly Analysis Using Grammar Inference
Loading...
Authors
Memon, Ahmed Umar
Date
2008-05-28T18:32:07Z
Type
thesis
Language
eng
Keyword
Anomaly analysis , Log file analysis , Log categorization , Grammar inference , Log file reporting , Robust parsing , Island grammars , Program comprehension
Alternative Title
Abstract
In the information age of today, vast amounts of sensitive and confidential data is exchanged over an array of different mediums. Accompanied with this phenomenon is a comparable increase in the number and types of attacks to acquire this information. Information security and data consistency have hence, become quintessentially important. Log file analysis has proven to be a good defense mechanism as logs provide an accessible record of network activities in the form of server generated messages. However, manual analysis is tedious and prohibitively time consuming. Traditional log analysis techniques, based on pattern matching and data mining approaches, are ad hoc and cannot readily adapt to different kinds of log files.
The goal of this research is to explore the use of grammar inference for log file analysis in order to build a more adaptive, flexible and generic method for message categorization, anomaly detection and reporting. The grammar inference process employs robust parsing, islands grammars and source transformation techniques.
We test the system by using three different kinds of log file training sets as input and infer a grammar and generate message categories for each set. We detect anomalous messages in new log files using the inferred grammar as a catalog of valid traces and present a reporting program to extract the instances of specified message categories from the log files.
Description
Thesis (Master, Computing) -- Queen's University, 2008-05-22 14:12:30.199
Citation
Publisher
License
This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.