Detecting PDF Javascript Malware Using Clone Detection

Loading...
Thumbnail Image

Authors

Karademir, Saruhan

Date

2013-10-02

Type

thesis

Language

eng

Keyword

NiCad , Clone Detection , PDF , Security , Acrobat , Malware

Research Projects

Organizational Units

Journal Issue

Alternative Title

Abstract

One common vector of malware is JavaScript in Adobe Acrobat (PDF) files. In this thesis, we investigate using near-miss clone detectors to find this malware. We start by collecting a set of PDF files containing JavaScript malware and a set with clean JavaScript from the VirusTotal repository. We use the NiCad clone detector to find the classes of clones in a small subset of the malicious PDF files. We evaluate how clone classes can be used to find similar malicious files in the rest of the malicious collection while avoiding files in the benign collection. Our results show that a 10% subset training set produced 75% detection of previously known malware with 0% false positives. We also used the NiCad as a pattern matcher for reflexive calls common in JavaScript malware. Our results show a 57% detection of malicious collection with no false positives. When the two experiments’ results are combined, the total coverage of malware rises to 85% and maintains 100% precision. The results are heavily affected by the third-party PDF to JavaScript extractor used. When only successfully extracted PDFs are considered, recall increases to 99% and precision remains at 100%.

Description

Thesis (Master, Electrical & Computer Engineering) -- Queen's University, 2013-09-30 11:50:15.156

Citation

Publisher

License

This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.

Journal

Volume

Issue

PubMed ID

External DOI

ISSN

EISSN