Asm2Seq: Explainable Assembly Code Functional Summary Generation

Loading...
Thumbnail Image

Authors

Taviss, Scarlett

Date

Type

thesis

Language

eng

Keyword

Machine Learning , Assembly , Code Summary , Attention , Long Short-Term Memory , Gated Recurrent Unit , Summary Generation , Sequence-to-Sequence Learning , Asm2Seq

Research Projects

Organizational Units

Journal Issue

Alternative Title

Abstract

Technology is at the forefront of nearly every aspect of the modern world. Humans write code for technology to function as we desire, but often times little is understood about the code needed to control the computer without significant human analysis. This research aims to bridge this gap by producing human-readable summarizations of the functionality of the assembly code needed for computer execution. Vulnerability datasets are used as starting datasets on the model because finding and understanding vulnerabilities in a program are important for software maintenance, software anal- ysis, and software development. Source code files exhibiting various vulnerabilities are compiled to produce their assembly code counterparts and used as input to the model. Descriptions of how the vulnerabilities function are extracted from the source code files and used as the desired output for the model. Various neural network architectures make up the encoder-decoder experiments to determine the best model. Each experiment undergoes significant training in order to produce accurate predictions. Attention was added in order to understand what aspects of the assembly code had the biggest effect on generating the summary. The models produced high rates of accuracy and Bilingual Evaluation Understudy (BLEU) score, which are both indicative of a well performing network. Comparisons between model predictions and the true descriptions showcase the favorable results.

Description

Citation

Publisher

License

Queen's University's Thesis/Dissertation Non-Exclusive License for Deposit to QSpace and Library and Archives Canada
ProQuest PhD and Master's Theses International Dissemination Agreement
Intellectual Property Guidelines at Queen's University
Copying and Preserving Your Thesis
This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.
CC0 1.0 Universal

Journal

Volume

Issue

PubMed ID

External DOI

ISSN

EISSN