Mining development knowledge to understand and support software logging practices

Loading...
Thumbnail Image

Authors

Li, Heng

Date

Type

thesis

Language

eng

Keyword

Software engineering , Mining software repositories , Software analytics , Log analytics , Data mining , Program analysis , Machine learning , Software logging , Natural language processing

Research Projects

Organizational Units

Journal Issue

Alternative Title

Abstract

Developers insert logging statements in their source code to trace the runtime behaviors of software systems. Logging statements print runtime log messages, which play a critical role in monitoring system status, diagnosing field failures, and bookkeeping user activities. However, developers typically insert logging statements in an ad hoc manner, which usually results in fragile logging code, i.e., insufficient logging in some code snippets and excessive logging in other code snippets. Insufficient logging can significantly increase the difficulty of diagnosing field failures, while excessive logging can cause performance overhead and hide truly important information. The goal of this thesis is to help developers improve their logging practices and the quality of their logging code. We believe that development knowledge (i.e., source code, code change history, and issue reports) contains valuable information that explains developers' rationale of logging, which can help us understand existing logging practices and provide helpful tooling support for such logging practices. Therefore, this thesis proposes to mine different aspects of development knowledge to understand and support software logging practices. We mine issue reports to understand developers' logging concerns, i.e., the benefits and costs of logging from developers' perspective. Our findings shed lights on future research opportunities for helping developers leverage the benefits of logging while minimizing logging costs. We mine source code to learn how developers distribute logging statements in their source code, and propose an approach to provide automated suggestions about where to log. We find that the semantic topics of a code snippet provide another dimension to explain the likelihood of logging a code snippet. We mine code change history to understand how developers develop and maintain their logging code, and propose an automated approach that can provide developers with log change suggestions when they change their code. We also mine code change history to understand how developers choose log levels for their logging statements, and propose an automated approach that can help developers determine the appropriate log level when they add a new logging statement. This thesis highlights the need for standard logging guidelines and automated tooling support for logging.

Description

Citation

Publisher

License

Attribution-NonCommercial-NoDerivs 3.0 United States
Queen's University's Thesis/Dissertation Non-Exclusive License for Deposit to QSpace and Library and Archives Canada
ProQuest PhD and Master's Theses International Dissemination Agreement
Intellectual Property Guidelines at Queen's University
Copying and Preserving Your Thesis
This publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.

Journal

Volume

Issue

PubMed ID

External DOI

ISSN

EISSN