Software Architecture-Based Failure Prediction
MetadataShow full item record
Depending on the role of software in everyday life, the cost of a software failure can sometimes be unaffordable. During system execution, errors may occur in system components and failures may be manifested due to these errors. These errors differ with respect to their effects on system behavior and consequent failure manifestation manners. Predicting failures before their manifestation is important to assure system resilience. It helps avoid the cost of failures and enables systems to perform corrective actions prior to failure occurrences. However, effective runtime error detection and failure prediction techniques encounter a prohibitive challenge with respect to the control flow representation of large software systems with intricate control flow structures. In this thesis, we provide a technique for failure prediction from runtime errors of large software systems. Aiming to avoid the possible difficulties and inaccuracies of the existing Control Flow Graph (CFG) structures, we first propose a Connection Dependence Graph (CDG) for control flow representation of large software systems. We describe the CDG structure and explain how to derive it from program source code. Second, we utilize the proposed CDG to provide a connection-based signature approach for control flow error detection. We describe the monitor structure and present the error checking algorithm. Finally, we utilize the detected errors and erroneous state parameters to predict failure occurrences and modes during system runtime. We craft runtime signatures based on these errors and state parameters. Using system error and failure history, we determine a predictive function (an estimator) for each failure mode based on these signatures. Our experimental evaluation for these techniques uses a large open-source software (PostgreSQL 8.4.4 database system). The results show highly efficient control flow representation, error detection, and failure prediction techniques. This work contributes to software reliability by providing a simple and accurate control flow representation and utilizing it to detect runtime errors and predict failure occurrences and modes with high accuracy.