Automated Analysis of Load Tests Using Performance Counter Logs

Thumbnail Image
Malik, Haroon
Performance Counter , Load Test
Load testing remains the most integral part of testing and measuring the performance of Large Scale Software Systems (LSS). During the course of a load test, a system under test is closely monitored, resulting in an extremely large amount of logging data, e.g., Performance counters logs. The performance counter log captures run-time system properties such as CPU utilization, disk I/O, queues, and network traffic. Such information is of vital interest to performance analysts. The information helps them to observe the system’s behavior under load by comparing it against the documented behavior of a system or with expected behavior. In practice, for LSS, it is impossible for an analyst to skim through the large amount of performance counters to find the required information. Instead, analysts often use ‘rules of thumb’. In a LSS, there is no single person with complete system knowledge. In this thesis, we present methodologies to help performance analysts to 1) more effectively compare load tests to detect performance deviations, which may, lead to Service Level Agreement (SLA) violations and 2) provide them with a smaller and manageable set of important performance counters to assist in the root cause analysis of the detected deviations. We demonstrate our methodologies through case studies based on load test data obtained from both a large scale industrial system and an open source benchmark system. Our proposed methodologies can provide up to 89% reduction in the set of performance counters while detecting performance deviations with few false positives (i.e., 95% average precision).
External DOI