Lightweight Top-K Analysis in DBMSs Using Data Stream Analysis Techniques

dc.contributor.authorHuang, Jingen
dc.contributor.supervisorMartin, Patricken
dc.date2009-08-31 12:42:48.944's University at Kingstonen
dc.descriptionThesis (Master, Computing) -- Queen's University, 2009-08-31 12:42:48.944en
dc.description.abstractProblem determination is the identification of problems and performance issues that occur in an observed system and the discovery of solutions to resolve them. Top-k analysis is common task in problem determination in database management systems. It involves the identification of the set of most frequently occurring objects according to some criteria, such as the top-k most frequently used tables or most frequent queries, or the top-k queries with respect to CPU usage or amount of I/O. Effective problem determination requires sufficient monitoring and rapid analysis of the collected monitoring statistics. System monitoring often incurs a great deal of overhead and can interfere with the performance of the observed system. Processing vast amounts of data may require several passes through the analysis system and thus be very time consuming. In this thesis, we present our lightweight top-k analysis framework in which lightweight monitoring tools are used to continuously poll system statistics producing several continuous data streams which are then processed by stream mining techniques. The results produced by our tool are the “top-k” values for the observed statistics. This information can be valuable to an administrator in determining the source of a problem. We implement the framework as a prototype system called Tempo. Tempo uses IBM DB2’s snapshot API and a lightweight monitoring tool called DB2PD to generate the data streams. The system reports the top-k executed SQL statements and the top-k most frequently accessed tables in an on-line fashion. Several experiments are conducted to verify the feasibility and effectiveness of our approach. The experimental results show that our approach achieves low system overhead.en
dc.format.extent1834799 bytes
dc.relation.ispartofseriesCanadian thesesen
dc.rightsThis publication is made available by the authority of the copyright owner solely for the purpose of private study and research and may not be copied or reproduced except as permitted by the copyright laws without written authority from the copyright owner.en
dc.subjectData Stream Analysisen
dc.subjectTop-k Analysisen
dc.titleLightweight Top-K Analysis in DBMSs Using Data Stream Analysis Techniquesen
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
1.75 MB
Adobe Portable Document Format