QSpace at Queen's University >
Theses, Dissertations & Graduate Projects >
Queen's Theses & Dissertations >
Please use this identifier to cite or link to this item:
|Title: ||Streaming Random Forests|
|Authors: ||Abdulsalam, Hanady|
|Keywords: ||Data mining|
|Issue Date: ||2008|
|Series/Report no.: ||Canadian theses|
|Abstract: ||Recent research addresses the problem of data-stream mining
to deal with applications that require processing huge amounts of data
such as sensor data analysis and financial applications.
Data-stream mining algorithms incorporate special provisions to meet
the requirements of stream-management systems, that is stream
algorithms must be online and incremental, processing each data
record only once (or few times); adaptive to distribution changes;
and fast enough to accommodate high arrival rates.
We consider the problem of data-stream classification,
introducing an online and incremental stream-classification
ensemble algorithm, Streaming Random Forests,
an extension of the Random Forests algorithm
by Breiman, which is a standard classification algorithm.
Our algorithm is designed to handle multi-class classification
It is able to deal with
data streams having an evolving nature and
a random arrival rate of training/test data records.
The algorithm, in addition, automatically adjusts its
parameters based on the data seen so far.
Experimental results on real and synthetic data
demonstrate that the algorithm gives a successful behavior.
Without losing classification accuracy, our algorithm
is able to handle multi-class problems for which the
underlying class boundaries drift, and handle the case when blocks of training
records are not big enough to build/update the classification model.|
|Description: ||Thesis (Ph.D, Computing) -- Queen's University, 2008-07-15 16:12:33.221|
|Appears in Collections:||Queen's Theses & Dissertations|
Computing Graduate Theses
Items in QSpace are protected by copyright, with all rights reserved, unless otherwise indicated.