|
QSpace at Queen's University >
Theses, Dissertations & Graduate Projects >
Queen's Theses & Dissertations >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/1974/1321
|
| Title: | Streaming Random Forests |
| Authors: | Abdulsalam, Hanady |
|
|
| Keywords: | Data mining Streams Classification algorithms Streaming algorithms |
| Issue Date: | 2008 |
| Series/Report no.: | Canadian theses |
| Abstract: | Recent research addresses the problem of data-stream mining
to deal with applications that require processing huge amounts of data
such as sensor data analysis and financial applications.
Data-stream mining algorithms incorporate special provisions to meet
the requirements of stream-management systems, that is stream
algorithms must be online and incremental, processing each data
record only once (or few times); adaptive to distribution changes;
and fast enough to accommodate high arrival rates.
We consider the problem of data-stream classification,
introducing an online and incremental stream-classification
ensemble algorithm, Streaming Random Forests,
an extension of the Random Forests algorithm
by Breiman, which is a standard classification algorithm.
Our algorithm is designed to handle multi-class classification
problems.
It is able to deal with
data streams having an evolving nature and
a random arrival rate of training/test data records.
The algorithm, in addition, automatically adjusts its
parameters based on the data seen so far.
Experimental results on real and synthetic data
demonstrate that the algorithm gives a successful behavior.
Without losing classification accuracy, our algorithm
is able to handle multi-class problems for which the
underlying class boundaries drift, and handle the case when blocks of training
records are not big enough to build/update the classification model. |
| Description: | Thesis (Ph.D, Computing) -- Queen's University, 2008-07-15 16:12:33.221 |
| URI: | http://hdl.handle.net/1974/1321 |
| Appears in Collections: | Queen's Theses & Dissertations Computing Graduate Theses
|
Items in QSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
|