A Comparative Study of Instance Reduction Techniques

Pooja Arora

Abstract


Dealing with very large databases is one of the
major challenges in data mining research and development.
It does not matter how powerful the computers are or will be
in future, data mining algorithms must consider how to
manage this ever-growing data that can be too large (for
example terabytes of data) to be processed. Therefore we
consider instance reduction as an important task in the data
preparation phase of knowledge discovery and data mining.
Major approaches for instance reduction are supervised
instance filters (resample, spread subsample, stratified
remove folds) and unsupervised instance filters (remove with
values, reservoir sample, remove percentage). This paper
provides a comparative study on existing techniques for
instance reduction.

Keywords


Data Mining, Data Preprocessing, Data Reduction, Instance Reduction.

Full Text:

PDF

Refbacks

  • There are currently no refbacks.


Copyright (c)



Subscribe to Print Journals