Machine learning ensemble method for discovering knowledge from big data

Farrash, Majed (2016) Machine learning ensemble method for discovering knowledge from big data. Doctoral thesis, University of East Anglia.

[thumbnail of thesis.pdf]
Download (7MB) | Preview


Big data, generated from various business internet and social media activities, has
become a big challenge to researchers in the field of machine learning and data
mining to develop new methods and techniques for analysing big data effectively and
efficiently. Ensemble methods represent an attractive approach in dealing with the
problem of mining large datasets because of their accuracy and ability of utilizing the
divide-and-conquer mechanism in parallel computing environments.
This research proposes a machine learning ensemble framework and implements it
in a high performance computing environment. This research begins by identifying
and categorising the effects of partitioned data subset size on ensemble accuracy when
dealing with very large training datasets. Then an algorithm is developed to ascertain
the patterns of the relationship between ensemble accuracy and the size of partitioned
data subsets. The research concludes with the development of a selective modelling
algorithm, which is an efficient alternative to static model selection methods for big
The results show that maximising the size of partitioned data subsets does not
necessarily improve the performance of an ensemble of classifiers that deal with large
datasets. Identifying the patterns exhibited by the relationship between ensemble
accuracy and partitioned data subset size facilitates the determination of the best subset
size for partitioning huge training datasets. Finally, traditional model selection is
inefficient in cases wherein large datasets are involved.

Item Type: Thesis (Doctoral)
Faculty \ School: Faculty of Science > School of Computing Sciences
Depositing User: Jackie Webb
Date Deposited: 15 Jun 2016 15:18
Last Modified: 15 Jun 2016 15:18


Downloads per month over past year

Actions (login required)

View Item View Item