Multistage feature selection methods for data classification

Mohamad, Masurah

Multistage feature selection methods for data classification

dc.contributor.author	Mohamad, Masurah
dc.date.accessioned	2024-02-14T04:08:51Z
dc.date.available	2024-02-14T04:08:51Z
dc.date.issued	2021
dc.description	Thesis (PhD.)
dc.description.abstract	In data analysis process, a good decision can be made with the assistance of several sub-processes and methods. The most common processes are feature selection and classification processes. Various methods and processes have been proposed to solve many issues such as low classification accuracy, and long processing time faced by the decision-makers. The analysis process becomes more complicated especially when dealing with complex datasets that consist of large and problematic datasets. One of the solutions that can be used is by employing an effective feature selection method to reduce the data processing time, decrease the used memory space, and increase the accuracy of decisions. However, not all the existing methods are capable of dealing with these issues. The aim of this research was to assist the classifier in giving a better performance when dealing with problematic datasets by generating optimised attribute set. The proposed method comprised two stages of feature selection processes, that employed correlation-based feature selection method using a best first search algorithm (CFS-BFS) and as well as a soft set and rough set parameter selection method (SSRS). CFS-BFS is used to eliminate uncorrelated attributes in a dataset meanwhile SSRS was utilized to manage any problematic values such as uncertainty in a dataset. Several bench-marking feature selection methods such as classifier subset evaluation (CSE) and principle component analysis (PCA) and different classifiers such as support vector machine (SVM) and neural network (NN) were used to validate the obtained results. ANOVA and T-test were also conducted to verify the obtained results. The obtained averages for two experimental works have proven that the proposed method equally matched the performance of other benchmarking methods in terms of assisting the classifier in achieving high classification performance for complex datasets. The obtained average for another experimental work has shown that the proposed work has outperformed the other benchmarking methods. In conclusion, the proposed method is significant to be used as an alternative feature selection method and able to assist the classifiers in achieving better accuracy in the classification process especially when dealing with problematic datasets.
dc.description.sponsorship	Faculty of Engineering - School of Computing
dc.identifier.uri	http://openscience.utm.my/handle/123456789/977
dc.language.iso	en
dc.publisher	Universiti Teknologi Malaysia
dc.subject	Automatic classification
dc.subject	Algorithms
dc.subject	Information storage and retrieval systems—Research
dc.title	Multistage feature selection methods for data classification
dc.type	Thesis
dc.type	Dataset

Files

Original bundle

Now showing 1 - 1 of 1

Name:: MasurahMohamadPSC2021_A.pdf
Size:: 263.05 KB
Format:: Adobe Portable Document Format
Description:: Additional experimental works

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed to upon submission
Description:

Download

Collections

Computer Science, Information Technology and Telecommunications