Assessing the Stability and Selection Performance of Feature Selection Methods Under Different Data Complexity

Omaimah Saif Al Hosni, Andrew Starkey

Research output: Contribution to journalArticlepeer-review

Abstract

Our study aims to investigate the stability and the selection accuracy of feature selection performance under different data complexity. The motivation behind this investigation is that there are significant contributions in the research
community from examining the effect of complex data characteristics such as overlapping classes or non-linearity of the decision boundaries on the classification algorithm's performance; however, relatively few studies have investigated the stability and the selection accuracy of feature selection methods with such data characteristics. Also, this study is interested in investigating the interactive effects of the classes overlapped with other data challenges such as small sample size, high dimensionality, and imbalance classes to provide meaningful insights into the root causes for feature selection methods
misdiagnosing the relevant features among different real-world data challenges. This analysis will be extended to real-world data to guide the practitioners and researchers in choosing the correct feature selection methods that are more appropriate for a particular dataset. Our study outcomes indicate that using feature selection techniques with datasets of different characteristics may generate different subsets of features under variations to the training data showing that small sample size and overlapping classes have the highest impact on the stability and selection accuracy of feature selection performance, among other data challenges that have been investigated in this study. Also, in this study, we will provide a survey on the current state of research in the feature selection stability context to highlight the area that requires more attention for other researchers.
Original languageEnglish
Pages (from-to)442-455
Number of pages14
JournalInternational Arab Journal of Information Technology
Volume19
Issue numberSpecial Issue 3A
DOIs
Publication statusPublished - 30 Jun 2022

Keywords

  • class overlapping
  • complex data
  • data characteristics
  • Stability of feature selection

Fingerprint

Dive into the research topics of 'Assessing the Stability and Selection Performance of Feature Selection Methods Under Different Data Complexity'. Together they form a unique fingerprint.

Cite this