Implicit Feature Selection with the Value Difference Metric

T R  Payne; Peter Edwards

Implicit Feature Selection with the Value Difference Metric

T R Payne, Peter Edwards

Computing Science

Research output: Chapter in Book/Report/Conference proceeding › Published conference contribution

19 Downloads (Pure)

Abstract

The nearest neighbour paradigm provides an effective approach to supervised learning. However, it is especially susceptible to the presence of irrelevant attributes.
Whilst many approaches have been proposed that select only the most relevant attributes within a data set, these approaches involve pre-processing the data in some way, and can often be computationally complex. The Value Difference Metric (VDM) is a symbolic distance metric used by a number of different nearest neighbour learning algorithms. This paper demonstrates how the VDM can be used to reduce the impact
of irrelevant attributes on classification accuracy without the need for pre-processing the data. We illustrate how this metric uses simple probabilistic techniques to weight features in the instance space, and then apply this weighting technique to an
alternative symbolic distance metric. The resulting distance metrics are compared in terms of classification accuracy, on a number of real-world and artificial data sets.

Original language	English
Title of host publication	Proceedings of the European Conference on Artificial Intelligence - ECAI-98
Editors	Henri Prade
Publisher	Wiley
Pages	450-454
Publication status	Published - 1998

Keywords

machine learning
nearest-neighbour
feature selection

Access to Document

payne-ecai-98Final published version, 177 KBLicence: Unspecified

Cite this

@inproceedings{27bc366190074a79ae3cfa5c2a32253b,

title = "Implicit Feature Selection with the Value Difference Metric",

abstract = "The nearest neighbour paradigm provides an effective approach to supervised learning. However, it is especially susceptible to the presence of irrelevant attributes.Whilst many approaches have been proposed that select only the most relevant attributes within a data set, these approaches involve pre-processing the data in some way, and can often be computationally complex. The Value Difference Metric (VDM) is a symbolic distance metric used by a number of different nearest neighbour learning algorithms. This paper demonstrates how the VDM can be used to reduce the impactof irrelevant attributes on classification accuracy without the need for pre-processing the data. We illustrate how this metric uses simple probabilistic techniques to weight features in the instance space, and then apply this weighting technique to analternative symbolic distance metric. The resulting distance metrics are compared in terms of classification accuracy, on a number of real-world and artificial data sets.",

keywords = "machine learning, nearest-neighbour, feature selection",

author = "Payne, {T R} and Peter Edwards",

year = "1998",

language = "English",

pages = "450--454",

editor = "Henri Prade",

booktitle = "Proceedings of the European Conference on Artificial Intelligence - ECAI-98",

publisher = "Wiley",

}

TY - GEN

T1 - Implicit Feature Selection with the Value Difference Metric

AU - Payne, T R

AU - Edwards, Peter

PY - 1998

Y1 - 1998

N2 - The nearest neighbour paradigm provides an effective approach to supervised learning. However, it is especially susceptible to the presence of irrelevant attributes.Whilst many approaches have been proposed that select only the most relevant attributes within a data set, these approaches involve pre-processing the data in some way, and can often be computationally complex. The Value Difference Metric (VDM) is a symbolic distance metric used by a number of different nearest neighbour learning algorithms. This paper demonstrates how the VDM can be used to reduce the impactof irrelevant attributes on classification accuracy without the need for pre-processing the data. We illustrate how this metric uses simple probabilistic techniques to weight features in the instance space, and then apply this weighting technique to analternative symbolic distance metric. The resulting distance metrics are compared in terms of classification accuracy, on a number of real-world and artificial data sets.

AB - The nearest neighbour paradigm provides an effective approach to supervised learning. However, it is especially susceptible to the presence of irrelevant attributes.Whilst many approaches have been proposed that select only the most relevant attributes within a data set, these approaches involve pre-processing the data in some way, and can often be computationally complex. The Value Difference Metric (VDM) is a symbolic distance metric used by a number of different nearest neighbour learning algorithms. This paper demonstrates how the VDM can be used to reduce the impactof irrelevant attributes on classification accuracy without the need for pre-processing the data. We illustrate how this metric uses simple probabilistic techniques to weight features in the instance space, and then apply this weighting technique to analternative symbolic distance metric. The resulting distance metrics are compared in terms of classification accuracy, on a number of real-world and artificial data sets.

KW - machine learning

KW - nearest-neighbour

KW - feature selection

M3 - Published conference contribution

SP - 450

EP - 454

BT - Proceedings of the European Conference on Artificial Intelligence - ECAI-98

A2 - Prade, Henri

PB - Wiley

ER -

Implicit Feature Selection with the Value Difference Metric

Abstract

Keywords

Access to Document

Fingerprint

Cite this