Instance Based Clustering of Semantic Web Resources

Gunnar Aastrand Grimnes, Pete Edwards, Alun David Preece

Research output: Chapter in Book/Report/Conference proceedingConference contribution

31 Citations (Scopus)

Abstract

The original Semantic Web vision was explicit in the need for intelligent autonomous agents that would represent users and help them navigate the Semantic Web. We argue that an essential feature for such agents is the capability to analyse data and learn. In this paper we outline the challenges and issues surrounding the application of clustering algorithms to Semantic Web data. We present several ways to extract instances from a large RDF graph and computing the distance between these. We evaluate our approaches on three different data-sets, one representing a typical relational database to RDF conversion, one based on data from a ontologically rich Semantic Web enabled application, and one consisting of a crawl of FOAF documents; applying both supervised and unsupervised evaluation metrics. Our evaluation did not support choosing a single combination of instance extraction method and similarity metric as superior in all cases, and as expected the behaviour depends greatly on the data being clustered. Instead, we attempt to identify characteristics of data that make particular methods more suitable.
Original languageEnglish
Title of host publicationThe Semantic Web: Research and Applications
Subtitle of host publicationProceedings of Fifth European Semantic Web Conference (ESWC 2008)
EditorsSean Bechhofer, Manfred Hauswirth, Jorg Hoffmann, Manolis Koubarakis
Place of PublicationHeidelberg, Germany
PublisherSpringer-Verlag
Pages303-317
Number of pages15
ISBN (Electronic)978-3-540-68234-9
ISBN (Print)978-3-540-68233-2
DOIs
Publication statusPublished - 24 May 2008
Event5th European Semantic Web Conference, ESWC 2008 - Tenerife, Spain
Duration: 1 Jun 20085 Jun 2008

Publication series

NameLecture Notes in Computer Science
PublisherSpringer-Verlag
Number-
Volume5021
ISSN (Print)0302-9743

Conference

Conference5th European Semantic Web Conference, ESWC 2008
CountrySpain
CityTenerife
Period1/06/085/06/08

Fingerprint

Semantic Web
Autonomous agents
Clustering algorithms

Keywords

  • semantic web
  • data-mining
  • clustering

Cite this

Grimnes, G. A., Edwards, P., & Preece, A. D. (2008). Instance Based Clustering of Semantic Web Resources. In S. Bechhofer, M. Hauswirth, J. Hoffmann, & M. Koubarakis (Eds.), The Semantic Web: Research and Applications: Proceedings of Fifth European Semantic Web Conference (ESWC 2008) (pp. 303-317). (Lecture Notes in Computer Science; Vol. 5021, No. -). Heidelberg, Germany: Springer-Verlag. https://doi.org/10.1007/978-3-540-68234-9_24

Instance Based Clustering of Semantic Web Resources. / Grimnes, Gunnar Aastrand; Edwards, Pete; Preece, Alun David.

The Semantic Web: Research and Applications: Proceedings of Fifth European Semantic Web Conference (ESWC 2008). ed. / Sean Bechhofer; Manfred Hauswirth; Jorg Hoffmann; Manolis Koubarakis. Heidelberg, Germany : Springer-Verlag, 2008. p. 303-317 (Lecture Notes in Computer Science; Vol. 5021, No. -).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Grimnes, GA, Edwards, P & Preece, AD 2008, Instance Based Clustering of Semantic Web Resources. in S Bechhofer, M Hauswirth, J Hoffmann & M Koubarakis (eds), The Semantic Web: Research and Applications: Proceedings of Fifth European Semantic Web Conference (ESWC 2008). Lecture Notes in Computer Science, no. -, vol. 5021, Springer-Verlag, Heidelberg, Germany, pp. 303-317, 5th European Semantic Web Conference, ESWC 2008, Tenerife, Spain, 1/06/08. https://doi.org/10.1007/978-3-540-68234-9_24
Grimnes GA, Edwards P, Preece AD. Instance Based Clustering of Semantic Web Resources. In Bechhofer S, Hauswirth M, Hoffmann J, Koubarakis M, editors, The Semantic Web: Research and Applications: Proceedings of Fifth European Semantic Web Conference (ESWC 2008). Heidelberg, Germany: Springer-Verlag. 2008. p. 303-317. (Lecture Notes in Computer Science; -). https://doi.org/10.1007/978-3-540-68234-9_24
Grimnes, Gunnar Aastrand ; Edwards, Pete ; Preece, Alun David. / Instance Based Clustering of Semantic Web Resources. The Semantic Web: Research and Applications: Proceedings of Fifth European Semantic Web Conference (ESWC 2008). editor / Sean Bechhofer ; Manfred Hauswirth ; Jorg Hoffmann ; Manolis Koubarakis. Heidelberg, Germany : Springer-Verlag, 2008. pp. 303-317 (Lecture Notes in Computer Science; -).
@inproceedings{c4e81db8292549d2b0a0d635116a4ad8,
title = "Instance Based Clustering of Semantic Web Resources",
abstract = "The original Semantic Web vision was explicit in the need for intelligent autonomous agents that would represent users and help them navigate the Semantic Web. We argue that an essential feature for such agents is the capability to analyse data and learn. In this paper we outline the challenges and issues surrounding the application of clustering algorithms to Semantic Web data. We present several ways to extract instances from a large RDF graph and computing the distance between these. We evaluate our approaches on three different data-sets, one representing a typical relational database to RDF conversion, one based on data from a ontologically rich Semantic Web enabled application, and one consisting of a crawl of FOAF documents; applying both supervised and unsupervised evaluation metrics. Our evaluation did not support choosing a single combination of instance extraction method and similarity metric as superior in all cases, and as expected the behaviour depends greatly on the data being clustered. Instead, we attempt to identify characteristics of data that make particular methods more suitable.",
keywords = "semantic web, data-mining, clustering",
author = "Grimnes, {Gunnar Aastrand} and Pete Edwards and Preece, {Alun David}",
year = "2008",
month = "5",
day = "24",
doi = "10.1007/978-3-540-68234-9_24",
language = "English",
isbn = "978-3-540-68233-2",
series = "Lecture Notes in Computer Science",
publisher = "Springer-Verlag",
number = "-",
pages = "303--317",
editor = "Sean Bechhofer and Manfred Hauswirth and Jorg Hoffmann and Manolis Koubarakis",
booktitle = "The Semantic Web: Research and Applications",

}

TY - GEN

T1 - Instance Based Clustering of Semantic Web Resources

AU - Grimnes, Gunnar Aastrand

AU - Edwards, Pete

AU - Preece, Alun David

PY - 2008/5/24

Y1 - 2008/5/24

N2 - The original Semantic Web vision was explicit in the need for intelligent autonomous agents that would represent users and help them navigate the Semantic Web. We argue that an essential feature for such agents is the capability to analyse data and learn. In this paper we outline the challenges and issues surrounding the application of clustering algorithms to Semantic Web data. We present several ways to extract instances from a large RDF graph and computing the distance between these. We evaluate our approaches on three different data-sets, one representing a typical relational database to RDF conversion, one based on data from a ontologically rich Semantic Web enabled application, and one consisting of a crawl of FOAF documents; applying both supervised and unsupervised evaluation metrics. Our evaluation did not support choosing a single combination of instance extraction method and similarity metric as superior in all cases, and as expected the behaviour depends greatly on the data being clustered. Instead, we attempt to identify characteristics of data that make particular methods more suitable.

AB - The original Semantic Web vision was explicit in the need for intelligent autonomous agents that would represent users and help them navigate the Semantic Web. We argue that an essential feature for such agents is the capability to analyse data and learn. In this paper we outline the challenges and issues surrounding the application of clustering algorithms to Semantic Web data. We present several ways to extract instances from a large RDF graph and computing the distance between these. We evaluate our approaches on three different data-sets, one representing a typical relational database to RDF conversion, one based on data from a ontologically rich Semantic Web enabled application, and one consisting of a crawl of FOAF documents; applying both supervised and unsupervised evaluation metrics. Our evaluation did not support choosing a single combination of instance extraction method and similarity metric as superior in all cases, and as expected the behaviour depends greatly on the data being clustered. Instead, we attempt to identify characteristics of data that make particular methods more suitable.

KW - semantic web

KW - data-mining

KW - clustering

U2 - 10.1007/978-3-540-68234-9_24

DO - 10.1007/978-3-540-68234-9_24

M3 - Conference contribution

SN - 978-3-540-68233-2

T3 - Lecture Notes in Computer Science

SP - 303

EP - 317

BT - The Semantic Web: Research and Applications

A2 - Bechhofer, Sean

A2 - Hauswirth, Manfred

A2 - Hoffmann, Jorg

A2 - Koubarakis, Manolis

PB - Springer-Verlag

CY - Heidelberg, Germany

ER -