Graph pattern based RDF data compression

Jeff Z. Pan*, José Manuel Gómez Pérez, Yuan Ren, Honghan Wu, Haofen Wang, Man Zhu

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

The growing volume of RDF documents and their inter-linking raise a challenge on the storage and transferring of such documents. One solution to this problem is to reduce the size of RDF documents via compression. Existing approaches either applywell-known generic compression technologies but seldom exploit the graph structure of RDF documents.Or, they focus on minimized compact serial isations leaving the graph nature inexplicit, which leads obstacles for further applying higher level compression techniques. In this paper we propose graph pattern based technologies, which on the one hand can reduce the numbers of triples in RDF documents and on the other hand can serial ise RDF graph in a data pattern based way, which can deal with syntactic redundancies which are not eliminable to existing techniques. Evaluation on real world datasets shows that our approach can substantially reduce the size of RDF documents by complementing the abilities of existing approaches. Furthermore, the evaluation results on rule mining operations show the potentials of the proposed serialisation format in supporting efficient data access.

Original languageEnglish
Title of host publicationSemantic Technology
Subtitle of host publication4th Joint International Conference, JIST 2014, Revised Selected Papers
PublisherSpringer-Verlag
Pages239-256
Number of pages18
Volume8943
ISBN (Electronic)978-3-319-15615-6
ISBN (Print)978-3-319-15614-9
DOIs
Publication statusPublished - 2015
Event4th Joint International Conference on Semantic Technology, JIST 2014 - Chiang Mai, Thailand
Duration: 9 Nov 201411 Nov 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8943
ISSN (Print)03029743
ISSN (Electronic)16113349

Conference

Conference4th Joint International Conference on Semantic Technology, JIST 2014
CountryThailand
CityChiang Mai
Period9/11/1411/11/14

Fingerprint

Data compression
Data Compression
Compression
Syntactics
Graph in graph theory
Redundancy
Evaluation
Linking
Mining

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Pan, J. Z., Pérez, J. M. G., Ren, Y., Wu, H., Wang, H., & Zhu, M. (2015). Graph pattern based RDF data compression. In Semantic Technology : 4th Joint International Conference, JIST 2014, Revised Selected Papers (Vol. 8943, pp. 239-256). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8943). Springer-Verlag. https://doi.org/10.1007/978-3-319-15615-6_18

Graph pattern based RDF data compression. / Pan, Jeff Z.; Pérez, José Manuel Gómez; Ren, Yuan; Wu, Honghan; Wang, Haofen; Zhu, Man.

Semantic Technology : 4th Joint International Conference, JIST 2014, Revised Selected Papers. Vol. 8943 Springer-Verlag, 2015. p. 239-256 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8943).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Pan, JZ, Pérez, JMG, Ren, Y, Wu, H, Wang, H & Zhu, M 2015, Graph pattern based RDF data compression. in Semantic Technology : 4th Joint International Conference, JIST 2014, Revised Selected Papers. vol. 8943, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 8943, Springer-Verlag, pp. 239-256, 4th Joint International Conference on Semantic Technology, JIST 2014, Chiang Mai, Thailand, 9/11/14. https://doi.org/10.1007/978-3-319-15615-6_18
Pan JZ, Pérez JMG, Ren Y, Wu H, Wang H, Zhu M. Graph pattern based RDF data compression. In Semantic Technology : 4th Joint International Conference, JIST 2014, Revised Selected Papers. Vol. 8943. Springer-Verlag. 2015. p. 239-256. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-319-15615-6_18
Pan, Jeff Z. ; Pérez, José Manuel Gómez ; Ren, Yuan ; Wu, Honghan ; Wang, Haofen ; Zhu, Man. / Graph pattern based RDF data compression. Semantic Technology : 4th Joint International Conference, JIST 2014, Revised Selected Papers. Vol. 8943 Springer-Verlag, 2015. pp. 239-256 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{cc860bbe7ed742caa943ad6315a55f41,
title = "Graph pattern based RDF data compression",
abstract = "The growing volume of RDF documents and their inter-linking raise a challenge on the storage and transferring of such documents. One solution to this problem is to reduce the size of RDF documents via compression. Existing approaches either applywell-known generic compression technologies but seldom exploit the graph structure of RDF documents.Or, they focus on minimized compact serial isations leaving the graph nature inexplicit, which leads obstacles for further applying higher level compression techniques. In this paper we propose graph pattern based technologies, which on the one hand can reduce the numbers of triples in RDF documents and on the other hand can serial ise RDF graph in a data pattern based way, which can deal with syntactic redundancies which are not eliminable to existing techniques. Evaluation on real world datasets shows that our approach can substantially reduce the size of RDF documents by complementing the abilities of existing approaches. Furthermore, the evaluation results on rule mining operations show the potentials of the proposed serialisation format in supporting efficient data access.",
author = "Pan, {Jeff Z.} and P{\'e}rez, {Jos{\'e} Manuel G{\'o}mez} and Yuan Ren and Honghan Wu and Haofen Wang and Man Zhu",
year = "2015",
doi = "10.1007/978-3-319-15615-6_18",
language = "English",
isbn = "978-3-319-15614-9",
volume = "8943",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer-Verlag",
pages = "239--256",
booktitle = "Semantic Technology",

}

TY - GEN

T1 - Graph pattern based RDF data compression

AU - Pan, Jeff Z.

AU - Pérez, José Manuel Gómez

AU - Ren, Yuan

AU - Wu, Honghan

AU - Wang, Haofen

AU - Zhu, Man

PY - 2015

Y1 - 2015

N2 - The growing volume of RDF documents and their inter-linking raise a challenge on the storage and transferring of such documents. One solution to this problem is to reduce the size of RDF documents via compression. Existing approaches either applywell-known generic compression technologies but seldom exploit the graph structure of RDF documents.Or, they focus on minimized compact serial isations leaving the graph nature inexplicit, which leads obstacles for further applying higher level compression techniques. In this paper we propose graph pattern based technologies, which on the one hand can reduce the numbers of triples in RDF documents and on the other hand can serial ise RDF graph in a data pattern based way, which can deal with syntactic redundancies which are not eliminable to existing techniques. Evaluation on real world datasets shows that our approach can substantially reduce the size of RDF documents by complementing the abilities of existing approaches. Furthermore, the evaluation results on rule mining operations show the potentials of the proposed serialisation format in supporting efficient data access.

AB - The growing volume of RDF documents and their inter-linking raise a challenge on the storage and transferring of such documents. One solution to this problem is to reduce the size of RDF documents via compression. Existing approaches either applywell-known generic compression technologies but seldom exploit the graph structure of RDF documents.Or, they focus on minimized compact serial isations leaving the graph nature inexplicit, which leads obstacles for further applying higher level compression techniques. In this paper we propose graph pattern based technologies, which on the one hand can reduce the numbers of triples in RDF documents and on the other hand can serial ise RDF graph in a data pattern based way, which can deal with syntactic redundancies which are not eliminable to existing techniques. Evaluation on real world datasets shows that our approach can substantially reduce the size of RDF documents by complementing the abilities of existing approaches. Furthermore, the evaluation results on rule mining operations show the potentials of the proposed serialisation format in supporting efficient data access.

UR - http://www.scopus.com/inward/record.url?scp=84928920380&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-15615-6_18

DO - 10.1007/978-3-319-15615-6_18

M3 - Conference contribution

SN - 978-3-319-15614-9

VL - 8943

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 239

EP - 256

BT - Semantic Technology

PB - Springer-Verlag

ER -