Predicate invention based RDF data compression

Man Zhu*, Weixin Wu, Jeff Z. Pan, Jingyu Han, Pengfei Huang, Qian Liu

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

RDF is a data representation format for schema-free structured information that is gaining speed in the context of semantic web, life science, and vice versa. With the continuing proliferation of structured data, demand for RDF compression is becoming increasingly important. In this study, we introduce a novel lossless compression technique for RDF datasets (triples), called PIC (Predicate Invention based Compression). By generating informative predicates and constructing effective mapping to original predicates, PIC only needs to store dramatically reduced number of triples with the newly created predicates, and restoring the original triples efficiently using the mapping. These predicates are automatically generated by a decomposable forward-backward procedure, which consequently supports very fast parallel bit computation. As a semantic compression method for structured data, besides the reduction of syntactic verbosity and data redundancy, we also invoke semantics in the RDF datasets. Experiments on various datasets show competitive results in terms of compression ratio.

Original languageEnglish
Title of host publicationSemantic Technology - 8th Joint International Conference, JIST 2018, Proceedings
EditorsR Ichise, F Lecue, T Kawamura, D Zhao, S Muggleton, K Kozaki
PublisherSpringer Verlag
Pages153-161
Number of pages9
ISBN (Electronic)9783030042844
ISBN (Print)9783030042837
DOIs
Publication statusPublished - 14 Nov 2018
Event8th Joint International Semantic Technology Conference, JIST 2018 - Awaji, Japan
Duration: 26 Nov 201828 Nov 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11341 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference8th Joint International Semantic Technology Conference, JIST 2018
CountryJapan
CityAwaji
Period26/11/1828/11/18

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Predicate invention based RDF data compression'. Together they form a unique fingerprint.

  • Cite this

    Zhu, M., Wu, W., Pan, J. Z., Han, J., Huang, P., & Liu, Q. (2018). Predicate invention based RDF data compression. In R. Ichise, F. Lecue, T. Kawamura, D. Zhao, S. Muggleton, & K. Kozaki (Eds.), Semantic Technology - 8th Joint International Conference, JIST 2018, Proceedings (pp. 153-161). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11341 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-030-04284-4_11