Towards a context-dependent numerical data quality evaluation framework

Milen S. Marev, Ernesto Compatangelo, Wamberto Vasconcelos

Research output: Working paper

113 Downloads (Pure)

Abstract

This paper focuses on numeric data, with emphasis on distinct characteristics like varying significance, unstructured format, mass volume and real-time processing. We propose a novel, context-dependent valuation framework specifically devised to assess quality in numeric datasets. Our framework uses eight relevant data quality dimensions, and provide a simple metric to evaluate dataset quality along each dimension. We argue that the proposed set of dimensions and corresponding metrics adequately captures the unique quality antipatterns that are typically associated with numerical data. The introduction of our framework is part of a wider research effort that aims at developing an articulated numerical data quality improvement approach for Oil and Gas exploration and production workflows that is based on artificial intelligence techniques.
Original languageEnglish
PublisherArXiv
Number of pages12
Publication statusSubmitted - 22 Oct 2018

Keywords

  • cs.DB
  • data quality
  • numerical data
  • evaluation framework

Fingerprint

Dive into the research topics of 'Towards a context-dependent numerical data quality evaluation framework'. Together they form a unique fingerprint.

Cite this