A FAIR Perspective on Data Quality Frameworks
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Despite considerable effort and analysis over the last two to three decades, no integrated scenario yet exists for data quality frameworks. Currently, the choice is between several frameworks dependent upon the type and use of data. While the frameworks are appropriate to their specific purposes, they are generally prescriptive of the quality dimensions they prescribe. We reappraise the basis for measuring data quality by laying out a concept for a framework that addresses data quality from the foundational basis of the FAIR data guiding principles. We advocate for a federated data contextualisation framework able to handle the FAIR-related quality dimensions in the general data contextualisation descriptions and the remaining intrinsic data quality dimensions in associated dedicated context spaces without being overly prescriptive. A framework designed along these lines provides several advantages, not least of which is its ability to encapsulate most other data quality frameworks. Moreover, by contextualising data according to the FAIR data principles, many subjective quality measures are managed automatically and can even be quantified to a degree, whereas objective intrinsic quality measures can be handled to any level of granularity for any data type. This serves to avoid blurring quality dimensions between the data and the data application perspectives as well as to support data quality provenance by providing traceability over a chain of data processing operations. We show by example how some of these concepts can be implemented at a practical level.