Research Data Leeds Repository

A Practical Guide to Characterising Data and Investigating Data Quality

Citation

Ruddle, Roy and Cheshire, James and Fernstad, Sara Johansson (2024) A Practical Guide to Characterising Data and Investigating Data Quality. University of Leeds. [Dataset] https://doi.org/10.5518/1481

Dataset description

This guide is designed for data scientists to use in their day-to-day work, and describes a comprehensive list of tasks to perform when investigating data quality and profiling data, and a six-step recommended workflow. Each of the 62 tasks is articulated as a question (and sometimes several questions) to answer about your data. The guide also provides pointers to a Python package (vizdataquality) that implements the workflow, a film about visualizing data quality and other useful resources.

Subjects: I000 - Computer sciences
Divisions: Faculty of Engineering and Physical Sciences > School of Computing
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Date deposited: 06 Feb 2024 14:23
URI: https://archive.researchdata.leeds.ac.uk/id/eprint/1235

Files

Documentation

Research Data Leeds Repository is powered by EPrints
Copyright © University of Leeds