Loading Events

« All Events

  • This event has passed.

Assessing Data Quality and Disclosure Risk in Numeric Data

February 20 @ 8:00 am - 5:00 pm

Course Code HUB-05-18/19-P-R
Organised by NCRM, University of Southampton
Presenter Louise Corti, Sharon Bolton, Cristina Magder, Anca Vlad and Myles Offord
Date 20/02/2019
Venue St Clements Building, London School of Econonics, St Clements Lane, London
Map View in Google Maps  (WC2A 2AB)
Contact Jacqui Thorp
Training and Capacity Building Co-ordinator
National Centre for Research Methods
Tel: 02380594069
Email: jmh6@soton.ac.uk
Description In this hands-on day course you will learn about the principles of, and tools for, assessing data quality and reviewing disclosure risk in numeric data sources. Data assessment is extremely useful whether it is for wishing to create a high quality data for publishing, thereby supporting  the transparency and replication agenda (e.g. to meet funder or journal policy), or simply to check unknown data that has been accessed for reuse. The requirements of the GDPR when processing and de-identifying data benefit from quick examination, using tools where possible.

The course will introduce the key elements of data quality and disclosure risk, including: file checks, data and metadata checks, and direct and indirect identifiers. The day makes use of two tools to undertake review. The first is QAMyData that automatically assesses elements of quality, such as missingness, duplication, outliers and direct identifiers. A user can specify and set thresholds in the QAMYData tool, to indicate what one is prepared to accept (i.e. no missing data or data must be fully labelled). Issues are identified in both a summary and detailed report. The second tool is SDCMicro, a practical tool for checking disclosure risk through examining combinations of key variables.

Practical demonstrations and hands-on exercises will be used throughout. The course will be held in a lab where the software will be mounted. However, these software are easily downloaded to a laptop and be quickly used after the workshop and can be integrated into data cleaning and processing pipelines for data creators, users, reviewers and publishers.

The course covers:

  • Short presentations on the principles and practicalities of assessing data quality and undertaking disclosure review in numeric data;
  • Practical demonstrations and hands-on exercises for assessing data quality and undertaking disclosure review in numeric data;
  • A short session surgery on installing the tools for onward use.

By the end of the course participants will:

  • Appreciate the principles and practicalities involved in assessing data quality and undertaking disclosure review in numeric data;
  • Gain hand-on experience with using two piece of software: QAMyData, for assessing data and metadata quality, and SDCMicro for disclosure review in numeric data;
  • Know how to install QAMyData and SDCMicro on their own computers.

Target Audience

Academics, lecturers, researchers and data publishers from all sectors who are interested in the practical elements of assessing numeric data for quality and disclosure risk.

Pre-requisites

Some knowledge about the creation and QA of survey or numeric data are expected, as is familiarity with some kind of statistics software tools e.g. SPSS, STATA or R.

 

Level Intermediate (some prior knowledge)
Cost The fee per teaching day is:

• £30 per day for UK/EU registered students.
• £60 per day for staff at UK/EU academic institutions, UK/EU Research Councils researchers, UK/EU public sector staff and staff at UK/EU registered charity organisations and recognised UK/EU research institutions.
• £220 per day for all other participants.

All fees include event materials, lunch, morning and afternoon tea. They do not include travel and accommodation costs.

Full refunds are available two weeks prior to the course, NO refunds are available after this date.

Website and registration
Region Greater London
Keywords Research Skills, Communication and Dissemination, Disclosure review , Privacy Impact Assessment , SDCMicro , Open source tools , Transparency , Replication , Data Publishing
Related publications and presentations Research Skills, Communication and Dissemination

Details

Date:
February 20
Time:
8:00 am - 5:00 pm
Event Category: