Sensitive data refers to data that needs to be protected from unauthorised access or unwarranted disclosure. It is generally considered to be:
Adapted from "Research Data Management: Sensitive Data" by NTU Library.
It is recommended that you adopt a proportionate risk-based approach in handling sensitive data, throughout the data lifecycle:
For related information, see below:
Research data should wherever possible be made available for use by others in a manner consistent with relevant legal, ethical and disciplinary frameworks and norms. For sensitive data however, please see step-by-step guidance below before sharing:
Consider data ownership, ethical concerns, intellectual property rights and other legal terms for the data.
Factor in data sharing when obtaining consent from research participants:
In the information sheet, describe clearly who has access to the data during and after the project.
In the consent form, offer clear choices to participants on whether they agree with archiving and/or reuse of the data from the project. Note that a participant can opt out of these activities, but still participate in your study.
Consider ways to protect the sensitive data by:
Anonymisation or de-identification of identifiable data: For example, remove all direct identifiers, and remove or modify indirect identifiers until the risk of re-identification is negligible.
Redacting research data to remove confidential information or third party intellectual property.
Data encryption during data upload, download, and storage over secure platforms.
Managed access to ensure that only bona fide researchers bound by professional obligations and specific agreements have access to the data.
Do not share or store the 'keys' to re-identification with de-identified datasets.
Consider if embargoes need to be placed on the data. This would mean that the description of the dataset is published but the embargoed data files remains restricted until a specified time.
Publish your data and metadata according to participant consent and ethics approval, and apply appropriate license, taking into account any limitations on re-use, redistribution, commercial use, etc.
If your data cannot be anonymised, consider publishing a description (i.e. the metadata) which enables you to place conditions around access to the data.
See example 1 and example 2 of de-identified datasets and how they are shared in real life.
Acts of anonymisation and aggregation render sensitive data non-linkable and therefore shareable. However, this may remove valuable information from the data. In some cases, therefore, instead of making sensitive data openly available, it may be preferable to release the data, on request, to other bone fide researchers using non-disclosure data sharing agreements, in addition to applying access control.
References:
Below is a compilation of resources for working with sensitive data: