LibGuides: Research Data Management: Sensitive Data

Definition

Sensitive data refers to data that needs to be protected from unauthorised access or unwarranted disclosure. It is generally considered to be:

Identifiable data are data that can be used to identify an individual (i.e. personal data), species, object, or location that introduces a risk of discrimination, harm, or unwanted attention. In Singapore, individual’s identifiable data is protected under the Personal Data Protection Act (PDPA) 2012.
Proprietary data which is not generally known or accessible and which gives competitive advantage to its owner. This includes research data with commercialization potential, and information that have terms of use attached to it. Proprietary data may be protected under copyright, patent, or trade secret laws.
Restricted or confidential data with contractual (e.g. Research Collaboration Agreements, Project Agreements, Material Transfer Agreements, Non-Disclosure Agreements) or legal obligations (e.g. Official Secrets Act).

Adapted from "Research Data Management: Sensitive Data" by NTU Library.

Best practices for handling sensitive data

It is recommended that you adopt a proportionate risk-based approach in handling sensitive data, throughout the data lifecycle:

Assess potential risks and consequences before commencing research
Outline data protection practices in the Data Management Plan (DMP)
Implement best practices within project teams/collaborators at the start and throughout research

For related information, see below:

NTU guidance on handling of digital and non-digital data at different levels of sensitivity.
Personal Data Protection Commission (PDPC) Singapore, Guide to securing Personal Data in Electronic Medium and Guide to Data Protection Impact Assessments
MANTRA, The University of Edinburgh, Protecting sensitive data, [Online course], CC-BY.

Best practices for sharing sensitive data

Research data should wherever possible be made available for use by others in a manner consistent with relevant legal, ethical and disciplinary frameworks and norms. For sensitive data however, please see step-by-step guidance below before sharing:

Consider data ownership, ethical concerns, intellectual property rights and other legal terms for the data.
Factor in data sharing when obtaining consent from research participants:
- In the information sheet, describe clearly who has access to the data during and after the project.
- In the consent form, offer clear choices to participants on whether they agree with archiving and/or reuse of the data from the project. Note that a participant can opt out of these activities, but still participate in your study.
Consider ways to protect the sensitive data by:
- Anonymisation or de-identification of identifiable data: For example, remove all direct identifiers, and remove or modify indirect identifiers until the risk of re-identification is negligible.
- Redacting research data to remove confidential information or third party intellectual property.
- Data encryption during data upload, download, and storage over secure platforms.
- Managed access to ensure that only bona fide researchers bound by professional obligations and specific agreements have access to the data.
Do not share or store the 'keys' to re-identification with de-identified datasets.
Consider if embargoes need to be placed on the data. This would mean that the description of the dataset is published but the embargoed data files remains restricted until a specified time.
Publish your data and metadata according to participant consent and ethics approval, and apply appropriate license, taking into account any limitations on re-use, redistribution, commercial use, etc.
If your data cannot be anonymised, consider publishing a description (i.e. the metadata) which enables you to place conditions around access to the data.

See example 1 and example 2 of de-identified datasets and how they are shared in real life.

Acts of anonymisation and aggregation render sensitive data non-linkable and therefore shareable. However, this may remove valuable information from the data. In some cases, therefore, instead of making sensitive data openly available, it may be preferable to release the data, on request, to other bone fide researchers using non-disclosure data sharing agreements, in addition to applying access control.

References:

Australian Research Data Commons (2022), Publishing sensitive data, ARDC Guide.
Inter-university Consortium for Political and Social Research (n.d.) Guide to Social Science Data Preparation and Archiving, Best Practice Throughout the Data Life Cycle: 6^th Edition.
Chapman and Grafton (2008) Guide to Best Practices for Generalising Sensitive Species Occurrence Data. Copenhagen: GBIF Secretariat. https://doi.org/10.15468/doc-b02j-gt10

Resources for sensitive data

Below is a compilation of resources for working with sensitive data:

NTU Research Integrity and Ethics Office (RIEO)
- IRB Guidelines
- NTU Position on the Use of Generative Artificial Intelligence in Research
NTU Institutional Review Board (IRB)
NTU Office of Enterprise Risk Management (ERM)
- Training video and resources on PDPA
NTU Legal and Secretarial Office (LSO)
- Templates for Research Agreements
NTUitive
- For questions on commercialisation and IP, please contact NTUitive at this email
Personal Data Protection Commission (PDPC) Singapore
- Guide to securing Personal Data in Electronic Medium
- Guide to Data Protection Impact Assessments