Guideline on securing human research participant information and data

ApplicabilityPurposeResponsibilities of the research team| Securing and safeguarding through the data lifecycle| Safeguarding data through de-identification| Direct identifiers| Indirect identifiers| Security measures and requirements| Resources


Applicability

This guideline applies to all research involving human participants and their information and data conducted under the auspices of University of Waterloo. The data may be or has been collected and/or stored in paper or electronic form. This could include mobile devices, personal computers, portable media, and online storage. These can be privately- or university-owned and located on or off university premises. 

Note: This guideline is not written for research with Indigenous Peoples as various considerations must be accounted for including Indigenous data sovereignty principles, respectful Indigenous community engagement, and reciprocal relationship building. 

Purpose

This guideline is intended to help researchers consider the welfare of participants throughout the lifecycle of their research. Properly safeguarding and securing research participant information and data demonstrates concern for welfare, one of the guiding principles of the Tri-Council Policy Statement: Ethical Conduct for Research Involving Humans, 2nd edition (TCPS2). This includes personal information such as names, addresses, telephone numbers, emails, and data that can be derived from biological materials, as well as digital and physical data collected during research. 

Researchers must establish procedures to protect data and information obtained/collected from participants. Procedures must be established when using data in research where it is a secondary use of the data. Secondary use is defined as “information that was collected for a purpose other than the current research purpose” (TCPS 2, Chapter 5). 

Definitions

The following general definitions of participant information and data are applicable for this guideline. 

Human participant information: Personal information collected and used to run a research study such as contact information (names, phone numbers, email). This is not required to answer the research question(s).  

Human participant data:Data from or about humans that are collected, obtained, and/or used as part of the research processes and outputs and/or used to answer the research question(s).


Responsibilities of the research team

Researchers have a responsibility to consider the whole lifecycle when safeguarding and securing participant information and data as, “…privacy risks arise at all stages of the research life cycle, including initial collection of information, use and analysis to address research questions, dissemination of findings, storage and retention of information, and disposal of records or devices on which information is stored.” (TCPS 2, Chapter 5) 

Policy 46 – Information Management outlines different roles and responsibilities for research team members. Under this policy, the Principal Investigator or Faculty Supervisor is the information steward for the research data while other research team members for the research project are information custodians. If there are alternative expectations, consult with the Office of Research Ethics for additional information. All research team members are expected to be aware of and understand their roles and responsibilities. Team members may include:

  • Co-investigators
  • Students engaged in research activities
  • Research staff with access to or involved in collection of information and data
  • Technical support staff involved in the deployment, maintenance, and administration of information technology where research information and data are stored and/or transmitted

As a reminder, researchers should be completing cyber awareness and research security training

For additional information about the responsibilities of principal investigators, please review the Principal Investigator’s Handbook.


Securing and safeguarding throughout the lifecycle

Throughout the research lifecycle consider:

  • What data/information is being collected? Is any of this identifying, or potentially identifying information (e.g., name, date of birth, address, a combination of indirect identifiers, etc.)
  • How is this information being collected (email, paper, in an interview audio recording, etc.)?
  • Is the data or research context sensitive, or will it potentially pose a risk to participant safety or well-being if disclosed to others?
  • Incorporating data management plans and research project cybersecurity planning

Recruitment phase

During participant recruitment, researchers may obtain participant contact information (e.g., email, phone number, mailing address). It is important for researchers to maintain effective security measures when doing so and respect the privacy of participants (e.g., not sharing their contact information, destroying it once no longer needed, etc.).

Data collection

It is important for researchers to consider what information and data are collected and how these would be classified so that the research team is aware of the security measures that are necessary over the lifetime of the research project. Things to consider include: identifiability, confidentiality, sensitivity, and whether there is the potential risk of re-identification of participants. Researchers can use the privacy and security research risk assessment tool to help identify and consider these details and determine how to best put safeguards in place throughout the lifecycle and research processes.

"The easiest way to protect participants is through the collection and use of anonymous or anonymized data, although this is not always possible or desirable… A “next best” alternative is to use de-identified data…the last alternative is for researchers to collect data in identifiable form and take measures to de-identify the data as soon as possible.” – Chapter 5, TCPS2 2022

Examples of collection and de-identification methods that can be used to safeguard participant data and information are:  

  • Collect the minimum information needed to conduct the study.
  • Be aware of what information and data you are collecting and the context in which it is collected.
  • When possible, try to collect data in an anonymous or de-identified manner.
  • If you must collect data in an identifiable form, de-identify data as soon as possible after collection and/or separate identifiable variables (e.g., create identity code).
  • Consider how the information is being collected, and if the methods of collection are appropriate for the research context and the type of data being collected.

Storage and preservation

Depending on how information and data are collected, shared, stored, or preserved, there may be specific security measures researchers should consider; for example, when using apps, portable devices, and smartphones. If paper copies are involved, attention should be given to the storage during data collection as well as longer term storage (e.g., do not leave paper copies in unattended vehicles, keep them stored in locked cabinets or rooms only accessible to researchers, etc.). When considering management, storage, and preservation options, please also consult the research data management services (RDMS) at the University of Waterloo Library for additional resources. 
 

Sharing

When human participant data and information is transferred or shared consideration should be given to who is sending and receiving the information and data, and how it can be kept secure in the transfer. This can be within a research group and/or externally such as with partners, other research groups, or processing services (e.g., transcription services). Where possible, transfer should be minimal and appropriate security measures for the type and sensitivity of data should be maintained. Examples of providing access or sharing data securely with other researchers are using SharePoint, OneDrive, SendIt, or a secure file transfer protocol (FTP) site.  In some cases, sharing agreements may be formulated and these often describe expectations of how data should be transferred, maintained, and disposed of or returned. The privacy and security research risk assessment tool can help guide you and advise when a data transfer agreement may be recommended. The electronic transmission of data must also use adequate and secured protocols approved by IST. Please review their guidelines for secure data exchange.

Disposal

The TCPS2 does not stipulate a data destruction timeline. There may be times when human participant data will be destroyed or disposed of as a final step in the data lifecycle or as part of research practices (e.g., deleting recordings after transcriptions has been completed). As part of this step, researchers should consider how to safeguard participant data including consideration of contact information, anonymization of data, deleting of data/information. Data and media disposal can be done in a responsible manner such as following guidance on disposing electronic media or following confidential shredding procedures.

For additional information about how long information and data should be kept, please review the guideline on retention of study information and data.


Safeguarding data through de-identification

Researchers must consider potential risks that may be associated with identification of participants and their data – these risks will vary with research contexts. The purpose of de-identification is to protect the privacy of research participants and to minimize harm in the event of a breach which could occur during data collection, storage, or transfer.

Steps to de-identify data

  1. Describe the de-identification process as part of the documentation/data management process.
  2. Identify direct and indirect identifiers required for your research: Identity can be disclosed through direct and indirect identifiers applying to individuals, groups, and organizations (De-identification Guidelines for Structured Data, June 2016; Biometrics. n.d.).

Direct identifiers

Examples that may act as direct identifiers in a dataset: 

  • Name
  • Telephone number
  • Email address
  • Social Insurance number/Heath Card number/Medical Record number
  • Certificate/license numbers
  • Vehicle identifiers and serial numbers, including license plate numbers
  • Device identifiers and serial numbers
  • Biometric identifiers (fingerprints, iris patterns, facial features, DNA, voice signatures.)
  • Internet protocol (IP) address number
  • Any other unique identifying number, characteristic, or code

Indirect identifiers

Examples that in combination could lead to identification:

  • Demographic information (e.g., age, ethnicity, race, gender, sex, religion, marital status, education level)
  • Elements of dates related to an individual (e.g., full birthdate, full date of death, admission dates, immigration dates)
  • Uncommon characteristics of the individual (e.g., rare health condition)
  • Geographic/regional location
  • Named facility and/or service provider
  • Institutional affiliations
  • Highly visible characteristics/ distinctive features of an individual (e.g., tattoos, scars, birthmarks, etc.)

Apply methods used to de-identify data

If a variable might act as an indirect identifier, it can be treated in a few ways to minimize re-identification risk (De-identification Guidelines for Structured Data, 2016; Guide to Social Science Data Preparation and Archiving, 6th Edition).  For example:

  • Coding – replace identifiers with a code or pseudonym
  • Removal – eliminate the variable from the data set (during the project design phase researchers should carefully consider variables to be collected)
  • Aggregate – reduce detail (e.g., report at region instead of village; report age range instead of age) 
  • Generalize detailed variables (e.g., report position level such as "manager" instead of identifying job title).
  • Top-code – restricting the upper and lower ranges of a variable/hiding outliers (e.g., an 81-year-old could be grouped as “people in their 80s”).
  • Avoid presenting data or tables with small cell sizes (e.g., fewer than 5 respondents)

Security measures and requirements

Minimum security requirements is expected for certain types of information and data. These requirements differ depending on the information/data and the possible layers of protection. Security should include organizational, technological, and physical measures. Researchers can also review further information and guidance about managing data.

When possible, researchers should start by using tools and services already vetted by IST or request a review from IST of any third-party services for security issues (e.g., use of cloud-based services, sharing with a research partner). 

Key points and tips

  • Collect the minimum identifying information and data needed to conduct the study.
  • De-identify as soon as possible after collection and/or separate identifiable variables (e.g., consider when and how you may be collecting certain demographic information, create an identity code, etc.)
  • Direct identifiers and identity-only data sets must always be stored in a secure location in a data-encrypted form.
  • When data sets cannot be completely de-identified, the original data set must be considered an identified data set.
  • Relative risk posed to the participant if the data is inadvertently or intentionally released or exposed must be considered when determining the level of security necessary for maintaining personally identifiable information.

Encryption

Secure encryption scrambles information and data into an unrecognizable form which can only be unscrambled and read by providing the password (encryption key). There are several storage options available and ways to encrypt, such as using OneDrive or SharePoint, encrypting individual files on a computer, and/or encrypting devices as well. 

Password-Protected Access

A computer or server requiring a password to access it is considered password protected. Data should be stored on campus servers that meet security guidelines for sharing and storage of research data, and use two-factor authentication for access when possible. For multi-site, multi-country, or multi-investigator research projects a non-Waterloo server solution, such as the Scholars Portal Dataverse, may be required. IST also provides information about security standards for desktops and laptops that researchers should be aware of.

Secure location

A secure location is a place (e.g., office, laboratory, filing cabinet) for storing a portable medium, computer, or equipment on which information and data are kept. The principal (or lead) investigator should always have access to the secure location through lock and key (either physical or electronic keys are acceptable). Access may be provided to other parties (e.g., co-investigators, post-doctoral fellows, research assistants, etc.) with a legitimate need and as outlined in the research ethics application.

Minimum security requirements for electronic data/information

Table 1 provides some examples of how different types of information/data should be secured at a minimum using some of the described methods such as encryption, password protection, and storage in a secure location. The "X" in a table cell indicates a required element. 

Table 1: Minimum security requirements for electronic data/information
Type of information/data Example Encryption Password protections Secure location Additional notes
Contains direct identifiers; OR indirect identifiers with a risk of re-identification Participants with rare health conditions may be more readily identifiable if other information is available such as rural or urban location       X       X     X If  identifiable and sensitive also store at high level of security (e.g., stand-alone servers, special protection for remote electronic access, etc.)
Coded information/data Identifiers are removed and participant code or pseudonyms are used in their place       X       X     X  
Anonymized  personal health information (PHI) Receiving an anonymized dataset from a hospital data custodian for data analysis       X      X      X Health information custodians may have additional requirements.
Anonymized or anonymous information (not PHI) An anonymous online survey        X     X  
Identified information, with permission for identification A virtual  interview that participants want attributed        X    

For additional information related to encrypting, protecting, and securing the data and devices, please visit the Information Security Services (ISS) page from Information Systems & Technology (IST).  

Privacy and the use of smartphones and apps

Protection of personal privacy on social media platforms, as described by IST is important. When using smartphones and devices during interactions with research participants, their information and data is  important to protect. At a minimum, if you are working with confidential or sensitive information on your device – including research participant information and data – you should not use social media apps on the same device. The research participants’ personal information and data must be held in a secure location and deleted when no longer needed.

Using laptops and personal devices

Laptops and portable devices must be secured; they pose a significant risk for identifiable and sensitive information/data because of the increased possibility of theft. Identifiable information/data collected on a laptop or portable equipment must be encrypted and de-identified as soon as possible or moved to secured, non-portable equipment or storage space. Consider reviewing the knowledge base on USB storage devices and their alternatives for additional information. 

Backing up data

Active research data should be backed up regularly. When backing up data, researchers are advised to have an encrypted off-line backup when the data is not in use. This is important for respecting the contributions of research participants by contributing to safeguarding their data from loss.


University of Waterloo resources

Information Systems & Technology 

Cyber Awareness and Security

Information security for research

Information Security Services (ISS): For information on:

  • Data and device encryption
  • Cybersecurity
  • Password standards
  • Much more

Password Standards

Protection of personal privacy on social media platforms

Secure file transfer


External resources


References

Canadian Institutes of Health Research, Natural Sciences and Engineering Research Council of Canada, and Social Sciences and Humanities Research Council of Canada, Tri-Council Policy Statement: Ethical Conduct for Research Involving Humans, December 2022. Accessed June 28, 2024 

Information and Privacy Commissioner of Ontario. Biometrics. n.d. Accessed June 28, 2024.  

Information and Privacy Commissioner of Ontario. (June 2016). De-identification Guidelines for Structured Data. Accessed June 28, 2024. 

Inter-university Consortium for Political and Social Research (ICPSR). (6th Edition). Guide to Social Science Data Preparation and Archiving. n.d. Accessed June 28, 2024.  

Updated September 2024