Landmark’s Help Center

Already have an account? Log in

*Landmark’s Platform View

Join our community of researchers who have used Landmark for over 100,000 transcripts

Request a free Demo. One of our Account Executives will be pleased to guide you through our Platform.

You can cancel your account at any time.

Quick Guide to Data De-Identification

De-identification involves the removal of personally identifying information in order to protect personal privacy.

In terms of health information, data is considered de-identified under the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule when a number of specified data elements are removed. 


What is De-Identified Data?

Data is de-identified when

●  All 18 HIPAA-specific direct and indirect identifiers have been removed (Safe Harbor method) 

●  Data is determined by expert opinion to have a low probability of re-identification.


Interviewer:  When was the first time you heard about the [Organization]?

Interviewee:   Ten years ago. I was a student at [University] and one of my professors told us about [Organization] and their work.

To view a complete example click on the link below.

What is PHI?

Individually identifiable health information, including demographic information, that is created or received by a covered entity and that relates to the past, present, or future physical or mental health of an individual, provision of healthcare to an individual, or past, present, or future payment for the provision of healthcare to an individual. 

The presence of at least one of 18 HIPAA-designated direct and indirect identifiers in a data set makes the whole data set Protected Health Information.

1. Name
2. Social Security numbers
3. Telephone numbers
4. Addresses and all geographic information smaller than a state
5. All elements of dates (except year), including date of: birth, admission, discharge, and death; and all ages over 89
6. Fax numbers
7. E-mail addresses
8. Medical record numbers
9. Health Plan Beneficiary numbers
10. Account numbers
11. Certificate/license numbers
12. Vehicle identifiers and serial numbers, including license plate numbers
13. Device identifiers and serial numbers
14. Web Universal Resource Locators (URLs)
15. Internet Protocol (IP) addresses
16. Biometric identifiers, including finger and voice prints
17. Full face photographic images and comparable images
18. Any other unique identifying number, characteristic, or code: Any code or other means of record identification that is derived from PHI that must be removed in order for the data to be considered de-identified per the Safe Harbor method.

My research involves the use of PHI, what steps do I take?

Entities covered by HIPAA may share a limited data set for research purposes permitted by the Privacy Rule under one indisputable condition. All recipients must bound by a data use agreement with the originator of the data.

If you are a researcher seeking to access, obtain, or use PHI from a HIPAA covered entity for research purposes, then you may require  a signed authorization for that use from the patient/participant, or otherwise justify an exception from that requirement. 

In either case, you will be required to have an IRB-approved protocol.

Tips for de-identifying participants

• Plan or apply editing at time of transcription except: longitudinal studies – de-identify when data collection complete (linkages)

• Avoid blanking out: use pseudonyms or replacements • Avoid over-anonymising: removing / aggregating information in text can distort data, make them unusable, unreliable or misleading

• Consistency within research team and throughout project

• Show replacements, e.g. with [brackets]

• Keep a log of all replacements, aggregations or removals made – keep separate from de-identified data files

• Text anonymisation helper tool can help you find disclosive information to remove or pseudonymise in text files

• MS Word macro to find and highlight numbers and words starting with capital letters in text, which are often disclosive, e.g. as names, companies, birth dates, addresses, educational institutions and countries

Regulatory Resources

Health Insurance Portability and Accountability Act of 1996 (HIPAA) (Pub. L. No. 104-191, § 264 (1996), codified at 42 U.S.C. § 1320d; Standards for Privacy of Individually Identifiable Health Information, 45 C.F.R. § 160 (2002), 45 C.F.R. § 164 subpts. A, E (2002).

Other Resources

Guidelines for Data De-Identification or Anonymization

Quick Guide to HIPAA – Stanford Medicine. Research Informatics Center

De-identification and pseudonymisation of qualitative data

Join our community of researchers who have used Landmark for over 100,000 transcripts

Request a free Demo. One of our Account Executives will be pleased to guide you through our Platform .

Get started with Landmark and ensure your research data gets the attention it deserves!

Try it for free . NO CREDIT CARD REQUIRED.

Contact Us

We want to hear you. Leave a message or enter your phone number and we will contact you for immediate assistance.

© 2009-2020 Landmark Associates, Inc.