Pseudonymisation

From Voror_Wiki

Revision as of 07:48, 14 May 2022 by JackBarker (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Discovery uses the NHS standard Open Pseudonymiser to create pseudonymised data by taking one or more inputs (NHS number, date of birth) and a salt file to generate a pseudo ID that looks like "A541CAF13D376B9AD1072C3096AE141CFF1E67B027CEB632D194D3C6577AB8BF".

By using different salt files you can generate different pseudo IDs from the same data input; each research project uses a different salt to create their own pseudo IDs generated for patient records they use.

Pseudo IDs are generated in DDS subscriber databases based on subscriber-specific configuration; from NHS number only or NHS number and date of birth.

Salt files

A salt file is used to generate a pseudo ID from a given input such as a NHS number.

Each customer/project can supply their own salt files, and each subscriber database supports an unlimited number of pseudo IDs generated for each patient. For example, the CEG database has 30 pseudo IDs for each patient, each generated from the NHS number and a separate salt file.

For more information see https://www.openpseudonymiser.org/FAQ.aspx

Key server

A key server is a website or API that securely stores and shares salt files.

Discovery doesn't currently provide a key server and instead uses a manual approach to managing salt files. A true automated key server is planned for the future.

De-identification of the record

This is how the Discovery Compass v2 databases are de-identified

For each patient demographic record:

if Subscriber Configuration -> isPseudonymised = true:

  • NHS number is blanked
  • Title, Firstname and Surname are blanked
  • Date of Birth is set as 01/MM/yyyy.  NOTE: This can be extended further with a DOB mask to only include the year, i.e. 01/01/yyyy
  • Telecom/fax/email values are blanked
  • All address lines are blanked and only the postcode prefix is set, I.e. LS1
  • All UPRN address coordinates are blanked