Stream: implementers
Topic: Hashed identifier
Niek van Galen (May 14 2021 at 12:09):
In the Dutch realm, senders sometimes do not want (or are not allowed) to include a persons national identification number (BSN in Dutch, like the US SSN). Organizations like quality registries however, do require a unique identifier to be sent. Often hashed identifiers are used. What would be the recommended way of representing a hashed identifier in FHIR?
I would expect using the .type element makes sense, but the referenced ValueSet does not contain a code that matches anything like "hashed identifier". This ValueSet is however extensible so I could just add my own, but I am looking for a more international approach.
John Moehrke (May 14 2021 at 12:12):
is there a reason to decorate these with a type? Seems to me the intent of these pseudo identifiers are as a placeholder to fulfil a requirement that an identifier exists. To indicate that it is a hashed identifier is to invite brute force attack. What downstream actor needs to know it was a hashed real identifer vs just knowing it is a globally unique id?
Niek van Galen (May 14 2021 at 12:14):
Good point @John Moehrke. The receiver of the data wants to know what is a real identifier (bsn) and what is not (hashed), because a check digit is performed on the real identifiers to confirm validity of the number.
John Moehrke (May 14 2021 at 12:16):
that is easy... those identifiers that it can recognize are what they are claiming to be... those that it can't recognize is something it is not intended to recognize.
John Moehrke (May 14 2021 at 12:17):
imagine a USA SSN being dropped in there... it would not be useful or recognized.... but it would be carried just the same.
John Moehrke (May 14 2021 at 12:18):
and to carry on with that USA SSN example... it is useless for it to be taged so that it could be understood as it is a number of no use in the Dutch environment.. much like the pseudo-identifier created from the hash.
Niek van Galen (May 14 2021 at 12:26):
I see your point, which does raise another question. When including a Dutch BSN conform the nl-core-patient profile the .system element has a fixed value. In that way, receiving systems know that it's a BSN. I would expect that one would have to use a different .system value (maybe even "hospital specific") for a hashed BSN/identifier?
Alexander Henket (May 14 2021 at 12:29):
I would expect the same: an identifier with .system "BSN" SHALL have a value that actually is a BSN. A hashed thing therefor SHALL NOT have a "BSN" system.
Alexander Henket (May 14 2021 at 12:30):
The main requirement here is likely that upon sending the same person/patient, you have the same identifier, regardless of what that identifier is.
Daniel Venton (May 14 2021 at 12:32):
You have conflicting requirements then, sounds like. Cannot put BSN in, Must put BSN in so we can do patient matching. Which requirements group has force of law? That's who I'd go with.
Niek van Galen (May 14 2021 at 12:35):
@Daniel Venton The receiver is entitled to process BSN by Dutch regulations, however can not enforce that. Senders are encouraged to include a real BSN but are not always willing to because of their organization policy.
Daniel Venton (May 14 2021 at 12:46):
If the organization doesn't want to send the BSN and the consumer can't force you to send the BSN but can force you to send a unique identifier. Send any ole unique identifier you have laying around. Patient ID. MRN. Why even hash BSN unless BSN is the only unique value you have? With other unique items you get to set the system value to yourself. One problem with hashing the BSN, I suspect, the BSN is probably a relatively simple number say 9-digit numeric. If your hash formula was ever "discovered" then creating a rainbow table of all possible hashes would be easy. Then you decide to prevent rainbow attack, you salt the value. If you salt the value then you have to track which salt you used with each BSN so you can use the same salt again tomorrow.
Seems simplest just to send a different unique value, perhaps hospital/facility assigned, and be done with it.
Craig McClendon (May 14 2021 at 14:08):
This seems potentially dangerous - if the receiving systems will sometimes receive the hashed and sometimes the unhashed identifiers for the same patient and they can match on either, does that imply they are storing both identifiers? If that was the case then anyone who stole the data would have the crosswalk between the two identifiers.
Last updated: Apr 12 2022 at 19:14 UTC