Stream: cds hooks
Topic: Anonymized CDS
Elliot Silver (May 25 2018 at 17:53):
Has there been any consideration for doing decision support without exchange of PHI? I believe some decision systems and some provider organizations would be more comfortable in this case. Can CDS Hooks support this?
Kevin Olbrich (May 25 2018 at 20:44):
I know @Brian Alper is interested in this as well. There isn't anything in the standard that prevents it. Some hooks might send over a Patient's FHIR id, which could be an issue in some cases. You would want to avoid prefetches unless you use the _elements param to subset the data. You could also propose a new hook that only sent over certain specific bits of data (like just age perhaps). I have also been thinking about using something like @Chris Grenz 's profile filtering work to actually help communicate the data requirements and possibly even restrict the data sent.
Kevin Shekleton (May 27 2018 at 10:02):
@Kevin Olbrich is correct -- the context
field in each CDS Service request is what may contain PHI. Each hook defines its own context so you could design a hook (or set of hooks) that do not send any PHI.
Alternatively, you could use the existing hooks and ensure that data like the Patient.id is mapped to some other value (eg, using a security tokenization solution).
John Moehrke (May 29 2018 at 13:55):
just removing the link to a Patient will reduce the risk that the data are identifiable, but doesn't bring it down much as there are many quasi-identifiers laying around in health data. I am not saying this isn't a useful step, I think it is a very useful step. I am just trying to keep us fully aware of proper anonymization. Note, that even the algorithm used to replace the Patient can be revealing...
Elliot Silver (May 29 2018 at 21:46):
OK. So it should be possible, it's just a matter of defining the hook and setting expectations correctly. Thanks.
Roeland Luykx (May 30 2018 at 06:23):
should not a new hook template be defined for anonymized CDS, because the patient-view has the patientid required in the context?
John Moehrke (May 30 2018 at 13:01):
The anonymous would likely need a Patient resource, to carry some useful values like age (offset by a random value), gender, other?
John Moehrke (May 30 2018 at 13:02):
If done right, the cds-hook should not need to be implemented differently. The difference would be on the requesting side to anonymize or not, and when anonymizing what method/algorithm.
John Moehrke (May 30 2018 at 13:03):
I guess one would need a profiled Patient to indicate the critical few elements, and the tolerance each element holds .
Kevin Shekleton (May 30 2018 at 21:03):
The patient-view
hook does have patientId
as a required hook context field, but that doesn't mean that patientId can't be tokenized to some UUID. Also, the FHIR server doesn't have to be accessible so that tokenized patientId
doesn't have to provide access to the actual Patient resource.
But, it is interesting to think about what a value a CDS Service would be able to provide without any identifying information about a patient. As John mentioned, the fields defined as PII/PHI are too narrow. Studies have shown you can identify patients with data outside of PII/PHI. :shrug:
Kevin Olbrich (May 30 2018 at 21:24):
In some cases, you might be able to provide some value with just gender and age, but if you couple that with other information like the location of the customer and the time of the request, it wouldn't be too hard to identify the patient.
John Moehrke (Jun 02 2018 at 16:21):
So, what elements in Patient are critical to the cds-hook doing the job? I would be glad to help define an anonymized Patient for this purpose. It would not be 100% privacy protecting, but nothing ever is. As long as we are clear that it is intended to lower the risk of exposure, and we indicate the potential residual risk, then we are being consistent with standards on de-identification. (gender, birthDate, communication, generalPratitioner?). Even better if we can define a tolerance on birthDate (+-30 days).
John Moehrke (Jun 02 2018 at 16:27):
Thus forbid (name, telecom, address, martialStatus, photo, contact). Unclear to me if cds-hooks wants (link, managingOrganization, multipleBirth, deceased[x])..... Lastly forbid ALL extensions.
John Moehrke (Jun 02 2018 at 16:29):
The identifier (and id) would need to be specified as a pseydonym (e,g, UUID assigned per transaction)
Lloyd McKenzie (Jun 02 2018 at 19:06):
deceasedBoolean would be relevant most of the time as most decision support only applies if it is false. Can't think of use-cases where much of the rest would matter.
Grahame Grieve (Jun 02 2018 at 19:59):
I think link would be relevant all the time
John Moehrke (Jun 02 2018 at 20:01):
In what way would link be relevant in cds-hooks, when the Patient is anonymized? In the case where it is not anonymized, then I understand the use.
Lloyd McKenzie (Jun 02 2018 at 20:20):
It would be relevant for retrieving associated other records - conditions, observations, etc.
Grahame Grieve (Jun 02 2018 at 20:21):
ok I agree with that
John Moehrke (Jun 02 2018 at 20:21):
That would mean your resource server would need to have a complete shadow copy of all data
John Moehrke (Jun 02 2018 at 20:25):
ah, I am assuming th model where the EHR passes the data to the cds-hooks.. I might be out-of-date, as there is the model where the cds-hooks calls back to the EHR to retrieve the data.. I guess in this case there would need to be a shadow copy of the patient data that is only associated with the pseudo Patient resource copy. This callback model exposes more data, thus increasing risk of re-identification...
John Moehrke (Jun 02 2018 at 20:43):
The most rich and pervasive quasi-identifier is all the dates found in all FHIR Resources. The more Resources, the more rich the date cloud for that pseudo-Patient. If someone knows key dates where patient Paul has seeked care, they can correlate the dates they know with the dates found in a pseudo-Patient to determine that pseudo-Patient is highly likely to be Paul. -- the shadow copy of the data would need to add some random offset to all dates, yet not invalidate the clinical significance. Usually implemented as a single random offset for that pseudo-Patient that is applied consistently to all dates. Interrelationships are maintained, but absolute date can't be determined. Most at-risk patients are well-known (VIP), or target of motivated attacker. -- Thus giving full access to all the data on a given pseudo-Patient, vs a sub-set, increases risk; thus my assertion the cds-hook calling back to the EHR is more dangerous, than EHR giving only select resources to the cds-hooks.
Lloyd McKenzie (Jun 02 2018 at 21:26):
One thing you could do is have the random offset be for a specific hook invocation. So long as there are constraints on the queries to require a reasonable degree of filtering (specific lab tests rather than all lab tests, for example), then the hook can't grab too much. If they see the same patient again, it'll have a different id and a different date offset.
John Moehrke (Jun 02 2018 at 21:41):
yes. The hard part is that the offset needs to be consistent for that whole pseudo-patient without exposing what the offset actually is. There is fun design work to protect privacy -- Privacy by Design...
Last updated: Apr 12 2022 at 19:14 UTC