FHIR Chat · How to represent fictitious, invalid or suspect data

Stream: implementers

Topic: How to represent fictitious, invalid or suspect data

Bruce Tietjen (Feb 14 2017 at 20:14):

We are trying to determine how we should represent Patient field data (which could appear on any field) that should be considered 'fictitous', invalid or at least suspect. It is necessary to keep the data, but somehow mark it so that it can be treated as suspect so that current or future processing can deal with it appropriately.

Some examples include, but are not limited to things like fictitious names (like John Doe, Mickey Mouse, Trauma Trauma); 'invalid' (like U.S. SSN of 000-00-0000, DOB like Jan 1, 1800 or a date years in the future); phone number like 000-000-0000; combination of death date before birth date, etc.

A few fields currently define a 'use' that could be set to 'temp', but most fields do not. Is this best handled by an extension to mark the fields as such (and if so, is there an already defined extension) or is there something already defined?

Grahame Grieve (Feb 14 2017 at 20:21):

sounds like an extension, and I don't see one already defined

Grahame Grieve (Feb 14 2017 at 20:22):

you could also use provenance to point at the information and say it's provenance is [??], but I don't see language in the provenance resource for indicating that information itself is suspect

Lloyd McKenzie (Feb 14 2017 at 20:54):

@Grahame Grieve Would this not need a modifier extension? The semantics seem a lot like "entered in error" or other modifiers. Saying "this isn't real data" certainly ought to change how systems would process or interpret it. (E.g. not looking for/enforcing a match on the SSN or birthDate). If it is a modifier, then we'd have to do something funky like have a modifier extension with pointers to the "id" elements of data that's invalid/suspect because we can't actually put modifiers on data types.

Grahame Grieve (Feb 14 2017 at 20:56):

I don't think it's a modifier. It doesn't change the meaning of the element that contains it; It just indicates that you have some reason to trust the truth of the data.

Grahame Grieve (Feb 14 2017 at 20:57):

if it's necessary to keep it, you have to take it at face value

John Moehrke (Feb 14 2017 at 20:59):

we have GF#10580 where we are asked to clarify one or more methods for this. We didn't get to that prior to the deadline for STU3, so will pickup the work next month.

Grahame Grieve (Feb 14 2017 at 21:00):

k thx

John Moehrke (Feb 14 2017 at 21:01):

We are approaching it as a way to tag a patient record, or Resource record... not elements within an otherwise valid Resource.

Grahame Grieve (Feb 14 2017 at 21:04):

might need a finer granularity

John Moehrke (Feb 14 2017 at 21:22):

What is the use-case for finer granularity? I don't understand a Resource that is partially valid, partially made-up...

Grahame Grieve (Feb 14 2017 at 22:19):

indicating that you believe that the patient's supplied birthdate is suspicious doesn't mean that you think that the existence of the patient is a under debate

John Moehrke (Feb 15 2017 at 01:53):

understood. That is indeed not the topic of our work item. Is this something that current systems do? I understand the need, an integrity evaluation on the element level, but it seems unusual. We do have the codes. Just not the method of applying the code to an element within a Resource. http://build.fhir.org/v3/SecurityIntegrityObservationValue/vs.html

Grahame Grieve (Feb 15 2017 at 02:01):

In Australia, I've seen systems that explicitly track the uncertainty of parts of the birthdate. In fact, there's a standard for the (AS 5017). But it's rare even then. And I've otherwise seen expressions of disbelief (manchausen's, for instance) done in text

Jose Costa Teixeira (Feb 15 2017 at 07:23):

do we need to inform that some of the elements (patient's name, SSN...) are pseudonymized / masked somehow? or that applies to the whole resource?

John Moehrke (Feb 15 2017 at 12:59):

Jose, I was presuming that if a record is pseudonymized, that the record is marked as such. To indicate which elements have been 'modified' is to give away knowledge that can be used to attack the pseudonymization protection. We do have a tag to mark a Resource as having been pseudonymized. That is in our scope for making more clear in the coming months.

Jose Costa Teixeira (Feb 15 2017 at 13:12):

Thanks. So it applies to the whole resource. I was poking to see if there was a need to mark which of the fields are impacted by pseudonymization (thinking of some break the glass scenario).

Michelle (Moseman) Miller (Feb 15 2017 at 15:42):

For the pseudo-anonymized data, such as a masked Social Security Number, then it should be using the extension http://build.fhir.org/extension-rendered-value.html (this extension was created in response to GF#8665)

John Moehrke (Feb 15 2017 at 16:39):

Michelle, that is an interesting extension... but from a Privacy or Security perspective it is not useful. As I understand this is a string to be displayed, rather than the value it is associated with. Which tells me (assumption) that the value still is the identifier in full accuracy? This is 'obscurity', not 'security'.

Grahame Grieve (Feb 15 2017 at 18:07):

obscurity is not the same as security, but it's useful from a security perspective even still

John Moehrke (Feb 15 2017 at 18:40):

Grahame, I understand the need for this extension. There is a need to define a display string for an element that has been de-identified, as the de-identified element often is expressly not displayable. This is especially true of non-string elements (like dates, and numbers). Which is what is explained by this extension. I just want the reader to understand what it is useful for, and what it is not useful for. I simply seek to explain that obscurity is not a solution for security or privacy. Obscurity does not provide any security. That the underlying value really does need to be redacted. To simply put this extension over non-redacted values is what I want to prevent.

Lloyd McKenzie (Feb 15 2017 at 18:48):

@John Moehrke You have a choice about whether the unobscured field is in the instance or not - it certainly doesn't have to be.

John Moehrke (Feb 15 2017 at 18:48):

Would it be so bad to include guidance?

Lloyd McKenzie (Feb 15 2017 at 20:06):

There are use-cases for including and excluding. If the guidance acknowledges both sets of use-cases, guide away :)

Last updated: Apr 12 2022 at 19:14 UTC

Main menu

FHIR Chat · How to represent fictitious, invalid or suspect data · implementers