Stream: argonaut
Topic: US Core:Extensible and Required bindings for historical data
Cooper Thompson (Oct 28 2019 at 21:13):
This was discussed a bit in this other topic, but I thought it deserved it's own topic. US Core defines some extensible and required valueset bindings. For example, DiagnosticReport (labs) binds the code property to LOINC, presumably due to USCDI. However this is a problem when dealing with historical or 3rd-party (e.g. payor) provided data. For example, we might have lab results in our system where we only have a CPT code (i.e. we got the lab data from a payer). Per the definition of extensible, that means we cannot send the CPT code we have in our system if there is an applicable concept for that same thing in the bound valueset. Similarly if the lab was preformed in some other country that doesn't use LOINC for their labs, etc.. For a few data types, including labs specifically, we stamp the codes on the record based on what we get from the source system (e.g. the lab). That "stamping" may have happened 10 years ago or more, and the data may be from a lab system that has been replaced.
I totally understand the desire to have data coded in a consistent terminology. But it seems like the basic concept of required and extensible bindings don't fit will with the real world of historical data, patient-provided data, 3rd-party data, international data exchange, and end user behavior, all of which can contribute to data that may be coded in a different code system than desired by a particular IG.
So I'm struggling with how to reconcile the valueset requirements from the US Core spec (and really any IG) with the real-world limitations of the *data *that HIT systems have. While our EHR system can handle data with the desired terminologies, when we aren't the source of the data there isn't much we can do. And specifically for 21st Century Cures (which we expect will include options for "real world testing" for certification), if real world data is coded in such a way that is incompatible with the US Core valueset binding requirements, what should we do?
Cooper Thompson (Oct 28 2019 at 21:14):
@Danielle Friend
Cooper Thompson (Oct 28 2019 at 21:20):
Also, do ONC or Inferno folks have input on how real-world testing will interact with (for example) international data? I.e. how strict will the terminology validation be? Will all data for a select patient need to pass terminology validation, regardless of it's provenance? Or at least one record for each data class? I'd suggest the latter, as that would mean real world systems could actually pass :grinning_face_with_smiling_eyes: .
Rob Hausam (Oct 28 2019 at 22:14):
I think we definitely still need clarification on the rules for this. Limiting (if possible) testing and certification requirements to current "locally owned" data could help with that aspect, but if sending this type of historical or 3rd party data is still technically non-compliant with the specification, then that is still a problem.
Grahame Grieve (Oct 29 2019 at 05:47):
with regard to international data - other countries will write their own coding requirements. US coding rules won't apply outside USA. Or are you asking about international data in US context?
Grahame Grieve (Oct 29 2019 at 05:48):
More generally, what I propose on this is that we change the bindings in US Core to allow for what terminologies are known to occur in the real world, and define an extension that expresses the regulatory rules for new data, with an extensible binding that says what policy says
Josh Mandel (Oct 29 2019 at 13:11):
When you say "define an extension," what do you have in mind?
Josh Mandel (Oct 29 2019 at 13:12):
For me, real-world testing isn't a process with a binary outcome; it's a way for organizations to see "how are we doing overall" -- with the idea that once you can measure, you're inclined to think about opportunities to improve.
Josh Mandel (Oct 29 2019 at 13:15):
To be useful in the long run, you'd want to reveal things like, "for patients we send to lab X, or specialist Y, we see 80% have improperly coded procedures, vs our baseline of 20%".
Lloyd McKenzie (Oct 29 2019 at 13:20):
Should we introduce a new type of binding called "best practice" where you get a warning if you don't follow it but you're not non- conformant if the code you send is expressive in the value set but not mapped?
John Moehrke (Oct 29 2019 at 13:38):
IETF has normative words for that documented in IETF RFC 6919 " Further Key Words for Use in RFCs to Indicate Requirement Levels"
The key words "MUST (BUT WE KNOW YOU WON'T)", "SHOULD CONSIDER",
"REALLY SHOULD NOT", "OUGHT TO", "WOULD PROBABLY", "MAY WISH TO",
"COULD", "POSSIBLE", and "MIGHT" in this document are to be
interpreted as described in RFC 6919.
an April 1st RFC -- April Fools
Cooper Thompson (Oct 29 2019 at 16:09):
@Grahame Grieve The international use case I'm thinking of is something like this:
1) A US patient is traveling overseas, receives care in some other country.
2) A CDA (for example) is sent from that other country back to the patient's home health system in the US.
3) Data from that international treatment is reconciled into the US-based system.
4) Patient then accesses their data from their US provider via FHIR / US Core, where some of the data the US system has originated from outside the US.
Step #3 is a little hand-wavy, since I don't think the international codes are normally retained. I mostly meant this to illustrate that data capture can be all over the map (pun intended).
Brett Marquard (Oct 29 2019 at 16:55):
The idea that ONC will invent a way to test extensible for historical data when crucicable/Hl7 validators can't even enforce seems cruel and unusual
Eric Haas (Oct 29 2019 at 17:59):
So Cooper what are you suggesting? Give up on standard terminology - on the basis that there will alway be legacy data or some outliers?
Brett Marquard (Oct 29 2019 at 18:43):
nah, I think Cooper is saying be realistic that legacy data exists
Grahame Grieve (Oct 29 2019 at 19:13):
The idea that ONC will invent a way to test extensible for historical data when crucicable/Hl7 validators can't even enforce seems cruel and unusual
The idea that the documentation won't say what actually happens because of a combination of the confusion around extensible per se and the confusion around regulation for current practices and catering for legacy daa is also cruel and unusual
Grahame Grieve (Oct 29 2019 at 19:14):
@Cooper Thompson that's certainly a possible scenario in the future but
A CDA (for example) is sent from that other country back to the patient's home health system in the US
could only be via IPS right now? (either CDA or FHIR, but I predict that outside Europe, the FHIR variant will be more likely to get traction)
Eric Haas (Oct 29 2019 at 19:41):
(deleted)
Cooper Thompson (Oct 29 2019 at 19:44):
@Eric Haas I'm saying that we need a binding that both communicates what we *hope *is true now and in the future, but also allows for what was true in the past. I don't know that I see the current definition of extensible being practically useful in many systems, so it might be that we just update that definition to allow for messy data.
Eric Haas (Oct 29 2019 at 19:45):
the confusion around extensible
I don't think there is confusion around what happens in the real world. you may get a warning. There is no team of elves in the back room checking to see if the concept overlap. so no harm no foul.
Grahame Grieve (Oct 29 2019 at 19:46):
right. that's the confusion right there
Grahame Grieve (Oct 29 2019 at 19:46):
the definition of extensible is precise and has real meaning. What is going on in real life is something different.
Eric Haas (Oct 29 2019 at 19:47):
the only confusion is the disconnect with reality and the theory
Grahame Grieve (Oct 29 2019 at 19:47):
Saying that because the definition of extensible isn't computably enforceable means that the difference doesn't matter is exactly my problem
Eric Haas (Oct 29 2019 at 19:47):
they are just words that are unenforceable.
Grahame Grieve (Oct 29 2019 at 19:48):
that is wrong. They may be unenforcible in a validator, but that doesn't make them just words. We should mean what we say, and say what we mean.
Eric Haas (Oct 29 2019 at 19:50):
I am hearing what we want is the definition to match the reality
Grahame Grieve (Oct 29 2019 at 19:50):
I think that, yes. But not by changing the meaning of 'extensible' but by reworking the way we define the binding
Josh Mandel (Oct 29 2019 at 20:41):
From my perspective, real world testing should be able to demonstrate a percentage of failures, and should be able to show that those failures are due to legacy data. Maybe even able to show those legacy failures or shrinking you're over a year. I don't know that we need to bend our conformance expectations to say legacy data meets them..
Grahame Grieve (Oct 29 2019 at 20:42):
I don't understand this perspective. Is legacy data wrong data?
Josh Mandel (Oct 29 2019 at 20:42):
Well, it's data that might not meet all the validation requirements of newly generated data.
Josh Mandel (Oct 29 2019 at 20:43):
We should have a way to talk about those elevated validation requirements for new data -- otherwise all the same exceptions and laxities that apply to legacy data will also apply to new data
Josh Mandel (Oct 29 2019 at 20:43):
The question is how we can improve over time and demonstrate that improvement
Grahame Grieve (Oct 29 2019 at 20:43):
that's what I actually proposed: the binding should describe what data is valid on the interface, and we should have an extension to describe what binding applies for new data
Grahame Grieve (Oct 29 2019 at 20:45):
or, alternatively, 2 different profiles.
Lloyd McKenzie (Oct 29 2019 at 20:50):
The definition of 'extensible' is entirely enforceable. It requires human review, but that doesn't mean it's not enforceable. What's desired here has nothing to do with extensible. Instead it's saying "this is the value set that should be used, but we recognize that due to legacy reasons that won't always happen". That too, is pretty useless from a conformance testing perspective. The real question is "What, based on the regulation, is allowed to happen in real systems?" Ideally, that would take into account the real world, but not always. And if regulation doesn't take into account the real world, we then need to decide what happens when the real world and regulation don't agree - do we sent non-compliant data and accept that it's not compliant or do we not send the data?
Cooper Thompson (Oct 29 2019 at 20:51):
@Josh Mandel I don't think it is just legacy data. These are the data categories I can think of that expose this same issue:
- Historical data
- Patient-provided data
- 3rd-party data (e.g. data from paid claims from payers, or uncoded data from an HIE)
- Data from international domains
- Data entered by non-standard end user workflow (e.g. scanned or manually transcribed lab reports)
- etc.
Grahame Grieve (Oct 29 2019 at 20:54):
so... either we decide, based on regulation, that such data is banned from being shard on the interface (yuck) or we agree that the data is shared anyway. In which case, that specification should say that that's what will happen
Lloyd McKenzie (Oct 29 2019 at 21:04):
And then the technical spec should have a binding declaration that reflects the type of validation that should happen - which I think is that any code at all is permitted, but if you don't have a code from the regulated value set, you get a warning. This would be similar to what happens with Extensible, but the difference is that what extensible is saying with the warning is "have a human look at this and if your concept falls within any of the codes in the value set, you're not conformant". What we actually want is a warning that says "if this data is 'current' data that didn't arrive via one of the 'exception' data sources, you are non-conformant"
Lloyd McKenzie (Oct 29 2019 at 21:05):
Which is why I'm wondering whether the solution shouldn't be a new binding type
Rob Hausam (Oct 29 2019 at 21:16):
I don't necessarily disagree with adding a new binding type. But we don't want them to proliferate (and if we do this I don't know if this would be the only one ever that we would add?). And if we add a new one that matches (somehow) what systems actually need in the real world, will the 'extensible' binding (which we've decided doesn't quite do what's needed in the "real" world) just continue to be there but mostly sit on the shelf and gather dust (once the specs have a chance to be updated)? That doesn't seem like a very satisfactory outcome (if I'm not mischaracterizing it).
Eric Haas (Oct 29 2019 at 21:16):
I think Josh's point is more in line with the IG we are setting a bar for the testers and and the testees. Real world usage may not reflect that because the real world is messier.
Lloyd McKenzie (Oct 29 2019 at 21:41):
In the real world, sometimes systems don't conform. However, we shouldn't create specifications where that's an expected outcome for most/all systems. If we do that, then the specification becomes next to useless. The whole point of a specification is to set expectations for what participants will do and can expect. If we write a specification that no one expects to adhere to and that SMART apps and others can't rely on, then we haven't accomplished anything useful.
Lloyd McKenzie (Oct 29 2019 at 21:44):
@Rob Hausam I disagree. Extensible has a purpose. The purpose is where the value set does not include high level generic codes that cover the space and where there's a need to require consistent use of standard codes, but allow wiggle room for other codes. As an example, the OperationOutcome codes. Those are not regulatorily driven. But we absolutely expect systems to use the defined codes if one applies. We also allow for completely custom codes if there are issues that come up that fall outside the high-level codes we've identified.
Rob Hausam (Oct 29 2019 at 21:58):
@Lloyd McKenzie If we identify and explain (as you did) where we expect 'extensible' to continue to be applicable, then no problem. I assume you probably would agree that binding types shouldn't proliferate?
John Moehrke (Oct 29 2019 at 22:19):
seems the problem is a theory at this point. Reality is that data are messy, that is not the theory. The theory is that the regulation will deem a system non-compliant simply because that system produces a small number of non-compliant data. In other settings a system is evaluated by it's ability to deliver compliant data, and the data that are not-compliant are seen as exceptional situations where those data themselves are seen as non-compliant. So this is needs a regulatory expectation of reality, and everyone to stop thinking that there is only two states (compliant vs non-compliant).
Grahame Grieve (Oct 29 2019 at 22:47):
there's a methodology issue here - discussion at https://chat.fhir.org/#narrow/stream/211987-methodology/topic/Multiple.20Bindings
Grahame Grieve (Oct 29 2019 at 22:48):
But it seems we have general agreement on the overall picture:
- regulation says to use a particular code system in a particular context
- but there's of data on the API that comes from other contexts where the regulation doesn't apply (particularly but not limited to legacy data)
Grahame Grieve (Oct 29 2019 at 22:49):
- we want the specification to describe the situation clearly so that people know what is going on
- we want conformance testing to... encourage.. the use of the correct codings. but we don't want conformance testing to call valid data from the other use cases non-conformant
Grahame Grieve (Oct 29 2019 at 22:50):
it seems the last point is the one where contention arises....
Josh Mandel (Oct 30 2019 at 00:52):
I don't think it is just legacy data. These are the data categories I can think of that expose this same issue:
These are all great examples @Cooper Thompson -- and I didn't mean to single out legacy data. My point is that whenever we define conformance expectations that target a "good level of consistency" for interop, we shouldn't assume that all data will meet this level in the real world. Scorecard style, we should be able to track quality across a variety of situations, and less than 100% conformance is a fact of life. (And this might be achieved with multiple profiles as Grahame suggests... I'm not entirely sure what that would look like.)
Lloyd McKenzie (Oct 30 2019 at 02:10):
@Grahame Grieve I'm not sure it's terribly clear that regulation doesn't apply in those situations, but I think there's consensus that it probably shouldn't apply - at least in most of them.
@John Moehrke Compliance is generally a binary thing. Either you're fully compliant or you're not. It's possible to itemize the areas of non-compliance, but if you're non-compliant anywhere, you're non-compliant. That's how the process works. And in most industries, non-compliance of any sort is a fail. (Try telling an electrical inspector that your wiring is "mostly compliant" :>) I don't think we want to change the standard meaning of the word 'compliance' here. What we do want to do is to ensure that the specifications are clearly defining the expectations we have for compliant systems. Which means that if it's totally fine for legacy data to have a non-SNOMED code, then the validation process shouldn't identify the absence of a non-SNOMED code as an error. If the expectation is that all 'new' data is SNOMED and 'old' data doesn't have to be and the validator has no clue what's new vs. not, then the validator should spit out a warning when it sees non-SNOMED and say "this is only valid if it's 'old' data" - and leave it to human validation to evaluate whether the instance is conformant or not. However, it should always be possible to tell whether the system is conformant if you dig deep enough. And if the system is non-conformant in a single instance, then it's non-conformant. (What action regulators take in response to non-conformance and whether they have a sliding scale of ramifications based on 'degree' of non-conformance is up to the regulations.)
Josh Mandel (Oct 30 2019 at 02:18):
Lloyd, I'm assuming you meant to mention me rather than John in your comments above. I'm pretty sure I disagree with your perspective on this; healthcare data aren't all going to be "valid" but we can't let that prevent us from defining good validation expectations. It's absolutely not a binary proposition when you talk about all your data, across populations and organizations. What, you're saying one badly coded lab means your whole system is a failure? That's silly.
Josh Mandel (Oct 30 2019 at 02:18):
We need to be realistic when we're assessing real-world data.
Lloyd McKenzie (Oct 30 2019 at 02:23):
I did mean to mention John. He was arguing that compliance isn't binary. I'm saying it must be. However, compliance should take into account reasonable expectations. When we say that Patient.gender must be male, female, other or unknown, that's a firm requirement. If you don't follow that, you're non-conformant - end stop. And having that as a clear rule is important. The rules around condition and procedure codes should be no less clear - but they well be much less strict. If the best we can do is a 'SHOULD', then that's what we should do. If we can do a conditional SHALL, then that's even better (so long as the condition is actually testable). If the conditional SHALL is violated, then the instance is non-conformant.
Lloyd McKenzie (Oct 30 2019 at 02:50):
Our reasonable expectations for Patient.gender are that all systems will translate what they have to one of the allowed 4 values or that they'll omit the element. If they want to convey additional detail, they'll do that with an extension or Observation. Some systems won't meet those expectations. Validators will find them non-conformant and that's an appropriate label for them to have.
Our challenge with some of the elements in U.S. Core is that there was never super-clear discussion about exactly what would constitute a "conformant" vs. a "non-conformant" instance. Without clarity in the requirements, we can't have clarity in the specs.
Richard Townley-O'Neill (Oct 30 2019 at 04:41):
@Josh Mandel
(And this might be achieved with multiple profiles as Grahame suggests... I'm not entirely sure what that would look like.)
One way is to replace a single profile (e.g. us-core-allergyintolerance) with two, one with the same bindings (strict) and one with weaker bindings (lax). Then requirements statements say you SHOULD conform to the strict one and SHALL conform to the lax one.
Grahame Grieve (Oct 30 2019 at 05:47):
or a best practice profile
Cooper Thompson (Nov 06 2019 at 14:58):
I submitted GF#25184 to track this issue.
Lloyd McKenzie (Jan 22 2022 at 20:44):
@Michael Donnelly - you were going to follow up on this thread to discuss the option of using a meta.tag to flag non-conformant data. (Actual link is FHIR#25183)
Cooper Thompson (Jan 24 2022 at 15:00):
FYI - this was discussed recently again in the Period Datatype Invariant thread.
Last updated: Apr 12 2022 at 19:14 UTC