Stream: inferno
Topic: BDGV-21 - Provenance inclusion (by default) in Bulk FHIR
Cooper Thompson (Jul 07 2021 at 14:25):
In the v1.0.0 version of Bulk FHIR, there isn't much said about how Provenance should be handled. Technically, only the Provenance resources that pertain to the Patient resource are in the patient compartment. Provenance resources for all other resource types are not in the patient compartment (which maybe is an issue with the FHIR definition of the patient compartment?). However, their inclusion is probably allowed because of the "other helpful resources" bit, so the compartment part doesn't matter I suppose.
Anyway, because v1.0.0 doesn't have any way to identify which set of Provenance resources you want in the output, we've backported the includeAssociatedData parameter from v1.1.0 in our implementation. This seems like the best option from an operational utility perspective, even though it does blur the lines of what version of the Bulk FHIR standard is used. However, since Inferno isn't checking anything from v1.1.0 (which is being balloted and is not SVAPed (yet)), we're stuck in the situation of either having to develop directly to v1.0.0, which is crappy for both our FHIR server (complexity and performance) and clients (possibly getting a ton of Provenances they don't want, if they only want the most recent, or none at all).
Given that v1.0.0 includes support for some experimental query parameters, could Inferno add a field to the Multi-Patient API tester for developers to provide additional query parameters? And if so, would providing the v1.1.0 includeAssociatedData be a violation of the spirit/intent of the g10 criteria? Given that v1.1.0 added it, I think it would be in line with the spirit of g10, but I don't know if this is the sort of thing that would require a CCG update or if Inferno could just update BDGV-21.
Robert Scanlon (Jul 07 2021 at 16:34):
Unfortunately I don't have a quick answer here. I'd recommend submitting formal feedback to ONC on this, just to get this started through the proper channels. Since this parameter isn't mentioned anywhere in v1, I think we'd need somewhere authoritative to point to that says its ok to add this as a requirement to clients that want this data, and if v1.1.0 is too far off then the CCG is where it would have to be. We have avoided allowing any vendor-specific requirements placed on the clients to access data (e.g. you must add this extra header or this extra query parameter otherwise we won't give this data) for obvious reasons. I understand that this is almost standard, so calling this vendor-specific isn't quite accurate. But if I add this will I also need to add all the other experimental fields (e.g. can some servers require the _since field because they cannot support an unbounded dump of data)? And when a v1.2 draft is created add those as options under v1.0 testing too?
Josh Mandel (Jul 07 2021 at 17:25):
Note that the "v1.1" or "1.2" is actually what we'll be calling "Bulk Data 2.0.0" at publication time -- we're well on the way to reconciling ballot comments, but at least a couple of months away from being ready to publish.
For provenance in particular @Cooper Thompson , one way to deal with this in the near term might be to follow the V2 guidance for RelevantProvenanceResources
by default when performing an export, or when performing an export that explicitly includes Provenance
in the _type
array. (@Dan Gottlieb, I'm subscribing you here FYSA.)
Robert Scanlon (Jul 07 2021 at 18:55):
If the default is to behave like RelevantProvenanceResources
, how would a client signal it didn't want that data? Would the server define something like _NoProvenanceResources
?
Cooper Thompson (Jul 07 2021 at 19:01):
I think in 1.0.0, there isn't really any way for a client to specify which Provenance they want other than _type.
Josh Mandel (Jul 07 2021 at 19:02):
Agreed, Cooper. Precisely why we've introduced this feature in V2.
Cooper Thompson (Jul 07 2021 at 19:09):
Given that v1.0.0 says the server SHOULD include resources in the patient compartment, then the spec-compliant-but-not-really-in-the-spirit option is to only include the Provenance for the patient resource. At least for certification. We could then have the backported includeAssociatedData option for operational use.
Cooper Thompson (Jul 07 2021 at 19:10):
That has the downside of having certification behavior be out of line with real-world intended use cases, but I think because of what is missing in v1.0.0, I don't see a way around that no matter what option we do.
Josh Mandel (Jul 07 2021 at 19:20):
Are you referring to the following language from the Bulk Export spec?
For Patient- and Group-level requests requests, the Patient Compartment SHOULD be used as a point of reference for recommended resources to be returned. However, other resources outside of the patient compartment that are helpful in interpreting the patient data (such as Organization and Practitioner) may also be returned.
I'm not sure how this imposes a testable requirement.
Re: the suggestion to
only include the Provenance for the patient resource. At least for certification
My reading would be: to meet https://www.healthit.gov/node/133856#test_procedure you'd show the USCDI data flowing into your $export
outputs. This would include USCDI provenance. Of course, it's brittle to do this by default and as your only option, which is why the V2 includeAssociatedData
feature exists.
Josh Mandel (Jul 07 2021 at 19:22):
Re: BDGV-21
, am I misreading the code when I infer that you could pass this with no Provenance? This seems beyond the point.
Robert Scanlon (Jul 07 2021 at 19:27):
You need to have at least one Provenance resource (and also at least one example of every Must Support element as defined in the US Core Provenance profile), otherwise the test is marked as "Skip" which we've defined as "not failed but not enough data to pass -- please use a patient record with more data".
Robert Scanlon (Jul 07 2021 at 19:29):
It's a fairly low bar on the Bulk Data side. We do more thorough checking on the single patient api side (US Core reads/searches).
Josh Mandel (Jul 07 2021 at 20:25):
For the single patient API side, do you check for Provenance? For bulk, could you perhaps check that the "single patient" exported is part of the example group exported? That'd allow you to compare the breadth/depth of resources.
Josh Mandel (Jul 07 2021 at 20:25):
(From my perspective, this particular kind of Provenance is pretty low-value; I think "check that there's one" is actually just fine. I'm more thinking about how to avoid ad-hoc decisions.)
Robert Scanlon (Jul 07 2021 at 20:51):
On the single patient API side, we verify that every (applicable) US Core Resource type supports the Provenance revinclude query as required by the US Core Server Capability Statement. Each query must return at least one provenance resource that validates against the US Core Provenance profile in order for the tests to pass. Across all provenance resources provided, all Must Support elements must be present at least once.
Lakshmi Bhamidipati (Sep 30 2021 at 21:44):
Robert Scanlon said:
You need to have at least one Provenance resource (and also at least one example of every Must Support element as defined in the US Core Provenance profile), otherwise the test is marked as "Skip" which we've defined as "not failed but not enough data to pass -- please use a patient record with more data".
Hello,
@Robert Scanlon
I am trying to figure out how to include provenance data in bulk API response (for certification). From what I am reading here, can the bulk API server just include provenance data for Patient resources in an export group? Say if my group has 3 resources, can the Provenance ndjson just include 3 records and pass certification? Or am I supposed to get all the provenance records for all the resources by making a call to <FHIR URL>/R4/<Resource>?_id=<resourceId>&_revinclude=Provenance:target?
@Cooper Thompson - I am assuming in your question above, you are referring to "For Patient- and Group-level requests requests, the Patient Compartment SHOULD be used as a point of reference for recommended resources to be returned. However, other resources outside of the patient compartment that are helpful in interpreting the patient data (such as Organization and Practitioner) may also be returned."
Robert Scanlon (Oct 01 2021 at 21:17):
You would pass today's tests by providing only the provenances for the Patient resource, because we do not probe very deeply into this view of the data. But just because we aren't checking for provenance of other resource types in bulk data doesn't mean you shouldn't be providing them. In general, I think the rule intends for equivalency between the Single Patient Data API and Mult-Patient APIs from a data-coverage perspective. @Lakshmi Bhamidipati
Last updated: Apr 12 2022 at 19:14 UTC