Stream: bulk data
Topic: Large Volume For Patient/System Export
Nihar (Mar 07 2022 at 11:04):
Hello:
We would require the guidance and suggestion in the Patient/System Export for all the resources. When we are allowing the open export query we are facing the technical troubles with large number of transaction that are getting considered for export. Though we are making file size to 100000 transaction per file.
Do we have any mechanism to restrict the systems to export certain amount of data per request i.e. either using _typefilter or _since or anything else that is inline with the Implementation requirement or Law Requirements? Please suggest.
Josh Mandel (Mar 07 2022 at 14:32):
Can you provide more context for the questions? Are you developing a bulk data API server? Are you leveraging an existing server technology and trying to configure it? Or are you wondering about the standard itself?
Nihar (Mar 07 2022 at 17:05):
Hi @Josh Mandel : We are actually leveraging an existing server technology and trying to configure it for Bulk Support. What we are trying to do is, considering the kick-off request for all patient data and for all the resources and this is creating the technical troubles when it comes to volume of data that it generates to consider in the ndjson files.
Josh Mandel (Mar 07 2022 at 19:08):
OK -- it sounds like you may want to reach out to whoever is providing your server technology to ask if they support these features (_since
, _typeFilter
, etc).
Nihar (Mar 08 2022 at 06:31):
Hi @Josh Mandel : I think I really made some mistake in explanation or answer...!!! We are an EHR dealing under the Oncology Specialty and trying to Build the BULK FHIR Functionality, we are facing some technical troubles while catering to the request for our client with bulk request for all patients and all resources because of Volume that it tries to collect from the database (70 - 100 Million).
To resolve this we need your suggestions with following queries...:
- Can we mandate other systems who are seeking the BULK Patient Data from our EHR System to use _typefilters OR _since so that we can limit the # of transactions for exchange per request? This mandate can we declare under the Server Capability Statement?
- If Point 1 is not appropriate then is there any other way that we can handle this technical concern? We need your suggestion.
Josh Mandel (Mar 08 2022 at 14:20):
Ah. No, you can't force clients to add parameters that limit their data request (at least, if you need to claim conformance to the bulk data IG).
Are you going for ONC certification? That requirement does allow you to support only group level export rather than sidewide export. But my best advice is that you should engineer your system to work for all data (even if it's slow).
Cooper Thompson (Mar 08 2022 at 14:26):
Throttling a request like this such that it doesn't impact database performance seems very reasonable. That would have implications on the run-time for a request for 100 million patients, but if someone is requesting 100 million patients I think they should have the expectation that it would take a while (personally I'd expect days or longer).
Josh Mandel (Mar 08 2022 at 14:27):
I think the request was about 100M resources not 100M patients, though I may have misunderstood.
Nihar (Mar 09 2022 at 06:33):
@Josh Mandel : Yes, we are going for the ONC Certification and what you mean is, for ONC Certification, it does allow to only comply with Group Export...?? Would it be accepted if we don't support the system export and Patient Export? Please guide.
And understood we cannot mandate for specific filters or even we don't have any other options to deal with this. We need to configure the server/database to handle those open requests in terms of conformance with BULK DATA IG.
Also, the request is for the 100M resources and not 100M Patients but even considering 100M Resources of all the patients it is creating database/server to create technical troubles for us. Shouldn't we have some mechanism to limit the data PER REQUEST as the Server?
Josh Mandel (Mar 09 2022 at 13:48):
https://www.healthit.gov/test-method/standardized-api-patient-and-population-services#test_procedure has ONC's test methods, which document group-export
as the focus.
Shouldn't we have some mechanism to limit the data PER REQUEST as the Server?
A key purpose of bulk export is to enable exporting all the US Core data that is known about a group of patients. If you impose limits, this purpose would not be met.
Nihar (Mar 09 2022 at 14:36):
@Josh Mandel: Thank you so much for the clarification and suggestions under this topic.
A follow-up question on ONC's Test Methods link that you have sent - When they mention as
FOR SEARCH - "The health IT developer demonstrates the ability of the Health IT Module to support the “capabilities” interaction as specified in the standard adopted in § 170.215(a)(1), including support for a “CapabilityStatement” as specified in the standard adopted in § 170.215(a)(1) and implementation specification adopted in § 170.215(a)(4)."
FOR RESPONSE - "The health IT developer demonstrates the ability of the Health IT Module to support a successful data response according to the implementation adopted in § 170.215(a)(4)."
Does it mean that ideally we need to comply with the § 170.215(a)(4) HL7® FHIR® Bulk Data Access (Flat FHIR®) (V1.0.1:STU 1) i.e. Patient, System and Group Export mentioned as per IG..? Along with the focused method for Group Export that will be definitely tested...? Please Guide.
Josh Mandel (Mar 09 2022 at 14:46):
The testing process will evaluate your server on its ability to export the full data set for a population (group) of patients. You can try out the inferno testing tool to see exactly what API surface area is covered. https://inferno.healthit.gov/
Cooper Thompson (Mar 09 2022 at 14:47):
I'll add to Josh's comments above that you can and probably should provide ways for the healthcare organization to control what content is returned about a group. The EHR vendor shouldn't be making decisions about limiting data, but the data holder (healthcare organizations) can limit data based on their policies, agreements and use cases. The EHR vendor can provide tools that empower the healthcare organizations to manage the exchange of the data they hold.
Nihar (Mar 10 2022 at 17:31):
Thank you so much @Josh Mandel for the clarity and suggestion.
Also, @Cooper Thompson : Thank you so much. Understood your point too, we as the Vendor should not be making the decisions about the data limitations whereas we should leave to our clients as to what data volume they want to exchange to another 3rd party systems.
Really appreciate all your support.
Brendan Keeler (Mar 14 2022 at 04:46):
@Cooper Thompson the Epic bulk data implementation has a lot of references to not going over 1000 patients. Do your comments here mean the scale in this thread is supported by Epic?
Cooper Thompson (Mar 14 2022 at 13:40):
The "1000 patient" number is an order of magnitude recommendation, not a technical limit. That recommendation is also based only on internal testing (since we don't have much real-world data yet). We may revise those recommendations as we get real-world performance numbers. But ultimately, the larger your patient population, the longer your job will take to run (for those who are familiar with Epic, we are using fairly conservative throttle settings for the export process). I would expect that exporting millions of patients would take longer than a client is likely to want to wait, but that depends on a lot of factors.
Brendan Keeler (Mar 14 2022 at 17:08):
Awesome, makes sense
Last updated: Apr 12 2022 at 19:14 UTC