Stream: implementers
Topic: Patient/$everything, paging and Binaries
Patrick Haren (May 04 2021 at 14:47):
Is there any guidance in place on Patient/$everything responses, for Binary and DocumentReference resources, when they contain a large amount of encoded content? It seems like it would not be pragmatic, or meet the intent of a pagecount limit in the bundle response (e.g. to 10 resources), if one or more of the resources contained a large encoded document.
Patrick Haren (May 04 2021 at 14:47):
Any pointers would be appreciated.
Rob Reynolds (May 04 2021 at 18:35):
I'm not sure if this matches your use case, but have you looked into bulk data to see if there's a fit there? https://hl7.org/fhir/uv/bulkdata/
Patrick Haren (May 04 2021 at 19:31):
Rob Reynolds said:
I'm not sure if this matches your use case, but have you looked into bulk data to see if there's a fit there? https://hl7.org/fhir/uv/bulkdata/
Thanks Rob. This is more about the regular APIs and implementing Patient/$everything operation (https://www.hl7.org/fhir/operation-patient-everything.html).
From the specification: 'The server SHOULD return at least all resources that it has that are in the patient compartment for the identified patient(s), and any resource referenced from those, including binaries and attachments.'... However, without paging or chunking, a regular API call could suffer from network limitations -- e.g. how reliable and responsive would this be, if the data includes 5 encoded attachments of 30MB each, along with the structured resources?
Keela Shatzkin (Jun 09 2021 at 16:19):
Hi! We're having the same issue because of the risk for a query to _include a TON of results, so even with pagination, the response would be enormous. How do we protect against this from blowing up our own server and/or resulting in a horribly large response. I don't think it's proper to just set an arbitrary limit on the _includes content without providing a way for the end user to then ask for everything. Has anyone solved this?
Merlyn Albery-Speyer (Jun 09 2021 at 17:13):
I'm curious. What content is making your payloads huge?
John Silva (Jun 09 2021 at 17:59):
I worked on a FHIR server (home-brew - before bulk-data was defined) that had "high frequency" data (as Observations) coming from patient monitors for all patients in a (critical care) unit; that data has a frequency of 1/5 sec (can be more but we throttled it). I can imagine high frequency data if a FHIR server was trying to store "all" the data from fitness devices, pedometers, etc.
Of course if any FHIR server was storing Binary or Attachment (docs like PDFs) it could get huge in a hurry. I suppose this begs the question, when you have large data files (and lots of them) maybe a FHIR server isn't the right place to store them; maybe an external DMS (Doc Mgmt System) or Document DB with a DocRef in FHIR is more appropriate.
Josh Mandel (Jun 09 2021 at 20:05):
These kinds of challenges are why we've focused most SMART on FHIR implementations -- and US Core -- on a set of more granular interactions using FHIR search semantics; apps can use standard, well-supported pagination capabilities, and can issue follow-up queries in parallel or even (if the server supports it) via Batch.
Elliot Silver (Jun 09 2021 at 20:09):
Does Patient/$everything support _summary? That would let you get back the resources without the actual binary content.
Lloyd McKenzie (Jun 10 2021 at 15:15):
You're free to reject a query that's going to result in too large a response. You're also free to decide what _include and _revinclude filters you allow and what you don't. (Don't ever do a _revinclude for Observation on Patient :>)
Keela Shatzkin (Jun 11 2021 at 22:05):
@Lloyd McKenzie the issue with that approach is that you won't know if the query would result in an unreasonable volume of data until you check against that specific person/results... which mean you had to run it to check whether or not you should NOT return those results because it would be an unreasonable amount of data... Think about that one patient that is a high utilizer with 20,000 lab results. If you do an _include Observations, you can't paginate those labs because it wasn't the primary resource.
Lloyd McKenzie (Jun 25 2021 at 00:17):
When you're hitting your internal database, you're presumably paging the results but also seeing a total number of matches. If the total number of matches is too big, just reject the request. There's definitely no mechanism to paginate through includes - your choice is to either return a really large Bundle for the page or return a failure. (Where it gets a bit fun is if the first couple of pages return a reasonable number of includes an the third page blows up and fails, but there's not much you can do about it...)
Last updated: Apr 12 2022 at 19:14 UTC