FHIR Chat · Bulk Export Implementation · bulk data

Stream: bulk data

Topic: Bulk Export Implementation


view this post on Zulip Mohammad Jafari (Apr 19 2019 at 23:11):

For one of our demonstrations we needed a bulk server supporting _typeFilter in $export and since I could not find any publicly-available servers with that capability, I hastily put together a bulk export service over the past few days which supports that parameter. This is highly experimental but I thought I'd share it here FWIW in case other groups may need something like this for the May Connectathon.
Note that this not a FHIR server and it works based on a configurable list of backed FHIR servers from which it fetches the resources using FHIR REST API. So, effectively, it's a proxy which can be set in front of one or more existing FHIR servers to provide bulk export functionality.
Source code: https://github.com/mojitoholic/hotaru-swarm
There's a test sever deployed on heroku here: https://hotaru-swarm.herokuapp.com/

view this post on Zulip Josh Mandel (Apr 19 2019 at 23:30):

How cool! I really like the proxy architecture for this kind of prototype. (Also I think this is the first Elixir project I've seen for FHIR.)

view this post on Zulip Josh Mandel (Apr 19 2019 at 23:37):

It'd be interesting to think about how to get results incrementally into a db without having to accumulate everything in memory across all requests/servers.

view this post on Zulip nicola (RIO/SS) (Apr 20 2019 at 08:33):

we use chunked encoding for this - https://docs.aidbox.app/api/bulk-api . Server and client implementations can be done with constant memory usage. May be something like this can be part of spec?!

view this post on Zulip nicola (RIO/SS) (Apr 20 2019 at 08:38):

as well i can implement bulk api with filter in aidbox for connectathon

view this post on Zulip Dan Gottlieb (Apr 20 2019 at 14:03):

Very neat project @Mohammad Jafari ! @Vladimir Ignatov weren't you also working on a bulk data proxy server?

view this post on Zulip Mohammad Jafari (Apr 20 2019 at 17:16):

Controlled buffering (and even file size for that matter) was not a priority for me at this point considering the purpose of this rapid prototype. But I can definitely add that if there's interest in the community to use Hotaru Swarm in future –Elixir has some very good tools, like Stream, GenStage, and Flow which makes that easy to implement.

view this post on Zulip Mohammad Jafari (Apr 20 2019 at 17:27):

My main feedback at this point (and I'm going to write this up in a blog post) is that we'd probably be better off if we use the existing FHIR REST API for CRUD-ing export jobs. A FHIR resource, say named BatchJob, can be created to trigger a an export, fetched for checking the status of the export, and eventually updated by the server to reflect the results once the task completed. The client can read/delete this resource. All of these can take place under existing FHIR REST API operations (plus more, e.g. versioning, search) and subject to existing FHIR authorization and audit mechanisms; this way the specs will only need to define the BulkJob resource structure and the expected behaviour under the hood for fulfilling it. Moreover, there is much less cognitive overhead for developers n terms of how to store and keep track of jobs and the access API. @Josh Mandel

view this post on Zulip Josh Mandel (Apr 20 2019 at 17:32):

Thanks @Mohammad Jafari . This is definitely something we talked about as we were defining the export API, though in the end having an async pattern for issuing any FHIR API call felt more general-purpose. It's worth continuing to discuss as we review ballot comments, so I hope you'll share a link to your blog post when it's up.


Last updated: Apr 12 2022 at 19:14 UTC