FHIR Chat · Changes to Bulk Data Proposal · bulk data

Stream: bulk data

Topic: Changes to Bulk Data Proposal


view this post on Zulip Dan Gottlieb (Jan 27 2018 at 20:56):

Two changes to bulk data proposal based on our breakout session:

view this post on Zulip Dan Gottlieb (Jan 27 2018 at 20:57):

1. On the initial kick-off request, server should accept a "output-format" parameter indicating the format for the bulk data files. Currently, this must be "application/fhir+ndjson". The Accept header will indicate the format for an OperationOutcome response to the kick-off request itself (eg. in the case of a missing parameter).

view this post on Zulip Dan Gottlieb (Jan 27 2018 at 20:57):

2. On the final status request (response type of 200), return a body with the following json structure:

{
  "transactionTime": "[instant]",  //the server's time when the query is run (no resources that have a modified data after this instant should be in the response)
  "request" : "Patient/$everything?_type=Patient,Observation", //GET request that kicked-off the bulk data response
  "secure" : true, //authentication is required to retrieve the files
  "output" : [{
    "type" : "Patient", //resource type contained in the file
    "url" : "http://serverpath2/patient_file_1.ndjson"
  },{
    "type" : "Patient",
    "url" : "http://serverpath2/patient_file_2.ndjson"
  },{
    "type" : "Observation",
    "url" : "http://serverpath2/observation_file_1.ndjson"
  }]
}

view this post on Zulip Nagesh Bashyam (Jan 27 2018 at 21:37):

Dan - I am assuming the output-format is a parameter to the operation and is not part of the header ?

view this post on Zulip Nagesh Bashyam (Jan 27 2018 at 21:39):

Also - for the next breakout, it might be worth talking about a polling interval for the status , have heard a few people mention last week at ONC that since it is an async request, it might take more than a few minutes to get the data ready, so it might be good to not continuously poll but poll at regular intervals that is specified by a server in its initial response with the content location

view this post on Zulip Grahame Grieve (Jan 27 2018 at 21:39):

thanks @Dan Gottlieb - when did we decide to make those changes?

view this post on Zulip Dan Gottlieb (Jan 27 2018 at 21:41):

@Nagesh Bashyam Yup, -on the querystring. Also, it would be good to discuss the polling. My initial thought is that we probably want to recommend https://en.wikipedia.org/wiki/Exponential_backoff

view this post on Zulip Dan Gottlieb (Jan 27 2018 at 21:43):

@Grahame Grieve asap, but while leaving the link header in place and continuing to accept ndjson in the Accept header so they're non-breaking for now

view this post on Zulip Grahame Grieve (Jan 27 2018 at 21:43):

I'm not sure the change is non-breaking for me

view this post on Zulip Dan Gottlieb (Jan 27 2018 at 21:44):

In what way?

view this post on Zulip Grahame Grieve (Jan 27 2018 at 21:44):

the accept header change

view this post on Zulip Grahame Grieve (Jan 27 2018 at 21:44):

I'm going to investigate

view this post on Zulip Grahame Grieve (Jan 27 2018 at 21:47):

(deleted)

view this post on Zulip Vladimir Ignatov (Jan 27 2018 at 22:34):

The server at https://bulk-data.smarthealthit.org was updated to implement these changes. Feedback is appreciated

view this post on Zulip Dan Gottlieb (Jan 27 2018 at 22:35):

:thumbs_up:

view this post on Zulip Josh Mandel (Jan 27 2018 at 22:54):

Having a response body with clear type info for each resulting file is working really nicely here.

view this post on Zulip Jason Walonoski (Jan 27 2018 at 23:28):

Exponential backoff is fine as a recommendation, but I don't think this should be part of the spec. Different clients in different environments will poll their own way.

view this post on Zulip Danielle Friend (Jan 27 2018 at 23:29):

The server at https://bulk-data.smarthealthit.org was updated to implement these changes. Feedback is appreciated

@Vladimir Ignatov how do you handle multiple requests for $everything? When I make the $everything request multiple times (minutes apart or in quick succession), the response for each returns the same content-location. Are these generated per request/on the fly?

view this post on Zulip Vladimir Ignatov (Jan 27 2018 at 23:42):

@Danielle Friend Yes. This is a demo server. It does not really generate any files on it's file system. It just makes you wait a while and then gives you a list of files to download... In other words, multiple calls to $everything with the same parameters will result in the same content-location

view this post on Zulip Dan Gottlieb (Jan 27 2018 at 23:45):

@Danielle Friend @Vladimir Ignatov - it seems like this may be the correct approach since on subsequent requests you're asking for files that have already been generated...

view this post on Zulip Nagesh Bashyam (Jan 27 2018 at 23:47):

I dont think that would work, because each request in the case of Targeted extract for a different set of patients. So some kind of request / response tracking is necessary on the server side which is what we implemented. Even in the case of Patient/$everything, new patients or their observations may change in the system and the previously generated data is not appropriate, unless i am not understanding.

view this post on Zulip Nagesh Bashyam (Jan 27 2018 at 23:48):

Quick comment on the Root URL recommended by Jason : This is how we had defined it in DAF-Research, before bulk-api came into being. Also defined on the Root URL.

http://hl7.org/fhir/us/daf-research/STU2/OperationDefinition-daf-extract.html

view this post on Zulip Nagesh Bashyam (Jan 28 2018 at 15:19):

I updated my server with the changes recommended...Feel free to give it a shot.

http://52.70.192.201/open-fhir/fhir/Patient/$everything

view this post on Zulip Toby Hu (Jan 28 2018 at 19:33):

@Nagesh Bashyam for the initial query, i think the expected response status code should be 202, your server returns 200.

view this post on Zulip Nagesh Bashyam (Jan 28 2018 at 19:39):

Toby, I looked into it and I will have to get a fix from James (for HAPI libraries) to override that status. I will take care of it on the next version.

view this post on Zulip Toby Hu (Jan 28 2018 at 19:40):

Sounds good. Thanks.

view this post on Zulip Toby Hu (Jan 28 2018 at 19:45):

I registered my client at http://snapp.clinfhir.com:4000/ and code can be found at https://github.com/toby-hu/test/tree/master/client .
Any server would like to try a connect?


Last updated: Apr 12 2022 at 19:14 UTC