FHIR Chat · Changes to Bulk Data Proposal

Stream: bulk data

Topic: Changes to Bulk Data Proposal

Dan Gottlieb (Jan 27 2018 at 20:56):

Two changes to bulk data proposal based on our breakout session:

Dan Gottlieb (Jan 27 2018 at 20:57):

1. On the initial kick-off request, server should accept a "output-format" parameter indicating the format for the bulk data files. Currently, this must be "application/fhir+ndjson". The Accept header will indicate the format for an OperationOutcome response to the kick-off request itself (eg. in the case of a missing parameter).

Dan Gottlieb (Jan 27 2018 at 20:57):

2. On the final status request (response type of 200), return a body with the following json structure:

{
  "transactionTime": "[instant]",  //the server's time when the query is run (no resources that have a modified data after this instant should be in the response)
  "request" : "Patient/$everything?_type=Patient,Observation", //GET request that kicked-off the bulk data response
  "secure" : true, //authentication is required to retrieve the files
  "output" : [{
    "type" : "Patient", //resource type contained in the file
    "url" : "http://serverpath2/patient_file_1.ndjson"
  },{
    "type" : "Patient",
    "url" : "http://serverpath2/patient_file_2.ndjson"
  },{
    "type" : "Observation",
    "url" : "http://serverpath2/observation_file_1.ndjson"
  }]
}

Nagesh Bashyam (Jan 27 2018 at 21:37):

Dan - I am assuming the output-format is a parameter to the operation and is not part of the header ?

Nagesh Bashyam (Jan 27 2018 at 21:39):

Also - for the next breakout, it might be worth talking about a polling interval for the status , have heard a few people mention last week at ONC that since it is an async request, it might take more than a few minutes to get the data ready, so it might be good to not continuously poll but poll at regular intervals that is specified by a server in its initial response with the content location

Grahame Grieve (Jan 27 2018 at 21:39):

thanks @Dan Gottlieb - when did we decide to make those changes?

Dan Gottlieb (Jan 27 2018 at 21:41):

@Nagesh Bashyam Yup, -on the querystring. Also, it would be good to discuss the polling. My initial thought is that we probably want to recommend https://en.wikipedia.org/wiki/Exponential_backoff

Dan Gottlieb (Jan 27 2018 at 21:43):

@Grahame Grieve asap, but while leaving the link header in place and continuing to accept ndjson in the Accept header so they're non-breaking for now

Grahame Grieve (Jan 27 2018 at 21:43):

I'm not sure the change is non-breaking for me

Dan Gottlieb (Jan 27 2018 at 21:44):

In what way?

Grahame Grieve (Jan 27 2018 at 21:44):

the accept header change

Grahame Grieve (Jan 27 2018 at 21:44):

I'm going to investigate

Grahame Grieve (Jan 27 2018 at 21:47):

(deleted)

Vladimir Ignatov (Jan 27 2018 at 22:34):

The server at https://bulk-data.smarthealthit.org was updated to implement these changes. Feedback is appreciated

Dan Gottlieb (Jan 27 2018 at 22:35):

:thumbs_up:

Josh Mandel (Jan 27 2018 at 22:54):

Having a response body with clear type info for each resulting file is working really nicely here.

Jason Walonoski (Jan 27 2018 at 23:28):

Exponential backoff is fine as a recommendation, but I don't think this should be part of the spec. Different clients in different environments will poll their own way.

Danielle Friend (Jan 27 2018 at 23:29):

The server at https://bulk-data.smarthealthit.org was updated to implement these changes. Feedback is appreciated

@Vladimir Ignatov how do you handle multiple requests for $everything? When I make the $everything request multiple times (minutes apart or in quick succession), the response for each returns the same content-location. Are these generated per request/on the fly?

Vladimir Ignatov (Jan 27 2018 at 23:42):

@Danielle Friend Yes. This is a demo server. It does not really generate any files on it's file system. It just makes you wait a while and then gives you a list of files to download... In other words, multiple calls to $everything with the same parameters will result in the same content-location

Dan Gottlieb (Jan 27 2018 at 23:45):

@Danielle Friend @Vladimir Ignatov - it seems like this may be the correct approach since on subsequent requests you're asking for files that have already been generated...

Nagesh Bashyam (Jan 27 2018 at 23:47):

I dont think that would work, because each request in the case of Targeted extract for a different set of patients. So some kind of request / response tracking is necessary on the server side which is what we implemented. Even in the case of Patient/$everything, new patients or their observations may change in the system and the previously generated data is not appropriate, unless i am not understanding.

Nagesh Bashyam (Jan 27 2018 at 23:48):

Quick comment on the Root URL recommended by Jason : This is how we had defined it in DAF-Research, before bulk-api came into being. Also defined on the Root URL.

http://hl7.org/fhir/us/daf-research/STU2/OperationDefinition-daf-extract.html

Nagesh Bashyam (Jan 28 2018 at 15:19):

I updated my server with the changes recommended...Feel free to give it a shot.

http://52.70.192.201/open-fhir/fhir/Patient/$everything

Toby Hu (Jan 28 2018 at 19:33):

@Nagesh Bashyam for the initial query, i think the expected response status code should be 202, your server returns 200.

Nagesh Bashyam (Jan 28 2018 at 19:39):

Toby, I looked into it and I will have to get a fix from James (for HAPI libraries) to override that status. I will take care of it on the next version.

Toby Hu (Jan 28 2018 at 19:40):

Sounds good. Thanks.

Toby Hu (Jan 28 2018 at 19:45):

I registered my client at http://snapp.clinfhir.com:4000/ and code can be found at https://github.com/toby-hu/test/tree/master/client .
Any server would like to try a connect?

Last updated: Apr 12 2022 at 19:14 UTC

Main menu

FHIR Chat · Changes to Bulk Data Proposal · bulk data

Stream: bulk data

Topic: Changes to Bulk Data Proposal

Dan Gottlieb (Jan 27 2018 at 20:56):

Dan Gottlieb (Jan 27 2018 at 20:57):

Dan Gottlieb (Jan 27 2018 at 20:57):

Nagesh Bashyam (Jan 27 2018 at 21:37):

Nagesh Bashyam (Jan 27 2018 at 21:39):

Grahame Grieve (Jan 27 2018 at 21:39):

Dan Gottlieb (Jan 27 2018 at 21:41):

Dan Gottlieb (Jan 27 2018 at 21:43):

Grahame Grieve (Jan 27 2018 at 21:43):

Dan Gottlieb (Jan 27 2018 at 21:44):

Grahame Grieve (Jan 27 2018 at 21:44):

Grahame Grieve (Jan 27 2018 at 21:44):

Grahame Grieve (Jan 27 2018 at 21:47):

Vladimir Ignatov (Jan 27 2018 at 22:34):

Dan Gottlieb (Jan 27 2018 at 22:35):

Josh Mandel (Jan 27 2018 at 22:54):

Jason Walonoski (Jan 27 2018 at 23:28):

Danielle Friend (Jan 27 2018 at 23:29):

Vladimir Ignatov (Jan 27 2018 at 23:42):

Dan Gottlieb (Jan 27 2018 at 23:45):

Nagesh Bashyam (Jan 27 2018 at 23:47):

Nagesh Bashyam (Jan 27 2018 at 23:48):

Nagesh Bashyam (Jan 28 2018 at 15:19):

Toby Hu (Jan 28 2018 at 19:33):

Nagesh Bashyam (Jan 28 2018 at 19:39):

Toby Hu (Jan 28 2018 at 19:40):

Toby Hu (Jan 28 2018 at 19:45):