FHIR Chat · Bulk Export Headers - required for servers or clients? · bulk data

Stream: bulk data

Topic: Bulk Export Headers - required for servers or clients?


view this post on Zulip Cooper Thompson (May 03 2021 at 14:37):

In the export section, we list some required headers. Are those requirements intended to be imposed on the client (the client is required to provide those values) or the servers (the server must reject requests that do not provide those headers). I'd assume the former, where if the client doesn't provide them, the server could still accept the request and use server-defined default behavior.

view this post on Zulip Josh Mandel (May 03 2021 at 14:56):

I think your interpretation is correct but we should say something about this. We should also explain what it means that only ndjson is "supported" (I think this means that it is the only format that a server is required to be able to produce).

view this post on Zulip Josh Mandel (May 03 2021 at 14:56):

Have you submitted ballot comment on this?

view this post on Zulip Cooper Thompson (May 03 2021 at 15:06):

Yes - I'll submit a ballot comment. I was also asking because Inferno is validating that servers reject requests that don't have those headers, so while the ballot comment will apply to 1.1.0, I also wanted to get community input on the topic in general so that we have more information to inform my github issue for Inferno which applies to the 1.0.0 version.

view this post on Zulip Cooper Thompson (May 03 2021 at 15:10):

FHIR#32062

view this post on Zulip Dan Gottlieb (May 03 2021 at 15:14):

I'm curious what others think, but since in the case of bulk export these headers are both fixed values that are there to align with FHIR patterns, it seems to me that the right behavior on the server side would be for the server to error if they're different from expected, but treat them as the defaults if omitted.

view this post on Zulip Yunwei Wang (May 04 2021 at 15:23):

My understanding on client side requirement is that server MUST reject invalid client request if such request fails client side requirement. If we say server COULD still accept "invalid" client request, then such client side requirement itself is NOT required any more For example, if we agree that if client send Accept: application/xml, server still shall return response using JSON format, then the requirement that client must send Accept: application/json does not stand anymore because server's behavior is not directly related to the validity of client request.

view this post on Zulip Yunwei Wang (May 04 2021 at 15:31):

In such case, we should remove these two requirement and add a requirement to kick off response section that the any response must be in JSON format.

view this post on Zulip Dan Gottlieb (May 04 2021 at 16:17):

@Yunwei Wang I certainly agree that a server returning a response that's counter to the provided Accept header would be confusing.

view this post on Zulip Dan Gottlieb (May 04 2021 at 16:17):

I'm proposing a something a little different:

view this post on Zulip Dan Gottlieb (May 04 2021 at 16:18):

  1. If an Accept: application/fhir+json header is provided, the server should return a json response as requested
  2. If no Accept header is provided, the server should assume the defaults and return a json response
  3. If an Accept: application/fhir+xml header is provided and the server supports xml (not required or described by the IG, but I don't think precluded?), the server should return an xml response
  4. If an Accept: application/fhir+xml header is provided and the server does not support xml, the server should return an error response

view this post on Zulip Dan Gottlieb (May 04 2021 at 16:18):

On #2, I guess the question is if there's a downside from following Postel's law here :).

view this post on Zulip Dan Gottlieb (May 04 2021 at 16:21):

Also, for all of these cases, the documentation could be clearer!

view this post on Zulip Gino Canessa (May 04 2021 at 16:23):

I like Dan's tree, but I do have a hesitation on 2; bulk export is an expensive operation - if the data format at the end isn't usable, I think it would be better to fail early.

view this post on Zulip Dan Gottlieb (May 04 2021 at 16:32):

@Gino Canessa to clarify, this header should only impact the format for the kickoff response's optional OperationOutcome message. The output format for the bulk export is defined through the _outputFormat parameter.

view this post on Zulip Gino Canessa (May 04 2021 at 16:34):

@Dan Gottlieb d'oh. I blame it on being Tuesday? Withdrawn =)

view this post on Zulip Dan Gottlieb (May 04 2021 at 16:35):

Ha, of course, the _outputFormat param is explicitly defined in the IG as defaulting to application/fhir+ndjson so your comment is still valid :).

view this post on Zulip Dan Gottlieb (May 04 2021 at 16:35):

I think a distinction may be that given the empahsis placed on the ndjson format in the IG and early implementations, it's reasonable to assume that if a client can't handle that format it's going to explicitly submit a different _outputFormat value that the server can then reject if it's unsupported.

view this post on Zulip Yunwei Wang (May 04 2021 at 17:20):

The response sections

Response - Success

HTTP Status Code of 202 Accepted
Content-Location header with the absolute URL of an endpoint for subsequent status requests (polling location)
Optionally, a FHIR OperationOutcome resource in the body

Response - Error (e.g., unsupported search parameter)

HTTP Status Code of 4XX or 5XX
The body SHALL be a FHIR OperationOutcome resource in JSON format

states that Error MUST have a JSON formatted body but it does not say format for optional body if Success.

@Dan Gottlieb You decision tree is good but I think it only make it complicated. The only requirement here is that server SHALL send kick-off response with (optional) JSON body.

view this post on Zulip Dan Gottlieb (May 04 2021 at 18:18):

Good point - if we want to allow flexibility here we'd have to change that language to something like "The body SHALL be a FHIR OperationOutcome resource in the format requested".

view this post on Zulip Dan Gottlieb (May 04 2021 at 18:18):

I don't feel strongly about this since I think it's reasonable to require clients and servers to only use json, but allowing different optional serialization formats if servers and clients both want to support them seems a little more in line with how other FHIR transactions work.

view this post on Zulip Yunwei Wang (May 04 2021 at 19:07):

I believe the original intention is to simplify implementation by dealing with only one content formatting. We could say that clearly as what we did in .well-known endpoint which states that server response SHALL be JSON format.

view this post on Zulip Dan Gottlieb (May 04 2021 at 19:45):

Agreed, and definitely don't want to describe broad support for non-json formats in the IG or remove the requirement for bulk servers to support json.

view this post on Zulip Dan Gottlieb (May 04 2021 at 19:46):

If we can do some light tweaking to the docs though, it might be worth continuing to require json support and explicitly default to json, while still allowing other formats if both the client and server choose to support them. Whether or not we allow xml, it seems like servers that don't support it will need to reject requests for it so don't think there's any additional burden for implementors.

view this post on Zulip Robert Scanlon (May 05 2021 at 16:00):

Whether or not we allow xml, it seems like servers that don't support it will need to reject requests for it so don't think there's any additional burden for implementors

Do they need to reject it, or can they simply ignore it and send back application/fhir+json? FHIR uses imprecise language here "406 Not Acceptable is the appropriate response when the Accept header requests a format that the server does not support". Is "appropriate" a SHALL or a SHOULD? It points to RFC7231, but that has language in multiple places stating that a server may decide to send back content that doesn't match what was requested in the Accept header, so the client needs to be prepared for that.

view this post on Zulip Cooper Thompson (Jun 15 2021 at 17:43):

With my EHR developer hat on, I wonder if it is reasonable to have some relaxed checks in Inferno (BDE-04 and BDE05) for this section of the Bulk FHIR spec, given it's ambiguity? This thread so far has focused on what the right long term solution is, but that would involve an update to a future version (and SVAP). For the purposes of validating Bulk FHIR 1.0.1, could/should we just use the most permissive interpretation of the ambiguous parts of the spec?

view this post on Zulip Robert Scanlon (Jun 16 2021 at 16:56):

For the purposes of validating Bulk FHIR 1.0.1, could/should we just use the most permissive interpretation of the ambiguous parts of the spec?

This is something we had intended to do when we started writing tests, but have found that there are quite a few ambiguities in the specs when you look extremely closely at them. And if you backed off testing anything that somebody thinks is ambiguous, you end up with “passing” implementations that clearly do not meet the intention of the spec. So I think there needs to be some judgement involved.

view this post on Zulip Robert Scanlon (Jun 16 2021 at 16:56):

I’m not saying we shouldn’t remove these tests in this case, just that as a general practice we don’t immediately remove tests when presented different interpretations of what is written in the spec.

view this post on Zulip Robert Scanlon (Jun 16 2021 at 16:57):

For this case in particular, if there is consensus on the correct behavior here, and that differs than what we are doing, then as long as it doesn’t impact ONC’s needs for the spec we can probably just remove the test. I’m not seeing consensus though.

view this post on Zulip Robert Scanlon (Jun 16 2021 at 16:57):

The consensus could be “it isn’t a requirement in this version, and will consider it for the next version”. It helps us justify removing tests if there is some clarifying language in a later version of a spec, but I don’t think that is needed if there is an obvious flaw to our interpretation.

view this post on Zulip Robert Scanlon (Jun 16 2021 at 16:58):

Without looking too closely, I do think that us failing servers for not rejecting Accept: application/fhir+xml means that the Bulk Data IG provides that requirement (whether it is derived based on other content in the IG or explicitly stated), because I believe that this isn’t a requirement in base FHIR. And it feels weird to me for Bulk Data to add a constraint like this -- what is special about that operation that makes it different that a normal FHIR operation? So by that logic, I’d think we would want to assume Bulk Data didn’t mean to add that constraint by default (unless it is very explicitly stated), which would have us remove the test. But Dan’s comment up top about how it might be important makes me want to look closer.

view this post on Zulip Yunwei Wang (Jul 13 2021 at 18:02):

Never mind, I found the ticket: https://jira.hl7.org/browse/FHIR-32062

view this post on Zulip Yunwei Wang (Jul 13 2021 at 18:16):

I am wondering if this resolution conflicts with FHIR Http spec that "The correct mime type SHALL be used by clients and servers:" http://hl7.org/fhir/http.html#mime-type @Dan Gottlieb @Josh Mandel

view this post on Zulip Dan Gottlieb (Jul 15 2021 at 13:36):

That's a good point. Perhaps it would be helpful to include a sentence in the bulk IG clarifying that while the base FHIR specification requires use of a mime type, since this operation only allows application/fhir+json, servers may treat that as intended if a client doesn't include the header?

view this post on Zulip Yunwei Wang (Jul 16 2021 at 13:54):

@Dan Gottlieb I know this has been blocked voted last Monday but wondering if the team want to discuss this to get consensus? I don't have strong feeling for any changes but also don't want to introduce conflicts between two IGs. There are two questions need to be clarified:
1) Does Bulk Data export implementation have to follow FHIR HTTP spec, or say FHIR compliant?
2) What does the "correct mime type SHALL be used by clients" mean? Must request with correct Accept header or Must send payload with correct Content-type or something else?
@Lloyd McKenzie

view this post on Zulip Josh Mandel (Jul 16 2021 at 15:22):

"correct mime type shall be used by clients" can't possibly mean that clients are required to provide a fhir specific accept header on all requests; we literally have conventions baked into the API for clients that can't or won't be able to do so (i.e., _format param, which exists specifically because we know clients won't always send an Accept header).

view this post on Zulip Mahesh Dabi (Aug 19 2021 at 13:00):

So will it not be mandatory check for BDE-04 and BDE-05?

Currently our FHIR server is sending 202 (Success) when the client set accept='application/fhir+xml' and prefer=return=representation for BDE-04 and BDE-05 respectively. Will this be acceptable after the new fix?

Thanks
Mahesh

view this post on Zulip Mahesh Dabi (Aug 19 2021 at 13:05):

Hi

Another scenario for bulk data status.

A scenario while testing in Inferno

Request URL:
https://fhir.secureit.co.in:9443/fhir-server/api/v4/$bulkdata-status?job=HRheZqFY_KT5TXhdXFuY7g

Response Body:
{
"transactionTime": "2021-08-18T10:21:53.82Z",
"request": "https://fhir.secureit.co.in:9443/fhir-server/api/v4/Group/17b58c31108-77f6763e-008b-4010-b428-c42ee061aa57/$export",
"requiresAccessToken": false,
"output": [{
"type": "AllergyIntolerance",
"url": "_jVMAj-5zOenYBhcTN5pa5b6DvR2rsx2Cpv4l9wuXOY/AllergyIntolerance_1.ndjson",
"count": 10
},
{
"type": "CarePlan",
"url": "_jVMAj-5zOenYBhcTN5pa5b6DvR2rsx2Cpv4l9wuXOY/CarePlan_1.ndjson",
"count": 2
}
]
}

This test case is a pass, except that it needs few extra things in the response as mentioned below:

1 - "requiresAccessToken": false. This parameter in the response is false, we need to return this in response as true. We have done all the required changes but are not able to get this.
Could you please let us know how to achieve this?

2 - "error": [] . This request is a pass, but it is expected here that "error":[], should be a part of the response, even if there is no error in this request. Please help us in this.

Also, we are getting "An unexpected error has ocurred while deleting the job", when a DELETE is initiated. Is it that we are missing any configuration?

We are close to passing the Bulkdata test in Inferno, after we get all of these resolved.

Please let us know your thoughts.

view this post on Zulip Michelle Vondercrone (Aug 24 2021 at 17:50):

Coming in late to the party :)

I see that @Dan Gottlieb mentioned the documentation could be clearer, but wondering where we landed on Accept header values.
Right now for Bulk Data Kick-Off Request, the documentation says for Accept header "Currently only application/fhir+json is supported." Does that mean if blank - reject request with 406 and OperationOutcome? what is Accept header is 'application/json' or 'application/json+fhir'? Should both of those values also reject with 406 and OperationOutcome? I'm taking the documentation with 'only' verbiage literally. But should I be?

As for Status Request the documentation says 'When requesting status, the client SHOULD use an Accept header indicating a content type of application/json." Does this mean if blank, assume this is the default? Dose this mean that other forms of the value (application/fhir+json or application/json+fhir) are acceptable? Shouldn't this follow the same requirements as is expected for bulk data kick-off request? Does this also apply to the Accept header on the Delete Request?

As for the File Request - this one is clearer - or so I think. Documentation states for Accept header "(optional, defaults to application/fhir+json)". So, if blank default in this value; if valued to application/fhir+json- use it; if valued to anything else, reject with 406. Right?

view this post on Zulip Michele Mottini (Aug 24 2021 at 18:06):

The status request does not return FHIR, it returns a non-FHIR JSON response, hence application/json for the accept header

view this post on Zulip Michele Mottini (Aug 24 2021 at 18:06):

Delete requests do not return anything, so accept is not really relevant

view this post on Zulip Michele Mottini (Aug 24 2021 at 18:08):

Our server is strict: wants application/fhir+json for the kick off request, but I think a looser interpretation would be fine as well - default to that if not specified, accept also application/json etc

view this post on Zulip Michele Mottini (Aug 24 2021 at 18:12):

File requests accept defaults to application/fhir+ndjson, not application/fhir+json

view this post on Zulip Yunwei Wang (Aug 24 2021 at 18:28):

Michelle Vondercrone said:

Does that mean if blank - reject request with 406 and OperationOutcome? what is Accept header is 'application/json' or 'application/json+fhir'? Should both of those values also reject with 406 and OperationOutcome? I'm taking the documentation with 'only' verbiage literally. But should I be?

https://jira.hl7.org/browse/FHIR-32062 clarifies that

A client SHOULD provide this header. If omitted, a server MAY return an error or MAY process the request as if application/fhir+json was supplied.

So it is up to the server to decide what to do. I think the same applies to other client requirement

view this post on Zulip Vladimir Ignatov (Aug 24 2021 at 18:52):

Generally speaking, the purpose of the accept request header is to tell the server something like: "I only understand this format(s), so if you can reply in variety of ways please pick something that I will understand". In Bulk Data this is typically about the responses in case of error. The server may return an error as text, or as OperationOutcome or anything else. It should consider the incoming accept header while deciding how to reply. However, a server that only knows how to reply with OperationOutcome errors might as well completely ignore the accept header because there is nothing to choose from.

Some servers may decide to validate the accept header and reject unknown values (I'm not sure if that would be a correct behavior though).

Finally, this applies to every request, except for the actual file download. There are endpoints that may not return anything, like the status endpoint while the job is in progress, or the delete request to cancel the job. Those endpoints may still have to reply with an actual payload in case of error, or may even choose to reply with an informational OperationOutcome if everything is OK. In these cases they should respect the accept header.

view this post on Zulip Yunwei Wang (Aug 24 2021 at 19:47):

These two client requirements do not make sense. As @Vladimir Ignatov mentioned, the Accept header is to let server know the preferred format that client could parse. But the following a successful response requires header only and the body of failed response haSHALL be a FHIR OperationOutcome resource in JSON format So it doesn't matter what the format "client prefers", the IG already says that shall be application/fhir+json

The Preferred header is the same. The IG already says that a successful response shall have content-location header for polling. So server has to use async response as required by IG. The client request header does not make any difference.

view this post on Zulip Gino Canessa (Aug 24 2021 at 20:00):

Vladimir, I think you are mixing the prefer header and the accept header. Prefer is used to specify what type of data you want back (when applicable, see FHIR HTTP), while Accept is used to specify the MIME type.

I can say that I would prefer an OperationOutcome, and I would accept either application/fhir+json or application/fhir+xml, or that I would prefer nothing (minimal), and that I only accept text/plain.

It is up to the server to determine if it can service the request or if it should be rejected. Expectations for the Accept header are documented on the same page, in the Content Types and encodings section (including responding with a 406: Not Acceptable if the server cannot fulfil a client's request).

view this post on Zulip Yunwei Wang (Aug 24 2021 at 20:09):

That is expectation for FHIR server and Bulk Data server is not a FHIR server

view this post on Zulip Gino Canessa (Aug 24 2021 at 20:41):

Probably I'm missing something, but I cannot figure out what. The specification lays out what values each of these need in the different headers under each type of request, compliant with the relevant RFCs:

I pointed to the FHIR HTTP documentation since it's a bit less dense, given that it's around a specific use case. That said, the intention behind the headers cannot be changed and still claim compliance with standards.

For interoperability, Bulk Data restricts acceptable values in these headers. Some guidance around acceptable behaviors for coercing or rejecting is also provided, but is a bit vague since it will vary by implementation (e.g., if your gateway filters for acceptable MIME types, your server will never even see the request to respond with an OperationOutcome).

What am I missing?

view this post on Zulip Vladimir Ignatov (Aug 24 2021 at 21:35):

I am probably wrong but this is how I understand it:

  • If $export operation is invoked on a FHIR server AND if that server is Bulk Data capable AND if the prefer header is "respond-async", then we enter into the Bulk Data spec
  • A bulk data client can only send prefer header to the export kick-of endpoint and its value must be "respond-async" (optionally followed by handling=lenient). I personally find this part of the spec a little odd because it is basically saying that clients "must prefer" something specific...
  • For the other requests in the flow the prefer header is not applicable

view this post on Zulip Gino Canessa (Aug 24 2021 at 21:54):

I don't see anything wrong with your understanding, though I don't claim to be an expert on Bulk Data either =). Requiring specific values is common to increase interoperability and restrict use cases. Someone involved in writing the spec would be needed to know the exact details on this particular example.

That said, I see it less as "a client must prefer something" and more of "this is a thing clients can specify, and clients need to specify it in this particular way for this to work".

A similar type of example is if you are working with encryption. There is often a field to specify an algorithm. The lower level specifications may just say "a value goes here", but an interoperable specification would pick a very reduced set to support, likely one.

edit: to finish the example, even if the algorithm is known, many formats require being explicit (e.g., JWT), so the value would be fixed.

view this post on Zulip Vladimir Ignatov (Aug 24 2021 at 22:21):

Yes, you said it much better. I guess, because it does not make a lot of sense to require a header with fixed value, I have always imagined that this is just an extension point allowing other implementations. For example, if prefer is not "respond-async", then the server might process that synchronously somehow. I don't know if that is a thing but it would definitely justify the required header.

view this post on Zulip Yunwei Wang (Aug 25 2021 at 01:46):

If server process synchronously, then is it still a server conforms to Bulk Data IG?

view this post on Zulip Vladimir Ignatov (Aug 25 2021 at 12:13):

No. That was just an example from imagination. I don't think that is really possible right now. Maybe something similar to bulk data can be done with $everything and pagination, but that is completely different discussion... I only wanted to say that a server might offer alternative behavior if prefer is not respond-async.

view this post on Zulip Josh Mandel (Aug 25 2021 at 14:10):

I'm reviewing this thread; lots of good discussion, and I'm trying to figure out if there's a specific set of improvements to the spec that someone would like to propose, beyond what you see in the CI build (http://build.fhir.org/ig/HL7/bulk-data/) reflecting ballot reconciliation.

view this post on Zulip Michele Mottini (Aug 25 2021 at 14:16):

Specs are fine in my opinion

view this post on Zulip Yunwei Wang (Aug 25 2021 at 14:49):

My concern is that if such strong requirement on client request header has any effect since server's response is already regulated by IG itself. As an example about, no matter client prefers async-response or not, a Bulk Data server MUST respond asynchronously. I think it is more like client SHOULD request xxx and server MAY response differently.

view this post on Zulip Dan Gottlieb (Aug 25 2021 at 15:09):

@Yunwei Wang think that's where we landed for STU2 - http://build.fhir.org/ig/HL7/bulk-data/export.html#headers ?

view this post on Zulip Vladimir Ignatov (Aug 25 2021 at 16:42):

I don't know if this is the right place but I would propose small addition to the Backend Services spec. Where it says:

For consistency in implementation, the client’s JWK SHALL be shared with the FHIR server using one of the following techniques:
URL to JWK Set (strongly preferred). This URL communicates the TLS-protected endpoint where the client’s public JWK Set can be found. This
endpoint SHALL be accessible via TLS without authentication or authorization. Advantages of this approach are that it allows a client to rotate
its own keys by updating the hosted content at the JWK Set URL, assures that the public key used by the FHIR server is current, and avoids the
need for the FHIR server to maintain and protect the JWK Set.

should we add a sentence saying that the JWKS URL can also be anywhere in the cloud? That would allow CLI clients to also use that type of auth if needed. Right now it sounds like it should be a client-hosted "endpoint" (assuming the client is also a server and is accessible on the same network).

view this post on Zulip Josh Mandel (Aug 25 2021 at 17:26):

should we add a sentence saying that the JWKS URL can also be anywhere in the cloud?

I think the language you quoted:

This URL communicates the TLS-protected endpoint where the client’s public JWK Set can be found. This
endpoint SHALL be accessible via TLS without authentication or authorization.

is unambiguous on this point -- the word "endpoint" here just means ... a TLS-protected URL. https://security.stackexchange.com/questions/242382/what-is-an-endpoint-in-the-context-of-tls captures the sense pretty well.

view this post on Zulip Vladimir Ignatov (Aug 25 2021 at 18:15):

I wonder if I am the only one confused by this. Is it clear for everybody?


Last updated: Apr 12 2022 at 19:14 UTC