Stream: implementers
Topic: HTTP, Idempotency, and FHIR
Karl M. Davis (May 26 2017 at 17:43):
"Idempotent": An idempotent operation is one that can be retried/replayed without affecting the outcome; the result will be the same if the operation is applied one time or a hundred times.
By my reading, there are 4 "writey" operations in the FHIR spec:
- 1.
PUT [base]/[type]/[id]
- 2.
PUT [base]/[type]?[search parameters]
- 3.
POST [base]/[type]
- 4.
POST [base]/[type]
withIf-None-Exist: [search parameters]
In HTTP, PUT
operations are generally expected to be idempotent. I'm not a huge "thou must perfectly follow the letter and spirit of the spec" kind of person, but that particular assumption is baked into a lot of things. For instance, the Apache HTTP client library is (by default) configured to allow retry of GET
s and PUT
s, but not allow retry of POST
s. No huge deal there: I can override the defaults. But things get a little bit more tricky with things that users don't directly control, such as HTTP proxies -- some of those will be configured to automatically retry things or not based on the HTTP operation.
Unfortunately, of those 4 FHIR "writey" operations, only the last one (POST [base]/[type]
with If-None-Exist: [search parameters]
) is truly idempotent. The two PUT
operations, per the spec, will create duplicate versions if they're retried.
Why do I care? Because I'm pushing millions of FHIR write operations, a non-zero number of them fail for transport reasons, and I need to automatically retry them when that happens. So why don't I just use option #4? Because unfortunately, that option doesn't allow me to specify the resources' logical IDs as I'm creating them, which I'd really really like to do (makes maintaining the system MUCH simpler down the road).
Karl M. Davis (May 26 2017 at 17:46):
Anyone geeky/nerdy enough to have thoughts on this? I've checked and #hapi doesn't honor the If-None-Exist
header on PUT
s -- and it shouldn't, per the spec.
Karl M. Davis (May 26 2017 at 17:47):
What I'd really love is to be able to add an If-None-Exist: true
or somesuch header to option 1 submissions, and have it be honored as a different type of conditional create operation.
Karl M. Davis (May 26 2017 at 17:49):
Well, that's the quick and dirty solution, anyways. If you guys really wanted to ensure that PUT
s were idempotent (per the HTTP spec), the FHIR spec would state that retrying the exact same PUT
request with the exact same content would not result in duplicate versions.
Joel Schneider (May 26 2017 at 17:56):
Wouldn't this (version aware updates) meet your requirement for a PUT operation that doesn't result in duplicate versions?
http://build.fhir.org/http.html#versionaware
Joel Schneider (May 26 2017 at 17:57):
Assuming the server supports it, that is.
Karl M. Davis (May 26 2017 at 18:13):
Unfortunately not, because If-Match: null
isn't supported. I'd need something like that to say "create this resource that shouldn't exist yet, but don't add a second version of it if I have to retry this request."
Karl M. Davis (May 26 2017 at 18:17):
Put another way: yeah, that could work for updates, but it wouldn't work for creates.
Grahame Grieve (May 26 2017 at 18:22):
I'm not following this exactly - why does it matter if it creates a new version?
Karl M. Davis (May 26 2017 at 18:24):
Wastes space with versions that are exact dupes.
Joel Schneider (May 26 2017 at 18:25):
It seems to me you may be looking for a custom operation.
For resource creation, the standard RESTful API uses a different HTTP verb (POST), which may support the If-None-Exist header.
Karl M. Davis (May 26 2017 at 18:25):
Is more or less of a problem depending on how many resources one includes in each Bundle
.
Karl M. Davis (May 26 2017 at 18:26):
Joel: Yup. But that POST
operation doesn't allow clients to specify the logical ID.
Joel Schneider (May 26 2017 at 18:30):
Why would the client need the ability to specify the logical id for a newly created resource? Isn't that id supposed to be local to (and assigned by) the server?
Karl M. Davis (May 26 2017 at 18:31):
Quoting the spec:
Note that servers MAY choose to allow clients to PUT a resource to a location that does not yet exist on the server - effectively, allowing the client to define the id of the resource. Whether a server allows this is a deployment choice based on the nature of its relationships with the clients. While many servers will not allow clients to define their ids, there are several reasons why it may be necessary in some configurations:
- client is reproducing an existing data model on the server, and needs to keep original ids in order to retain ongoing integrity
- client is a server doing push based pub/sub (this is a special case of the first reason)
- multiple clients doing push in the context of agreed data model shared across multiple servers where ids are shared across servers
Alternatively, clients may be sharing an agreed identification model (e.g. key server, scoped identifiers, or UUIDs) where clashes do not arise.
Servers can choose whether or not to support client defined ids, and indicate such to the clients using CapabilityStatement.rest.resource.updateCreate.
Grahame Grieve (May 26 2017 at 18:33):
there's a number of reasons why we would not state that " retrying the exact same PUT request with the exact same content would not result in duplicate versions"
Grahame Grieve (May 26 2017 at 18:34):
we might say that a server was allow to not create duplicate versions - that is, to ignore a change that isn't a change. that does seem possible
Grahame Grieve (May 26 2017 at 18:34):
But I'm wondering about the record-keeping requirements / provenance aspects of that.
Grahame Grieve (May 26 2017 at 18:35):
and i"m also wondering what % of time this occurs - it seems sufficiently rare to me that the loss of resources is not overly strong
Karl M. Davis (May 26 2017 at 18:35):
What about a smaller change, like a mechanism to specify If-Match: null
or something equivalent?
Karl M. Davis (May 26 2017 at 18:37):
It's a bit frustrating that all the _pieces_ I need are there: client-specified logical IDs, a mechanism to make creates idempotent, a mechanism to make updates idempotent, etc. But I can't combine them in the specific configuration that I need.
Joel Schneider (May 26 2017 at 18:44):
Propagating logical IDs seems like a special case to me, involving systems which are not loosely coupled.
Does the data travel in only one direction between these systems? Is the downstream system a read-only copy of the upstream one?
Karl M. Davis (May 26 2017 at 18:55):
Yes and yes.
Karl M. Davis (May 30 2017 at 00:23):
This is an odd debate to me:
- Graham is arguing that duplicate versions/idempotency shouldn't matter, but there's already
If-Match
supports for updates, so the spec has conceded that issue. - Joel is arguing that client-specified logical IDs shouldn't be needed, but there's already support for that with
PUT [base]/[type]/[id]
submissions, so the spec has conceded that issue.
Are these features that folks disagree with and are hoping to pull from the spec eventually? Or is there some other reason to oppose adding a way to avoid duplicate versions for resource creates (instead of just updates, as things exist now)?
Grahame Grieve (May 30 2017 at 00:26):
I'm not arguing anything - just exploring the issues. But adding if-match because we need to support version aware updates is not the same as conceding that idempotency at this level matters
Grahame Grieve (May 30 2017 at 00:27):
(nor is it saying it doesn't)
Karl M. Davis (May 30 2017 at 00:38):
For the Blue Button FHIR server, idempotent creates get us two things:
- Allows us to retry the creates when they fail. We're seeing more socket/timeout errors in AWS than I've seen other places. Still not "a lot", butit's nice not having to worry about polluting the version history when it happens.
- Allows us to avoid building a lot of additional progress-tracking logic into our ETL service. It already checkpoints our initial load every .1% or so, so we can now stop in the middle of those .1% chunks (which can still take hours) and safely restart later without worrying about polluting the version history or wasting disk space.
These are more concrete benefits than we were getting from predictable logical IDs, so I've just finished updating our code to handle creates by POST
ing with If-None-Exists
. It's be nice to not have to make that tradeoff, but oh well for now.
Joel Schneider (May 30 2017 at 01:58):
I'm not arguing anything either. Just observing that using the FHIR API to replicate logical IDs across multiple servers looks like a new use case.
Grahame Grieve (May 30 2017 at 01:59):
that's not new - we've had this from the beginning for many reasons
Joel Schneider (May 30 2017 at 02:18):
In that case, I must be missing something. Is there a way to specify the logical ID when creating a new resource via the REST API?
Grahame Grieve (May 30 2017 at 02:18):
yes. Just PUT it to the place you want it at
Joel Schneider (May 30 2017 at 07:01):
Thanks, that's helpful. Now I see the update (PUT) interaction "creates an initial version if no resource already exists for the given id."
Michael Lawley (Jun 01 2017 at 21:12):
HTTP's If-None-Match: *
would seem to match this use-case exactly (https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.26)
Jens Villadsen (Jun 02 2017 at 07:25):
What about ´If-None-Match´with the client-generated logical ID as etag in the POST? Would that be considered a violation of the standard?
Michael Lawley (Jun 02 2017 at 11:32):
PUT
with If-None-Match: *
gives you an idempotent create operation . Idempotent update is also do-able, but you need to know the E-Tag of the resource that you're updating. I'm not clear if @Karl M. Davis 's use case requires update or just create
Jens Villadsen (Jun 02 2017 at 11:44):
@Michael Lawley but don't you then leave it up to server to decide if it has seen the resource before - meaning it will be (entirely) proprietary? If the identifiers issued by the client can be embedded in the etag (using weak validation), it would be up to the client to decide what it sees as duplicate resources at the moment of creation?
Karl M. Davis (Jun 02 2017 at 11:49):
Our use case requires both create and update, but we can use different options for each, if need be.
Joel Schneider (Jun 02 2017 at 16:38):
I agree, PUT
with If-None-Match: *
looks reasonable for idempotent creates.
The version aware update mechanism (if supported) could perform idempotent updates.
http://build.fhir.org/http.html#versionaware
Joel Schneider (Jun 07 2017 at 01:32):
Has FHIR guidance regarding usage of PUT
with If-None-Match: *
yet been considered, or decided? (e.g. to support updateCreate combined with conditionalCreate)
Notably, as pointed out here by @Michael Lawley, the If-None-Match
section of RFC 2616 indicates If-None-Match: *
is "intended to be useful in preventing races between PUT
operations." (a.k.a. the "lost update" problem)
https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.26
Per @Karl M. Davis, a header such as If-None-Match: *
(or maybe If-None-Exist: true
) would enable precise data replication between version-aware servers, where the downstream server is an exact, read-only copy of the upstream one.
The FHIR spec presently describes usage of If-None-Match
in conjunction with GET
, but is apparently silent on its usage (if any) with PUT
. The spec additionally describes usage of a HL7-defined extension header, If-None-Exist
, in conjunction with POST
.
http://build.fhir.org/http.html
With this in mind, would it be reasonable to add guidance for using PUT
with If-None-Match: *
to the FHIR spec?
Grahame Grieve (Jun 07 2017 at 01:35):
we document use of the if-match header. I don't understand what extra is needed?
Joel Schneider (Jun 07 2017 at 02:17):
Using If-Match
, is there a way to require that the target resource does not exist? E.g., in a PUT
request which attempts to create a new resource with an assigned id.
Michael Lawley (Jul 27 2020 at 00:34):
This is an old thread, but I'm now coming around to implement PUT
and If-Match
but this doesn't help with the case of using PUT
to do a create
-- I am planning to use If-None-Match: *
rather than the FHIR-invented If-None-Exist
.
It would be really nice if the spec were updated to deal with this case and provide canonical advice.
Aaron Nash (Sep 11 2020 at 18:32):
Is there any consensus here? I'm running into a similar issue. There is no standard way to create a resource with a client specified ID AND ensure an existing resource is not updated, if one already exists. If-None-Match (https://tools.ietf.org/html/rfc7232#section-3.2) seems like the right solution.
Paul Church (Sep 11 2020 at 18:53):
I think we should add "If-None-Match: *" to PUT for this case.
I think there is also some additional clarification that would be useful on If-Match. Can I If-Match on a deleted version (tombstone)? If the resource does not exist and never existed, and I PUT with "If-Match: xyz", is that still a 412 Precondition Failed? Should we call out "If-Match: *" as allowed by RFC 7232 to match any existing version?
Aaron Nash (Sep 11 2020 at 19:26):
I created https://jira.hl7.org/browse/FHIR-28495 for supporting "If-None-Match" for update operations.
@Paul Church those seem like good clarifications to make. I haven't yet developed a opinion on the specific questions you've raised. Do you have any thoughts on the correct behaviors for those instances? I do believe it would be best if FHIR matched the behaviors described in RFC 7232 for the sections that FHIR supports.
Paul Church (Sep 11 2020 at 19:51):
I think it should be possible to If-Match on tombstones, the non-version should not If-Match any version, and If-Match: * would be useful.
Last updated: Apr 12 2022 at 19:14 UTC