Stream: implementers
Topic: Resource.id and endpoint STU3/R4
Alexander Henket (Jan 20 2022 at 11:26):
Suppose an Observation with .id
123 on base example.org/fhir/stu3
. Suppose you move to R4 with or without changing the base:
Is it expected that you keep the same Resource.id
in R4 as you had in STU3 - I've always assumed yes
What could be valid reasons for assigning a new id, effectively copying the resource?
- different endpoint?
- different version of FHIR?
- same version of FHIR but different version of underlying profile?
- because you have a FHIR store and optimize retrieval using the right FHIR version/profile this way
- none of the above?
René Spronk (Jan 20 2022 at 12:28):
I'd say: Same set of backend data (whether legacy or native FHIR), same id. Patient Smith is patient Smith (with the same id) regardless of what versions/profiles the API supports. That would be the most RESTful way of thinking about ids. However, I expect the spec to be silent on this (and no, I didn't look), thus effectively allowing you to do whatever you want.
Rik Smithies (Jan 20 2022 at 12:55):
yes I think it depends if you think of it as the same data. If it is not a copy, but a new way to access the same data then the same id makes sense (and the "old" copy is no longer available - same id, different url, would be technically different but confusing).
Side issue: "The logical id is unique within the space of all resources of the same type on the same server. Once assigned by the server, the id is never changed." But what makes it the same resource, and does it matter if it is not?
Is it acceptable to randomly reassign all the ids on a server (say, overnight, when no transactions are in progress). If not, why not? Who has the right to keep my ids?
Cooper Thompson (Jan 20 2022 at 14:01):
I think the combination of namespace (base URL) and id should consistently identify the same resource. However, if you change the namespace (base URL), then there isn't any implication that the ID say the same. So example.org/fhir/stu3/Patient/1234 and example.org/fhir/r4/Patient/1234 can be different patients. However if you start hosting an R4 resource on example.org/fhir/stu3/Patient.1234, then you'd need to keep the same identity as the previous STU3 resource at that same URL.
Cooper Thompson (Jan 20 2022 at 14:04):
Assigning new IDs in the same namespace can be really dangerous. It basically makes it look like all the data is "duplicated", which can be bad for clinical data. If a client had queried you before the upgrade, and has stored a local copy, then queries you after, and pulls that down into their local copy, then everything is either duplicated, or possibly assigned to the wrong patient. It would be a giant mess unless every client were super conservative in their processing (which, in my experience, they aren't).
Alexander Henket (Jan 20 2022 at 14:13):
I see a lot of support for my ground thought: same information, same id, regardless of technical representation.
I did not necessarily realize that while the server may know that example.org/fhir/stu3/Patient/123 and example.org/fhir/r4/Patient/123 are the same patient, this cannot be assumed or inferred by a client. This imo makes multiple end points per version of FHIR less desirable as it creates virtual copies of data.
I did note the impreciseness in "The logical id is unique within the space of all resources of the same type on the same server. Once assigned by the server, the id is never changed." as it leaves in the middle what defines a server. The only thing that makes sense to me is that 'server' is equal to endpoint
Cooper Thompson (Jan 20 2022 at 14:38):
Alexander Henket said:
This imo makes multiple end points per version of FHIR less desirable as it creates virtual copies of data.
I'm not sure I agree here. Since two endpoints are logically different, it isn't really any different than having endpoints for different hospitals. Several years ago we had a lot of discussion (at a WGM) about how to handle multiple FHIR versions, whether to use HTTP headers, different endpoints, etc. It was all kinda a wash, but the common practice so far has been multiple endpoints and that is working fine as far as I know.
John Moehrke (Jan 20 2022 at 14:41):
the id should only be seen as a unique identifier at that endpoint for that resource type. It should not be expected that an id on one endpoint is the same thing at another endpoint. This is very important. The .identifier is there to carry identifiers that might cross many endpoints (and business systems).
John Moehrke (Jan 20 2022 at 14:45):
can the same id be used for the same data on two different endponts? Yes, but a client that expects this is WRONG. That client will likely work fine in the initial setting, and totally fail in other settings.
Alexander Henket (Jan 20 2022 at 19:37):
Different hospitals with different endpoints makes sense, unless they share a common backend. example.org/fhir/stu3/Patient/123 and example.org/fhir/r4/Patient/123 MAY be the same Patient and one may use the Patient.identifier to attest to that. Unless that proves that these are in fact the same Patient, one should assume they are different. I know for a fact that a lot of resources do not have an identifier in our Dutch context so unless you are willing to venture into other deduplication schemes like code + date, having multiple endpoints almost unavoidably leads to virtual duplicates of data. Technically there is no issue in multiple endpoints. Functionally/clinically I'm not so sure.
Would the discussion from all those years ago turn out the same today I wonder.
Derek Ritz (Jan 20 2022 at 20:25):
Alexander Henket said:
I see a lot of support for my ground thought: same information, same id, regardless of technical representation.
@Alexander Henket -- I like this idea a lot. I believe that our lives would be fundamentally simpler if the same exact thing had the same exact id, no matter at which endpoint it was discoverable. Where there is an overarching governance over the health data, this is eminently doable.
Cooper Thompson (Jan 20 2022 at 21:13):
@Derek Ritz I'm not sure if that is what Alexander meant, but it is not practical to have the same ID across endpoints. You can have the same business identifier but not the same FHIR ID. Using these same FHIR ID is impossible in practice, and not nearly as useful as you probably think (because of merges, and identity issues).
Lloyd McKenzie (Jan 20 2022 at 21:26):
Resource id is a primary key. If you have multiple interfaces to the same database, then it's reasonable to surface that same key over each of them. If you have different databases that aren't explicitly synchronized/replicated, then it's exceptionally unlikely that the keys will be the same. When you're hitting an interface, you won't know what lies underneath and will thus have to always assume that different url + id == potentially different instance and behave accordingly.
Vassil Peytchev (Jan 21 2022 at 00:55):
I think the foundational concept of FHIR base
is sufficiently clearly described in the specification:
The Service Base URL is the address where all of the resources defined by this interface are found. The Service Base URL takes the form of
http{s}://server{/path}
and further down
For example:
http://myserver.com/Patient/1
andhttps://myserver.com/Patient/1
refer to the same underlying object, whilehttp://myserver.com:81/Patient/1
is a distinct entity from either of the above.
The difference of using different ports on the same server is the same as the difference of in the path component on the same server.
The Service Base URL cannot contain any semantics, therefore all discussion on implying equivalence between resources with different Service Base URLs would be moot.
Tjerk Drouen (Jan 25 2022 at 08:54):
Technical storage challenge
There is a technical challenge for the storage-system when objects change due to model-changes and versioning.
The structureDefinition of the stored object may have changed significantly across profile-meta-versions and/or fhir-versions.
Freedom of assigning logical ids
https://build.fhir.org/resource.html#id states
Each resource has an id element which contains the logical id of the resource assigned by the server responsible for storing it. ... Logical Ids are always opaque, and external systems need not and should not attempt to determine their internal structure.*
IMO: FHIR Resource Servers are free to create new logical instances if object-storage-requirements call for this
IMO: The business identifier should be used to indicate the same object across multiple endpoints.
http{s}://{server}{/fhir-version}{/path}?identifier={business identifier}|StructureDefinition.version}
Not the logical id
http{s}://{server}{/fhir-version}{/path}{logical-id}|StructureDefinition.version}
How to indicate a specific version
Based on https://build.fhir.org/structuredefinition-definitions.html#StructureDefinition.fhirVersion
There may be different structure definition instances that have the same identifier but different versions. The version can be appended to the url in a reference to allow a reference to a particular business version of the structure definition with the format [url]|[version].
Possibility to use logical-id uniquely as joint agreement
An additional constraint should be impsoed if logical-id is to be used as unique key.
See the unless phrase below.
https://www.hl7.org/fhir/managing.html#distributed
... the logical id of the resource is not guaranteed to be unique (unless all resources have a UUID for the logical id, which is allowed but not required). *
Lloyd McKenzie (Jan 25 2022 at 18:09):
Across endpoints, logical id (i.e. [x].identifier) is the only safe way to decide that something is the "same", and even that isn't necessarily always "safe" - there are lots of situations where the same business identifier might exist on more than one record. However, if a system exposes a single endpoint and uses HTTP content negotiation to determine the version, then id would be the same regardless of what version you ask for - even if the data is represented quite differently between versions.
Derek Ritz (Jan 26 2022 at 16:18):
Where each resource.id is a UUID, it would seem that the usefully simplifying expectation that "same information, same id" becomes reasonable... and doable.
Lloyd McKenzie (Jan 26 2022 at 17:17):
Except that id isn't required to be a UUID - and it often isn't (and can't be made to be). Agree that if the id is a UUID, then it ought to represent the same record. However, there's no guarantee that it'll be the same information. The instance on one server might be years - or even decades - out of date.
Last updated: Apr 12 2022 at 19:14 UTC