FHIR Chat · clarity on id, reference, fullURI, identifier

Stream: implementers

Topic: clarity on id, reference, fullURI, identifier

John Moehrke (Aug 06 2020 at 13:05):

I need a clear explainer of the relationship and translation rules between an id, Reference, fullUrl, and Identifier. I think there are times at which a value can be all four. I understand there are points in the lifecycle of a resource when they are all different. I am just not clear. Most of the examples always are relative, so it is not clear when a full url is used, and is a full url path to the resource instance the same as a fullUrl, and what are each of these flavors called?

Lloyd McKenzie (Aug 06 2020 at 14:58):

id is the server-specific version-independent record identifier. There are strict rules around format and length. It must be unique for a given resource type on a given server. It can be the same on distinct servers that host the 'same' information, but there's no expectation it will be - and the same resource id on different servers could point to completely different data (on different patients). I find it useful to think of 'id' as equivalent to a database primary key.
Reference is how you generally point from one resource to another resource. Ideally, you use Reference.reference which is either an absolute or relative reference to the target resource. Relative reference means you specify only the resource type and 'id'. This can only be used when the target resource is on the same server as the referencing resource. With a complete reference, you specify the server URL as well, which means the reference can point to an instance on a different server. The Reference always contains the id. It can, however, be version-specific, meaning that following the 'id' is '/_history/' plus the server-specific instance version. The Reference type also allows an 'identifier' element. That can be used to convey a business identifier for the target record and is typically used when the desired resource can't be referred to by a RESTful URL. Unlike Reference.reference, there's no expectation that Reference.identifier will be resolved by servers and there's no ability to do chaining or includes across references that use them. Finally, Reference includes a display element. If you think of Reference as being like an HTML anchor tag, the 'display' is the display text for the link while the Reference.reference is the URL. However, display can also be used on its own, for example if you just have the descriptive name for the target and nothing else.
fullUrl is used in Bundles to communicate the full URL for a resource because, in the context of a Bundle, there's no other way to know it. (With a single resource RESTful call, the full URL is conveyed either in the target URL for the call or in the location header for a response). As well, in certain Bundles like messages and documents, the fullUrl can be a urn:uuid which is used in situations where the resource in question doesn't have existence on a RESTful server. The fullUrl will always end with the id. FullUrls are never version-specific - which means in some Bundles they won't be unique.
Identifier is used to convey business identifiers. They include a 'system' which defines the server-independent namespace for the identifier and a 'value' which is the portion of the identifier typically exposed to a system. These can include a server's resource id, but generally shouldn't - for the same reason you don't generally want the primary key for a Patient in a database to be the patient's SSN or MRN. Identifiers are useful for matching on equivalent entities across servers. Different servers might have distinct Patient instances for the same patient, but they would typically share identifier values such as social security number, health card number, etc.

John Moehrke (Aug 06 2020 at 15:38):

Yup I understand that... and that is all carefully crafted text... The thing I am looking for is the reality in the overlap. And I am looking for something that would be understood by someone who is not yet a FHIR expert.

Lloyd McKenzie (Aug 06 2020 at 15:56):

I was hoping the above would work for a non-FHIR expert. And I tried to address the overlap?

John Moehrke (Aug 06 2020 at 15:57):

you did an excellent geek job. I think I need a @René Spronk to come up with the text for my audience need

René Spronk (Aug 08 2020 at 08:36):

The first thing to understand is the difference between an identifier (which identifies a thing) and an id (where that thing can be found).

Imaging a luggage storage room in a grand hotel - you drop off your luggage with the concierge, and they provide you with a 'luggage claim ticket' that mostly has some unique number on it. Presumably the hotel staff will use that number to store luggage in a particular way, in numeric order perhaps, but that's something we as hotel guests neither care nor know about. The only thing we care about is the ability to retrieve our luggage using the claim ticket number - perhaps to only take some clean socks out of our suitcase and then to have it stored yet again. Again, how they deal with the storage and retrieval is not of interest to us, they could (for all we care) use a subatomic vaporiser on our luggage followed by a reverse-blockchain osmosis process upon retrieval. The 'claim ticket number' is what is known in FHIR as the 'id', it aids a party that stores stuff to retrieve a specific item.

Now what if you were storing a passport instead of just a suitcase? The passport itself has a unique identifier, the passport number. It gets stored just like a suitcase, and you'll receive a luggage ticket. In FHIR-speak: the luggage ticket (the 'where') is the id, the passport number is the identifier (the 'what'). Many objects have an inherent property that acts as their unique identifier, e.g. product serial numbers, national person identifiers, bar codes on test tubes, credit card numbers, insurance policy numbers.

Now what if the grand hotel has multiple luggage rooms? Let's say we have one on the ground floor that hands out blue tickets, and a VIP one on the top floor which uses Red tickets. If one works in the Blue luggage room, it's safe to assume that (as long as the ticket presented is blue) that the ticket number (let's say 15252) is sufficient to locate a specific piece of luggage. But from a hotel perspective, using a URL style string for identification, we now have /blue/15252 or /red/15252. This URL is relative to the hotel we're in. If they're part of a chain of hotels, we (depending on the context) may need to specify a full URL, e.g. http://theGrand.Amsterdam.com/blue/15252 - all of these are still the 'id' of a piece of luggage, but depending on the context of use one may choose not to explicitly mention certain parts. For example: most people don't write down the name of the country of the addressee of a letter (although one could), because the 'default' is the country of the sender.

Id versus identifier is a surprisingly hard thing to understand for an audience not familiar with REST interfaces. Full URLs in bundles are an edge case, which is best not explained to a newbie audience.

John Moehrke (Aug 12 2020 at 14:05):

right away... "id (where that thing can be found)" is not true, and everything you say after supports that this statement was not true. The id value in FHIR is just the trailing unique identifier, it explicitly does NOT include WHERE it can be found. You must know that implicitly. The id is just the unique number within some where.
But as you mention the implied "WHERE" is important.

John Moehrke (Aug 12 2020 at 14:06):

What I am looking at is when I have an id, and I put it into an Identifier, I know that the id value goes into the Identifier.value; if I know the WHERE I would put that into the Identifier.system... right?
Thus a search/query using a token would equally process an id in this way. the id would go into the token code; and the WHERE (if known) would go into the token system. right?

John Moehrke (Aug 12 2020 at 14:07):

And this becomes problematic when one tries to carry resources that were originally created in different WHERE(s).

René Spronk (Aug 16 2020 at 06:18):

To me, id is about WHERE, as the remainder of the WHERE context is implied by the resource type, and the base_url. If you talk about an id, the others have to be known implicitly in any given context. In my above example: those that manage the luggage room will have their own way of mapping id to a shelve/corner or whatever other locations they have. It's still 'location', albeit that we as clients don't care about this, but the internals of servers do care about such locations.
So whet you call WHERE is any implicit location information external to the server storage layer, whereas my WHERE includes that layer as well (and thus id being about WHERE something is).

Charlie McCay (Oct 01 2020 at 15:55):

Lloyd McKenzie said:

Hi Lloyd - Its been a while... I have a query on the use of UUIDs for identifiers for resources and how to reference them. It seems that I have to do the references differently depending upon whether the resources are posted in a bundle or separately.

In both cases the sending system is specifying the resource.id as a uuid.

I can send a bundle with the following identifiers in the fullurl, id and reference.value attributes
fullUrl.value = urn:uuid:1f0f66e6-f3b9-4092-a493-d635dbf4dacf
id.value = 1f0f66e6-f3b9-4092-a493-d635dbf4dacf
Sender.reference.value = urn:uuid:1f0f66e6-f3b9-4092-a493-d635dbf4dacf

If I post the same resources separately, there is no fullUrl attribute, and the reference.value and id.value have to match.
The following works:
id.value = 1f0f66e6-f3b9-4092-a493-d635dbf4dacf
Sender.reference.value = 1f0f66e6-f3b9-4092-a493-d635dbf4dacf
but this does not:
id.value = 1f0f66e6-f3b9-4092-a493-d635dbf4dacf
Sender.reference.value = urn:uuid:1f0f66e6-f3b9-4092-a493-d635dbf4dacf

Am I missing something here?
many thanks - Charlie

Lloyd McKenzie (Oct 01 2020 at 16:25):

@Charlie McCay It's been a while indeed! Good to see you here. If you're not referencing within a Bundle, then specifying a UUID as urn:uuid:.... a reference is not legal. References have to be able to be resolved. Inside a Bundle, you can resolve a urn:uuid so long as it's defined within the Bundle. However, if you're doing a straight RESTful POST/PUT, the only type of URI that can be resolved is either a relative or full URL. If you specify Sender.reference.value = 1f06...., what that means is that the target of the reference is https://yourserver..../SomeResource/1f06...

Charlie McCay (Oct 01 2020 at 16:57):

HI @Lloyd McKenzie many thanks for the swift reply ... I sort of see the rationale, though as a naive fhir user this was surprisingly confusing - I expected a common syntax for referencing resources that are identified by uuid whether in the bundle or server.... if the urn:uuid uri can be resolved in a bundle it seems odd that it is not also resolved by looking in https://yourserver..../SomeResource/ .... or alternatively, if a bare uuid is treated as an acceptable for referencing a resource on the server, it seems odd that it is not acceptable for referencing one within a bundle (ie why is this not acceptable as a relative url in the bundle looking across the union of the bundle and the server at https://yourserver..../SomeResource/ . )

Lloyd McKenzie (Oct 01 2020 at 17:44):

Reference.reference is expected to be computably resolvable. It can contain either a full URL - which specifies the specific server location of the resource following FHIR's URL conventions [base]/[resource type]/[id], or specifies a relative URI - which has the format [resourcetype]/[id] when the base is the same as that as the referencing resource. Using UUID URIs can't be resolved this way. They can only be resolved in the context of a Bundle - and they're intended to be used within messages, documents and transactions where the referenced resource doesn't have an independent RESTful identity.

Also, in your example above, Sender.reference.value = 1f0f6... is not actually legal. Refernce.value must either be an absolute URI or a relative reference - and relative references must include the resource type - because resource ids aren't expected to be unique across type within the same server.

Kevin Mayfield (Oct 02 2020 at 08:14):

@Charlie McCay [This is a non official reply] Just so you are aware, Manchester LCHRE have their own pathology spec: https://simplifier.net/guide/GreaterManchesterLHCRE/UnsolicitedObservations
I'll see if I can put up a pathology example.

This regional IG is not that different from England/national, however it is being driven bottom up and by https://build.fhir.org/ig/HL7/v2-to-fhir/branches/master/ConceptMap-message-oru-r01-to-bundle.html

Charlie McCay (Oct 02 2020 at 08:55):

Hi @Lloyd McKenzie many thanks -- that really does make sense - and implies that reference.reference should only be used to point to other resources from within a resource that is unlikely to move form one context to another, and is not appropriate in a loosely coupled environment where the RESTful identity is less robust than an identity that will work across more than one FHIR server. I was led astray by the description of reference.identifier in https://www.hl7.org/fhir/references.html#Reference as "Logical reference, when literal reference is not known", which would be better saying "Logical reference, when literal reference is not known or may not be stable across the anticipated lifetime of the resource that contains the reference", which would helpfully indicate a situation where the literal reference is not appropriate.

Charlie McCay (Oct 02 2020 at 09:48):

Hi @Kevin Mayfield many thanks for the awareness raising - I look forwards to more non official (and maybe even official) open discussion on this - will be exciting if we can have bottom, top, and sides seeing what each other are doing and helping each other - including looking at how we work with specifications that are (not that) different across pathology (for example) - fun times ahead :)

Lloyd McKenzie (Oct 02 2020 at 13:05):

FHIR doesn't expect strict referential integrity. It also doesn't generally expect content to 'move'. Think of it like a web page pointing to other web pages. As a rule, you're much better using Reference.reference than Reference.identifier. References that use .identifier aren't expected to be computably resolvable. You can't do chained searches through them, you can't use them for includes, etc. If your RESTful 'id' is not stable, you're going to find that your RESTful system just doesn't work terribly well...

Michele Mottini (Oct 02 2020 at 13:32):

Note also that when POSTing a single resource to a server you are not necessarily able to assign its id, most server with assign their own id (that is not necessarily a UUID, it can be a sequential number for example), and then you have to use the resulting URL of that new resource (as returned in the Location header) to reference it from other resource POSTed afterwards

Kevin Mayfield (Oct 02 2020 at 13:32):

Wouldn't that depend on interaction pattern? I would expect to see high use of reference.identifier in messaging scenarios but in rest, yes ,do not use reference.identifiers.

Kevin Mayfield (Oct 02 2020 at 13:35):

In the scenario @Charlie McCay is working with, the message is very similar to HL7v2.x ORU_R01 and as many hospitals will be converting from V2 they will be able to support reference.identifier (especially for national identifiers). They are unlikely to be able to include either external or intenal bundle references.

Charlie McCay (Oct 02 2020 at 14:43):

It would be nice to have a pattern that works for distributed, loosely coupled RESTful environments where there is messaging between FHIR servers. This would require that the data-source can determine the identifier that is to be used (so would have to be globally unique such a UUID or OID) so that a collection of resources that contain references to each other can be accessed RESTfully and shared in bundles between servers. It would seem that this pattern requires the use of resource.identifier

Lloyd McKenzie (Oct 02 2020 at 15:22):

The expectation is that if you're using REST, you drive by Resource.id - and more specifically resource URL. That's how REST works. If you're not interested in REST, then yes, you'd be managing linkages via identifiers.

Charlie McCay (Oct 02 2020 at 16:02):

Hi lloyd ... at risk of labouring a point - it would be much easier to have a mixed economy of RESTful FHIR servers and messaging between them if:
[1] resource.id could be determined by the provider of the information (as a globally unique id eg uuid or oid) with HTTP PUT
[2] there was a consistent way of using a relative path to reference resource.id across bundles and FHIR servers
As it is what I am hearing is that this is not seen to be in scope as a core FHIR implementation pattern, and that if you want to use REST with FHIR it is designed to be used with a single (logical) FHIR server. There is nothing in the REST religion that I am aware of (as a novice) that says that PUT cannot be used to create resources with a known identifier - indeed since it is idempotent it seems marginally more robust (see https://restfulapi.net/rest-put-vs-post/) .
It therefore seems to me that this is a FHIR design choice that has been made for other reasons. The usecase that I am thinking of initially is pathology information that may be stored in a regional shared record FHIR server and/or a locally hosted FHIR server within a hospital.

Lloyd McKenzie (Oct 02 2020 at 16:48):

It can. FHIR supports "upserts" using PUT. It's defined here: https://www.hl7.org/fhir/http.html#upsert. Servers can choose whether to support that or not. (Allowing external entities to define your record id isn't necessarily possible in some architectures.)
There's no way to reference information across Bundles other than URL. Given an arbitrary UUID, there's no way for any system to have a clue what server that record lives on. If you want to reference something, you need to know where it is.

Vassil Peytchev (Oct 02 2020 at 17:51):

Hi Charlie, a "mixed economy" of RESTful FHIR servers and data exchange between them still has to support a RESTful ecosystem. Note that I changed "messaging" to "data exchange" - messaging is only one way to enable data exchange between servers. Using light-weight notifications is another way to enable data exchange among servers, which doesn't suffer from potentially creating incompatible modes of exchange.

In other words, if you must use FHIR messaging, you need to have the resources in the message be accessible via the RESTful API (in addition to being in the message). Otherwise, I don't think you can get a working "mixed economy".

Grahame Grieve (Oct 02 2020 at 22:10):

@Charlie McCay hi my old friend. it's been a while....

Grahame Grieve (Oct 02 2020 at 22:11):

I don't understand this thread. both these things are true:

1] resource.id could be determined by the provider of the information (as a globally unique id eg uuid or oid) with HTTP PUT
[2] there was a consistent way of using a relative path to reference resource.id across bundles and FHIR servers

Use an id with a relative path. Perhaps I have missed something in the nuance here...

Kevin Mayfield (Oct 05 2020 at 07:22):

We do have a mixed economy but I think it is around interaction requirements.:

Provider to Provider (server to server)
App to System/EPR (client to server)
Device to System (client to server)

First one is going to use identifiers and will often use HL7 FHIR(/v2) messaging.
Second one: id's and most often be (FHIR) restful
last one ...... probably a mix of the two and probably messaging and may not be FHIR. If the system is a mobile phone, it probably ends up as FHIR Messaging (to an Organisation)

Charlie McCay (Oct 15 2020 at 09:36):

Grahame Grieve said:

I don't understand this thread. both these things are true:

1] resource.id could be determined by the provider of the information (as a globally unique id eg uuid or oid) with HTTP PUT
[2] there was a consistent way of using a relative path to reference resource.id across bundles and FHIR servers

Use an id with a relative path. Perhaps I have missed something in the nuance here...

Hi @Grahame Grieve ... it has indeed been a while - and nice to be back... it seems to me that relative paths do not work in a bundle, but do work in a fhir server - it seems that within a bundle the reference has to point to a fullUrl, and that cannot be a relative path .... for now I have conditional logic to create the reference as either Resource/nnnnn or urn:uuid:nnnnn depending upon whether the resource that contains the reference is being PUT directly to a fhir server, or is being bundled up in a bundle. It works but is ugly :(

Grahame Grieve (Oct 15 2020 at 09:37):

I don't understand that bit. There's no reason to make the reference any different

Grahame Grieve (Oct 15 2020 at 09:38):

unless the resource has no id

Charlie McCay (Oct 15 2020 at 10:01):

Does that mean that <reference value="Observation/5e761190-2f9c-4cd3-b69d-6a77f8a735d3"/> should be OK if the observation is a sibling entry in a bundle with that uuid as its observation.id.value??

Grahame Grieve (Oct 15 2020 at 10:31):

yes

Charlie McCay (Oct 15 2020 at 10:46):

glad to hear that should work - I have been using a hapi server to validate and it didnt seem to like that

Grahame Grieve (Oct 15 2020 at 10:47):

oh?

Charlie McCay (Oct 15 2020 at 10:50):

seems to want the reference to point to a fullurl, and that cannot be a relative path.... if I remove the fullurl it fails to validate ... will keep playing and get a couple of examples together and test with some other tools... many thanks for swift responses

Grahame Grieve (Oct 15 2020 at 11:20):

the most authoritative validator is the java validator

ryan moehrke (Oct 15 2020 at 15:27):

I was under the impression that a relative path in say a transaction POST implied the relative path on the receiving server and you had to explicitly state you were referencing in-bundle with the full uuid prefix + id. If that's not the case and you reference a resource in the same bundle how is the server supposed to know you want the reference to resolve in-bundle vs in their database? are all servers just expected to try and resolve every reference in the bundle first and just default to in DB if that fails? or is there some other nuance I'm missing?

Josh Mandel (Oct 15 2020 at 16:32):

If you're handling a bundle, you're supposed to follow the rules at http://build.fhir.org/bundle.html#references which include looking inside the bundle first, and looking on a server if that fails.

Josh Mandel (Oct 15 2020 at 16:33):

(That said: it looks like the nesting on these bullets is a bit wonky. I'll plan to review tomorrow.)

Michele Mottini (Oct 15 2020 at 16:35):

@Charlie McCay the link Josh just posted explain how you convert the relative reference to an absolute one to match fullUrl that are absolute urls

Paul Church (Oct 15 2020 at 16:35):

Those nested if/else blocks are definitely hard to interpret. Would benefit from some editing.

Charlie McCay (Oct 15 2020 at 17:15):

many thanks all... I will continue playing with this...

Josh Mandel (Oct 15 2020 at 17:16):

Agreed. I'm going to block off 1-2p CT tomorrow to work on reviewing "2.37.5.1 Resolving references in Bundles". To continue my weekly recording experiment, I'll plan to do it in a live screencast where I try to review the intended meaning, make tweaks, and submit a Jira issue.

https://youtu.be/ZK0AKB5PqGM

Charlie McCay (Oct 15 2020 at 18:49):

Thanks Josh - I will not be able to join you live tomorrow, but look forwards to the recording - my challenge is to support "send it all bundle", "skinny bundle with references" or "direct push of resources into FHIR server without a bundle" without changing what is inside the resources (ie without having to change the references).
It seems that the FHIR specification supports this with fullUrl in bundles by requiring a common "base URL" for the resource being referenced and the one that contains the reference. This begs the question "what should that base URL be?" to which there is not an elegant answer when the system that generates the ID and the FHIR resources is not the one that will not be a persistent store. Any choice of base url is inelegant because it will confuse casual implementers who may expect it to be stable, persistent and usable - which it is not. Alternatively the sending system needs to know what the final resting place for the resource will be, which is a significant deployment overhead.

Lloyd McKenzie (Oct 15 2020 at 19:10):

Direct push means that the resources must be available at a RESTful endpoint. You can use that identification approach consistently everywhere, but it means you can't use uuids - you have to use proper URLs. If you can't use proper URLs, there's no way to point to other resources except by 'logical' identifiers - i.e. Reference.identifier. And if you use those, none of the RESTful resolution mechanisms (_include, _revinclude, query chaining, etc.) will function.

Lloyd McKenzie (Oct 15 2020 at 19:11):

In short, if you don't have persistent identity on a server, you can't use REST

Lloyd McKenzie (Oct 15 2020 at 19:17):

(that's not completely true - there are some limited things you can do, but posting resources with links to other resources isn't one of them.)

Vassil Peytchev (Oct 15 2020 at 21:44):

my challenge is to support "send it all bundle", "skinny bundle with references" or "direct push of resources into FHIR server without a bundle" without changing what is inside the resources (ie without having to change the references).

These are such different things, that I would suspect someone has given you a Sisyphean task without realizing it. This can only work in very specific circumstances.

I would suggest to start with where the data is supposed to live. If, for example, one of your use cases is that the data lives at the initiator, and they need to share it with the recipient, then you can look at the following:

The data is available via the FHIR RESTful API
- "send it all bundle" contains the full URLs from the initiator's server, and all references look like when they are accessed via the RESTful API from the initiator's server. There is no expectation that the recipient will create any of the resources (as they belong to the the initiator).
- "skinny bundle with references" - similarly, references point to the initiator's server
- "direct push of resources" - here is a catch: since the assumption is that the data belongs to the initiator, you can only really send a notification (e.g. R5 Subscription Notification or the R4 equivalent) for the recipient to GET the resources. Doing a POST or PUT to the recipient's server creates a copy of the resource, with a different ownership, and its own lifecycle, and the references may or may not still point back to the initiator's server.
The data is not available via the RESTful API
- "send it all bundle" has to use (at least in some cases) UUIDs as full URLs, and references are based on them. Again, there is no expectation that the recipient will create any of the resources (as they belong to the the initiator).
- "skinny bundle with references" - This could fit a case where some resources are available via the RESTful API, and others are not. External references point to the initiator's server, while internal references (presumably not accessible via the RESTful API) need to use UUIDs
- "direct push of resources" - This doesn't seem to work with the assumption that all data belongs to the initiator, and yet it is not available via the RESTful API.

Second use case: The data lives at the recipient (including data that is OK to be duplicated, and exists as different resources at the initiator and recipient):

The data is/will be available via the RESTful API (on the recipient server)
- "send it all bundle" - if the bundle contains only changes to data that already exists on the recipient server, the full URLs and reference can reflect that, and will look the same if you do the updates via direct PUT of the resources. If, however, there are new resources to be created, there is no way to dictate to the server how the id component of the URL will look like, and the full URLs of these entries will have to use UUIDs. The bundles in this case are most likely to be of type transaction or batch since the assumption is that the resources live on the recipient server. A message bundle makes less sense here, because a message does not imply that any resources would be created or updated on the recipient server.
- "skinny bundle with references" - In this case, the references will be to existing resources on the recipient's server, so they will look the same as when you do any updates via the RESTful API. New resources will have to be in the bundle, and their full URL will need to use UUIDs. For this case the use message bundles may make some sense.
- "direct push of resources" - These are the natural C[R]UD operations for this use case.
The data is not available via the RESTful API (on the recipient server) - this is an edge case that can't really be broken down in different parts, as the assumption that the data belongs to the recipient leads to pretty much only having a message bundle with UUIDs for full URLs

Third use case: Some of the resources belong to the initiator, and some belong to the recipient. These are usually cases that are parts of a workflow. An example is a ServiceRequest, which belongs to the initiator, with the corresponding links to patient and requester, a Task, which is POSTed to the recipient, with links to the ServiceRequest at the initiator, and to the task owner at the recipient. In cases like this if you don't have the resources exposed via the RESTful API on both sides, you can't come even close to having the references be the same for the three conditions from the top.

In my view, encouraging data owners to expose the information via the RESTful API is the necessary first step towards a common FHIR-based exchange. Rushing to replace existing messaging integrations with FHIR bundles being tossed around will not necessarily bring you closer to a better situation, as pointed out :

Any choice of base url is inelegant because it will confuse casual implementers who may expect it to be stable, persistent and usable - which it is not. Alternatively the sending system needs to know what the final resting place for the resource will be, which is a significant deployment overhead.

Charlie McCay (Oct 16 2020 at 08:56):

Hi @Vassil Peytchev ... delightful to meet again ... many thanks fo that very thorough and useful analysis.... I see that designing to give every resource a home makes life easier, and that is a good thing. I can see that the FHIR resource.id is an identifier scoped by the base url, and that applies even if the identifier happens to be from a scheme that ensures global uniqueness.
I should have done this earlier - but I have just looked at the FHIR DICOM resource. I had expected the DICOM SUID to the used as the ImagingStudy.id.value as it is the primary and unique identifier for the information object. But I see that this is included in the example as ImagingStudy.identifier http://www.hl7.org/fhir/imagingstudy-example.xml.html. This is an example of an information object that is assigned an identifier that should be used in references to it, but the information object does not have a single obvious home.

John Moehrke (Oct 16 2020 at 14:08):

in ImagingStudy we didn't want to assume that the ImagingStudy FHIR Resource was exactly identifical to the DICOM suid. Hence why the split as you indicate. If it is identicial, that is a systems design fact that can result in it being duplicated in both id and identifier --- Right @Elliot Silver ?

Vassil Peytchev (Oct 16 2020 at 14:32):

Just to be clear, the fact that two resources are at [Base1]/[Type]/123 and [Base2]/{Type]/123 respectively has no bearing on whether these two resources have anything at all in common, much less that they might be two copies of the same resource.

Elliot Silver (Oct 16 2020 at 22:31):

John Moehrke said:

in ImagingStudy we didn't want to assume that the ImagingStudy FHIR Resource was exactly identifical to the DICOM suid. Hence why the split as you indicate.

Correct. For example, since some of a study could be stored on one PACS and other parts of a study on another PACS, we allow multiple ImagingStudy resources each reflecting one source's view of the study. Thus study UID needs to be independent of resource id.

We also considered that a FHIR server may have it's own mechanism for generating resource id such as GUID or timestamp, and couldn't assume that it would support using study uid for ImagingStudy resources.

Derek Ritz (Oct 17 2020 at 21:46):

Vassil Peytchev said:

my challenge is to support "send it all bundle", "skinny bundle with references" or "direct push of resources into FHIR server without a bundle" without changing what is inside the resources (ie without having to change the references).

These are such different things, that I would suspect someone has given you a Sisyphean task without realizing it. This can only work in very specific circumstances.

I would suggest to start with where the data is supposed to live. If, for example, one of your use cases is that the data lives at the initiator, and they need to share it with the recipient, then you can look at the following:

The data is available via the FHIR RESTful API

"send it all bundle" contains the full URLs from the initiator's server, and all references look like when they are accessed via the RESTful API from the initiator's server. There is no expectation that the recipient will create any of the resources (as they belong to the the initiator).

"skinny bundle with references" - similarly, references point to the initiator's server

"direct push of resources" - here is a catch: since the assumption is that the data belongs to the initiator, you can only really send a notification (e.g. R5 Subscription Notification or the R4 equivalent) for the recipient to GET the resources. Doing a POST or PUT to the recipient's server creates a copy of the resource, with a different ownership, and its own lifecycle, and the references may or may not still point back to the initiator's server.

The data is not available via the RESTful API

"send it all bundle" has to use (at least in some cases) UUIDs as full URLs, and references are based on them. Again, there is no expectation that the recipient will create any of the resources (as they belong to the the initiator).

"skinny bundle with references" - This could fit a case where some resources are available via the RESTful API, and others are not. External references point to the initiator's server, while internal references (presumably not accessible via the RESTful API) need to use UUIDs

"direct push of resources" - This doesn't seem to work with the assumption that all data belongs to the initiator, and yet it is not available via the RESTful API.

Second use case: The data lives at the recipient (including data that is OK to be duplicated, and exists as different resources at the initiator and recipient):

The data is/will be available via the RESTful API (on the recipient server)

"send it all bundle" - if the bundle contains only changes to data that already exists on the recipient server, the full URLs and reference can reflect that, and will look the same if you do the updates via direct PUT of the resources. If, however, there are new resources to be created, there is no way to dictate to the server how the id component of the URL will look like, and the full URLs of these entries will have to use UUIDs. The bundles in this case are most likely to be of type transaction or batch since the assumption is that the resources live on the recipient server. A message bundle makes less sense here, because a message does not imply that any resources would be created or updated on the recipient server.

"skinny bundle with references" - In this case, the references will be to existing resources on the recipient's server, so they will look the same as when you do any updates via the RESTful API. New resources will have to be in the bundle, and their full URL will need to use UUIDs. For this case the use message bundles may make some sense.

"direct push of resources" - These are the natural C[R]UD operations for this use case.

The data is not available via the RESTful API (on the recipient server) - this is an edge case that can't really be broken down in different parts, as the assumption that the data belongs to the recipient leads to pretty much only having a message bundle with UUIDs for full URLs

Third use case: Some of the resources belong to the initiator, and some belong to the recipient. These are usually cases that are parts of a workflow. An example is a ServiceRequest, which belongs to the initiator, with the corresponding links to patient and requester, a Task, which is POSTed to the recipient, with links to the ServiceRequest at the initiator, and to the task owner at the recipient. In cases like this if you don't have the resources exposed via the RESTful API on both sides, you can't come even close to having the references be the same for the three conditions from the top.

In my view, encouraging data owners to expose the information via the RESTful API is the necessary first step towards a common FHIR-based exchange. Rushing to replace existing messaging integrations with FHIR bundles being tossed around will not necessarily bring you closer to a better situation, as pointed out :

Any choice of base url is inelegant because it will confuse casual implementers who may expect it to be stable, persistent and usable - which it is not. Alternatively the sending system needs to know what the final resting place for the resource will be, which is a significant deployment overhead.

This is a hugely insightful analysis... and super-helpful. Thank you @Vassil Peytchev ! My radar is going off a bit, I must admit. It sounds like an HIE based entirely on FHIR would be well-served by some kind of façade that made multiple FHIR servers "look like" a single, logical FHIR server. Am I correctly surmising that such a façade would help many aspects of the FHIR spec "just work" -- and that otherwise, it'd make for a significantly more complicated implementation?

Lloyd McKenzie (Oct 18 2020 at 02:27):

Making multiple servers look like a single server might be straight-forward. It depends on whether they have duplicate records, whether there's a need to interpolate records from multiple servers in search responses, etc. I think the major question is whether the data sources have reliable identity for the records (and often whether they can differentiate that data has changed).

Derek Ritz (Oct 18 2020 at 04:04):

@Lloyd McKenzie I guess the "a-ha" moment was realizing that a federation of FHIR servers playing various roles in an HIE (e.g. one is the Client Registry... another one is the Shared Health Record repository... etc.) will not easily work together unless we can make it seem like they're all one big server. There seem, apparently, to be challenges and complexities related to identifying where the resources are. Or am I misunderstanding the arc of the thread so far?

Lloyd McKenzie (Oct 18 2020 at 04:35):

I don't think that's true. It's totally fine having the patients on one server and the providers on a different server

Lloyd McKenzie (Oct 18 2020 at 04:36):

Charlie's issue is with systems that don't have permanent RESTful ids - but that he still wants to interact RESTfully. That doesn't work so well...

Vassil Peytchev (Oct 18 2020 at 04:58):

"make it seem like they're all one big server" is one way to look at it. Another, I think, is to consider how Content Delivery Networks (CDNs) function today on the web - when you visit cnn.com, you are not really getting data from the CNN server, it's cached on a different server somewhere close to you, so you get it fast.

I think there can be a similar role for HIEs (and intermediaries in general) where record location, caching of resources, and negotiating authorization and authentication are the services provided, while the data is accessible RESTfully at the source.

Derek Ritz (Oct 18 2020 at 20:30):

Thanks @Lloyd McKenzie and @Vassil Peytchev -- this is helpful. Lloyd, I think the permalink issue is a bigger one that folks realize. As Vassil has said... even though you always go to cnn.com... cnn.com just then points you to all the stuff it servers you (including all the ads!) which come from various other places. We need, however, to be much more deterministic about things when it relates to patient-specific data. Even if it is "above the interoperability layer", somehow there still needs to be a way to tell which servers the source-of-truth resides is on... and there need to be permalinks to those locations so that, no matter how much rebalancing has been done between multiple federated servers, the right resource can still be found. That's non-trivial, in my experience.

Lloyd McKenzie (Oct 18 2020 at 21:04):

The load balancing/behind the scenes stuff isn't something the interface needs to worry about. So long as all of the servers pushing web pages claim that they're cnn.com (and don't serve conflicting content claiming to be the same page), you're fine. FHIR has no notion of "source of truth" other than Provenance - allowing you to trace back where data came from. An assertion of a Condition on one server might be deemed more reliable than one from a different server, but FHIR can't pick an official source of truth any more than Google can choose cnn.com over fox.com or cnn.com

Derek Ritz (Oct 18 2020 at 21:08):

I'm just thinking of source of truth as in: this is the definitive patient resource for Derek. that kind of thing. I'm counting on a FHIR-based HIE to be able to do that. Is that not a reasonable expectation??

Lloyd McKenzie (Oct 18 2020 at 22:43):

There'll be a whole lot of 'definitive' Patient resources for Derek - one per organization at minimum. They might include a linkage to the HIE one, but they're certainly not going to give up having their own. And each will deem their own as "source of truth".

Derek Ritz (Oct 18 2020 at 23:24):

Thanks, @Lloyd McKenzie -- what I'm thinking of here is the location of (in my case) the OHIP resource for Derek. In most of the countries I'm working in there is an expectation that there will be a "golden demographic record" in a central client registry... like the provincial CRs here in Canada. That CR entry for Derek is the source of truth regarding my demographic details, and my ECID. So... related to the previous comments in the thread... there will need to be, I'm surmising, a permalink to this "golden" resource.... right? And my point was that this is (potentially) a non-trivial thing to operationalize, in practice.

Lloyd McKenzie (Oct 18 2020 at 23:38):

That registry will have a base URL and your record will have a specific id. That shouldn't be terribly hard to operationalize I wouldn't think...

Last updated: Apr 12 2022 at 19:14 UTC

Main menu

FHIR Chat · clarity on id, reference, fullURI, identifier · implementers