Stream: implementers
Topic: reverse search for any referencing resources
Sean McIlvenna (Oct 03 2019 at 00:05):
If I have resource A and B that references resource C, is there a way to search for any resources that reference resource C. In other words, if I only know of resource C and want to find anything that references it, is that possible?
Lloyd McKenzie (Oct 03 2019 at 01:39):
I don't think so. Both _revinclude and _has require specifying the specific target class and relationship.
Sean McIlvenna (Oct 03 2019 at 01:44):
That was my conclusion as well, @Lloyd McKenzie ... I ended up submitting a tracker (24888) for this.
Lloyd McKenzie (Oct 03 2019 at 01:47):
It's a really expensive search to run - you'd need to hit every resource and every element that could have extensions that could possibly point to the resource. What's the use-case?
René Spronk (Oct 03 2019 at 04:16):
If you were to supply a GraphDefinition as the intended response structure, it could be defined (well, not right now, but its within its scope) to be a "any resource with a reference to C" graph. That won't make the query less expensive, but it just goes to show we could already support the use case (once we finalise the GraphDefintion resource, and if/when implemented by servers :-) )
Lloyd McKenzie (Oct 03 2019 at 04:25):
Wow. Didn't realize that 'path' was optional in 'link'. That's pretty scary given that you'd then have to check all extensions within the source resource as well as all elements that have outbound relationships.
Sean McIlvenna (Oct 03 2019 at 14:28):
To support that, I would think you would want to create an index of references. When a resource is POST/PUT to a server, extensions would be traversed and any references extracted.
Sean McIlvenna (Oct 03 2019 at 14:29):
But really, I'm less concerned about extensions than I am about the core spec's References.
Lloyd McKenzie (Oct 03 2019 at 14:29):
The challenge is that most people will care about "the relationships I care about" - which will likely be a subset of both core elements and extensions
Sean McIlvenna (Oct 03 2019 at 14:33):
I think the main benefit of something like this is to ensure referential integrity.
Sean McIlvenna (Oct 03 2019 at 14:33):
for example, HAPI enforces referential integrity by default
Sean McIlvenna (Oct 03 2019 at 14:34):
so, if I wanted to delete a resources that is referenced by other things, right now I have no way of determining what resources reference it
Sean McIlvenna (Oct 03 2019 at 14:34):
short of a TON of queries
Sean McIlvenna (Oct 03 2019 at 14:35):
it is less likely that a system will enforce referential integrity on references in extensions. if it does, then it likely processes references in extensions like I mentioned above (via an index)
Lloyd McKenzie (Oct 03 2019 at 16:02):
Enforcing referential integrity sounds like something that should be handled at the persistence layer rather than as a query capability?
Sean McIlvenna (Oct 03 2019 at 16:13):
In HAPI, for example, referential integrity IS enforced at the persistence layer. So, you can't delete a resource that is referenced by something else. But, how do you find out what that "something else" is?? you can't...
James Agnew (Oct 03 2019 at 16:21):
you could always write a custom interceptor on the STORAGE_PRESTORAGE_DELETE_CONFLICTS
hook.. it gets called when a conflict is about to prevent deletion in order to allow you to resolve it
Sean McIlvenna (Oct 03 2019 at 16:33):
we actually like the way referential integrity is enforced right now... we wouldn't want to change that
Sean McIlvenna (Oct 03 2019 at 16:33):
but to an end-user trying to delete a resource, they need a way of knowing what is referencing it, that may be affected by the change
Sean McIlvenna (Oct 03 2019 at 16:34):
but right now there is no way to know that
Paul Church (Oct 03 2019 at 16:41):
GCP does handle this in the persistence layer and not in the search index...sort of. Not all references are the subject of search parameters; the ones that are do appear in the search index so there are tricks that can be used to find them. It turns out that "?_content=Patient/123" on the base URL works, but that's not guaranteed by the spec. I don't think the search index is the right place to solve this problem.
Even if we give you a nice error message "can't delete Patient/123 because it's referred to by Observation/456, Observation/789, ..." (and we currently don't) this is not parseable in an interoperable way, and there could be an unbounded number of references.
For the specific case of any resource that has a compartment definition, retrieving the compartment might help.
Define a new operation? It would be relatively easy for us to page through the references from storage and return them.
Sean McIlvenna (Oct 03 2019 at 16:42):
If the error response from the delete request included a computable list of references that are stopping the delete, that would solve the problem for me...
Sean McIlvenna (Oct 03 2019 at 16:43):
for example, perhaps an OperationOutcome with a specific code, and multiple OperationOutcome.issue.expression entries for each of the referenced resources
Paul Church (Oct 03 2019 at 16:47):
I think the biggest risk there is that there are too many of them and you can't paginate an OperationOutcome response.
Paul Church (Oct 03 2019 at 17:09):
@James Agnew does HAPI enforce referential integrity on references that are in extensions?
James Agnew (Oct 03 2019 at 17:19):
@Paul Church We do, but only if there is a search parameter on that extension, and the server is configured to enforce referential integrity.
James Agnew (Oct 03 2019 at 17:34):
...and Sean, you could definitely implement that logic (the OO resource containing the conflicts) using an interceptor on the hook I mentioned. Honestly that sounds pretty useful to me.
There is a catch though: If you can't delete Patient/1 because Observation/2 links to Patient/1, you'd get that in the response.. but there is no guarantee you would be able to delete Observation/2, since DiagnosticReport/3 might link to it.
Sean McIlvenna (Oct 03 2019 at 17:40):
No, but you could delete just the reference inside Observation/2, effectively leaving it blank...
Sean McIlvenna (Oct 03 2019 at 17:41):
@James Agnew is that logic you mentioned something you would want in core hapi-fhir, or as part of hapi-fhir-jpaserver-starter
Sean McIlvenna (Oct 03 2019 at 17:41):
if in core hapi-fhir, I would think you would want that logic built into framework, not implemented via hooks... could be wrong though
James Agnew (Oct 03 2019 at 18:09):
I think it would make sense in core hapi fhir... but as a hook. These days our philosophy is increasingly that if something can be done with interceptors, that's how we do it. It enforces a nice separation of concerns, and means it's really easy to turn on and off features (you'll notice that a lot of the new flags being added to jpaserver-started actually just toggle interceptors actually).
Sean McIlvenna (Oct 03 2019 at 19:38):
no idea if/when I will be able to get to this...
Sean McIlvenna (Oct 03 2019 at 19:38):
but if I ever do, that's the approach I'll try
Brian Postlethwaite (Oct 04 2019 at 04:35):
You could craft a search parameter that does this (which I noted when considering how to locate all the resources for processing a patient merge).
Only way around it is to scan everything in the way in and tag a special search parameter patient-for-merge
search expression: descendants.ofType(ResourceReference).resolve().ofType(Patient)
I know probably wrong and expensive but that gives the idea.
And maybe use it on the system level search?
Lloyd McKenzie (Dec 07 2020 at 20:51):
There's a proposal to support _include and _revinclude syntaxes that allow "*" as a relationship type. Questions to consider:
- would this only match on the specific search criteria the system supports, or would it handle all references, even if they aren't supported as independent search criteria? (The latter is necessary to safely use this for referential integrity purposes)
- what are the performance ramifications of supporting this?
- If we introduce this, would the syntax allow constraining what resource types relationships are considered for?
- if we don't do this in search, should we support a custom operation instead?
Lloyd McKenzie (Dec 07 2020 at 20:52):
It may be we don't actually want a search set, but rather just the resource ids that have references.
Bas van den Heuvel (Dec 07 2020 at 20:53):
We encountered this issue also in FHIRcast were we want to find all resources that relate to an "Anchor" - all referred and and all referred from. Having this as a search criteria would be great as we can use R5 subscriptions for it.
Paul Church (Dec 07 2020 at 20:55):
That gets at the idea of subscribing to changes to a patient, but more generally. I believe this is an area where search-based subscription definitions becomes quite awkward, but is the only tool we currently have.
Gino Canessa (Dec 07 2020 at 21:05):
Paul Church said:
That gets at the idea of subscribing to changes to a patient, but more generally. I believe this is an area where search-based subscription definitions becomes quite awkward, but is the only tool we currently have.
The R4 backport of topic-based subscriptions is up for ballot in January, so it will hopefully be available Soon(tm)
Christiaan Knaap (Dec 08 2020 at 10:50):
I can't really judge the usefulness of it, but implementing it is quite doable for Vonk.
For _include, both options are feasible (all reference searchparameters or all reference elements), but for efficient evaluation of _revInclude it would have to be on searchparameters only.
Having a type modifier seems useful to me, e.g. for Observation that can have many basedOn or partOf references, most of which may not be of interest to the search at hand.
But I do expect problems with :iterate.
Lloyd McKenzie (Jan 04 2021 at 20:17):
@Grahame Grieve @James Agnew - thoughts on this?
René Spronk (Jan 05 2021 at 07:47):
Mind you, we already have wildcards for 'reference-type search parameters', to _(rev)include=Encounter:* - this isn't actually supported on (the public version of) HAPI, Vonk, test.fhir.org last I tested.
Lloyd McKenzie (Jan 27 2021 at 19:28):
Question for @James Agnew @Grahame Grieve Are we in favor of adding support for _revInclude=*
James Agnew (Jan 27 2021 at 19:48):
HAPI already supports _revInclude=*
, at the time it was implemented I don't think we knew it wasn't in the spec :)
Paul Church (Jan 27 2021 at 19:49):
_revinclude=*
has been in the spec for a while, I thought this was about being able to revinclude on incoming references that aren't the subject of any search parameter?
Gino Canessa (Jan 27 2021 at 20:08):
As far as I can tell, it's not specified. The section in search says that you can use an *
in place of the search parameter name. It does not indicate anything about the resource name, or being able to override the <resource>:<search param>
format with a single *
.
(btw, I am for the change - just stating why I think a change is necessary)
James Agnew (Jan 27 2021 at 20:59):
Ah, didn't realize this was about supporting if for non-searchparameter revincludes.
I'm indifferent to that idea. It wouldn't be hard to implement, but it'd certainly have an impact on indexing speed so I'd probably want it to be configurable at the server level. Do0able though, if there was a good use case.
Gino Canessa (Jan 27 2021 at 21:17):
I'm pretty sure that's part of the question. FHIR#24888 is asking for a way to search for all references to a resource. Since the behavior of _include=*
and _revInclude=*
isn't defined, does/should it:
- Match every resource with a search parameter of the correct linkage.
- Match every resource with a link.
If it's agreed on the latter, then that fills the need of the request (with additional docs, etc.). Otherwise... an operation? another parameter?
Paul Church (Jan 27 2021 at 22:23):
I have always been in favour of an operation for this, Resource/$references-to. I don't think something that explicitly isn't limited to search parameters is a good fit for search.
René Spronk (Jan 28 2021 at 08:51):
If _include=* were to mean "it's up to the server to include as may as possible (without any guarantees of completeness) resources that have a reference to the focal resource" then I'd be in favor of such a suggestion. Note that even $everything doesn't guarantee you get 'everything'. If you want a guaranteed 100% set of resources - which is harder to implement, it sounds like an operation to me, which a server may or may not wish to support. Note that we already have a wildcard on the searching of compartments, adoption of which is low AFAIK.
René Spronk (Jan 28 2021 at 09:16):
Reading https://www.pewtrusts.org/en/research-and-analysis/reports/2021/01/standard-technology-presents-opportunities-for-medical-record-data-extraction I see lots of / some of US EHR/EMR vendors are going to charge "per API call", which means clients will try and minimize API calls, which in FHIR terms means they'll try and squeeze as much data out of a server in one API call as they can, a) thus subverting the nature of REST [multiple more atomic exchanges when one really needs the data], and b) probably leading to a client duplicating the data on their own system rather than requesting a new fresh copy of that data each and every time they need it. (b) has patient safety issues associated with it, one would be working with potentially outdated data.
Whilst I understand that some would like to use a fee per call scheme (reminds me of the fee per message scheme in the 1990s) - such a choice has negative repercussions. IMHO It would be best if US regulators were to move away from this specific payment scheme.
Last updated: Apr 12 2022 at 19:14 UTC