Stream: implementers
Topic: Chained search on multiple types
Paul Church (Dec 01 2021 at 22:19):
I've been taking a close look at some chained search cases recently. In the search spec there's a section:
Advanced Search Note: Where a chained parameter searches a resource reference that may have more than one type of resource as its target, the parameter chain may end up referring to search parameters with the same name on more than one kind of resource at once. Servers SHOULD reject a search where the logical id refers to more than one matching resource across different types. For example, the client has to specify the type explicitly using the syntax in the second example above.
I think what this is saying is that if I search DiagnosticReport?subject.name=Peter, and the inner part of the chain finds matching resources for name=Peter that have colliding IDs, e.g. Patient/123 and Location/123, it SHOULD reject the search.
But why? It could simply return all DiagnosticReport resources that refer to either of those two distinct resources. What's problematic about this case?
Josh Mandel (Dec 01 2021 at 22:24):
Good question. We have similar language earlier on the search page:
servers SHOULD reject a search where the logical id refers to more than one matching resource across different types. In order to allow the client to perform a search in these situations the type is specified explicitly:
which applies even to non-chaining scenarios. I assume you have the same concern with this too?
Paul Church (Dec 01 2021 at 22:29):
Yes, I assume this must have come from the same motivation - in this case we're telling people not to do ?my-reference-param=123 without specifying either :type=123 or type/123. This query cannot be validated up front - whether there are colliding resource IDs depends on the data. But if we've successfully executed the query to the point of realizing that this is the case, why not return the results?
Josh Mandel (Dec 01 2021 at 22:32):
The original language (earliest git history I have on search.html is 2014) was designed to allow static evaluation (not dependent on server state):
Where a chained parameter searches a resource reference that may have more than one
different type of resource as its target, the parameter chain may end up referring
to search parameters with the same name on more than one kind of resource at once.
The parameter names defined in FHIR have consistent types wherever they are used.
Implementers defining their own names need to be sure that they do not create
unprocessable combinations. Servers SHOULD reject queries chained queries
that lead to disjoint types along the path (e.g. the client has to specify
the type explicitly using the syntax in the second example above).
I'm trying to figure we changed this...
Paul Church (Dec 01 2021 at 22:33):
or alternately, don't let queries cross resource types like this regardless of whether colliding IDs actually exist - but that ship has presumably sailed long ago
Josh Mandel (Dec 01 2021 at 22:35):
Yeah, "regardless of whether colliding IDs exist" was the original language and intent. https://github.com/HL7/fhir/commit/144919d370f07e3796a40643a18b1c10d029fb0b is where the spec changed, based on...
Josh Mandel (Dec 01 2021 at 22:35):
Josh Mandel (Dec 01 2021 at 22:36):
I don't think the spec updates in that commit actually reflected the documented resolution in FHIR-8370, FWIW.
Josh Mandel (Dec 01 2021 at 22:43):
I agree we're in a weird place right now @Paul Church. Pragmatically I think:
-
We should allow
Observation?subject.lastUpdated=gt2021
and stuff like it, no matter what typesubject
has. There are legit use cases for this. -
We should disallow
Observation?subject=123
-- it's not really a coherent query, and likely to cause surprises.
Josh Mandel (Dec 01 2021 at 22:44):
But extra pragmatically: we should remove both prohibitions, because they're confusing and hard to disentangle.
René Spronk (Dec 02 2021 at 07:59):
Hmm - I thought the wording was intending to state that one can't assume a parameter (even if it has the same name) to have the same semantics across different resource types. As such referring to e.g. subject.name when there are multiple resource types that match subject is problematic.
Paul Church (Dec 02 2021 at 18:43):
Searching across multiple types by parameter(s) having the same name is explicitly allowed when using parameters that list multiple types in SearchParameter.base, see http://hl7.org/fhir/http.html#vsearch - the Google implementation doesn't do this but it's a precedent for why subject.name might be allowed.
I think the most plausible path forward is to remove both prohibitions because they're confusing and not what appears to have been intended (because queries can't be statically validated). Switching to a different set of prohibitions would be nice but potentially breaking.
Craig McClendon (Dec 08 2021 at 15:41):
I had long assumed (from the original wording) that the intent was to avoid the potential complexity of dealing with diverging branches for multi-chained queries. You could reach a point where a specified search param is not valid for one branch/path where it is for another.
Josh Mandel (Dec 08 2021 at 15:45):
Agree on the problem statement and proposed solution, Paul! Are you willing to create a Jira issue for this and link back to discussion here?
Craig McClendon (Dec 08 2021 at 15:50):
i.e. - what do you do with this?
Observation.subject.birthdate=2011
Josh Mandel (Dec 08 2021 at 19:41):
Assuming you mean Observation?subject.birthdate=2011
: With the prohibitions removed, servers would be free to process that and return results for any linked resources with a birthdate
search param that matches. (Servers could also refuse, and require clients to ask for something explicit like Observation?subject:Patient.birthdate=2021
.)
Craig McClendon (Dec 08 2021 at 22:56):
Correct, Josh - that's what I meant.
Renee has a good point too regarding differing semantics for search parameters with the same name. I don't know if anything like that exists - i.e. two resources have the same search parameter name with a different search parameter type.
I'm also trying to conceive if there are cases where multi-chained paths/branches could diverge and rejoin (requiring de-duplication), create a loop, or other such things. Again, not sure if that's possible.
Anyway, I don't disagree with loosening the language - just thinking out loud where there may be difficult cases which led to the restriction.
Last updated: Apr 12 2022 at 19:14 UTC