FHIR Chat · Search with limit of returned results · implementers

Stream: implementers

Topic: Search with limit of returned results


view this post on Zulip Shlomy Reinstein (May 10 2017 at 12:00):

Is there a way to limit the number of search results (excluding "_include" and "_revinclude")? We typically want to get only the most recent resource of a specific type, so we often use "_sort:desc=date&_count=1", but according to the FHIR paging specification, _count does not limit the number of resources returned, only the number of resources returned in a single page.
For example, we would like to get only the latest Composition of a specific type for a patient, with all the resources that it references (using _include). How can do that? Using sorting and count=1, we get a paged result, where the first bundle contains only what we requested, but it also contains a link to the next bundle, which provides yet another Composition. How do we know if we should fetch the next page or not? Can we know if the next page is a Composition that is irrelevant for us (since we only need the latest one), or if the next page contains some of the included resources?

view this post on Zulip Lloyd McKenzie (May 10 2017 at 12:11):

Each page contains up to _count of the focal resources for the search -the resource in the search URL - plus all included resources for all of the resources on the page.

view this post on Zulip Lloyd McKenzie (May 10 2017 at 12:13):

So [base]/Patient?_count=10&_revinclude=Observation:subject could come back with a Bundle that contained 10k+ entries - 10 patients plus every observation ever captured for any of those patients. (_revinclude should be used with caution . . .)

view this post on Zulip Lloyd McKenzie (May 10 2017 at 12:15):

As well, when retrieving subsequent pages, you may well see _included and _revincluded resources that you saw on previous pages - there's no expectation that you remember "included" resources you'd see in previous pages. (In part because there's no presumption you'll navigate sequentially. You might hit the "last" link after the first page and navigate backwards

view this post on Zulip Shlomy Reinstein (May 10 2017 at 12:20):

Thanks! Does this mean that the size of a page is not limited in any way? If I ask for a Composition with _include=*, and it contains 1000 references, they will all be returned in a single page?

view this post on Zulip Grahame Grieve (May 10 2017 at 12:20):

could be, yes

view this post on Zulip Lloyd McKenzie (May 10 2017 at 12:27):

Right. Page count doesn't limit number of entries in the Bundle, only the number of primary resources in the Bundle. There is no way to say you don't want 10000+ rows in a single response from a query with _revinclude. (Cardinalities are typically such that this is less of an issue for _include)

view this post on Zulip Jenni Syed (May 10 2017 at 12:59):

Observation implemented this concept of only give me last X as a new operation: http://hl7.org/fhir/STU3/observation.html#lastn

view this post on Zulip Shlomy Reinstein (May 10 2017 at 13:36):

Here is my understanding:
1. There is no way to tell FHIR how many focal resources to return overall. I can just tell it the max number of focal resources to return in a page, but it will provide a "next" link with more resources (which I don't have to fetch). This in itself can be a performance problem, if the server wants the query result to be coherent and provide a "next page" which doesn't take into account changes made between the original query and the fetching of the next page.
2. Since _count specifies the max number of focal resources to return in a page. The server may return a page with fewer focal resources, and provide a "next" link to fetch the rest of the focal resources. Hence, if I wanted to get the 2 "latest" Compositions, the server may return just one of them in the first page, and the next one in the next page.
Is the above true?

view this post on Zulip Shlomy Reinstein (May 10 2017 at 13:37):

(as an elaboration of (1) above, if the server wants to compute the entire query results at one time, and then store them and return one page at a time, it will have to find all the resources in the database that match the search criteria, since nothing in the query provides a limit to the overall query result)

view this post on Zulip Shlomy Reinstein (May 10 2017 at 13:39):

3. There's another issue that may come into play here. How do I know, from the search results, which are the focal resources and which are included? E.g. if I ask for a Composition with _include=*, and some of the included resources are Composition resources.

view this post on Zulip Grahame Grieve (May 10 2017 at 13:50):

1. yes. server writers have not generally raised this as an issue.
2. yes, it may, but I have not seen any servers actually do that
3. see Bundle.entry.search.mode

view this post on Zulip Shlomy Reinstein (May 10 2017 at 13:55):

Re (1) - HAPI FHIR computes the results of a query at one time, sends back the first page to the client, but persists the entire result set in the DB to serve the other pages later, without having to compute the query again (and maintaining a coherent set of results, no matter what happened in FHIR meanwhile). We used to get a lot of issues due to that, and only now I understand the reason (because most of our queries are with _count=1).

view this post on Zulip Grahame Grieve (May 10 2017 at 13:55):

why did you get issues with that?

view this post on Zulip Shlomy Reinstein (May 10 2017 at 13:58):

Because we used to send many concurrent queries, most of which were expected to return a single result (due to _count), when actually many resources were included in the result set, which were sent to storage.

view this post on Zulip Grahame Grieve (May 10 2017 at 13:58):

my server artificially limits to fairly short lists - about 1000.

view this post on Zulip Grahame Grieve (May 10 2017 at 13:59):

no one has ever noticed ;-)

view this post on Zulip Jenni Syed (May 10 2017 at 14:00):

re (1) Yes, that is one of the reasons we added lastN for Observation (there were many other reasons, like how hard it can for a client to find a server that has implemented _count and _sort etc)
re (2) Our server does this. Mostly for performance. IE: you asked for 10,000 and we're only going to get 50 at a time. I don't know that we would stop "2" or a low number from coming back, but we absolutly have a max behind the scenes

view this post on Zulip Eric Haas (May 11 2017 at 12:06):

I would like to see lastN as a connectathon topic. This is something that was on top of the list for Jenni and right now is only in spec for Observation but I can see it be used for other resources to address this issue.

view this post on Zulip Grahame Grieve (May 11 2017 at 12:07):

Do any servers support it at this time? I don't think mine does

view this post on Zulip Eric Haas (May 11 2017 at 12:07):

Hence the need for a connectathon. :-) to socialize the operation.

view this post on Zulip Jenni Syed (May 11 2017 at 14:35):

We will support it as soon as we get the time :) We have many people asking for this.

view this post on Zulip Jenni Syed (May 11 2017 at 14:35):

Up for a connectathon

view this post on Zulip Grahame Grieve (May 11 2017 at 14:58):

I'll support it for San Diego

view this post on Zulip Jenni Syed (May 11 2017 at 15:04):

I can as well

view this post on Zulip Eric Haas (May 11 2017 at 15:05):

I'll create the wiki and be the titular lead.

view this post on Zulip Anand Mohan Tumuluri (May 11 2017 at 22:48):

The grouping defined in Observation.$lastn is very specific to and may only make sense for Observation. It seems to utilize the search parameters and there is potential for a conflict between the explicit max, sorting specified for $lastn with general search parameters like _count and _sort.

view this post on Zulip Elliot Silver (May 11 2017 at 23:22):

On a tangentially related topic, does anyone see a need for something like _summary=id? This would return within the bundle only the resource ids. The client could then retrieve the individual resources separately.
It is an idea present in XDS, and it seems useful to me, but I don't recall ever seeing it discussed. Is there something else about FHIR that makes it not needed?

view this post on Zulip Grahame Grieve (May 12 2017 at 04:57):

no one has asked for it before

view this post on Zulip Jim Steel (May 12 2017 at 06:25):

_elements=id

view this post on Zulip Anand Mohan Tumuluri (May 12 2017 at 06:53):

Nope, the spec says Clients SHOULD list all mandatory elements in a resource as part of the list of elements. So this isnt possible unless there are no mandatory elements in the resource

view this post on Zulip Elliot Silver (May 12 2017 at 08:20):

OK, so it looks like I could get close using _elements. Anyone have a thought about why there hasn't been a demand? (I'm trying to figure out if I'm thinking of the issue the wrong way.)

view this post on Zulip Grahame Grieve (May 12 2017 at 11:03):

Usually want text as well for display purposes

view this post on Zulip Lloyd McKenzie (May 12 2017 at 11:49):

We can't omit minOccurs=1 elements. We could return Bundles that only include fullUrl but no resource, but we'd need to change some of the constraints. If we include the resource, then there's no way to have that without all the resource mandatories present. So you're not going to get a different behavior than _elements=id.

view this post on Zulip John Moehrke (May 15 2017 at 20:41):

see GF#10001


Last updated: Apr 12 2022 at 19:14 UTC