Stream: implementers
Topic: Bundle - add faceting
natus (Sep 26 2019 at 21:20):
Bundle provides the optional total which is helpfull to paginate.
In the same spirit, I d'like to add faceting numbers (such lucene faceting). Say the resource Patient has 10 milion entries,
the bundle would have a total=10M . Also It could ship a facet on gender (Male=5M, Female=5M) and also
a facet on year of birth.
The search parameter could be _facet=gender,deaceasedBoolean. I wonder what FHIR type could
represent facets with flexibility. I also guess a bundle extension could be okay, with a kind of key/value map
data type.
Any thought ?
Brian Postlethwaite (Sep 27 2019 at 07:21):
That could be put into an operation outcome in the bundle with info messages if you really wanted to.
natus (Sep 27 2019 at 10:25):
@Brian Postlethwaite this makes sense. Do you mean putting the facets as string in the display element of the operationOucome resource in the Bundle resource ?
Brian Postlethwaite (Sep 27 2019 at 10:29):
Or info messages in it yes.
natus (Sep 27 2019 at 10:32):
no way to have a complex type such Map[key -> intValue] instead of string ?
Lloyd McKenzie (Sep 27 2019 at 13:23):
It should be possible to create an operation that did this, but I don't think it's appropriate to build it into the query mechanism. It would add a lot of complexity and wouldn't be relevant to paging. You could submit a change request if you'd like us to define such an operation as part of R5.
natus (Sep 27 2019 at 20:15):
@Lloyd McKenzie not sure to understand why it wouldn't be relevant to paging. Facet is as static as total
Lloyd McKenzie (Sep 28 2019 at 01:56):
Yes, but 'total' lets you know how many times you'll have to page forward to get everything. The others are just statistics - and there's pretty much an unlimited set of statistics you might want. For example, you might want males and females, but you might also want males and females by age range or males and females who meet a particular criteria vs. not. And most often when you're looking for the statistics, you won't actually care to page through the details. (If you were going to, there'd be no reason to retrieve the statistics in the first place.)
Lloyd McKenzie (Sep 28 2019 at 01:58):
@natus Note that the above just reflects my personal opinion. If you feel strongly it should be part of search, feel free to submit a change request for that instead.
Grahame Grieve (Sep 28 2019 at 10:52):
no one else has expressed interest in the idea yet?
natus (Sep 28 2019 at 14:00):
I didn't find any mention of facet in the zulip history.
@Lloyd McKenzie I admit faceting can be broad, however it might be possible to specifify the kind of facet and level that could be shared exactly how search parameters and _elements are limited in scope
natus (Sep 28 2019 at 14:01):
@Grahame Grieve I didn't find any mention of facet in the zulip history.
@Lloyd McKenzie I admit faceting can be broad, however it might be possible to specifify the kind of facet and level that could be shared exactly how search parameters and _elements are limited in scope
Any search engine provide those statistics natively and easily
Michele Mottini (Sep 28 2019 at 14:46):
Any search engine provide those statistics natively and easily
Uh....no. In our system (SQL Server based) producing male / female stats would involve some extra non-trivial query - because the query would have to deal with mapping of our internal values to FHIR male / female
natus (Sep 28 2019 at 14:56):
@Michele Mottini sure this may not apply to every implementation
Lloyd McKenzie (Sep 28 2019 at 15:37):
The database will typically return a count of total matching records. To do facet counts, you either need to do sub-queries or iterate through all the data (both of which have a significant impact on performance). I'm not aware of any database technologies that would provide facet counts 'for free' the way they provide a count of total records.
natus (Sep 28 2019 at 15:42):
I'm not aware of any database technologies that would provide facet counts 'for free' the way they provide a count of total records.
@Lloyd McKenzie same apply for total matching records: no relational database will know that information without scanning the whole. However, search engine such solr/elastic search provides this for free.
Grahame Grieve (Sep 28 2019 at 18:37):
I didn't find any mention of facet in the zulip history
yes, to my memory no one has brought this up. There's nothing stopping you from doing this in a server, but we generally only standardise something if enogh people express interest in the idea
Lloyd McKenzie (Sep 28 2019 at 19:25):
@Grahame Grieve - the mechanism for doing this on a local basis would be to put extensions in Bundle.meta?
Grahame Grieve (Sep 28 2019 at 19:26):
yes I think so
natus (Sep 28 2019 at 21:08):
so what about the valueType. Is string the best option (with a json in it)?
Lloyd McKenzie (Sep 28 2019 at 22:14):
Your mechanism should ideally work whether the system asks for results in XML, JSON or RDF. So I would go with a complex extension that indicates the facet type and value
natus (Nov 18 2019 at 15:22):
@Lloyd McKenzie you mean a bundle extension ? Apprently bundle cannot have extension (from the documentation):
Although there are no extensions on the Bundle itself, link, entry, and search/request/response can all have extensions
Lloyd McKenzie (Nov 18 2019 at 16:14):
True, but you can stick them inside Bundle.meta
natus (Nov 18 2019 at 17:06):
what about this:
{ "resourceType": "Bundle", "id": "3ec28c39-cc93-42fa-924e-46dd6ee9c65d", "meta": { "extension": [ { "url": "facet-encounter-length", "extension": [ { "url": "1 days", "valueInt": 2000 }, { "url": "2 days", "valueInt": 4000 } ] } ], ... }
natus (Nov 18 2019 at 17:18):
@Lloyd McKenzie the general idea is to propose one complex extention per facet, and limit the facet to several FHIR elements.
say patient.age, encounter.type, encounter.length, documentReference.type ...
natus (Nov 18 2019 at 17:20):
the sub extention would follow a pattern in order to allow range faceting such
"[1, 20[":2000
Lloyd McKenzie (Nov 18 2019 at 17:20):
The base URL would have to be a full URL and the child URLs would have to be valid URL syntax (e.g. change the space to %20), but otherwise I think that should work.
natus (Nov 23 2019 at 11:33):
finally I hve something working with complex extension in the meta of bundles.
The client is able to ask for different kind of facets:
- &facet=gender: returns a histogram of code as complex extension
- &rangeDateFacet=birthdate : returns a histogram of date range as complex extension
- &rangeNumFacet=length: returns an histogram of numeric range as complex extension
- &uniqueFacet=patient,encounter : returns the number of unique patient/encounters that matches the filter as simple extension
This has been implemented with HAPI FHIR and apache solr. Thanks @Lloyd McKenzie and @James Agnew
Last updated: Apr 12 2022 at 19:14 UTC