Stream: implementers
Topic: Gender discrimination
Simone Heckmann (Oct 16 2018 at 14:41):
The HAPI test server currently resturns a Bundle.total value for search on female Patients but not for male Patients. :smile:
How does that happen!?
Simone Heckmann (Oct 16 2018 at 14:41):
@James Agnew
nicola (RIO/SS) (Oct 16 2018 at 14:45):
Looks like Gender Discrimination :) - let me check on our server
Lloyd McKenzie (Oct 16 2018 at 14:49):
@James Agnew ?
Simone Heckmann (Oct 16 2018 at 17:06):
I think it has to do with the total number of results. Apparently there‘s a threshold at whIch HAPI stops counting... :)
James Agnew (Oct 16 2018 at 18:23):
Lol, this is kinda hilarious.
The reason is definitely around the number of results. The JPA server has some optimizations in it, including the fact that it doesn't try to calculate a total right away for searches that return a lot of results.
We should really make this configurable, the public server never really gets to a volume of data where this calculation is onerous so it doesn't make sense to skip the total and people do find this ind of thing a bit weird. :)
Lloyd McKenzie (Oct 16 2018 at 18:34):
Perhaps send an extension on the total that says "more than 1000" or something like that?
James Agnew (Oct 16 2018 at 22:19):
That's a good idea!
Lloyd McKenzie (Oct 16 2018 at 22:25):
And it'll help teach people about extensions on simple types ;)
nicola (RIO/SS) (Oct 17 2018 at 08:57):
We have similar optimisation - i think, it is worth to discuss and make count in search bundle optional or estimated. We use _countMethod custom parameter to configure this behaviour. @James Agnew @Grahame Grieve what is your opinion about _countMethod param?
Simone Heckmann (Oct 17 2018 at 09:08):
I think total is already optional. The invariant bdl-1 only forbids the use of total for anything other than history or searchset Bundles, but it doesn't enforce it's presence for searchset/history
nicola (RIO/SS) (Oct 17 2018 at 09:16):
I mean - make it configurable per request
James Agnew (Oct 17 2018 at 09:33):
@nicola (RIO/SS) what are the values you accept for that parameter?
We were actually going down the road of letting people use combinations of _summary
, such as _summary=count,data
to let the user request that a count always happen for a big search. Turns out the .net client refuses to do this though because the spec actually prohibits it so that's a nonstarter.
Brian Postlethwaite (Oct 17 2018 at 09:35):
I was actually considering an option to explicitly request _nocount=true so I can skip the calc
Brian Postlethwaite (Oct 17 2018 at 09:36):
Almost sounding like _total=none|estimate|accurate would be better.
nicola (RIO/SS) (Oct 17 2018 at 09:37):
I like naming by @Brian Postlethwaite . We have _countMethod=none|estimated|count :)
James Agnew (Oct 17 2018 at 09:39):
Yeah that's not bad. And I'm assuming there is an easy way to get the .net client to output that parameter?
Brian Postlethwaite (Oct 17 2018 at 09:42):
Yeah, can just appendix as any other search parameter.
Brian Postlethwaite (Oct 17 2018 at 09:43):
_sum.ary has special behaviour
Brian Postlethwaite (Oct 17 2018 at 09:48):
Quiz for other servers, how do you calc an estimate?
I can easy see none or accurate, or maybe trim to a specific size, but other form of estimate?
Vadim Peretokin (Oct 17 2018 at 09:50):
I think that's all of them.
nicola (RIO/SS) (Oct 17 2018 at 09:58):
For postgresql there are stats tables and execution plan estimation - which can be used for that
Brian Postlethwaite (Oct 17 2018 at 10:05):
Ok, thanks. Wondered if might be something like that.
Michele Mottini (Oct 17 2018 at 13:39):
..so _noCount=true|false
or _countMethod=none|estimated|count
or something else? We have to implement the same option in our server - I'd rather go with the 'standard'!
Michele Mottini (Oct 17 2018 at 13:40):
(in our server we can choose between no count or an estimate, there is really no way to get an exact count)
James Agnew (Oct 17 2018 at 13:46):
I'm going to trial the option that @Brian Postlethwaite suggested (_total)
Michele Mottini (Oct 17 2018 at 14:09):
(deleted)
Grahame Grieve (Oct 17 2018 at 18:56):
I'm not entirely sure that I follow. The point here is for the client to save the server work? why would any client bother?
James Agnew (Oct 17 2018 at 19:37):
No, it's the opposite. The server defaults to saving work and the client is requesting that the server do more.
Put another way, if a user is searching for patients named smith, there's a good chance that they really just to page through results and don't actually care how many there are. So issuing a SQL count query is extra overhead. So a server might opt to not include it. But the client might actually care, and this allows them to express that.
nicola (RIO/SS) (Oct 17 2018 at 19:37):
Servers bother - count is a second query, which has a cost comparable with query cost
Grahame Grieve (Oct 17 2018 at 19:39):
so this is a breaking change then?
nicola (RIO/SS) (Oct 17 2018 at 19:39):
no - if default is count
Grahame Grieve (Oct 17 2018 at 19:40):
which means I was right and James is not...
nicola (RIO/SS) (Oct 17 2018 at 19:40):
but client can choose to not count to get data faster
Jenni Syed (Oct 17 2018 at 19:42):
It can be pretty difficult to supply a count (estimated or not) when you start having to cross different tables/data structures under the covers. As a server, I don't want to see this become required as we have places where I don't think we can even produce an estimate
Jenni Syed (Oct 17 2018 at 19:42):
I'm good if the client can "suggest" a behavior - I think we have other places in the spec where that's possible (but the server can ignore it...)
nicola (RIO/SS) (Oct 17 2018 at 19:42):
or we can move default count method into Conformance
James Agnew (Oct 17 2018 at 19:43):
I don't see this as a breaking change. Servers are already not required to supply a count. And HAPI's JPA server has often been opting not to supply one for the last year or so.
Grahame Grieve (Oct 17 2018 at 19:45):
interesting. ok. I agree we don't require it. We don't even say anything about it, which we should
Grahame Grieve (Oct 17 2018 at 19:45):
so we can add a client suggestion. Either as a parameter or a prefer header
nicola (RIO/SS) (Oct 17 2018 at 19:45):
parameter is better :)
Grahame Grieve (Oct 17 2018 at 19:48):
why?
nicola (RIO/SS) (Oct 17 2018 at 19:48):
You can use it in browser :)
Grahame Grieve (Oct 17 2018 at 19:49):
surely you can set the prefer header in the browser?
nicola (RIO/SS) (Oct 17 2018 at 19:49):
How?
nicola (RIO/SS) (Oct 17 2018 at 19:50):
I mean, by manually entering url in browser location string (not js xhr)
Grahame Grieve (Oct 17 2018 at 19:50):
like https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API/Using_Fetch#Headers
Brian Postlethwaite (Oct 17 2018 at 23:38):
I also prefer a parameter rather than header, just easier to handle everywhere...
and do think this is a hint/suggestion rather than requirement (not changing what it is now)
Michele Mottini (Oct 17 2018 at 23:56):
Agreed- parameter is easier
Michele Mottini (Oct 17 2018 at 23:57):
In our case I think we'll continue to return the total, but giving the client the option to disable it could improve performances - and we definitely have clients that could / would use that
Grahame Grieve (Oct 18 2018 at 00:05):
time for someone to create a task. not sure whether it will make it into R4 though
Michele Mottini (Oct 18 2018 at 00:16):
Last updated: Apr 12 2022 at 19:14 UTC