Stream: implementers
Topic: Searching against numbers
Ivan Dubrov (Mar 14 2018 at 04:09):
I'm confused by how prefixes work for numbers in the search.
Is it correct that value "1.9" is "gt1.91"? The range of the value "1.9" would be [1.85, 1.95], which overlaps with the "above" range of "1.91", which is [1.91, +inf).
It sort of awkward that any number would be "greater than" itself, as the range "above" it will always overlap with the range of the number itself, given the definition of the "above" range.
Or maybe I'm mistaken and the "above" range should actually be [1.915, +inf) for the "1.91" rather than [1.91, +inf)?
So, basically, which one is correct for the ranges "below" and "above":
1) (-inf, value] and [value, +inf)
or
2) (-inf, low_value] and [high_value, +inf)
?
(where "low_value" and "high_value" are computed by subtracting/adding 0.(n0)5, where "(n0)" is the right amount of zeroes according to the number scale)
Grahame Grieve (Mar 14 2018 at 04:15):
"Note: Uncertainty does not factor in evaluations. The precision of the numbers is considered arbitrarily high. (The way search parameters operate in resources is not the same as whether two numbers are equal to each other in a mathematical sense)."
Ivan Dubrov (Mar 14 2018 at 04:25):
I've read that many times, but still cannot quite grasp it. I understand the part about mathematical sense, sure, but I don't really understand the part "uncertainty does not factor in evaluations".
Similarly, if date is given as just the year, let's say, "2018", what would be the "above" and "below" range for it? Before 2018-01-01T00:00:00 and after 2018-01-01T00:00:00? This is what it looks like according to the specification, but "2000" sort of represents the whole year. Shouldn't the those ranges be rather "before 2018-01-01T00:00:00" and "after 2018-12-31T23:59:59"?
I feel that this part of the specification is hard to understand without more "interesting" examples (example given uses "2015-08-12T05:23:45" which is not very interesting as it has exact time specified). Also, examples like "1.9 is considered greater than 1.91" would also help as they would clearly demonstrate the point about the mathematical sense.
Grahame Grieve (Mar 14 2018 at 04:26):
so with numbers, when searching, precision does not play a part. 1.91 > 1.9. As is 1.9000000000001
Grahame Grieve (Mar 14 2018 at 04:27):
your search engine should just add 0s to the number till you get to the working precision limit - or behave as if that has happened
Grahame Grieve (Mar 14 2018 at 04:27):
dates the rules are different.
Ivan Dubrov (Mar 14 2018 at 04:30):
Ok. Does that mean that for numbers I just use "normal" mathematical operators? Like, a > b when a > b, digit by digit?
What confuses me is that section "The range interpretation is provided for decimals and dates. Searches are always performed on values that are implicitly or explicitly a range.", so I tried to make sense of all those "range" rules for decimals.
Grahame Grieve (Mar 14 2018 at 04:33):
well, those two statements are not compatible with each other
Ivan Dubrov (Mar 14 2018 at 17:11):
Let me formulate where I think specification has some issues. Apologies for the long text, but I struggle to make total sense of the specification in this area.
1. There is an inconsistency around number comparisons. One part of the specification calls for "range" semantics ("The range interpretation is provided for decimals and dates. Searches are always performed on values that are implicitly or explicitly a range") and another part of the specification calls for something else ("Uncertainty does not factor in evaluations. The precision of the numbers is considered arbitrarily high").
Based on the previous discussion it seems that for relative comparisons (gt/lt/ge/le) the comparison should be against numbers as they are, without using range semantics. For example, "1.0 < 1.0001" and "1.0001 > 1.0".
It is also not so clear for the equality. For example, when checking value "1.001" against criteria "eq1.0", should it be "true" or "false"? According to the "range" semantics, it should be "true", because range of "1.0" ("0.95" - "1.05") includes range of "1.001" ("1.0005" - "1.0015"). However, according to the "arbitrarily high" precision semantics and the discussion above ("your search engine should just add 0s to the number till you get to the working precision limit - or behave as if that has happened"), it should be "false", as "1.001" != "1.000".
Also, to fully nitpick on this case, this "The precision of the numbers is considered arbitrarily high." is somewhat inconsistent with "The way search parameters operate in resources is not the same as whether two numbers are equal to each other in a mathematical sense". The "mathematical sense" (to me) is exactly about numbers having arbitrary precision (no rounding at all).
2. There is uncertainty around "above" and "below" ranges for the dates. It is not clear what to use as "high" value for the "below" range and "low" value for the "above" range. For example, for the "range above the value" the specification says "The specified value and up". However, it is not clear what is the "value" in this context for dates which are not fully specified (for example, "2015").
Intuitively, I would split into three intervals:
* "below" interval: up to the _minimum_ value ("2015-01-01T00:00:00Z" in case of "2015")
* "value" interval: from _minimum_ value to the _maximum_ value ("2015-01-01T00:00:00Z" to "2015-12-31T23:59:59" in case of "2015")
* "above" interval: from the _maximum_ value ("2015-12-31T23:59:59" in case of "2015").
However, I don't think this is what specification says. For both "below" and "above" it does not distinguish between _minimum_ and _maximum_ value and says about the value itself. Which is, again, unclear, what is the value of "2015"? The whole year or just the point in time "2015-01-01T00:00:00"?
3. Finally, the examples provided in the table explaining ranges (https://build.fhir.org/search.html#prefix) are not very helpful as the cases they use are not very interesting. For example, for dates they use date with time for which both _minimum_ and _maximum_ values are the same as the value itself (because it includes month, day and time).
Also, an example for the date "2015-08-12 has a range from 00:00 to 00:00 exclusive" seems to specify an empty range ("00:00 to 00:00 exclusive"). It (probably) should say "2015-08-12 has a range from 2015-08-12T00:00 to 2015-08-13T00:00" (different day, "13").
Hopefully that explains my frustration with the specification. Unfortunately, the discussion above did not help me to understand what is the desired semantics.
I would really appreciate if other implementers point me at tests they have in public access, so I can compare them to each other and to the specification.
Grahame Grieve (Mar 14 2018 at 19:16):
don't know if @James Agnew has tests. I don't have tests for this part in my server. But agree that the specification says 2 incompatible things here
Christiaan Knaap (Apr 03 2018 at 06:57):
In Vonk we compare Number searchparams as-is, but the value of Quantities are treated as ranges, accounting for their precision (both the value in the resource and the value in the search).
Given the nature of the searchparams of type number (http://www.hl7.org/implement/standards/fhir/searchparameter-registry.html), this seems reasonable for most of them. I'm just not sure about ChargeItem.factor-Override and RiskAssessment.probability.
Craig McClendon (Feb 07 2019 at 19:55):
I think the spec needs clarification around these cases. It seems that everyone is interpreting it differently.
I've spent hours reading it and am still not certain on some cases.
To illustrate, I set up some tests to see how different servers are handling some cases around searching numbers and ranges.
I could only find three public servers to test against at the time.
For each of these, I created a Condition resource with either the onsetAge or onsetRange set.
Posted to each server, then executed a search to see if the search returned the resource. Then I deleted the resource.
Here are the cases that I ran where various servers behaved differently:
********* Test age: age:0.99; searchParam:=eq1.0; found:true; server:HAPI STU3 Test age: age:0.99; searchParam:=eq1.0; found:false; server:Aegis STU3 Test age: age:0.99; searchParam:=eq1.0; found:true; server:Pyro STU3 ********* Test age: age:0.99; searchParam:=ge0.99; found:true; server:HAPI STU3 Test age: age:0.99; searchParam:=ge0.99; found:false; server:Aegis STU3 Test age: age:0.99; searchParam:=ge0.99; found:true; server:Pyro STU3 ********* Test range: low:5.0; high:6.0; searchParam:=eq6.0; found:true; server:HAPI STU3 Test range: low:5.0; high:6.0; searchParam:=eq6.0; found:false; server:Aegis STU3 Test range: low:5.0; high:6.0; searchParam:=eq6.0; found:false; server:Pyro STU3 ********* Test range: low:5.0; high:6.0; searchParam:=gt5.999999; found:true; server:HAPI STU3 Test range: low:5.0; high:6.0; searchParam:=gt5.999999; found:false; server:Aegis STU3 Test range: low:5.0; high:6.0; searchParam:=gt5.999999; found:false; server:Pyro STU3 ********* Test range: low:5.0; high:6.0; searchParam:=ge6.0; found:true; server:HAPI STU3 Test range: low:5.0; high:6.0; searchParam:=ge6.0; found:false; server:Aegis STU3 Test range: low:5.0; high:6.0; searchParam:=ge6.0; found:false; server:Pyro STU3
Test details:
For instance, this output:
Test age: age:0.99; searchParam:=eq1.0; found:true; server:HAPI STU3
I posted a Condition like:
{ "resourceType": "Condition", "contained": [ { "resourceType": "Patient", "id": "xx1" } ], "identifier": [ { "system": "http://example.org/testXX", "value": "0618b587-a842-4298-af74-d41e6c31df4e" } ], "clinicalStatus": "inactive", "subject": { "reference": "#xx1" }, "onsetAge": { "value": 0.99, "system": "http://unitsofmeasure.org", "code": "{years}" } }
Then performed a search like:
{baseurl}/Condition/?identifier=0618b587-a842-4298-af74-d41e6c31df4e&onset-age=eq1.0
Then printed whether the search found the resource or not.
Grahame Grieve (Feb 08 2019 at 04:51):
did you test test.fhir.org?
Craig McClendon (Feb 08 2019 at 14:19):
@Grahame Grieve - I'm receiving errors POSTing to test.fhir.org.
At test.fhir.org/r3 and test.fhir.org/r4:
500, responseBody->Not done yet @ TFHIRTextComposer.Compose
At the base url test.fhir.org:
404, responseBody->Document /Condition not found
NOTE: I have these tests automated so can add servers and test cases re-compare easily.
Grahame Grieve (Feb 08 2019 at 15:25):
what are you posting?
Craig McClendon (Feb 08 2019 at 17:04):
Updated results with test.fhir.org included:
********* Test age: age:0.99; searchParam:=eq1.0; found:true; server:test.fhir.org STU3 Test age: age:0.99; searchParam:=eq1.0; found:true; server:HAPI STU3 Test age: age:0.99; searchParam:=eq1.0; found:false; server:Aegis STU3 Test age: age:0.99; searchParam:=eq1.0; found:true; server:Pyro STU3 ********* Test age: age:0.99; searchParam:=ge0.99; found:true; server:test.fhir.org STU3 Test age: age:0.99; searchParam:=ge0.99; found:true; server:HAPI STU3 Test age: age:0.99; searchParam:=ge0.99; found:false; server:Aegis STU3 Test age: age:0.99; searchParam:=ge0.99; found:true; server:Pyro STU3 ********* Test range: low:5.0; high:6.0; searchParam:=eq5.0; found:false; server:test.fhir.org STU3 Test range: low:5.0; high:6.0; searchParam:=eq5.0; found:true; server:HAPI STU3 Test range: low:5.0; high:6.0; searchParam:=eq5.0; found:true; server:Aegis STU3 Test range: low:5.0; high:6.0; searchParam:=eq5.0; found:true; server:Pyro STU3 ********* Test range: low:5.0; high:6.0; searchParam:=eq6.0; found:false; server:test.fhir.org STU3 Test range: low:5.0; high:6.0; searchParam:=eq6.0; found:true; server:HAPI STU3 Test range: low:5.0; high:6.0; searchParam:=eq6.0; found:false; server:Aegis STU3 Test range: low:5.0; high:6.0; searchParam:=eq6.0; found:false; server:Pyro STU3 ********* Test range: low:5.0; high:6.0; searchParam:=gt5.999999; found:true; server:test.fhir.org STU3 Test range: low:5.0; high:6.0; searchParam:=gt5.999999; found:true; server:HAPI STU3 Test range: low:5.0; high:6.0; searchParam:=gt5.999999; found:false; server:Aegis STU3 Test range: low:5.0; high:6.0; searchParam:=gt5.999999; found:false; server:Pyro STU3 ********* Test range: low:5.0; high:6.0; searchParam:=ge6.0; found:true; server:test.fhir.org STU3 Test range: low:5.0; high:6.0; searchParam:=ge6.0; found:true; server:HAPI STU3 Test range: low:5.0; high:6.0; searchParam:=ge6.0; found:false; server:Aegis STU3 Test range: low:5.0; high:6.0; searchParam:=ge6.0; found:false; server:Pyro STU3
Paul Church (Feb 08 2019 at 19:02):
After re-reading that part of the search spec I have to agree with test.fhir.org - but it's a little bit confusing that significant figures precision applies to eq5.0 => [4.95,5.05) while the values in the explicit range (low=5.0, high=6.0) are treated with no distinction between setting low=5.0 and low=5.0000000. This confusion is specific to eq/ne against a range target value, I think?
Michael Donnelly (Feb 08 2019 at 19:18):
When we discussed this for eq at the May 2018 WGM (Gforge #16369) I don't remember whether we discussed ge and le.
Craig McClendon (Feb 08 2019 at 20:47):
I think test.fhir.org has the "eq" against an explicit target range correct, as per the spec. But I can't say the spec makes intuitive sense to me on this case.
The spec says: the range of the search value fully contains the range of the target value
So for this case: Test range: low:5.0; high:6.0; searchParam:=eq5.0;
range of search value = [4.95,5.05)
range of target value = [5.0, 6.0]
So while the search value is within the target range, the (implicit) range of the search value does not fully contain the (explicit) range of the target. And it seems like it rarely would, unless the target range had a fine precision with a relatively narrow range.
- Assuming I am reading the spec correctly.
Alexander Kiel (Feb 17 2020 at 17:01):
I opened this issue to investigate this further: https://jira.hl7.org/browse/FHIR-26311
Last updated: Apr 12 2022 at 19:14 UTC