Stream: implementers
Topic: Extending $lastn
Paul Lynch (Apr 29 2020 at 23:48):
NLM has been working with SmileCDR to get $lastn implemented in HAPI FHIR, and we have some extra capabilities we would like add to the $lastn specification:
1) $lastn takes a category or a code. We would like be able to specify multiple codes (and get back n Observations per code).
2) $lastn takes a patient. We would like to be able to specify multiple patients, and get back n Observations per patient/code combination.
3) If no codes are specified, return n of each code per patient.
4) If no patients or codes are specified, return n of each distinct code (across all patients). This is one we are less certain about, but it would be useful for getting a list of unique codes actually used in the database. The more consistent interpretation would be n of each code/patient combination, but we think that would not be useful to anyone working with a realistically-sized data set.
There is also the issue that Observation takes a CodeableConcept, so it can have more than one code, but for simplicity in considering the above, let's assume one code per Observation which (I am told) is usually the case.
Grahame Grieve (Apr 29 2020 at 23:49):
I don't understand 3 - if no codes are specified, return n of the codes that aren't specified?
Lloyd McKenzie (Apr 30 2020 at 02:04):
3 would be of all the codes that exist for that patient on that server
Lloyd McKenzie (Apr 30 2020 at 02:05):
For the multiple patients and multiple codes, it might be nice to point to a group or a value set, respectively
Grahame Grieve (Apr 30 2020 at 02:11):
what if there's 100s of codes? What if they are grouped or not by CodeableConcept?
Lloyd McKenzie (Apr 30 2020 at 02:18):
If there are 100s of codes, you'd get a lot of data - probably bulk data retrieval time if you did it for all patients
Paul Lynch (Apr 30 2020 at 14:37):
In our flowsheet demo (https://lhcflowsheet.nlm.nih.gov/, click "select patient") we typically pull in 1000 or more observations at once, so 100s of codes is not necessarily an issue depending "n" and the number patients.
James Agnew (Apr 30 2020 at 15:26):
With regards to 3- I don't know if this is even technically feasible, but a reasonable approach might be for the server to be allowed to reject the request if there are so many codes that it can't reasonably answer, but to answer otherwise.. This feels consistent with how other things work to me.
In terms of "what about CodeableConcept with multiple codes".. We could perhaps allow a syntax similar to in search where you could ask for |http://loinc.org
which means that you'd get any LOINC codes but not any equivalents?
Paul Lynch (Apr 30 2020 at 15:44):
I would think the amount of data returned at once would be constrained by the server's page size (possibly set by _count).
Xiaocheng Luan (Apr 30 2020 at 16:10):
For (3), the number of possible (patient, code) combinations can be a real concern (and thus limiting its usefulness), as implied in @James Agnew 's post. One possible approach is to interpret the "subject" parameter as a "filter", that is, it limits the code set to those that appear in the records of the given subject(s). When no subject(s) is provided, it includes all the codes in all observations. This interpretation is consistent across all the four use cases and should be practical for implementation as well. For use case(2), it can be achieved using batched/bundle of operations based on (1), without incurring too much overhead. For example, for randomly selected 100 patients in the (moderate) Regenstrief 10K data set, there are about 20K patient-code combinations, which is probably the practical limit for what you can do in real time applications.
Paul Church (Apr 30 2020 at 16:47):
It's not easy to paginate the response if the number of combinations gets out of control. You can constrain the amount of data returned per page but the server probably needs to be able to marshal the entire result set.
Ian Marshall (May 04 2020 at 22:09):
Just to add to Paul Lynch's and Xiaocheng's comments, for numbers 1, 2 and 3, I believe the idea is that $lastn would return a list of observations ordered and grouped first by Subject (if more than one is specified) and then Observation Code (if more than one is specified). For 3, if no Observation Codes are provided, then all Observation codes applicable for the patient(s) would be included in the result set.
Its understood that the result set could potentially be extremely large depending on the number of patients and observation codes. This is intentional as one of the intended purposes would be to generate large data sets for research purposes. Assuming that the server is able to handle the query and that the observations in the response are ordered by Subject and Code, it should be possible to paginate the response if needed.
For number 4, the use case that NLM is trying to address is that they need a way to retrieve the list of distinct Observation codes for all Observations in the repository. The idea here would be to adapt the $lastn operation so that if it is invoked with no subject params and no code params, it would simply return the last N most recent Observation resource(s) for each distinct Observation code. If the max param is set to 1, it would simply return the most recent Observation resource for each distinct Observation code in the database.
Paul and Xiaocheng - feel free to correct or add to my comments above.
Last updated: Apr 12 2022 at 19:14 UTC