Stream: implementers
Topic: Questions on $document
Richard Kavanagh (Sep 30 2017 at 13:27):
We have a use case for a "document on demand" operation so I have been looking at the $document operation. This operation will not meet our needs but we'd like to keep as close to it as possible.
So a few questions ( based on http://build.fhir.org/composition-operations.html
1) The operation is defined to be valid for URL: [base]/Composition/$document there are no parameters to refine the focus of the generation so how would this create anything that a document for "all" compositions
2) The operation is described as idempotent but is it. Assuming the intent is to drive the document of a focal composition that any resource in the document bundle could change at any time (incl re-versioned) so whilst you probably will get the same result can this be guaranteed
3) The documents states this is a search operation which means that the document bundle will be returned within a search bundle (e.g. a bundle within a bundle).
Note that since this is a search operation, the document bundle is wrapped inside the search bundle
Why is this the pattern chosen when there can only ever be one document returned from the operation? The example on the page does not imply that pattern.
4) Why is there no output parameter defined in the operation definition? Without an output parameter named 'return' would the correct behaviour not be to return a parameter resource ( as per http://hl7.org/fhir/STU3/operations.html#executing)
René Spronk (Oct 01 2017 at 07:17):
As to (1) - the operation also allows for URL: [base]/Composition/[id]/$document which mimics the IHE XDS On Demand option. There is a persist (0..1 boolean) operation parameter to mimic the XDS persistence option.
Ad (2) idempotent - if one were to use a $document on a composition X that is already a document, the result would be document X
Ad (3,4) AFAIK the operation just returns the generated document resource.
Richard Kavanagh (Oct 01 2017 at 19:24):
@René Spronk with regards to [1] then I'm aware of the other URL pattern. I'm puzzled about the one I mentioned though - what is it's purpose?
Richard Kavanagh (Oct 01 2017 at 19:27):
with regards to [2] then I'm still not convinced but perhaps it's because we see a composition as not necessarily being a document, or in the case of "document on demand" the entities existed before they became a document not as a result of the decomposition of a document.
with regards to [3,4] then the spec appears confusing and perhaps inconsistent with other operations.
Either way the operation does not work for us as we have a different use case - we'll create a different operation
Eric Haas (Oct 01 2017 at 22:45):
I thought I posted this before - but I can't find it - so forgive me if duplicative. We did a retrieve a document on demand operation using the DocumentReference operation $docref for Argonaut and plan to update for STU3 in US Core. Is that what you mean by Document on Demand?
René Spronk (Oct 02 2017 at 06:43):
Agree that /Composition/$document seems failry useless, at best an edge use case.
I was thinking about (2 - Idempotence), and the current wording is questionable. If you were to use $document to turn the Composition and all its referenced resources into a document, would the $document operation check if the instance of Composition is already part of a Bundle of type document? If the $document operation does that (it really has no other way of knowing whether the Composition is a "Definition" or an "Document-Instance"), then the operation is idempotent. http://build.fhir.org/composition-operations.html#2.41.13.1 would have to be updated to describe this, as it currently is silent on the expected behavior of the operation in that case.
To go back to your NHS use case: are you in effect creating a $message operation on Composition instances, to turn Compositions into messages, with some of the data needed in the message header as IN parameters ? Makes sense to me..
René Spronk (Oct 02 2017 at 06:53):
@Eric Haas It's a related use-case. Yours is actually closer to what IHE calls the 'on demand option' in XDS - the metadata of the [future] document is known, whenever it's queried it's instantiated.
The current discussion is about the latter part of the instantiation process, part of "whenever it's queried it's instantiated". How does the server accomplish that? That's the $document operation on a Composition resource. The Composition resource effectively holds "the metadata of the [future] document".
Richard Kavanagh (Oct 02 2017 at 12:31):
@Eric Haas I think our use case is different. For us it is more about requesting a document to be constructed following a request to do so. By "document" it's a loose term more looking for a BUNDLE with a COMPOSITION resource as the initial entry with associated entry resources depending on the content. The rules for the content of the document, the sections of the composition, the "entries" required etc are inherent properties of the operation. It probably will not be focused around a single ENCOUNTER and as such is different from any core operations defined.
John Moehrke (Oct 02 2017 at 13:06):
There are three different concepts that IHE has defined. ( a ) On-Demand, this is a pattern published by the source where the pattern is usually something like "Current Medical Summary". The On-Demand usually creates something that is 'current'. There is an option that keeps any document that was created as a snapshot for records keeping. ( b ) Deferred-Creation, this is a specific document that could be created but has not yet been made. Usually a "Discharge Summary". That is it has a well defined timeframe. When this one is created it replaces the deferred-creation entry (DocumentReference) (((or should, not everyone keeps it). ( c ) Fetch (don't have a good name), this is where the requesting system defines the attributes they desire in a document, such as the timeframe the document should cover.
Eric Haas (Oct 02 2017 at 13:57):
deleted
Richard Kavanagh (Oct 02 2017 at 17:12):
@John Moehrke Where is the IHE worked detailed - it would be good to see if we could plan an intercept. We would be looking at either (a) or (c) from your list.
John Moehrke (Oct 02 2017 at 17:42):
The IHE definitions are today in XDS terms, not FHIR.... So far no one has asked for on-demand or delayed-assembly, or fetch from a FHIR api, or for a FHIR document... both are logical, but IHE has not yet received the request for a new work item.
John Moehrke (Oct 02 2017 at 17:43):
But I did want to clarify the meaning of the concepts, so that we can have similar use of the concept-name... and not have something like "on-demand" mean something totally different in IHE-XDS as it does in FHIR...
John Moehrke (Oct 02 2017 at 17:44):
Here is Fetch (very immature system, few want to implement this method. There are uses of on-demand that act like I have described for Fetch) http://wiki.ihe.net/index.php/Cross_Community_Fetch
John Moehrke (Oct 02 2017 at 17:46):
delayed assembly http://www.ihe.net/uploadedFiles/Documents/ITI/IHE_ITI_Suppl_Delayed_Doc_Assem.pdf
John Moehrke (Oct 02 2017 at 17:50):
On-Demand is part of Final-Text XDS http://www.ihe.net/uploadedFiles/Documents/ITI/IHE_ITI_TF_Vol1.pdf#nameddest=10_1_1_7_On_Demand_Document_Sou
Eric Haas (Oct 02 2017 at 18:29):
@John Moehrke how does use case (a) different from the argonaut $docref - I thought they were the same thing?
John Moehrke (Oct 02 2017 at 18:30):
I don't know. I am not on the Argonaut project. I am not sure even where it is documented
Eric Haas (Oct 02 2017 at 18:35):
Never mind I reread Rene's post above - is the same (Thanks Rene') link is above.
Grahame Grieve (Oct 03 2017 at 01:53):
back to @Richard Kavanagh's original questions:
1) - i'm pretty sure that's an error in the spec. If an operation allows execution at the type level, it must explain what that means, and this doesn't, and I can't imagine how it could. So let's assume it will be removed as an option
2) from a defnition of "idempotence": 'Note that while idempotent operations produce the same result on the server (no side effects), the response itself may not be the same (e.g. a resource's state may change between requests)'. - so, yes, the operation is idempotent in that sense. however if persist=true then the operation is not strictly idempotent. I'm not sure whether that's a problem or not.
3) the sentence "Note that since this is a search operation, the document bundle is wrapped inside the search bundle" is bizarre and confusing. it's just wrong. Not one of my prouder moments. A big clue that it's wrong is that the example doesn't conform to it, nor does the definition
4) The definition is also incorrect not to define a return parameter.
Richard - it would be good if you create a task to correct the definition here
Lloyd McKenzie (Oct 03 2017 at 06:20):
Richard Kavanagh (Oct 03 2017 at 20:34):
I see @Lloyd McKenzie got there first :-)
Rick Geimer (Oct 05 2017 at 15:48):
Had some discussion about adding a url param to $document to resolve this. Please review
https://gforge.hl7.org/gf/project/fhir/tracker/?action=TrackerItemEdit&tracker_item_id=13987
Rick Geimer (Oct 05 2017 at 16:05):
Also we discussed adding a graph parameter to pass a GraphDefinition, and during that discussion there was near unanimous agreement within SDWG that the default behavior for $document should be to pull in the full transitive closure of all resources referenced from Composition, and that one would use GraphDefinition if they wish to restrict this behavior. We are looking for feedback on this though before approving since I know some (@Grahame Grieve ) have other opinions..
Grahame Grieve (Oct 05 2017 at 19:52):
I think the default behaviour should be to include all the resources that a document is required to contain, since the full transitive closure can be very long indeed
Lloyd McKenzie (Oct 05 2017 at 20:54):
Transitive closure isn't likely what you want. For example: List-MedicationPrescription->Medication->Product(as ingredient)->Organization(as manufacturer)->Organization(parent)->EndPoint
Lloyd McKenzie (Oct 05 2017 at 20:55):
You certainly want the prescription. You might want the medication and maybe the product. You almost certainly don't want the manufacturer's parent's endpoint
Lloyd McKenzie (Oct 05 2017 at 20:56):
In some cases, transitive closure could get you half the patient's EHR. As well, there'll be times when you want Provenance or other records that point to a record (e.g. procedures that point to an Encounter) and you definitely don't want to do transitive closure of reverse includes - that definitively will give you the entire Patient's EHR. GraphDefinition is the logical way to define the boundaries.
Rick Geimer (Oct 05 2017 at 21:23):
I disagree. Transitive closure is what you want in general when you are crafting a document. Sometimes you want to exclude stuff if you find you are getting too much. My point is the exclusions are edge cases, and usually you want the entire document as authored.
Grahame Grieve (Oct 05 2017 at 21:24):
I think it depends on how well indexed and extensive the source you are generating from is
Grahame Grieve (Oct 05 2017 at 21:25):
I think you've brushed some of Lloyd's concerns under the table too easily there
Rick Geimer (Oct 05 2017 at 21:25):
The worry about getting too much does not actually play out in practice. It is the typical (and sometimes valid) programmer fear of infinite recursions/loops. In practice, especially with documents, it doesn't actually play out that way, and I would encourage you to find examples of existing documents that actually cause problems instead of worrying about the possibility that they might.
Rick Geimer (Oct 05 2017 at 21:26):
Because documents that are missing data are an immediate problem right now (not hypothetical).
Grahame Grieve (Oct 05 2017 at 21:26):
'existing documents'..... not sure what you mean. how does existing documents factor into the discussion?
Rick Geimer (Oct 05 2017 at 21:27):
Real world documents. Maybe try out the transitive closure approach on every Composition on all FHIR servers today and see how many (if any) actually pull in half the patient record.
Grahame Grieve (Oct 05 2017 at 21:30):
from our point of view, this test is inadequate.To do a proper test, you would need to
- find an EHR system that supported $document
- find a production instance that had genuine records
- choose a patient with a chronic disease
Rick Geimer (Oct 05 2017 at 21:30):
I suggest using ClinFhir to look at the resource graph for documents. It does not only pull in "all the resources that a document is required to contain" (which is what SDWG wants to redefine by the way), it does what $document should actually do. And I have never seen it explode.
Grahame Grieve (Oct 05 2017 at 21:30):
the nearest we have to this now is the synthea data set, but while it's very wide, it's also quite shallow compared to real records
Grahame Grieve (Oct 05 2017 at 21:31):
do you understand why Lloyd and I are not comforted by arguments based on mock data?
Rick Geimer (Oct 05 2017 at 21:33):
Yes, but do you understand why I am not comforted by arbitrarily cutting off documents based on the potential of getting too much?
Rick Geimer (Oct 05 2017 at 21:33):
Without even mock data to back it up?
Rick Geimer (Oct 05 2017 at 21:35):
Simple example, what happens today when Composition.author points to PractitionerRole? What happens when a Medications section points to a List of MedicationStatement resources?
Grahame Grieve (Oct 05 2017 at 21:36):
seems to me that you want to say more here for a start: http://build.fhir.org/documents.html#content
Rick Geimer (Oct 05 2017 at 21:38):
From that page: The Composition resource, and any resources directly or indirectly (e.g. recursively) referenced from it
Rick Geimer (Oct 05 2017 at 21:38):
Seems to say what we want already.
Grahame Grieve (Oct 05 2017 at 21:39):
this is the bit I was interested in:
Grahame Grieve (Oct 05 2017 at 21:39):
Any resource referenced directly in the Composition SHALL be included in the bundle when the document is assembled. Specifically, this means the following resource references:
Composition.subject
Composition.encounter
Composition.author
Composition.attester.party
Composition.custodian
Composition.event.detail
Composition.section.entry
Other resources that these referenced resources refer to may also be included in the bundle if the document construction system chooses to do so. Including these additional resources will make the document bigger, but will save applications from needing to retrieve the linked resources if they need them while processing the document. Thus, whether these linked resources should be included or not depends on the implementation environment.
Grahame Grieve (Oct 05 2017 at 21:39):
you are quoting the outer limits, I am quoting the inner limits
Grahame Grieve (Oct 05 2017 at 21:40):
I think your comments imply that the inner limits should say a little more...
Rick Geimer (Oct 05 2017 at 21:41):
That last paragraph is what SDWG is considering clarifying (edited: we did not actually discuss that paragraph specifically, we discussed what documents need to contain at a minimum in general), because that's just not how documents work today. I refer to my comments above about PractitionerRole and List as examples.
Rick Geimer (Oct 05 2017 at 21:43):
In my opinion, inner limits = outer limits by default. You need to limit by exclusion. Today it is the opposite, you need to explicitly add what you want via a graph (assuming we add that param) vs. respecting what someone actually authored (or a system generated).
Grahame Grieve (Oct 05 2017 at 21:44):
I thought we were talking about how $document works. what SDWG considering clarifying exactly?
Rick Geimer (Oct 05 2017 at 21:45):
Both. $document is not including enough, and the minimum requirements of a document do not require enough.
Rick Geimer (Oct 05 2017 at 21:45):
And this is very valuable feedback to bring back to SDWG by the way, thanks.
Grahame Grieve (Oct 05 2017 at 21:46):
ok. so the minimum requirements are the bit I quoted. You quoted the maximum allowed
Grahame Grieve (Oct 05 2017 at 21:46):
what changes to the minimum limits for a document (as opposed to the default for $document) are we considering?
Rick Geimer (Oct 05 2017 at 21:47):
The consensus from today's discussion is that all referenced resources (recursive) is the desired default behavior for $document. I agree we have more work to do on the minimum allowable, which we did not discuss today.
Rick Geimer (Oct 05 2017 at 21:49):
And by the way on the SDWG call I was actually suggesting that we just implement the graph parameter, and find a simple way to create a GraphDefinition that represents the transitive closure, and the backlash was overwhelming.
Grahame Grieve (Oct 05 2017 at 21:51):
well, ok. servers will just have to return an error if they think the graph size is getting out of hand. I typically call it a day at around 500 resources
Rick Geimer (Oct 05 2017 at 21:53):
I think a limit like that is reasonable. Maybe make that limit a parameter, and if unspecified it is a server-defined value in the CapabilitiesStatement somewhere?
Rick Geimer (Oct 05 2017 at 22:03):
And if you have a really large document (I once wrote a publishing system to handle A380 maintenance manuals if you want to talk big), then an implementer would need to write their own code to retrieve/traverse the resources individually and assemble the Bundle they want.
Grahame Grieve (Oct 05 2017 at 22:04):
or they can ask the server manager out of band and get special approvals. I don't think that there's a lot of call for clinical documents that are that big
Rick Geimer (Oct 05 2017 at 22:04):
Agreed
Rick Geimer (Oct 05 2017 at 22:05):
But then again, you should take a look at some of the CCDs that I have seen. 60+ page "summary" documents come to mind. Would be happy to see them go away.
Rick Geimer (Oct 05 2017 at 22:06):
Anyway, signing off for now. Will check into this thread later tonight or tomorrow.
Grahame Grieve (Oct 05 2017 at 22:07):
I've heard from users about CCD documents that are that big ;-)
René Spronk (Oct 06 2017 at 09:33):
Reminds me of a discussion in RIMBAA 7 years ago: http://wiki.hl7.org/index.php?title=Safe_querying_of_a_RIM-based_data_model : (ok, v3 stuff, so some reading between the lines is necessary) who gets to decide what should be included in a document, is it the document creator, or the client requesting that a document be created? And how does one create 'clinically safe' ObjectGraphs, if such a thing is possible at all?
René Spronk (Oct 06 2017 at 09:38):
IMHO including everything and the kitchensink (as default behavior) is not the way to go, one would simply be flooded by too much data. When a server executes $document FHIR should provide a reasonable minimalistic definition as to what should be included, but leave the details to the server and its context. If a client explicitly provides a parameter to $document to indicate a specific ObjectGraph, or sets kitchensink=true, then that's the clients decision, which may or may not be supported by the server.
Rick Geimer (Oct 06 2017 at 13:40):
@René Spronk Remember that we are not talking about automated documents here. We are talking about bundling the references in the Composition as constructed. Someone could certainly construct a Composition that represents the kitchen sink (i.e. many CCDs), but there are other documents like a History and Physical, and Op Note, etc. that will very likely have deeply nested references with a finite (and rather small) closure. For what "should" be in a CCD or other document when you construct the Composition, I will refer to the output of the Relevant and Pertinent project (talk to Bob Dieterle and Keith Boone about that).
What I am arguing here for is that the server not cut off a document after an arbitrary number of reference levels. But to Graphame's point, I think it is perfectly acceptable for a server to cut off document processing when it consumes too many server resources. This could be defined as using too much RAM, taking too much time, or as Grahame suggested when it has Bundled up 500+ resources and shows no sign of stopping.
In other words, stop an actual runaway train, don't stop all trains arbitrarily because any train "could" run away.
John Moehrke (Oct 06 2017 at 14:22):
I think the point is that the server gets to define the pattern they are willing to fulfill. Thus the pattern should be clear. The pattern should not cause a server to get to 'too big'. This would be poor planning. However it will happen, simply because reality....
John Moehrke (Oct 06 2017 at 14:27):
We have run into cases where the requesting system can tell that the result is going to be 'too big' for the client to process. This seems to be a topic here. How does the client express some limits that the client can predict. Clearly the client must handle gracefully whatever comes back, but if it can indicate a 'too big' limit, then the document creation can be more graceful. These should be marked as being a subset by client request. That said, some documents simply can't or should not be arbitrarily shortened based on a client request. These documents would not be 'complete' or 'whole' or 'authenticatible'..... They are intended to be statements of a 'document' and thus must be taken whole or not-at-all... so graceful degradation is needed. right?
Lloyd McKenzie (Oct 06 2017 at 17:08):
When FHIR resources are created, they get interlinked with all sorts of other content. And none of that linking gets done with any thought as to how it will impact on document generation. I think the notion of just arbitrarily traversing all links - especially if we do reverse links - is a really bad idea. GraphDefinition is a much better solution. It allows you to approach the problem from a design perspective - you consciously decide what should be included vs. not based on what relationships are relevant.
Lloyd McKenzie (Oct 06 2017 at 17:09):
Even if traversing all links isn't always a problem, it's definitely going to be a problem in some systems and for some patients. And reverse links are going to be relevant in some situations and there's no way you can traverse all of those without including the whole EHR.
Elliot Silver (Oct 06 2017 at 17:47):
How would document generation with or without use of GraphDefinition work with resources spread across servers?
Lloyd McKenzie (Oct 06 2017 at 18:44):
Whether you're guided by transitive closure or graph definition, nothing in $composition prevents you from traversing servers as you gather relevant information - and in many environments, it may be necessary to span servers even to include the "mandatory" resources (those directly referenced from Composition).
Elliot Silver (Oct 06 2017 at 18:48):
Right, but traversing other servers to obtain resources to put in a response is not a "normal" FHIR behavior. In many ways this is a fancy search operation, and standard search doesn't traverse servers.
Elliot Silver (Oct 06 2017 at 18:50):
Should traversal, where necessary, be a requirement for this operation? Or should there be an explicit statement that "there is no expectation of traversal"? Or should we just remind people "some document resources may live on other servers" and let them figure out the approach they want?
Grahame Grieve (Oct 06 2017 at 18:51):
I don't think of it like a search operation. My implementation is completely separate too. There is an engine that aggregates resources by following links in the resources. My implementation will not follow links to another server - but that is principally a management choice; following links on another server is conceptually equivalent
Rick Geimer (Oct 09 2017 at 14:38):
@Lloyd McKenzie I'm not proposing doing reverse links. I agree that is a bad idea. As for forward links I think it is absolutely necessary if you want to be able to move a document between servers and persist the entire content for legal retention periods, especially when the original server may disappear over the course of seven or more years. I think requiring someone to first author a document, then create a GraphDefinition showing what they want in it is redundant. It's like authoring the document twice. Better to author the document once when you create the Composition, then if you find $document generates more than you want, use GraphDefinition to reduce the scope.
Lloyd McKenzie (Oct 09 2017 at 15:26):
If you want to persist a document for legal retention, you store the binary. If you split the Bundle up into its constituent resources, you will almost certainly lose data and will lose the ability to validate any signatures. Because when you store a resource RESTfully, there's no expectation you must be able to store all extensions or even core elements, nor is there any expectation that you'll order the resources in a reassembled Bundle in the same order they originally were.
Lloyd McKenzie (Oct 09 2017 at 15:27):
In any case, that's orthogonal to whether you want the transitive closure of inclusion. And it doesn't address the fact that sometimes in a document you'll want to include a few _reverseinclude resources - e.g. Provenance.
Lloyd McKenzie (Oct 09 2017 at 15:28):
If you want to be able to generate a document, then you need a document definition that defines what goes in the document - which would be both a Composition and a GraphDefinition that sets out what content should be included vs. not. The GraphDefinition allows the server to include the relevant data (and only the relevant data). If you're manually crafting a document, then there's no need for GraphDefinition.
John Moehrke (Oct 10 2017 at 14:59):
Note if you do decompose a document into Resources, then the resulting Resources could have a Provenance record pointing to the original document received. Thus you both save the original document and you make the content available as FHIR resources. THIS is what IHE has defined in the mXDE profile http://wiki.ihe.net/index.php/Mobile_Cross-Enterprise_Document_Data_Element_Extraction Initially documented as doing this for CDA documents, but would work just as equally for FHIR documents, or any document that one can derive FHIR Resources from... The 'value-add' is the Provenance linkage back to the source 'document'....
Last updated: Apr 12 2022 at 19:14 UTC