Stream: implementers
Topic: Provenance - understanding what happened
Morten Ernebjerg (Aug 04 2021 at 08:55):
We are looking at creating Provenance instances for resource we ingest into our system, mainly to document when and from where they were imported. After reviewing Provenance, it seemed activity
was the key element for indicating what actually happened (creation, update, signing...). So a mapping like this seemed appropriate for us
activity
: "CREATE" (fromhttp://terminology.hl7.org/CodeSystem/v3-DataOperation
)agent.type
= "performer" (fromhttp://terminology.hl7.org/CodeSystem/provenance-participant-type
)agent.who
= Reference to the (customer-specific) system doing the ingestion
But then I noticed that the simplest Provenance example in the spec does not use activity
, and that the US Core IG Provenance profile does not declare activity
must-support (and does not use it in this official example). Hence the following questions:
- How is one supposed to understand what happened with the target resource if
activity
is not used (as in the two examples linked above)? E.g. should it be inferred fromagent.type
(if present) or does it implicitly default to resource creation? I am thinking particularly of scenarios in which one cannot reference specific versions of the target resource (not possible on our system), so that provenance instances for creation and update would have exactly the same target. - Following on that, is the usage of
activity
andagent
I suggested above appropriate for our case?
Vivian Neilley (Aug 04 2021 at 12:14):
We interestingly use agent.who as the reference to the document it came from, not necessarily the source system. Would be interesting to see how others are implementing (https://github.com/GoogleCloudPlatform/healthcare-data-harmonization/blob/master/mapping_configs/hl7v2_fhir_r4/projector_library/datatypes.wstl)
John Moehrke (Aug 04 2021 at 12:20):
Noted. The security wg is addressing some fantastic Change Requests from the last ballot. The issues you are pointing at are clearly within this set of questions. The use of the Change Request system is very important to improving the specification.
John Moehrke (Aug 04 2021 at 12:23):
The vocabulary binding to .activity is going to radically change. It should be only recommending event type codes, and the .activity is intended to hold a code that expresses the type of event that caused the change. The current vocabulary has a large number of inappropriate codes.
John Moehrke (Aug 04 2021 at 12:39):
@Morten Ernebjerg seems you are describing an activity of importing data. so you would want an .activity that says importing. -- There does not seem to me to be a good code in the valueSet given (as I said above, this is known to be a useless valueset).
I don't think that you would be doing a "CREATE" as you are importing, not creating.
We should find a good code for Import. In IHE I used http://hl7.org/fhir/w3c-provenance-activity-type#Derivation
I am not sure that is the right code to use either.
John Moehrke (Aug 04 2021 at 12:42):
so you can see why few have used .activity... it is indeed poorly defined.
John Moehrke (Aug 04 2021 at 12:47):
as to the question (1)... I am not sure I fully understand your question. If the question is about the .target and how to handle when version specific targets are not supported? This is not something we address. If you have creation times, you can correlate creation times with the provenance recorded with that time, and thus all other provenance are updates in the order of time. You can even just sort the provenance by time and the oldest one is most likely a create. Time is the best key you have if you don't have a versioning FHIR server.
John Moehrke (Aug 04 2021 at 12:51):
the agent.type would differentiate multiple agents from each other. So you might have an agent for the source that sent you the data, and one for the system doing the import.
Again, this vocabulary is poor, and we are working on it. Would be good to get feedback on the kinds of things you want to say about agents.
John Moehrke (Aug 04 2021 at 12:52):
Vivian Neilley said:
We interestingly use agent.who as the reference to the document it came from, not necessarily the source system. Would be interesting to see how others are implementing (https://github.com/GoogleCloudPlatform/healthcare-data-harmonization/blob/master/mapping_configs/hl7v2_fhir_r4/projector_library/datatypes.wstl)
This is wrong. the reference document would go in an .entity. A document can't be an .agent.
John Moehrke (Aug 04 2021 at 12:55):
I thank you both for bringing these forward. It is inspiring me to make R5 better. I might also work on an informal IG on Provenance similar to what I have done for Basic Audit http://build.fhir.org/ig/JohnMoehrke/BasicAudit/branches/main/index.html
Morten Ernebjerg (Aug 04 2021 at 13:34):
Hi @John Moehrke, thanks for all the feedback and for picking up all these themes! There are several interesting issues here so I will try to separate them a bit:
Value sets bound to activity
and other elements
You hit the nail on that one, I was indeed very confused about them and what would be an appropriate choice for a given case :smile: (e.g. having things like "analyte" or "origin" as possible codes for an activity). Great you are working on those!
Appropriate code for activity in our case
This was indeed smt. I was unsure about, not only because of the bound value set. I think my key confusion is whether the Provenance is talking about (1) the data itself, independent of the specific technical incarnation, or (2) the specific technical artefact I am creating on my system. If it's talking about (2), then "CREATE" seems appropriate since I am indeed creating a new artefact on my system (the one I point to in target
). If it's talking about (1), then "CREATE" is clearly not the full story since the activity that actually happened was that data flowing in from one system was piped into a new artefact on my system But I would certainly be happy with a code saying "importing", only , as you say, there is no obvious candidate.
Interpreting Provenance without explicit activity
What I meant by my question (1) was this: If I am given a Provenance resource with only target
= A and agent
= B (and recorded
= T), it seems all I know is "Agent B did something to resource A and recorded it at time T". Even if I can time order different Provenances instances, this seems quite vague . E.g. there would be no way to distinguish updating, reviewing, signing, adding a security label etc. I suppose I'm simply struggling to understand what useful information I can extract from this. A more concrete way of asking this would be this: Suppose I have a UI and the users can click on a resource to see a brief summary of its provenance in a pop-up (this is essentially our use case). If I have an "activity-less Provenances", what could I display to users (that would bring them value) in that pop-up?
What I would like to say about agents in our current use case
In our case, we basically want to say that a particular instance of our ingestion pipeline (so basically a Device) moved this piece of data onto to our system from somewhere else (we will typically not know what that "somewhere else" is). So it is a non-human actor that blindly passes the data on without looking into or modifying it. For agent.type
, I had a hard time finding a good code to communicate that the ingestion system is just piping the data into a different system without modifying it. Most of the existing codes other than the very generic "performer" seem to indicate more active involvement than in my case - in general, the value set seems mainly targeted at human actors. I see that the US Core Provenance profile add the participation type code "transmitter, which is a good match for our use case, so that seems like a very sensible addition.
Morten Ernebjerg (Aug 04 2021 at 13:40):
BTW just let me know if you need Jira issue for smt. specific.
Vivian Neilley (Aug 04 2021 at 14:06):
Agreed. Activity depends on reconciliation of the data - which is handling fhir to fhir after initial mapping - thus - activity wouldn't be as relevant (it would always create, then be reconciled).
John Moehrke (Aug 04 2021 at 16:36):
so the current plan for the activity valueset is to change to Example binding, and include just a handful of codes. So would love to hear of codes being used as those would be good handful. -- https://jira.hl7.org/browse/FHIR-33020
John Moehrke (Aug 04 2021 at 16:37):
more specifically on Provenance.activity - https://jira.hl7.org/browse/FHIR-32517
John Moehrke (Aug 04 2021 at 16:38):
I think you might be looking to .activity when you should be looking for .reason... but the current model does not make this clear. There is big changes coming to .reason -- https://jira.hl7.org/browse/FHIR-32354
John Moehrke (Aug 04 2021 at 16:42):
I agree with your assessment that the Provenance is more about the activity than the RESTful action. Although recording the RESTful action is sometimes all that can be done, and is not uncommon for a FHIR Server to be able to automatically do. Thus an initial instance of some target resource might have a simple Provenance, as recorded by the FHIR Server, and a more comprehensive Provenance as recorded by the session level application/service that knows more. Might even have others.
John Moehrke (Aug 04 2021 at 16:45):
We do have a request for clarity for systems that are just transforming - https://jira.hl7.org/browse/FHIR-26313
John Moehrke (Aug 04 2021 at 16:47):
Security workgroup meets on Mondays - Here is the agenda/minutes from this week with the details on how to participate.
https://confluence.hl7.org/display/SEC/2021-08-02+FHIR-Security+Meeting+Agenda
Morten Ernebjerg (Aug 06 2021 at 07:56):
Thanks for the JIRA links @John Moehrke - reviewing the issues I realized that "assembler" is probably the more appropriate agent.type
for us.
so the current plan for the activity valueset is to change to Example binding, and include just a handful of codes. So would love to hear of codes being used as those would be good handful. -- https://jira.hl7.org/browse/FHIR-33020
I just saw the activity codes from http://terminology.hl7.org/CodeSystem/iso-21089-lifecycle (part of the bound VS in the latest build). On a quick review, these look like a very useful selection for our world (if they are free to use, cf. this thread).. One curious thing about that code system, though, is that it is does not contain an obvious code for a plain create ("originate" seems close, but the definition sounds more convoluted).
John Moehrke (Aug 06 2021 at 11:53):
almost all valuesets in Provenance will be moving to example and just have a handful of useful codes.
Last updated: Apr 12 2022 at 19:14 UTC