Stream: Security and Privacy
Topic: Meaning of Security Labels on Bundles
Mohammad Jafari (May 06 2020 at 17:41):
This was brought up at the Security WG meeting and I wanted to raise it within the broader community here. Are there any use-cases or rationale for putting high-watermark labels on FHIR bundles? I am wondering since a bundle is not quite the same as an outer envelope and does not bear routing information (at least as far as I understand) in what scenario the client could benefit for high-watermark labeling given that the client has full access to the labeled resources in the bundle.
This takes me to the my next question: isn't it more useful to interpret the semantics of a security label on the bundle as "applies to all of the entries" instead of high watermark? There is a clear rationale for this since it enables the server to avoid repeating labels on potentially a large number of resources within the bundle, especially when those labels are transaction-dependent and not inherent to the resources within the bundle. For example, if there is a PoU restriction or Delete-after-use handling instruction that only pertains to the transaction at hand, it can be assigned to the bundle instead of repeating it on every single resource within the bundle.
John Moehrke (May 06 2020 at 20:12):
I am not sure I understand the distinction you are making...
John Moehrke (May 06 2020 at 20:19):
the concept of high water can ONLY be applied to the confidentialityCode, which has a linear relationship
John Moehrke (May 06 2020 at 20:20):
all other codes are independent facts or actions
John Moehrke (May 06 2020 at 20:21):
that said.. I think if you are suggesting there be guidance on what the bundle.meta.security should have related to the content; very much agree it would be great if there was an algorithm. I suspect that algorithm will be highly influenced by Policy, which is another way of saying that the algorithm will not be standardized.
Mohammad Jafari (May 06 2020 at 20:49):
Let me clarify my point:
There are two ways to interpret meta.security
on a bundle.
- HWM, i.e. we will look at the resources in the bundle, and compute a HWM and assign it to the
meta.security
. For example, if a bundle contains two resources oneR
and oneM
, the bundle is set toR
. - Grouping, i.e. when we assign a label to the bundle, it means that the consumer must assume that that label applies to all the resources in the bundle. For example, we assign
R
to a bundle (instead of assigning it to every single resource in the bundle) and it means all resources in the bundle areR
.
I'm suggesting (2) is useful especially for transaction-specific labels because it saves the server from repeating labels on potentially a large number of resources. I am also asking if there is any real use-case or rationale for (1).
Mohammad Jafari (May 06 2020 at 20:49):
PS: At least theoretically, HWM for unordered valuesets is defined based on set-theoretic conjunction. For example, if there is an HIV
resource and an SCA
resource in a collection, the HWM would be {HIV, SCA}
.
Lloyd McKenzie (May 06 2020 at 21:32):
I'm not in favor of 2 as it would be a type of context inheritance which we strenuously work to avoid in FHIR. If you want to know the characteristics of a resource, you should only ever have to look at the resource. Putting stuff on the Bundle that conducts to the resource means you'd have to transform the resources when you extract them as individual instances. Much better to declare the same thing on 1000 different resources .
John Moehrke (May 07 2020 at 23:50):
I have always expected the Bundle.meta.security would be metadata about the bundle, not necessarily the data contained within. As Lloyd points out each resource contained in a bundle has a .meta.security you can inspect. Thus I expect the bundle.meta.security to mainly hold PurposeOfUse for which the data are allowed, obligations that must be imposed because of the conditions of the creation of the bundle, and refrains that must be respected because of the conditions of the creation. It would seem a confidentiality code is useful, but it is not clear it is solely based on the content.
Mohammad Jafari (May 08 2020 at 17:57):
I think I find @Lloyd McKenzie 's argument about context inheritance compelling here. It seems like the burden and complexity imposed on the client in computing resource labels outweighs the minuscule comfort for the server in such a grouping strategy.
Although I agree with you @John Moehrke that at a conceptual level the bundle and the resources are separate entities, any obligations (like restricted purpose of use or delete after use) associated with the bundle as a result of the context of the transaction, ultimately applies to the individual resources in the bundle. Since the bundle is an ephemeral data structure and is usually not persisted, the client still needs to apply such obligations to individual resources within the bundle before persisting them (for example, in an exchange scenario).
In any case, it will be helpful if we can clarify this in the security labeling specs with examples.
John Moehrke (May 08 2020 at 19:12):
@Mohammad Jafari You are imposing a storage requirement as an Interop requirement. We need to focus on interop communications, and not define storage requirements.
John Moehrke (May 08 2020 at 19:17):
That is to say that a recipient can store the obligations however they need to store them in order to enforce them. To presume that the data resources must be tagged by the sender is not necessary. I really don't think that these kinds of communications policies should be updated in the data object, as they are not meta about the data; they are meta about the communications. This separation of layers is an important systems design. This separation of layers does not forbid a recipient from doing as you are suggesting and updating the data .meta.security tagging, but by separating these layers we enable recipients to keep these policy statements elsewhere, (like in a XACML database).
Mohammad Jafari (May 08 2020 at 21:58):
Storage is just one example of processing and handling. Any processing of resources received in a bundle depends on those bundle labels.
To be clear, again, I don't disagree with your point from a conceptual perspective, but I am just pointing out that this directly ties to the concern that Lloyd raised earlier about the self-sufficiency of resources. When labels on a bundle apply to the resources in the bundle, this _is_ context inheritance.
Lloyd McKenzie (May 09 2020 at 00:09):
Labels on a Bundle apply to the Bundle. They can't apply to the resources within the Bundle. If there are rules for the individual resources, they need to be declared on the individual resources.
John Moehrke (May 11 2020 at 13:38):
Lloyd McKenzie said:
Labels on a Bundle apply to the Bundle. They can't apply to the resources within the Bundle. If there are rules for the individual resources, they need to be declared on the individual resources.
I agree with this mostly, but want to be very careful about the meaning of your point. I am mostly very concerned about "..They can't apply to the resources within the bundle". The word I am concerned with is "apply". There are uses of the english word "apply" that I would definitely agree with, but others that I would disagree with. My point of separating Bundle.meta.security being different than the resources.meta.security. The tags on the resources are tags about that data independent of policies that can change dynamically (e.g. current consent status, as that can change over time totally independent of tags on data). Tags on the bundle level are about (meta) the bundle, which is highly influenced by the data within the bundle, context of the request that caused the bundle to exist, and current policies (e.g. consent status).
John Moehrke (May 11 2020 at 13:40):
This is not to say that data can't point at policies that affect future uses of that data. Hence why data can't have tags that indicate that the data are sensitive to HIV status. This is a meta statement about the data. This fact is used during a future decision, based on dynamic policy like consent status related to HIV status disclosure.
John Moehrke (May 11 2020 at 13:43):
but data should not be tagged with transitory policy rules that could change dynamically. This is simply good systems design. Security standards have recognized this for decades, this is not new to healthcare. All domains do tend to occationally forget this, tag data with dynamic information, and then realize it was a really bad idea. This lesson was a huge mistake that IHE made with the original release of the BPPC profile, that was painful but eventually changed. I don't want FHIR to make this mistake as it will be far more painful.
John Moehrke (May 11 2020 at 13:46):
if we look at all the codes in the HCS. There are some codes that we expect to see on data (sensitivity, confidentiality, and integrity). Sometimes purposeOfUse (and broader compartment ) are also on data when the database is itself not dedicated to singular purpose (e.g. an EHR) and in this case there is a need to record why the data was collected (Note that can also be done through other means like Provenance).
John Moehrke (May 11 2020 at 13:54):
but there are other codes like Obligations, Refrain, and Policy; that are not appropriate for data. They are appropriate as part of a release of data. (e.g. I am releasing this data to you but you must Encrypt this data where ever you store it). This obligation can be inspected by the recipient; where they can decide they can't abide so they will discard, or they recognize that their system can deal with encryption). This recipient can choose to record this obligation however it wants. It could put this obligation into the data. But it doesn't have to put it into the data. This obligation was about the release action that produced the Bundle. This release included the data, so the obligation does 'apply' to the data; but the release obligation is not a fundamental 'meta' about the data, it was about the release.
Lloyd McKenzie (May 11 2020 at 19:51):
The basic point is if I receive a resource in a Bundle and nothing about the Bundle disallows me from removing a resource from the Bundle (that I'm contractually obligated to follow), then I can yank the resource from the Bundle and only pay attention to the tags on the resource and totally ignore the Bundle tags once the resource has been extracted.
John Moehrke (May 11 2020 at 20:05):
there is always some contractual obligation that causes you to look at anything. There is nothing in any data-model that compels one to enforce anything. So this distinction you are trying to make between a bundle and the resources within is not a distinction with any difference.
The data holder would need to assess if you were trustworthy to give the data to in the first place. That decision to provide data is what I am referencing as where the obligations and refrains come from. That decision, in this example, came to the conclusion that these residual obligations and refrains were possible for the recipient to enforce. If the recipient is not trustworthy to get the data or enforce the obligations/refrains, then the bundle would not be created at all. I am simply making the recommendation that when the recipient can be trusted with these residual obligations and refrains, that they would be best carried in the bundle.meta.security and not replicated across all the data within the bundle (unless they were already marked that way).
Lloyd McKenzie (May 11 2020 at 20:12):
"always" is a statement I always reject ;)
Lloyd McKenzie (May 11 2020 at 20:13):
The key point is you cannot expect a receiver to propagate any tags on the Bundle to be tags on the content within the Bundle. If you want a policy to apply to a resource within the Bundle, you need to declare it on that resource.
John Moehrke (May 11 2020 at 22:21):
so what do you do when there are two policies that apply to the data. Something like "do not print" for payers, and "do not re-disclose for clinicians", while the patient has no restrictions????? These are policies that are dynamic and NOT about the data. They are NOT META.
John Moehrke (May 11 2020 at 22:22):
You can give this a try. In one or two years you will realize your mistake. I can't stop you. But I will continue to try to prevent you from doing a bad thing.
John Moehrke (May 11 2020 at 22:23):
PLEASE look at the codes I am talking about... Obligations and Refrain. I am NOT making any comments about sensitivity, confidentiality, or integrity codes. I am all for marking data with meta tags about the data (confidentiality, sensitivity, and integrity). I am simply against putting obligations and refrain codes on data.
Jerry Goodnough (May 16 2020 at 04:28):
As I understand the concept of security label it is an advisement to the consumer extra ordinary handling of received content is in order... the nature and scope of such handling is specific to the business relationships between the parties and the associated obligations as a result. If this basic presumption is correct then there is a distinct role for labeling aggregate information (i.e. the bundle). There are many cases where the association of data in a single unit raises the overall sensitivity of unit. One simple use case is the result of queries with subjects with a particular condition. More complex cases exist for example a bundle of vital signs from a platoon to the field. The vitals themselves may have some sensitivity, but the key concern to a consumer is be aware that the composition of the platoon is also sensitive information.
Mohammad Jafari (May 19 2020 at 17:07):
Thanks Jerry for this very subtle and helpful use-case! I think using the meta.security
on the bundle in such use-case is perfectly aligned with the earlier conversations in this thread about the conceptual difference between a bundle and its contents.
John Moehrke (May 19 2020 at 17:17):
I especially like the point that a search set can be more sensitive than the data it contains. Such as the example of a search set that might contain only normal data, but for which the search criteria was to return the patients with a sensitive observation. Thus the data contained in the bundle would be normal patient resources, but clearly their association within a bundle on a sensitive query is sensitive. This is a concrete example of my distinction of the bundle having a reason to exist that must be factored into the bundle meta.security values in a way that is related but independent to purely the contents of the bundle.
k connor (May 20 2020 at 23:05):
Wouldn't the sender of the platoon's vital signs be smarter to group those in another container type resource - e.g., List and label the list as "restricted" confidentiality and perhaps a sensitivity code related to the risk of revealing military intelligence because the list contents could reveal the composition of the platoon?
Then the Bundle.meta.security would also have have "restricted" confidentiality, but not the sensitivity code for military sensitivity. The List would remained appropriately labeled even when the Bundle gets dropped.
However, that gets us back at where Mohammad started this discussion: Do high water marks work in FHIR. I think there needs to be a way to encrypt the contents of a bundle so that only authorized recipients are able to access the information. But there doesn't seem to be a way to do this with FHIR.
John Moehrke (May 21 2020 at 14:25):
the high water concept is different than a functionality to encrypt. Generally all communications of FHIR are over encrypted channels, which so far has been seen as sufficient.
Mohammad Jafari (May 21 2020 at 15:06):
John I think the encrtption Kathleen is referring to here is beyond the channel encryption; for example if one particular resource is masked for a specific end-user on the recipient's side. My understand about such use-cases though is that we can have some attributes in a resource masked without having to mask the entire resource, especially the resource metadata including meta.security
.
It's also unclear to me how masking an entire resource would work considering that it will break the bundle schema so we need some IG if there is a compelling use-case for this, but that's a separate discussion.
John Moehrke (May 21 2020 at 15:08):
I understand that.. but that is not the topic. I am not against talking about that, but it is orthogonal. It is very different than common REST use. And would definitely be a different type of Bundle that does not yet exist
Mohammad Jafari (May 21 2020 at 15:10):
so my understanding is that Kathleen is suggesting that HWM on bundle makes sense if the content of the bundle resources (including the resource security labels) are hidden from the client by encryption. Essentially when the bundle resembles the concept of outer envelope.
John Moehrke (May 21 2020 at 15:13):
I don't agree. The bundle is a resource, so the .meta.security should represent the content of the bundle. The concept of high-water-mark is that there is some algorithm (yet to be defined, right?) that can be used to calculate the high-water-mark for the bundle given the content.
John Moehrke (May 21 2020 at 15:14):
Now I don't think that the high-water-mark is as critical as a well defined meaning of the bundle.meta.security to hold obligations and refrains.. but if this topic will be addressed elsewhere, then that discussion can happen elsewhere. (much like a new type of bundle that has encrypted parts)
Last updated: Apr 12 2022 at 19:14 UTC