FHIR Chat · security labels for dummies? · Security and Privacy

Stream: Security and Privacy

Topic: security labels for dummies?


view this post on Zulip Isaac Vetter (Nov 27 2019 at 04:34):

I'm trying (really hard!) to understand how security labels would actually work in practice. While OAuth2/SMART scopes are widely used in prod, it seems like tagging resources with a security label is a possibility in the future. I don't understand how security labels are created, nor how they interact with OAuth2 scopes. Are there any basic, educational articles that I could read to understand this? @John Moehrke @Mohammad Jafari

view this post on Zulip Grahame Grieve (Nov 27 2019 at 04:50):

I think a few of them are used in practice. VIP tag on Patient, for instance, is a widely adopted practice in Australia

view this post on Zulip Mohammad Jafari (Nov 27 2019 at 04:55):

@Isaac Vetter I try to provide a quick summary:
Generally, security labels are assigned by a Security Labeling Service (SLS) based on rules (e.g. originating from policies or patient consent). For example, based on the appearance of certain clinical codes in medications or diagnoses, the SLS may decide to assign the label HIV. Or, based on organizational policies or patient consent, the SLS may assign the confidentiality label R to a resource.
The SLS can operate in real time by labeling resources on-the-fly as they are requested by a client, or it can operate as a batch service. In practice, probably a combination of both is desirable depending on the type of labels and the type of processing (e.g. if you are using natural language processing to scan unstructured text in a resource, it's usually not feasible to implement that in real time).

OAuth 2.0 scopes on the other hand, record the client's level of access, i.e. clearance which is determined by the OAuth server based on policies and the patient consent. For example, a patient can grant to an app the clearance to access resources with confidentiality level N(but not R).

When a client requests a resource, the FHIR Server needs to ensure that the client's clearances match the labels of the resource. So for example, a client with confidentiality clearance N should not be able to view a resource labeled as R.

view this post on Zulip Grahame Grieve (Nov 27 2019 at 04:59):

.. I've never seen an SLS in live practice. But I have seen security flags - usually assigned straight out of the API. Or sometimes in config tables associated with particular kind of STD related tests

view this post on Zulip Isaac Vetter (Nov 27 2019 at 05:13):

Mohammad - thank you for the explanation!

view this post on Zulip Isaac Vetter (Nov 27 2019 at 05:13):

OAuth 2.0 scopes on the other hand, record the client's level of access ... For example, a patient can grant to an app the clearance to access resources with confidentiality level N(but not R).

view this post on Zulip Isaac Vetter (Nov 27 2019 at 05:14):

Do you have a recommendation for representing N, but not R as an OAuth2 scope? I'm trying to figure out how this could interact with an overly simple SMART scope.

view this post on Zulip Isaac Vetter (Nov 27 2019 at 05:14):

security labels are assigned by a Security Labeling Service (SLS) based on rules (e.g. originating from policies or patient consent)

view this post on Zulip Isaac Vetter (Nov 27 2019 at 05:14):

I do struggle to understand exactly how an SLS should be implemented by a developer. It feels a little loosey-goosey to say that "certain clinical codes in medications or diagnoses, the SLS may decide to assign" a label. I imagine the worst case -- a poor HIM analyst in a basement assigning security labels to conditions 8x265. Exactly who is involved in the workflow of assigning a security label?

view this post on Zulip John Moehrke (Nov 27 2019 at 13:39):

note the SLS is a service. We security geeks knew we couldn't differentiate between the various medical conditions, we are security geeks after- all. So we see that there are clinical-decision-support services, and decide that we want to use that huge brain. So we define that there is a service that would look at the data and label it, likely based on the brain that also drives CDS. So SLS is a form of CDS... so if you believe that CDS can exist to some degree, then so can a SLS.

view this post on Zulip John Moehrke (Nov 27 2019 at 13:42):

Here are some blog articles where I have tried to provide a simplified view of security labeling used for access control
https://healthcaresecprivacy.blogspot.com/2019/02/basic-ds4p-how-to-set.html
https://healthcaresecprivacy.blogspot.com/2019/02/segmenting-sensitive-health-topics.html
https://healthcaresecprivacy.blogspot.com/2019/02/what-is-ds4p.html

view this post on Zulip John Moehrke (Nov 27 2019 at 13:46):

Here is my three year old proposal for adjusting SMART scopes to add purposeOfUse (how will this data be used if you give it to me), and _confidentiality (highest confidentialitycode authorized to access)
https://healthcaresecprivacy.blogspot.com/2016/01/fhir-oauth-scope.html
I don't claim that this three year old proposal still holds up, but it is readily available to read

view this post on Zulip John Moehrke (Nov 27 2019 at 13:48):

HEART has a scopes definition that includes many of these vectors in a very different arrangement. Where HEART is intended to be used cascaded with business level OAuth (e.g. SMART), thus the thing actually authorized is the overlap of the HEART token and the SMART token. Thus the SMART token doesn't need to address all vectors, just the business level OAuth.
https://healthcaresecprivacy.blogspot.com/2015/11/heart-profiles-for-review-comment-and.html

view this post on Zulip Mohammad Jafari (Nov 27 2019 at 17:00):

Isaac:
- The simple SMART scopes are currently unable to record security labels but I am aware that an update is being discussed by them to accommodate more advanced scopes. Meanwhile, there are other proposals out there including the one John mentioned above. I have an experimental implementation which I have demoed in the past connectathons, discussed here and here.

- As for SLS, the VHA team has had demos in the past, of a simple labeling service at the connectathons working based on simple rules by mapping clinical codes (e.g. SNOMED and RxNorm) to sensitivity labels; the simple rules were essentially of the form if clinical codes x and y and z exist in the content of the resource, then mark it as HIV. A second sweep then would assign confidentiality codes based on the sensitivity codes assigned earlier. Rules in this stage were of the form if resource is labeled as HIV then label with R and were based on organizational policies or patient consent. There was also a demo of a preliminary natural language processing of unstructured text to infer clinical codes implied by the text but not explicitly specified as a structured code.
Note that this is admittedly a simple implementation and does not accommodate other considerations such as related resources.
From an implementation perspective, the SLS can either be its own microservice with an API that accepts a resource or a bundle and returns the labels, or it can be implemented as a module inside the FHIR server --invoked by addition of new resources for batch processing, and by a response interceptor to label the resource/bundle in real time before they're sent to the client.

I will see if I can find and share some of the reports in this thread.

view this post on Zulip Grahame Grieve (Nov 27 2019 at 17:01):

but I believe it's true to say that no one has pushed SLS into production and dealt with the policy issues that result from that ?

view this post on Zulip k connor (Nov 29 2019 at 06:38):

Everyone participating in eHealth Exchange in US who auto-magically puts a confidentiality code on a CDA at the header Document class - either N, R, VR - has demonstrated that they are taking responsibility for security labeling. Not that there are many confidentiality code police - however, if an eHealth Exchange participant were to put the default N (normal for HIPAA) on a CDA that contained 42 CFR Part 2 information (i.e., didn't conform to the DS4P CDA IG guidance for security labeling of Part 2 information), then they could be fined if that CDA information were disclosed to a eHX participant who did not have Part 2 consent. Since there aren't a lot of those fines being discussed, seems like there must be a lot of eHX participants doing the right security labeling to segregate Part 2 from HIPAA CDA information flows.

view this post on Zulip Lloyd McKenzie (Nov 29 2019 at 14:37):

Lack of fines could be an indication of correct security labeling or it could be an indication of negligible enforcement. If the former, I'd have expected to see a bump of fines as the ignorant/lazy were informed the error of their ways, followed by a decrease in fines as everyone learned the expectations. Have we seen that? If not, my leaning would lax enforcement (and quite possibly poor compliance under the covers)

view this post on Zulip k connor (Nov 29 2019 at 16:58):

Another theory is that there's not a lot of Part 2 information being exchanged to avoid the compliance issues. My point is that those exchanging CDA are security labeling with the confidentiality code, whether selected intentionally based on policy or by default.

view this post on Zulip Lloyd McKenzie (Nov 29 2019 at 18:13):

Those exchanging CDA could be security labeling. Do we know how many systems actually support doing so (and are doing so in practice)? Also, protecting a document is rather different from protecting individual records.

view this post on Zulip k connor (Nov 29 2019 at 22:10):

Well, given that a conformant CDA requires a confidentiality code, which is hopefully accurate, then the number of systems are at a minimum the number of participants in the US eHealth Exchange. Guess we 'd need the principals from Sequoia and Commonwell to give us some numbers in the US. Then EU also exchanges conformant CDAs, so that's another batch. Then there's the other realms that exchange CDAs. Determining the volume of CDA usages may be a good exercise for HQ.

view this post on Zulip Lloyd McKenzie (Nov 29 2019 at 22:52):

Volume of systems that produce CDAs with a confidentialityCode other than N would be most relevant.

view this post on Zulip John Moehrke (Nov 30 2019 at 16:27):

Unfortunately (or fortunately) these documents are protected resources and therefore we don't have access to the statistics to prove or disprove. Given my short view into the workings of the eHEX; I would say that it is both ( a ) no one cares what the tags are on an incoming document (it was released or denied, so they are just going to use it for why they asked for it), AND ( b ) there are some organizations that continue to not release their 42 CFR Part 2 data for many reasons possibly including not trusting their recipient to care (see ( a ))...
@Matt Blackmon can you comment on to what level eHEX 'tests' for this vector (security tagging N vs R)?
Note that there are changes going on in eHEX policy that would give them visibility to the metadata (aka hub architecture), thus they will soon have ability to give us this statistic (and be exposed to alot of PHI that normally wouldn't be knowable to them).

view this post on Zulip Didi Davis (Dec 10 2019 at 17:28):

Well, given that a conformant CDA requires a confidentiality code, which is hopefully accurate, then the number of systems are at a minimum the number of participants in the US eHealth Exchange. Guess we 'd need the principals from Sequoia and Commonwell to give us some numbers in the US. Then EU also exchanges conformant CDAs, so that's another batch. Then there's the other realms that exchange CDAs. Determining the volume of CDA usages may be a good exercise for HQ.

So to start this thread, I wanted to point out that The Sequoia Project is a separate non-profit 501(C)3 company from the eHealth Exchange and Carequality who are also non-profits. Commonwell is an implementer of the Carequality Framework. The eHealth Exchange and Carequality leverage the C-CDA specifications that have the following conformance:

SHALL contain exactly one [1..1] confidentialityCode, which SHOULD be selected from ValueSet HL7 BasicConfidentialityKind urn:oid:2.16.840.1.113883.1.11.16926 DYNAMIC (CONF:1198-5259).

This would require that ALL documents exchanged will contain a confidentialityCode that SHOULD be selected from those provided in the spec and available in VSAC found in Table 5: HL7 BasicConfidentialityKind Value Set: HL7 BasicConfidentialityKind urn:oid:2.16.840.1.113883.1.11.16926

N = normal
R = restricted
V = very restricted

We have no numbers to provide for the eHealth Exchange, but I can state that Carequality is exchanging 80 million documents per month that should all contain this code. I hope this helps and thanks to Kathleen for calling my attention to this thread.

view this post on Zulip John Moehrke (Dec 10 2019 at 17:31):

Hi Didi.. that doesn't help... The assertion is that no one is actually giving these codes any thought an everyone is just using "N" because they must put something into the element. So you are just expressing that indeed there is a requirement to fill out.

view this post on Zulip John Moehrke (Dec 10 2019 at 17:34):

second, the 1..1 restriction is a CDA requirement. the XCA metadata element confidentialityCode is not 1..1; it allows many codes, and IHE recommends the valueSet from HL7 HCS (same valuesets in FHIR). This metadata capability to carry 0..* security/privacy codes enables more expressive obligations and restrictions. This is the scope of DS4P...

view this post on Zulip Didi Davis (Dec 10 2019 at 17:39):

Unfortunately (or fortunately) these documents are protected resources and therefore we don't have access to the statistics to prove or disprove. Given my short view into the workings of the eHEX; I would say that it is both ( a ) no one cares what the tags are on an incoming document (it was released or denied, so they are just going to use it for why they asked for it), AND ( b ) there are some organizations that continue to not release their 42 CFR Part 2 data for many reasons possibly including not trusting their recipient to care (see ( a ))...
Matt Blackmon can you comment on to what level eHEX 'tests' for this vector (security tagging N vs R)?
Note that there are changes going on in eHEX policy that would give them visibility to the metadata (aka hub architecture), thus they will soon have ability to give us this statistic (and be exposed to alot of PHI that normally wouldn't be knowable to them).

eHealth Exchange has been testing content conformance to the 2011 (HL7 CCD), 2014 (HL7 R1.1 + Companion Guide) and 2015 (HL7 R2.1 + Companion Guide) Editions of Meaningful Use Requirements. The testing tooling provides an error if no code is included and a warning if one of the three codes recommended is not used. Carequality is expected to formally require testing against these specifications plus the added requirements specified by the Joint Carequality/Commonwell Content Workgroup published in 2018 found here: https://s3.amazonaws.com/ceq-project/wp-content/uploads/2018/10/03211340/Carequality_CommonWell_Improve_C-CDA_06-15-2018_V1.pdf. As was mentioned, once the Hub is deployed in production by the eHealth Exchange, more insight and statistics would be available for the number of documents exchanged.

view this post on Zulip Grahame Grieve (Dec 10 2019 at 20:51):

the real question is, how many documents do you have that don't have "N" as their confidentiality code, and what would it mean if they don't?

view this post on Zulip John Moehrke (Dec 11 2019 at 13:31):

the real question is, how many documents do you have that don't have "N" as their confidentiality code, and what would it mean if they don't?

exactly the question

view this post on Zulip John Moehrke (Dec 11 2019 at 13:41):

Didi, I suspect the only to answer the question we really want answered is to specifically ask the endpoints in the exchange for their statistics (by percentage so as to protect the data). Percent of documents sent with just "N", percent of documents sent with just "R", other Percent of document received with just "N", percent of documents received with just "R", other.

view this post on Zulip John Moehrke (Dec 11 2019 at 13:43):

Note that another factor in the exchange is that there is not a clear security assertion to express need for highly-sensitive (R) type data. Thus it would not be unreasonable for (R) data to never be sent, thus skewing the results.

view this post on Zulip John Moehrke (Dec 11 2019 at 13:44):

And most data that would be marked (R) is because it falls under 42 CFR Part 2, which requires explicit consent. and it is a well established problem that getting explicit consent for these data are hard to obtain (for good privacy reason, and for user interface and policy language reasons)

view this post on Zulip John Moehrke (Dec 11 2019 at 13:45):

so we will have skewed data... but I expect the results to be statistically 100% "N" alone.

view this post on Zulip Lloyd McKenzie (Dec 11 2019 at 14:14):

What I'm really interested in is how many systems that create documents have ever created production documents with values other than "N" and how many systems that consume documents have behavior that's in any way different based on what the code is.

view this post on Zulip Lloyd McKenzie (Dec 11 2019 at 14:15):

I.e. Not what's the percentage of sensitive data, but rather what systems actually differentiate sensitive data and behavior around it.

view this post on Zulip John Moehrke (Dec 11 2019 at 14:51):

yes. I would also like a reality check on "N" vs anything-else... But there are various reasons that "N" might be the only thing that is live. So I am interested in "why not", because that helps us understand the likelihood for improvement.

view this post on Zulip John Moehrke (Dec 11 2019 at 14:51):

OUTBOUND:
Let me know if your experience on documents published (outbound) from your system is that :

  1. "N" is the only thing your system would ever send, as you have no instrumentation for anything other than "N"
  2. "N" is the only thing your system would ever send, as all policies restrict anything more sensitive. (that is to say your system can do it, but mostly is never configured to do it)
  3. "N" is mostly what is sent, as there is a small set of circumstances when restricted "R" could be sent. (an example is 42 CFR Part 2 data where no consent is given)
  4. You have a system that is more complex than just "N" vs "R".

INBOUND
Let me know your inbound experience:

  1. 100% of documents inbound are marked "N"
  2. Our request for data forces only "N" to be requested
  3. We see both "N" and "R" data but nothing else
  4. We see more than just "N" and "R" data

so just respond with your number for inbound and your number for outbound. Feel free to elaborate, but all I want is these two numbers. You should not need to do statistical assessment, as the difference between these buckets should be stark.

view this post on Zulip Lloyd McKenzie (Dec 11 2019 at 15:57):

Also on INBOUND, do you treat "N" vs. "R" vs. other codes differently in any way?

view this post on Zulip John Moehrke (Dec 11 2019 at 16:59):

I didn't expect anyone would be wiling to say they ignore all tags.

view this post on Zulip Lloyd McKenzie (Dec 11 2019 at 17:49):

True. But they might choose to not answer a question about whether they have logic that differentiates. If few people say they differentiate, that would be telling

view this post on Zulip Jenni Syed (Dec 11 2019 at 20:10):

We have logic that differentiates on CDAs, would have to see if we have data on actual production %. We don't send anything other than N out.

view this post on Zulip Jenni Syed (Dec 11 2019 at 20:11):

If it's inbound marked R/V etc, it gets stored (and accessed) in a different way.

view this post on Zulip Jenni Syed (Dec 11 2019 at 20:18):

It sounds like the majority inbound in prod is N. We could get more detailed if needed :)

view this post on Zulip Jenni Syed (Dec 11 2019 at 20:19):

For outbound, if there's any sensitive info, the CDA doesn't get created.

view this post on Zulip Jenni Syed (Dec 11 2019 at 20:19):

as mentioned, because of consent complications

view this post on Zulip John Moehrke (Dec 11 2019 at 20:20):

thanks @Jenni Syed

view this post on Zulip k connor (Dec 11 2019 at 20:58):

The ONC 21st Century Cures Act NPRM Draft Test Procedures - See https://www.healthit.gov/sites/default/files/page/2019-03/170_315b_13_Data_segmentation_for_privacy_-_receive.pdf The received CCDA tagged as restricted document includes the following data elements: The originating document Individual Author or Organization; and Confidentiality Code constrained in accordance with the standard specified in § 170.205(o)(1).5 - which is "R" (restricted) for sensitive information governed by privacy laws more stringent than HIPAA (which is "N" because it's the norm of protection.)
https://www.healthit.gov/sites/default/files/page/2019-03/170_315b_12_Data_segmentation_for_privacy_-_send.pdf for proposed capability The user will generate a summary record document(s) from the Health IT Module and submit the document(s) to the tester for verification. The generated summary record includes the following data elements: Document Level Confidentiality Code constrained in accordance with the standard specified in § 170.205(o)(1) HL7 Version 3 Implementation Guide: Data Segmentation for Privacy (DS4P), Release 1 - which is "R" (restricted) for sensitive information governed by privacy laws more stringent than HIPAA (which is "N" because it's the norm of protection.)

view this post on Zulip René Spronk (Dec 18 2020 at 11:02):

I'm in the process of creating training material to cover 'security labels'. It'll first cover the wider access control context, SLS, and then focus on the FHIR aspects thereof. Let's see whether I understand the context of the labeling process correctly:

Let's say system A receives a request for a resource R: then R could already have security labels associated with it created by some other system (which A imported into its persistence layer), A might run a SLS (once upon storing R, or whenever R is modified) to assign new labels or update old labels on R.
The access control engine could also use an SLS at run-time (upon querying) to assign run-time labels (espcially dynamic labels), and stick them on to R, to inform the querying application (let's call that application: Q) of such labels. Q would at least have to persist R with its non-dynamic labels.

In XACML terms, an SLS (if present) would effectively be grouped with a PDP, or could be grouped with a PAP.

Correct?

view this post on Zulip René Spronk (Dec 18 2020 at 11:23):

Big question is obviously: if one has a XACML based policy registry, what access control rules would one support using a privacy label (=labeling based on a XACML policy) versus a SDS (=access control decisions based on policies, and labels related thereto). Key differentiator is the stickyness of the labels when storing a resource, in case one assumes that a retrieving system has the capability to support security labels, but does not support SDS, or does not support (the same set of) policies.

view this post on Zulip René Spronk (Dec 18 2020 at 11:26):

@k connor @Mohammad Jafari

view this post on Zulip René Spronk (Dec 18 2020 at 12:47):

On the latter topic, John also blogged about such questions: https://healthcaresecprivacy.blogspot.com/2019/02/segmenting-sensitive-health-topics.html

view this post on Zulip René Spronk (Dec 18 2020 at 12:54):

Looking at the HCS Label categories, Confidentiality, Sensitivity, Privacy Law, Integrity, Compartment, Handling caveats: in what order would you list those in terms of actual use (from high to low)? Either what systems are using right now, or what you think the usage will be within the next years or so. It would seem that Confidentiality, Sensitivity and Handling caveats are in the top-3. Would you agree with that assessment ?

view this post on Zulip k connor (Dec 18 2020 at 14:43):

Hi Rene - a lot of questions. Starting with your last one. See http://hl7.org/implement/standards/fhir/uv/security-label-ds4p/2020May/spec.html for the FHIR version of HCS. Also, see https://confluence.hl7.org/display/SEC/Security+Labels and leaf pages. There are 3 Named Tag Sets: Security Classification, Security Category, and Security Control. Security Classification, which is the level of confidentiality protection, only has Confidentiality Tag Set, and a label must have 1..1. Security Category, which is a security/privacy attribute of the labeled content, includes sensitivity, and importantly, the policy tag, as well as others. Labeled content may have 0..* Security Category tags. Security Control tags convey the permissible purposes of use, obligations, refrains, and privacy marks. Labeled content may have 0..* Security Control tags.

view this post on Zulip k connor (Dec 18 2020 at 14:51):

RE 1st set of questions - See HL7 SLS spec http://www.hl7.org/implement/standards/la.cfm?file=/documentcenter/private/standards/v3/V3_SECURITY_LABELSRV_R1_2014JUN_R2019JUN.zip The component that runs at access decision time along with PAP is the SLS. The component that runs after the PDP to assign a label to the authorized disclosure is a PPS (Privacy Protective Service). In the FHIR DS4P IG, we've added an extension that allows a label to include a reference to an external artifact, which could be the XACML policy, so it too can "stick" to the disclosed content.

view this post on Zulip k connor (Dec 18 2020 at 15:12):

RE Which tags are used most and in what order: Order doesn't matter as long as long as the label is differentiated by policy - which is what we use the sec-label-basis for in FHIR DS4P IG http://hl7.org/implement/standards/fhir/uv/security-label-ds4p/2020May/StructureDefinition-extension-sec-label-basis.html. Cardinality matters for Confidentiality - must be 1..1 per label. The tags used should be sufficient to convey the policy. Sometimes, part of the policy is conveyed outside of the label, e.g., NHIN requests contain the purpose of use (POU). Since the prevailing policy is HIPAA, then a Confidentiality code = N is sufficient. However, with Part 2 (current, not the To Be revision) governed content where the POU is set outside of the label, it is important to include the refrain tag NORDSCLCD (Prohibition on redisclosure without patient consent directive). In some cases, it is important to include the Sensitivity tag so that the recipient knows which information is governed by the label. E.g., if a C-CDA includes service information governed by HIPAA for which the patient has paid for in full, a provider will need to know which information may not be disclosed to a payer. The safest practice is to include all of the tags for the policy governing the information. If a policy domain decides on a consensus security label for each privacy/security policy important to the domain, then recipients would be able to configure their access control systems to recognize an entire label. There doesn't have to be fine grain parsing of each label to figure out how to comply with the policy.

view this post on Zulip k connor (Dec 18 2020 at 15:36):

@Jenni Syed RE "Jenni Syed: For outbound, if there's any sensitive info, the CDA doesn't get create as mentioned, because of consent complications" Curious how that's going to work with CURES Information Blocking - if the patient consented to disclosure of restricted sensitive information or the requester is authorized to receive restricted information (where consent is not required), then not sending the C-CDA would not meet the privacy exemption from what I can tell. Another thought is that not being able to computably segment/label data may not meet the Information Blocking infeasibility exemption if the sender could have manually redacted/labeled the data, e.g., in a UI. Interestingly, entities required to label CUI are now examining the labeling capabilities of commonly deployed content/security management and data loss protection products such as Varonis, McAfee, and Microsoft Information Protection. Some of these are able to label computably and some allow end users to label using labeling templates.

view this post on Zulip René Spronk (Dec 18 2020 at 16:03):

Right - that's a lot of details, which I'll read up on. This conversation serves as input in the creation of international (non-US specific) training material, and we'll probably talk about security labels for 20-30 minutes. The training material won't mention CDA at all (it is a FHIR training, one can't assume any prior knowledge of CDA - which most attendees in FHIR courses won't have).

One of the things to explain which types of security labels are actually used, and show them a list of which types of security labels are most commonly implemented. That question is not about how we would like projects to use labels, but about actual use.

Re: 1st set of questions: thanks, that's helpful, I'll read the SLS spec.

view this post on Zulip John Moehrke (Dec 22 2020 at 15:16):

@René Spronk I presume you got the answers to your questions that I would give by reading my blog... Let me know what I might need to clarify

view this post on Zulip John Moehrke (Dec 22 2020 at 15:17):

As you indicate the "when" the SLS is applied is a design choice, usually policy driven. Doing it as PIP to a runtime request for data assures that the data are most accurately assessed to current clinical knowledge and policy norms. BUT, doing this at runtime will result in performance impact. Doing this at import, and just using existing tags at runtime, might be more energy efficient; but will not catch cases where clinical facts change over time from sensitive to non-sensitive (like HIV did), or from non-sensitive to sensitive.

view this post on Zulip John Moehrke (Dec 22 2020 at 15:19):

Note that the SLS tends to just mark the data with the sensitivity codes, classifying the data into various sensitivity buckets. This is PIP. These sensitivity codes tend to get converted into access decisions, and those decisions tend to apply obligations and refrains; and scope use to purposees... with the resulting released data not including the sensitivity codes that the SLS came up with.

view this post on Zulip John Moehrke (Dec 22 2020 at 15:23):

First priority for an organization to implement is Policy... that policy might declare that all data released has implied tags of X + Y + Z. Thus the data at runtime does not need to have these tags, limiting accidents and bandwidth overhead. Where X + Y + Z tends to be a group of authorized purpose of use (Treatment + Payment), a group of refrains (do not re-disclose without further consent, persist encrypted, attribution to source), and a confidentiality code of Normal. -- which is also my proposal for the priority of the use of the valuesets in the HCS vocabulary.

view this post on Zulip John Moehrke (Dec 22 2020 at 15:25):

More mature is an environment that recognizes multiple levels of access, where purposeOfUse must be declared to communicate intentions. Where obligations must be declared on data as they can't be implied by policy. Where multiple level access controlled data are carried in the same bundle because the recipient system is trusted, but where the users at that recipient system are known to include legitimate access by lesser-authorized users. These are the reasons for inline tagging, vs implied tagging.

view this post on Zulip René Spronk (Dec 22 2020 at 16:16):

I can see that SLS would be PIP, and I'll certainly cover the considerations of assigning labels at run-time (upon disclosure) or batch-like. Policies should be defined first (and this is mostly lacking in 'affinity domains'). There may be implicit defaults, these may have to be made explicit when disclosing data.

view this post on Zulip René Spronk (Dec 23 2020 at 08:23):

When it comes to guidance for FHIR servers, they should A) Enforce the handling caveats associated with security labels, and B) Restrict ability of clients to modify security labels to ensure their integrity; Support the $meta operations

Is there any additional guidance as to 'category B' ?

view this post on Zulip John Moehrke (Dec 23 2020 at 13:11):

Are you asking about "functionally" regarding the data persisted within themselves? We generally stay away from that. We define the requirements on import (create/update) and on export (read/delete). Letting you handle persistence how ever works for you.

view this post on Zulip John Moehrke (Dec 23 2020 at 13:25):

note that there are authorized reasons to change the .meta.security (patient explicitly tagging data they want handled with higher confidentiality), privacy office needing to put a hold on data under investigation, police investigation, public health reporting). The $meta operations are only helpful because they have been designed such that using them to modify the .meta.security could function without changing the _lastUpdated. Changing that might be desired by policy, or might be not. However (B) can only be examined by the server on an Update, where the server has knowledge of the previous state of the tags. So it is important to cast your (B) carefully as there are many considerations. I am not convinced that the $meta operation will be found as important/useful as was originally envisioned. Lastly, (B) effect might simply happen because of policy driven design, that is to say that if on a Read a client might get a filtered .meta.security, then on an Update the server can't trust the .meta.security values the client offers, the server must dump them and keep what it has internally. An example might be where the tags are more privacy exposing than the data, so that tag always gets removed; or where a runtime SLS is used to apply tags that are never persisted; or where obligation tags are applied specific to that client/user. These roundtrip issues make all Import (Create/Update) difficult, not just difficult for local trusted app. Example is that when Importing from another organization, their tags might not apply at all, yet they do need to be considered relative to the trust-framework agreements and patient consents.

view this post on Zulip John Moehrke (Dec 23 2020 at 13:27):

which gets to your (A) relative to a Server Importing data from another organization, it needs to inspect the incoming .meta.security and decide what to do. I would like to think that the other organization didn't put tags that have no value or have no expectation; but it is possible. And they might have been overly tag happy, tagging things that could have been implied under the trust-framework.

view this post on Zulip John Moehrke (Dec 23 2020 at 13:36):

Relative to Obligations, Refrains, and PurposeOfUse... I think these should only appear in the Bundle.meta.security; as they are contextually related to the authorization decision for the request/response; and generally are not specific to the data itself (they are not strictly "meta" about the "data"; they are "meta" about the "communication"). Thus I would tend to recommend against putting these valuesets on data. This is not to say that a system upon Import couldn't put these on their persisted data, that is their design choice. I might chose to put these into the Provenance, would rather see us get Permission done so I could put it there. I don't think this is in violation of the context-conduction forbiddance principle that FHIR has, but that is a perspective against my thinking.

view this post on Zulip Peter Jordan (Dec 23 2020 at 20:10):

John Moehrke said:

Relative to Obligations, Refrains, and PurposeOfUse... I think these should only appear in the Bundle.meta.security; as they are contextually related to the authorization decision for the request/response; and generally are not specific to the data itself (they are not strictly "meta" about the "data"; they are "meta" about the "communication"). Thus I would tend to recommend against putting these valuesets on data. This is not to say that a system upon Import couldn't put these on their persisted data, that is their design choice. I might chose to put these into the Provenance, would rather see us get Permission done so I could put it there. I don't think this is in violation of the context-conduction forbiddance principle that FHIR has, but that is a perspective against my thinking.

Looking forward to progress on the Permission Resource! I'll see if I can get the Security SMEs here in NZ involved.

view this post on Zulip René Spronk (Dec 24 2020 at 14:14):

Sigh - the security topic is overloaded with all sorts of careful considerations. But in a training course (where a topic like security labels already falls in the 'advanced topics' category) I have max 30 minutes to spend on the topic ;-) .. so whilst I appreciate the considered and detailed feedback, which provides me (as a trainer) with a lot of context, I'm still having to condense the material, and get rid of some of the nuances/subtleties that a newbie to this subject won't understand anyway. That's my job as a trainer, so no complaints there. But I will try not to stray too far from what the experts on this forum regard as the 'minimal viable content' ..

Nowadays, we have to make statements about server-side implementation aspects, albeit not in a normative manner, but as guidance, or a statement of observed best practices. Hopefully we'll be able to do that, otherwise those that study this subject will simply drown in the level of detail, without us providing them guidance as to e.g. how this is currently used in projects, or how this should be used in practice. And I'm aware that many of you work on this stuff in real-live projects, so I value your suggestions and hints.

The current version of my server-side-supply slide (talking points, so I'll elaborate when presenting this):

  • Enforce the handling caveats associated with security labels
  • Process/filter labels on imported data
    ** On update, “merge” inbound labels with pre-existing ones

  • Restrict ability of clients to modify security labels to ensure their integrity
    ** May be helpful to support the $meta operations

  • Upon disclosure, scrub security labels
    ** Tags themselves may be sensitive

view this post on Zulip John Moehrke (Dec 24 2020 at 15:16):

understood. Any progress at educating is good. I just want to express that the area is very broad and very deep. I am happy if they go away understanding the tea-cup you offer and understanding that was just a tea-cup.


Last updated: Apr 12 2022 at 19:14 UTC