FHIR Chat · Payload Encryption · bulk data

Stream: bulk data

Topic: Payload Encryption


view this post on Zulip Karl M. Davis (Dec 17 2018 at 16:36):

Has anyone from the CMS side mentioned our work on adding in an additional layer of payload encryption to our bulk data server implementation?

view this post on Zulip Karl M. Davis (Dec 17 2018 at 16:36):

cc @Sutton, @Ryan Nagle

view this post on Zulip Ryan Nagle (Dec 18 2018 at 17:28):

to add some color, we've been working on an encryption strategy that uses a public key provided by the API client to secure a payload so that it is protected on both sides of the transfer, only accessible to the party that possesses the corresponding private key.

view this post on Zulip Ryan Nagle (Dec 18 2018 at 17:31):

the initial request/download process matches the current spec, but we've added some fields to responses from our job status endpoint to indicate the key used to protect each file

view this post on Zulip Ryan Nagle (Dec 18 2018 at 17:32):

we're planning on providing additional documentation to API consumers to explain how to handle the files once they're in their possession

view this post on Zulip Ryan Nagle (Dec 18 2018 at 17:36):

anyway, thought we'd share the idea here in the event anyone has thoughts/feedback for us

view this post on Zulip Ryan Nagle (Dec 18 2018 at 17:38):

happy to share more here or arrange a call, feel free to @-msg me :thumbs_up:

view this post on Zulip Michele Mottini (Dec 18 2018 at 18:42):

Do you have a test server?

view this post on Zulip John Moehrke (Dec 18 2018 at 18:43):

what is the risk that this payload encryption is protecting against? I am starting with the requirement for TLS, and OAuth client authentication/authorization. So I am not clear what risk this payload encryption is protecting against. Without understanding of the risk, it is hard to determine if this solution has reduced that risk. I am a fan of encryption, but it needs to be used properly. Adding more encryption does not always lower risk. For example a payload encryption presents an "Availability" risk caused by loss of keys.

view this post on Zulip Ryan Nagle (Dec 18 2018 at 18:55):

currently, we do not have a public test server

view this post on Zulip Michele Mottini (Dec 18 2018 at 18:56):

ok, thanks

view this post on Zulip Ryan Nagle (Dec 18 2018 at 20:04):

one risk we've discussed is the exposure of client credentials. if a client's credentials are compromised and an attacker successfully queries for their data, the payload is effectively useless to them without also gaining access to the private key.

view this post on Zulip Ryan Nagle (Dec 18 2018 at 20:06):

it happens that this approach also satisfies some compliance with CMS acceptable risk safeguards with regard to protecting sensitive data at-rest

view this post on Zulip Ryan Nagle (Dec 18 2018 at 20:08):

that's the gist of it, would love to hear from you all. thanks for the time!

view this post on Zulip Grahame Grieve (Dec 18 2018 at 20:41):

it the client credentials are exposed, you can just repeat the query?

view this post on Zulip Ryan Nagle (Dec 18 2018 at 21:32):

right -- for example, if the client is not aware their credentials have been compromised / they have not been revoked, an attacker could use the creds to initiate a new request

view this post on Zulip Isaac Vetter (Dec 18 2018 at 21:51):

Hey Ryan, I thought the intent of the Backend Services spec was to limit the use of OAuth2 client_credentials to only "asymmetric" authentication mechanisms (e.g. public/private key pairs), which are further, optionally protected by rotating the certs referenced in the Json Web Key Set.

view this post on Zulip Isaac Vetter (Dec 18 2018 at 21:51):

The rotating JWK Set not only provides public/private key pairs as authentication within the existing spec, but also includes the ability to rotate these keys pretty quickly.

view this post on Zulip Isaac Vetter (Dec 18 2018 at 21:51):

Do you think we should modify this section of the spec to also encrypt the payload?

view this post on Zulip Ryan Nagle (Dec 18 2018 at 22:38):

hey isaac, thanks for the reply

view this post on Zulip Ryan Nagle (Dec 18 2018 at 22:39):

i should've mention up front -- we're still exploring options for our auth/n/z strategy and don't know for sure if the backend services spec will 100% work for us.

view this post on Zulip Ryan Nagle (Dec 18 2018 at 22:40):

in my example above, i was assuming the typical client_credentials workflow (i.e., using a client_id and client_secret)

view this post on Zulip Ryan Nagle (Dec 18 2018 at 22:41):

in all events, i'm happy to share more re: our experience working with the backend services spec, but it'll probably be a while before we have coherent feedback :-)

view this post on Zulip Ryan Nagle (Dec 18 2018 at 22:46):

to answer your question, i don't think the spec should necessarily require payload encryption. it makes sense given our requirements, but might not work for others.

view this post on Zulip Ryan Nagle (Dec 18 2018 at 22:54):

i suppose i'm wondering where you all see these sorts of modifications/addenda for specifics use cases fitting in with the spec in general, if at all

view this post on Zulip Grahame Grieve (Dec 19 2018 at 03:15):

if the client is not aware their credentials have been compromised / they have not been revoked, an attacker could use the creds to initiate a new request

So what does encryption achieve? not much....

view this post on Zulip Josh Mandel (Dec 19 2018 at 03:22):

I'm just catching up here -- thanks for the discussion. I'm on the same page as John: would definitely help to understand specific threats.

view this post on Zulip Josh Mandel (Dec 19 2018 at 03:23):

A few quick notes:

  • The current Bulk Data spec is written independent of the Backend Services security spec (but they're designed to work well together).

  • Modifications/addenda are always the trick when it comes to standardizing an end-to-end system for data exchange. If you introduce a modification to a protocol, and it's required for clients who talk to your system, then your system is no longer compatible with generic clients.

  • This means that, especially for modifications/addenda that are general-purpose or important (or likely to be made mandatory by some servers), we'd like to understand these up-front as much as possible, and incorporate them into our initial design.

view this post on Zulip Josh Mandel (Dec 19 2018 at 03:24):

As Isaac points out, the client's credentials never go over the wire in our current Backend Service spec; they're used only for asymmetric signatures, and can be rotated by the client.

view this post on Zulip Karl M. Davis (Dec 19 2018 at 16:50):

It's also worth pointing out the reason we have compliance issues here that we're trying to address: In general, our risk mitigation policies aren't thrilled with the idea of "we have a server on the open internet sitting around with a ton of bulk data PII/PHI on it."

That leaves us only one remote code execution vulnerability away from "oops, someone just scraped all the waiting-for-delivery payloads off our disk and sent them to <bad-actor-of-your-choice>." And while that absolutely is a low probability event (those vulns don't come along very often), the pain if it happens is high enough to warrant extra mitigation effort. The reasonablest approach we could get agreement on was ensuring that files weren't sitting effectively-unencrypted at rest (disk-level or single-key encryption aren't helpful here).

@Ryan Nagle Is that an accurate summary of the compliance concerns?

view this post on Zulip Josh Mandel (Dec 19 2018 at 16:54):

It would be good to describe technically how the approach you're outlining (re: per-client encryption) could be built on top of the current specs (which have the potential advantage that a client's public key is already known by a server at the time an export request is issued).

view this post on Zulip Josh Mandel (Dec 19 2018 at 16:56):

(BTW, I really appreciate the background here -- thanks for engaging in the discussion!)

view this post on Zulip Ryan Nagle (Dec 19 2018 at 17:01):

@Karl M. Davis accurate in my opinion, thanks for chiming in. :thumbs_up:

view this post on Zulip John Moehrke (Dec 19 2018 at 17:01):

Isn't that risk usually addressed by operational security technology such as encrypting databases, encypting filesystems, etc? Things that don't need to affect the Interoperability layer?
Further by indicating it is per-client encryption, this means that the data is available on that server in non-encrypted form so that it can be encrypted to a targeted client.
What I am worried about is added complexity at the Interoperability layer caused by security-theater.

view this post on Zulip Josh Mandel (Dec 19 2018 at 17:03):

Is there a write-up on how you're approaching this from a technical perspective, @Ryan Nagle ?

view this post on Zulip Josh Mandel (Dec 19 2018 at 17:07):

@John Moehrke I agree it's important to nail down these details, with clear discussion about threats. In practice, one very common error (outside of remote code execution vulnerabilities on cloud VMs) is for a server to push files to a cloud storage bucket (for a client to fetch from) and then someone accidentally makes that bucket public. Plenty of stories in the news on this kind of misconfiguration, and the additional encryption does mitigate the risk. It all comes back to the threats we're most concerned about.

view this post on Zulip Ryan Nagle (Dec 19 2018 at 17:07):

@Josh Mandel we're working on that as we speak, we should have a solid draft in the next few weeks

view this post on Zulip Karl M. Davis (Dec 19 2018 at 17:09):

@John Moehrke Those aren't effective mitigation strategies, because -- from the standpoint of an attacker with a code execution exploit -- the encryption doesn't exist. Filesystem, DB, etc. encryption only really protects from the, "someone robbed the data center and walked away with the disks," risk, which... okay, but isn't a real big concern.

view this post on Zulip Josh Mandel (Dec 19 2018 at 17:09):

@Ryan Nagle ok -- on the timing here, is there an intention to describe your approach as a "small diff" building on top of the specification we have today? I think it could be a useful exercise, but if you want to go this route it would be good to think about that from the beginning.

view this post on Zulip Karl M. Davis (Dec 19 2018 at 17:13):

It's also worth mentioning that we here at CMS (and the VA and the DoD) have a very different risk profile from your average provider or payer considering use of the bulk data API.

view this post on Zulip John Moehrke (Dec 19 2018 at 17:13):

@John Moehrke I agree it's important to nail down these details, with clear discussion about threats. In practice, one very common error (outside of remote code execution vulnerabilities on cloud VMs) is for a server to push files to a cloud storage bucket (for a client to fetch from) and then someone accidentally makes that bucket public. Plenty of stories in the news on this kind of misconfiguration, and the additional encryption does mitigate the risk. It all comes back to the threats we're most concerned about.

For this I would agree that encrypting the blob is a good a proper solution.

view this post on Zulip Karl M. Davis (Dec 19 2018 at 17:15):

(Actually, KMS is pretty awesome and does protect against the, "we just oopsed our S3 ACLs," risk. But the point, in general, is still a good one: there's lots of ways to footgun yourself and we have to take a defense-in-depth approach, which the payload encryption is a part of.)

view this post on Zulip Ryan Nagle (Dec 19 2018 at 17:17):

@Josh Mandel that's more or less correct. in short, everything about the request and delivery process matches the current state of the bulk data spec, the difference comes in the "post-processing" of the payload.

view this post on Zulip Josh Mandel (Dec 19 2018 at 17:18):

Is there a negotiation or some protocol over the wire to determine what sort of encryption will be used, or to communicate to the client what has been used?

view this post on Zulip Ryan Nagle (Dec 19 2018 at 17:22):

that's a good question. at present, we're using AES GCM and planned on specifying that in our documentation. i can't see us offering other options for encryption, since we're limited in what we can use cipher-wise for compliance. but, if it were generalized, i think communicating that as part of file delivery makes sense.

view this post on Zulip Ryan Nagle (Dec 19 2018 at 17:26):

you mentioned the advantage of having a client pub key already in-hand -- this is exactly what we're hoping to leverage. but, like i said, we're still working on our story for client registration, client credentials workflow and so on. we'll have more we can share in the coming weeks.

view this post on Zulip Josh Mandel (Dec 19 2018 at 17:40):

Interesting ; so for the AES approach, you'll define some way to communicate a symmetrically shared AES key to the client. Will it be one AES key per client, or per export job, or per exported file? Given the threat model you outlined, I'd expect one AES key per job, which could be communicated directly as part of the JSON manifest file for the job when it's complete. If that's vaguely right, we'd be talking about something like adding a parameter to the manifest payload to communicate this key, which would be populated at the server's discretion when exported-file-encryption was enabled.

view this post on Zulip Karl M. Davis (Dec 19 2018 at 18:23):

If I was reading the spec right however many months ago, one area of potential non-compliance is that it specifically says the payload must be an NDJSON file, and doesn't it also call out a MIME type or two?

view this post on Zulip Karl M. Davis (Dec 19 2018 at 18:23):

That's not hard to fix, I'm sure, but still worth noting.

view this post on Zulip Karl M. Davis (Dec 19 2018 at 18:27):

@Ryan Nagle Are you still thinking of moving to streaming encryption at some point? Does that impact this conversation at all?

view this post on Zulip Ryan Nagle (Dec 19 2018 at 18:48):

@Karl M. Davis yep, i believe you're right about the payload format. good point.

view this post on Zulip Ryan Nagle (Dec 19 2018 at 18:49):

we're taking a pretty naive approach to payload encryption to start, encrypting the entire file as a single message

view this post on Zulip Ryan Nagle (Dec 19 2018 at 18:50):

we're going to have to look at other options, including chunking large files into smaller contiguous messages so that both the server and client don't have to hold potentially very large files in memory to decrypt

view this post on Zulip Ryan Nagle (Dec 19 2018 at 18:56):

i should say server side to encrypt, client side to decrypt

view this post on Zulip John Moehrke (Dec 19 2018 at 18:57):

I would recommend the use of standards based enclosures that already have key management and algorithm communication. The standard I would recommend is CMS. I have written profiles in IHE that use CMS for end-to-end security. In the case of IHE (DEN profile), the mechanism fits within a Document Sharing environment, or used on portable media (CD-ROM, USB-Memory). This CMS standard is very flexible.

view this post on Zulip John Moehrke (Dec 19 2018 at 18:59):

this is the same standard that makes up the cryptography portion of S/MIME, leveraged by the USA specific Direct Project.

view this post on Zulip Ryan Nagle (Dec 19 2018 at 18:59):

@John Moehrke awesome, thank you. i'll read up!

view this post on Zulip John Moehrke (Dec 19 2018 at 19:01):

High level intro to the IHE DEN Profile https://wiki.ihe.net/index.php/Document_Encryption
Formal normative text http://www.ihe.net/uploadedFiles/Documents/ITI/IHE_ITI_Suppl_DEN.pdf
These are global standards. The standards are available to use... one must just select them.

view this post on Zulip John Moehrke (Dec 19 2018 at 19:03):

so my ask about the risk is not because I am against encryption... I authored the IHE, and was the author of the security assessment of the Direct Project... I simply am making sure that the risk you are trying to address is worthy of the 'costs' of applying end-to-end encryption. I have too often seen excited people define end-to-end encryption only to find that it is very costly in terms of technology, management, and failure-modes.

view this post on Zulip John Moehrke (Dec 19 2018 at 19:06):

I am also worried when we start to re-invent protocols such as SOAP which has end-to-end encryption and authenticity as a major benefit of using that technology over the more simple http/REST. One can do REST over SOAP. SOAP also has asynchronous modes and many other features. If that is desired, accessing the FHIR REST resources can be done leveraging those existing and well-implemented standards.

view this post on Zulip Isaac Vetter (Dec 19 2018 at 19:08):

(and in this specific case, the use of end-to-end encryption is unique and therefore breaks current Backend Services / bulk data clients. )

view this post on Zulip Ryan Nagle (Dec 19 2018 at 19:08):

i understand and i appreciate your questions and comments, this is been a tremendously valuable conversation for me.

view this post on Zulip Isaac Vetter (Dec 19 2018 at 19:09):

Ryan, the FHIR connectathon is a whole weekend of this type of feedback! Think you'll make it?

view this post on Zulip Ryan Nagle (Dec 19 2018 at 19:15):

@Isaac Vetter i _might_ be able to, i'm not sure at the moment

view this post on Zulip Ryan Nagle (Dec 19 2018 at 19:17):

i only know that it is happening, but don't recall the dates

view this post on Zulip Josh Mandel (Dec 19 2018 at 19:18):

January 12-13, 2019 (http://www.hl7.org/events/working_group_meeting/2019/01/)

view this post on Zulip Ryan Nagle (Dec 19 2018 at 19:19):

thank you

view this post on Zulip Josh Mandel (Dec 19 2018 at 19:25):

Re: the mime type question, for encrypted payloads it's a good question. I think our protocol would still want to convey/negotiate what's inside (confirming NDJSON vs Parquet or whatever), as well an indication that the content is encrypted.

view this post on Zulip John Moehrke (Dec 19 2018 at 20:03):

In the IHE DEN profile it is made clear that the contained object(s) that is encrypted can be of any mime-type, and that mime-type is recorded in the CMS header, while the CMS object it-self is described using the mime-type of "application/pkcs7-mime".
That doesn't answer the question Josh is asking, which is more how does the client identify what type of payload they want.. Which seems to me could be simply use of the http negotiate mecahnism. Use the fhir type you see in the negotiation, and also see that application/pkcs7-mime is in negotiate.. right?

view this post on Zulip Josh Mandel (Dec 19 2018 at 20:04):

Well, we don't use straight-up HTTP content-type negotiation because we're not negotiating the content-type of the response manifest file -- we're negotiating the content-type of the eventually-to-be-exported data files. But it's the same basic idea (i.e., the client passes along a parameter to the kick-off request saying "I want NDJSON out in the end, please").

view this post on Zulip Ryan Nagle (Jan 03 2019 at 14:43):

happy new year, all :wave:

view this post on Zulip Ryan Nagle (Jan 03 2019 at 14:44):

as promised, a (draft) doc explaining how we implement per-client payload encryption: https://github.com/CMSgov/bcda-app/blob/master/ENCRYPTION.md

view this post on Zulip Ryan Nagle (Jan 03 2019 at 14:45):

appreciate any critique/feedback you are willing to offer


Last updated: Apr 12 2022 at 19:14 UTC