FHIR Chat · content compression · implementers

Stream: implementers

Topic: content compression


view this post on Zulip Brian Postlethwaite (Oct 13 2016 at 12:52):

I'm interested in the servers out there who believe's their server is supporting the gzip (or deflate) content compression. And those that support receiving compressed content, or those that will just return compressed output.
'Accept-Encoding: gzip' and `Content-Encoding: gzip' support is what I'm looking for.
We're adding support to this in the .NET client, and wanted to know other servers that are supporting it so that we can test things (sqlonfhir supports both read and write)

view this post on Zulip Brian Postlethwaite (Oct 13 2016 at 12:59):

I just checked HAPI, and looks like it supports both request body and respinse body gzip encoded.

view this post on Zulip John Moehrke (Oct 13 2016 at 16:28):

You will also get automatic and transparet compression from using the new http protocol -- http/2

view this post on Zulip Grahame Grieve (Oct 14 2016 at 19:15):

do you think there is any other advnatage from http/2?

view this post on Zulip Travis Cummings (Apr 07 2020 at 17:50):

Resurrecting this thread... We are interested in the 'Accept-Encoding: gzip' and `Content-Encoding: gzip' headers to reduce the size of a very repetitive FHIR query response (400 kb compressed to 12 kb). I see security considerations regarding BREACH but don't fully understand if a FHIR server supporting gzip and SMART/OAuth tokens would be vulnerable. Assuming the app is a different origin than the FHIR server and the app uses an OAuth token, would the response from the server be subject to BREACH if it uses gzip?

view this post on Zulip Grahame Grieve (Apr 07 2020 at 21:14):

reading the details of the attack... it's in the incoming request, not the outgoing one, wher the problem is, right? @John Moehrke can you confirm that? My read is that you should prohibit incoming gzip compression, but server -> client content is variable anyway, so it isn't useful for BREACH

view this post on Zulip John Moehrke (Apr 08 2020 at 13:06):

This is my understanding as well. The vulnerability is upon the recipient of the gzip content. So a server would be at risk when accepting a PUT or POST with gzip; a client would be at risk accepting a GET with gzip. The vulnerability is because of gzip containing malicious encoding or content. A mitigation, which is likely already in place, is the authentication of the sender of that gzip; thus lowering the risk of a malicious sender providing content leveraging gzip vulnerabilities. That is to say that your clients are authenticating the server, hopefully strongly and not just generic https; and your servers are authenticating the clients, likely thru OAuth, strongly and not just looking for auth header without validating it. Thus you have a contained community, something that the general internet use of http doesn't have as pre-conditions.

view this post on Zulip John Moehrke (Apr 08 2020 at 13:17):

note that http/2 compression is broader than gzip. The compression lookup code table is maintained across a persistent TCP session for many sub-session (http) interactions (aka multiplexing). Thus the compression is benefiting from all of the FHIR modeling consistency for all sessions between that client and server; not needing to build that lookup table each http interaction like gzip needs to.

view this post on Zulip Travis Cummings (Apr 08 2020 at 16:54):

@John Moehrke I don't understand the BREACH attack to be as you describe. I believe it is an attack by a third party that observes a message encoded by HTTP compression where the compressed message contains secrets like CSRF token. The malicious party sends repeated requests while guessing common secret strings like "request_token=a", "request_token=b" observing the length of the reply. When the length decreases, the malicious party knows they've guessed a matching substring due to compressing a repeated value. This attack isn't mitigated by authentication from what I understand. It seems one proper mitigation involves hiding the exact length of the response (ex: TLS 1.3 record padding).

view this post on Zulip John Moehrke (Apr 08 2020 at 16:57):

where is that one documented? I guess I could understand it, but it is even more esoteric... I think client authentication does mitigate (lower risk), right?

view this post on Zulip Travis Cummings (Apr 08 2020 at 17:08):

Here is the best reference I have seen: https://arstechnica.com/information-technology/2013/08/gone-in-30-seconds-new-attack-plucks-secrets-from-https-protected-pages/

view this post on Zulip Travis Cummings (Apr 08 2020 at 19:11):

We are thinking that a FHIR server should only support compression when it can do so securely (or has no sensitive data). For us, this likely means only returning compressed responses when we trust the client referrer value. TLS 1.3 sounds good but it seems the EHR community allows TLS 1.0, 1.1, 1.2 and IE 10, 11.

view this post on Zulip Grahame Grieve (Apr 08 2020 at 21:09):

the key is that the BREACH attack isn't by a random observing 3rd party - it's only from someone inside the process who can orchestrate it - it's a layer break. So if you can orchestrate the sender, but are not trusted by the sender, you can break the sender.

so:

  • an application running inside a browser is obviously susceptible.
  • An native application requires that the application be reverse engineered, and any encryption is probably reverse engineerable - if it's not, and it's supposed to still be resistant to that, BREACH is a factor.
  • a close server is only susceptible if someone can hack into the server
  • a server running extensible code, but offering encryption to that code while that code is not trusted, that's also susceptible.

so the take away: if you have a closed server, then you can safely return gzip content without worrying about BREACH (just all the standard worries about general server security)

view this post on Zulip Grahame Grieve (Apr 08 2020 at 21:11):

I cannot see how the server needs to worry about the state of the client. But if you really do, just add some random value somewhere in every response - a useless header or something. Then there's no chance that an agent on the client can orchestrate the server response enough. But really, on a FHIR interface... how could you do that? I suppose there could be some PUT patient / GET patient cycle that made it susceptible...


Last updated: Apr 12 2022 at 19:14 UTC