Stream: Security and Privacy
Topic: Status of GET for search
Grahame Grieve (Oct 22 2019 at 19:33):
It's been brought to my attention that some members of the community are seeking to prohibit the use of GET for search requests in particular IGs, and possibly more broadly.
I think we should have a clear view on this across the community. Some background:
- mostly, this is a question of logging - HTTP logs offer an access route to PHI through whatever PHI the URLs contain
- in general, HTTP logs do not log the body. At least, not by default. But they can, and sometimes do. There is (and never will be) any normative rule that they can't
- any organization collecting logs of FHIR RESTful calls is a party to PHI and needs to manage their logs accordingly. Given a secure connection to the app server, only the organisation should have access to the call to log it
- I think we should assume that if the logging agent is malicious (MITM etc) they'll be logging the body as well
- other arguments for not using GET relate to web pages, not web applications (referrer, visible in the address bar for other users)
For these reasons, the specification has never made any rule about this not using GET, but it does say that servers SHALL accept POST searches, mainly for this reason, so that clients can use POST instead of GET if they want.
Individual IGs are welcome to make their own rules, but in general, I would recommend against hard SHALL NOT use GET, since this is actually based on bad threat analysis (as outlined above). I would recommend that
- we explain this a little better in the specification
- IGs repeat the guidance in the specification
John Moehrke (Oct 22 2019 at 19:51):
I think it is appropriate for us to state that there exists a risk of un-managed audit logging (managed is by definition managed). I do not see that this risk of un-managed audit logs is a rational to forbid the use of GET and mandate of use of POST. (especially as Grahame points out there is nothing stopping an un-managed audit log from logging POST body).
Matt Blackmon (Oct 22 2019 at 20:33):
I would agree with the above.
I agree in that there is a risk of propagating the "misunderstanding" of "SHALL NOT use GET mitigates" a certain risk.
It also eliminates a potentially useful action (namely, GET) that may not have been fully explored or utilized. I don't have anything particularly in mind for this, but it seems dreadfully overly restrictive for what such a prohibition would actually deliver.
Finally, the existing recommendation to utilize https/secure transport mitigates much of this concern while addressing security in turn by measures at the proper layers of the OSI model.
Jenni Syed (Oct 22 2019 at 21:11):
Do we have anything about this in a security guide? It's such a common misunderstanding, I'm curious if we have a similar list of bullets and considerations for reference
Jenni Syed (Oct 22 2019 at 21:11):
I know we've talked about this quite a bit over the years
Lloyd McKenzie (Oct 22 2019 at 21:15):
We could also add guidance in the IG- guidance IG.
Mohammad Jafari (Oct 22 2019 at 21:35):
Adding my comment for the Security WG RE this:
https://confluence.hl7.org/display/SEC/Share+with+Protection+-+Technical+Approaches?refresh=1571761141503&refresh=1571761208716&focusedCommentId=66926111&refresh=1571770267086#comment-66926111
There's also some draft language if we want to address and mention this in the security/privacy page.
I also think that instead of a SHALL NOT, we should suggest this as a possible design choice which mitigates some risks.
Jenni Syed (Oct 22 2019 at 21:37):
Interesting on assumption of TLS terminating. In public clouds, we've been given guidance that TLS in transit is recommended for all sensitive health data (which would matter depending on how deep the stack/logging goes)
Jenni Syed (Oct 22 2019 at 21:38):
I know of a few frameworks that default to logging bodies on POSTs for failure
Jenni Syed (Oct 22 2019 at 21:38):
And a few reverse proxies that parse bodies for similar TLS or URL manipulation issues for architectures referenced there
Jenni Syed (Oct 22 2019 at 21:39):
(to make sure the URL going outbound matches what the caller used inbound when behind a termination or proxy of some sort)
Jenni Syed (Oct 22 2019 at 21:39):
Essentially: everything your data transits through unencrypted has to be considered part of the model to protect data
Grahame Grieve (Oct 22 2019 at 21:55):
Do we have anything about this in a security guide?
No. Hence my proposal above to say something explicitly
k connor (Oct 22 2019 at 23:01):
RE • GG "any organization collecting logs of FHIR RESTful calls is a party to PHI and needs to manage their logs accordingly. Given a secure connection to the app server, only the organisation should have access to the call to log it"
Unfortunately, it doesn't matter that the organization has access to the logs in scenarios where the entities accessing the PHI in logs have no laws restricting how they use PHI, in particular, where a US consumer releases PHI to entities not governed by HIPAA (and now no longer PHI and perhaps vaguely governed by Consumer Protection laws), perhaps it makes sense for IG authors to restrict queries to POST especially when their IGs are about "Consumer-centric control of their information.
Thankfully, some are developing API consumer privacy protective guidelines such as Xcertia Guidelines https://xcertia.org/wp-content/uploads/2019/08/xcertia-guidelines-2019-final.pdf and CARIN Alliance Code of Conduct https://www.carinalliance.com/wp-content/uploads/2019/05/2019_CARIN_Code_of_Conduct_05082019.pdf, which may decide that restrictive conformance requirements are appropriate for the policy context in which they operate. Providing approaches to Sharing Consumer Information with Protections would be a useful FHIR P&S, to which IGs wishing to distinguish their Consumer Centric Capabilities could assert conformance.
Grahame Grieve (Oct 22 2019 at 23:20):
in scenarios where the entities accessing the PHI in logs have no laws restricting how they use PHI, in particular, where a US consumer releases PHI to entities not governed by HIPAA (and now no longer PHI and perhaps vaguely governed by Consumer Protection laws), perhaps it makes sense for IG authors to restrict queries to POST
That's the core confusion here: if the organization is collecting logs, it doesn't matter whether you use a post or a get since they may collect the bodies.
k connor (Oct 22 2019 at 23:35):
Those able to log PHI because they're the end point is one problem that can't be solved with POST vs GET. It's the intermediaries logging the GET HTTP headers, who aren't even covered by the Consumer Protection Laws that are the other issue.
Grahame Grieve (Oct 22 2019 at 23:40):
-
There should be no intermediaries not covered by consumer protection laws. How would that happen?
-
If there are such intermediaries, they can log the POST bodies too, and sometimes do.
The whole idea that intermediaries only log the URL is bad security analysis that we shouldn't allow
k connor (Oct 23 2019 at 00:53):
The US FTC has some consumer protection laws if the terms of agreement in Apps are not followed. No protections against intermediaries for the most part. Even HIPAA X12 transactions have to be opened beyond the envelope by whatever Clearinghouse among any number of Clearinghouses (and not necessarily with Business Associate Agreements with the Source/Recipient Covered Entity) happens to need to find out the transaction end points. Likely there needs to be more analysis/research on existing analysis about the lack of protections by intermediaries. I don't think any of us has enough research to make any assertions about what intermediaries are doing with GET Header or POST Header/Body. Doesn't hurt to try to minimize damage. Suggest that a more comprehensive review of the literature/practices be done so that we can give appropriately tuned guidance about the risks HL7 implementers should be considering/mitigating against.
k connor (Oct 23 2019 at 00:58):
May want to check out Data sharing practices of medicines related apps and the mobile ecosystem: traffic, content, and network analysis BMJ 2019; 364 doi: https://doi.org/10.1136/bmj.l920 (Published 20 March 2019)
k connor (Oct 23 2019 at 01:01):
Here's FTC Medical App guidance https://www.ftc.gov/tips-advice/business-center/guidance/mobile-health-app-developers-ftc-best-practices
Laura Hoffman (Oct 23 2019 at 01:05):
The ONC and CMS interoperability rules are going to require providers to push health information to consumers via apps at a pace never before seen. There are no substantive privacy protections for consumers using apps and since the information is leaving the covered entity space, HIPAA obviously won't apply. Certified APIs should need to check for app attestation to whether they adhere to industry guidelines around app development and data use (Xcertia Guidelines, CARIN Code of Conduct, FTC Best Practices), as well as whether they provide consumers with easy-to-understand, model privacy notices (e.g., ONC's Model Privacy Notice). Otherwise apps are free to do what they want with health information on patients and their families.
Grahame Grieve (Oct 23 2019 at 01:24):
right. So obviously there are concerns here, and only using POST is a mitigation.. but my point is - it's a very minor mitigation, far less than people think. Doing your client crypto good is a far better protection, to reduce the prospect of accidental leakage - you will be connected directly to the source, who can log whatever they want
Mohammad Jafari (Oct 23 2019 at 01:24):
Intermediaries can indeed log POST bodies but in most cases they don't because that will lead to large volume of log data. It's much more likely for intermediaries to log URLs than full HTTP bodies. Note that this is about the customery configurations rather than what each entity can technically see and log.
As I mentioned in the writeup for the security WG, we can't deny that URLs are usually more exposed in logs compared to POST bodies.
I copy the example from a heroku default router log which is created after TLS termination. We see the URL parameters and the source IP which is very common to log.
2019-10-21T14:20:41-07:00 test heroku/router at=info method=GET path="/endpoint?parameter1=value1" host=test.herokuapp.com request_id=87eef4da-3eb2-4cdc-a9ee-914607fbbaa9 fwd="70.66.172.102" dyno=web.1 connect=0ms service=42ms status=200 bytes=515 protocol=https
Once again, people can configure things to cease generating these logs or filter them. What we are saying is that using POST is also an additional mechanism to subject sensitive parameters to less exposure and mitigate the risk of exposure via logs due to possible misconfiguration. This is not about a definitive prescription; it's just bringing to attention a possible mitigating measure.
Grahame Grieve (Oct 23 2019 at 03:38):
so I agree to bring it people's attention as a mitigating factor. But not to support that by bad analysis.
Jenni Syed (Oct 23 2019 at 14:05):
In these cases, who are the intermediaries that have access to the unencrypted payload but are NOT covered by your BA/within your 4 walls? If they're part of the consumer side app only (which is a concern we all have re:how a patient will choose a good/well behaving app that they can trust with their health data), how does doing a POST help if the data is unencrypted and traveling through this system?
Jenni Syed (Oct 23 2019 at 14:06):
Even if you ignore the body and URL, there are sensitive authentication headers passing through these systems unencrypted... which is even scarier
Gino Canessa (Oct 23 2019 at 15:42):
I'm confused.
If third parties are logging details from inside requests, the channel isn't secure and the data is compromised (e.g., they are in the channel, so request parameters are the least of your concerns - someone has access to your server's responses, can impersonate clients, etc.).
If an implementer is logging their own requests, I would assume the guidance should be to treat logs as if they contain/are PHI (e.g., having a sequence of resource/terminology requests from a particular IP Address can likely be considered PHI). The GET/POST discussion is then just about how much PHI you tend to leak.
I worry that framing it as anything other than that would give the impression that logging GET requests results in clean logs.
Mohammad Jafari (Oct 23 2019 at 15:42):
@Grahame Grieve I agree.
@Jenni Syed When an app is deployed on the cloud provider, arguably, it's already out of the organization's "4 walls" and you're trusting the cloud provider. The traffic is in plaintext after TLS termination until it hits the webservers (original traffic attributes are added in X-forwarded
headers to inform the app that the traffic was originally https). The argument is that the traffic is plaintext inside the cloud provider network and between their servers (between router/loadbalancers and the app containers) and you already have trusted the cloud provider (they already have your entire app and its secrets hosted on their servers and could technically access its traffic from within the app/container). I agree with you nonetheless that this deployment model, especially TLS termination, is risky.
Mohammad Jafari (Oct 23 2019 at 15:47):
@Gino Canessa I definitely agree that this should not be framed to give the impression that using POST makes logs secure. Like I said earlier, this is just to highlight one possible mitigating measure with respect to common logging configurations.
Jenni Syed (Oct 23 2019 at 16:05):
@Mohammad Jafari The cloud providers we've worked with tell us that we need/should/must encrypt traffic in transit within their systems if it contains any sensitive data. Any logs written during that are typically secured and what goes in them is governed by sensitivity rules.
Jenni Syed (Oct 23 2019 at 16:07):
The URLs are the least of your worries here
Mohammad Jafari (Oct 23 2019 at 16:08):
I understand that and I agree. What I'm saying is that, again, we're not saying this is a solution for keeping traffic confidential on a cloud provider's network. It's a mitigating measure with respect to common logging configurations and practices.
Jenni Syed (Oct 23 2019 at 16:09):
But another common configuration in most public clouds is a way to "trace" or "debug" those calls. This typically logs all data
Jenni Syed (Oct 23 2019 at 16:09):
And while some provide ways to filter out those logs by blacklisting certain fields, that won't help for bodies (where almost everything is sensitive or PII)
Jenni Syed (Oct 23 2019 at 16:10):
We could take this to the extreme and talk about buckets left misconfigured and open to the public, which is more common than most would want as well
Jenni Syed (Oct 23 2019 at 16:10):
But the result is the same: you have to secure you system with intention and confirm
Gino Canessa (Oct 23 2019 at 16:10):
@Mohammad Jafari , I understand what you are saying, but I disagree.
In your Confluence post, you say:
It is always emphasized to developers to make sure they do not log any sensitive data in the application logs because logs are usually subject to fewer protective measures ... So developers must be constantly reminded to ensure they remove sensitive parameters if they are logging the HTTP URL ...
I think we need the opposite guidance, saying that logs are sensitive data and MUST be treated as such. If that means changing the vetting process for a cloud provider or changing how the system is designed, that is what a developer should do.
If we consider a sequence: POST-search
, GET-Patient/xyz
, GET-snomed/103412005
, from an IP address that resolves to a physical address (not uncommon where I live), this log gives a strong hint that someone at that location is patient 'xyz' and they have an HIV result. Implicit in this data are also items like which facility this patient receives care from, etc.
If the chain includes something like a medication request or terminology lookup for a medication, it is easy to tell the result is positive.
Other chains reveal plenty of information as well, like receiving requests for four different patients (GET by ID) in the same session to reveal relationships, etc.
In my mind, this is all PHI. What is gained by providing/following that guidance?
Jenni Syed (Oct 23 2019 at 16:11):
We should definitely have guidance that calls all of this out, and not send the wrong signal that POST is helping someone not need to think about the intricacies here.
Mohammad Jafari (Oct 23 2019 at 16:14):
@Gino Canessa I agree. these are addressed in my initial write-up on top of that confluence page (above the comment).
Mohammad Jafari (Oct 23 2019 at 16:16):
I think the intention of The POST vs. GET has been misunderstood here. This is not intended to solve one or all of the confidentiality and inference problems. It's one possible mitigating measure which we propose should be noted. That's all.
Mohammad Jafari (Oct 23 2019 at 16:20):
But I understand from this conversation that the framing is crucial and it can be counterproductive if this is framed incorrectly to give such an impression.
Last updated: Apr 12 2022 at 19:14 UTC