Stream: implementers
Topic: XML parsing and XXE attacks
Michael Lawley (Apr 19 2016 at 05:09):
For any server implementers, it's probably worth looking at this issue just opened on HAPI
https://github.com/jamesagnew/hapi-fhir/issues/339 and the associated XXE links about the issue and a mitigation
Ewout Kramer (Apr 19 2016 at 09:06):
Thanks Michael, i have added this as a todo on the .NET XML parser as well.
Brian Postlethwaite (Apr 19 2016 at 09:41):
Have you looked into it yet Ewout?
Brian Postlethwaite (Apr 19 2016 at 09:50):
Just checked the dotnet XML Serializer
settings.IgnoreProcessingInstructions = true;
Grahame Grieve (Apr 19 2016 at 10:40):
Seems that msxml is not affected? It's hard to find good information about this!
Brian Postlethwaite (Apr 19 2016 at 11:07):
this is in the dotnet System.Xml.XmlReader class that we use.
Brian Postlethwaite (Apr 19 2016 at 11:07):
(not necessarily the msxml parser)
Ewout Kramer (Apr 19 2016 at 12:15):
See https://github.com/ewoutkramer/fhir-net-api/issues/190.
Verified resilience with a unit-test, .NET parser refuses to parse this DTD
Michael Lawley (Apr 19 2016 at 13:50):
For a test case: POST this to [base]/ValueSet/$expand
<!DOCTYPE foo [ <!ENTITY xxe SYSTEM "file:///etc/">]> <Parameters xmlns="http://hl7.org/fhir" ><parameter> <name value="identifier" /><valueUri value="http://snomed.info/sct/32506021000036107/version/20120531?fhir_vs=isa/900000000000455006" />&xxe; </parameter></Parameters>
Grahame Grieve (Apr 19 2016 at 20:05):
so, the maintainers of the reference implementations just discussed this. In order to protect against this, we have to turn off all processing of entities - we can't just turn off some of them
Grahame Grieve (Apr 19 2016 at 20:06):
the upshot of this is that we propose to make a new rule: no entity definitions in the xml. Only pre-defined entities allowed.
Ewout Kramer (Apr 20 2016 at 09:03):
The "pre-defined ones" including the "normal" character entities. Now to decide which those are, I didn't know there were this much: https://dev.w3.org/html5/html-author/charref
Michel Rutten (Apr 20 2016 at 10:16):
DTDs for XHTML entities: https://www.w3.org/TR/xhtml1/dtds.html#h-A2
Ewout Kramer (Apr 20 2016 at 10:55):
Y. That's the list I used. The link I sent has many more, like 𝕕
Michel Rutten (Apr 20 2016 at 11:07):
Aha, apparently HTML5 introduced some new entities.
Grahame Grieve (Apr 20 2016 at 11:53):
we're using xhtml, not html. the only entities defined are those in xml, not xhtml
Ewout Kramer (Apr 21 2016 at 07:49):
Well, there might be some in the narrative too, which is xhtml. It's a different namespace, but I'd think it would be surprising to the user if it's allowed in one place but not in the other. If I look at this: https://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references, it seems there are only a few xml entities. I'd really like to support stuff like ê for common diacritics in (european) languages...
James Agnew (Apr 21 2016 at 13:42):
FYI to any HAPI users, a new release is now in the Maven stable repos (1.5) which addresses this issue.
Lloyd McKenzie (Apr 21 2016 at 13:45):
@Ewout Kramer I'm not thrilled with the notion of supporting things that aren't valid XHTML. I don't think we can count on support for those in a lot of off-the-shelf tools. Plus it creates technical issues with the spec.
Ewout Kramer (Apr 26 2016 at 14:10):
Well, I had to support it (just like the Java implementation I guess), since our examples had those entities in them. But sure, we could discuss removing this support....
Grahame Grieve (Apr 26 2016 at 14:13):
I don't think you'll find them in any examples any more
Grahame Grieve (Apr 26 2016 at 14:13):
they're not valid, and I don't think anything like this still escapes testing
Grahame Grieve (Apr 26 2016 at 14:14):
though I eliminated some from the json examples while on the plane
Last updated: Apr 12 2022 at 19:14 UTC