Stream: implementers
Topic: XML Version
Grahame Grieve (Mar 16 2020 at 06:58):
We have never said whether we require a particular XML version (1.0 vs 1.1). Does anyone have any opinion about this?
Jose Costa Teixeira (Mar 16 2020 at 07:28):
I thought we were only looking at XML 1.0
approx 3 years ago we wanted to put the GS1 field separator in device.udiCarrier.carrierAIDC
Jose Costa Teixeira (Mar 16 2020 at 07:28):
and we needed to use base64Binary.
Jose Costa Teixeira (Mar 16 2020 at 07:29):
I think the reason was that one of the formats (JSON or XML) did not support the 0x1D character
Vassil Peytchev (Mar 16 2020 at 16:32):
Are there XML processors that do not support XML 1.1?
Is it easier to manage JSON <-> XML conversions if only XML 1.0 is supported?
Grahame Grieve (Mar 16 2020 at 21:36):
it's harder, in fact, because there are valid characters in json that are not valid in XML 1.0 but are in XML 1.1
Grahame Grieve (Mar 16 2020 at 21:37):
and it is 0x1D. But I don't remember any link between XML version and AIDC coming up in discussion
Vassil Peytchev (Mar 16 2020 at 21:44):
Sounds like a strong case to require XML 1.1
Grahame Grieve (Mar 17 2020 at 11:25):
working on this... the validator uses the standard java SAX based parser. I thought, following https://xerces.apache.org/xerces2-j/faq-sax.html#faq-6, the this code:
public void setDocumentLocator(Locator locator) { super.setDocumentLocator(locator); this.locator = locator; this.locator2 = (org.xml.sax.ext.Locator2) locator; xmlVer = locator2.getXMLVersion(); }
Grahame Grieve (Mar 17 2020 at 11:25):
on this XML:
Grahame Grieve (Mar 17 2020 at 19:45):
<?xml version="1.1" encoding="UTF-8"?> <CodeSystem xmlns="http://hl7.org/fhir"/>
the locator2 interface reports the version as 1.0, not 1.1. Any ideas as to why?
Vassil Peytchev (Mar 17 2020 at 20:38):
Take a look at how it is done in Writer.java under the SAX samples of Xerces. You need to get the locator during the very first startElement...
Grahame Grieve (Mar 17 2020 at 20:46):
doesn't work; don't know why. I'm just going to parse the XML header myself first - I have other reasons to do so
Vassil Peytchev (Mar 17 2020 at 21:01):
FWIW,
java -cp ./xercesSamples.jar:./xercesImpl.jar:./serializer.jar sax.Writer ./xml_1_1_sample.xml <?xml version="1.1" encoding="UTF-8"?> <CodeSystem></CodeSystem>
Using your example as the sample file... Since namespaces are not enabled, the namespace is not being printed...
Marc de Graauw (Mar 24 2020 at 08:47):
I would not go for XML 1.1. See these comments by two XML experts: https://norman.walsh.name/2004/09/30/xml11 and http://www.cafeconleche.org/books/effectivexml/chapters/03.html. Still valid, I think. Especially what Elliotte says: "Whereas XML 1.0 was conservative (Everything not permitted is forbidden) XML 1.1 is liberal (Everything not forbidden is permitted.) XML 1.0 listed the characters you could use in names. XML 1.1 lists the characters you can't use in names." An application expecting XML 1.0 but getting XML 1.1 would surely break. Of course, you could make <?xml version="1.1" ... required in all FHIR instances, but that means all existing XML-based applications would have to change (unlike the rest of the world, we have quite some XML FHIR in the Netherlands).
Vassil Peytchev (Mar 24 2020 at 14:46):
Given that the JSON format allows characters that are forbidden in XML 1.0, not using XML 1.1 limits interoperability. The concern about characters in names doesn't seem to apply, because the names are part of the specification, and that can be controlled.
Marc de Graauw (Mar 24 2020 at 16:47):
XML 1.1 is mostly about characters in names, of which you correctly say those should not be a problem. JSON text is Unicode, but doesn't allow control characters (https://www.json.org/). XML 1.1 does allow some extra control characters in text - looks like it makes interoperability worse, not better. Do you - or anybody else - have examples of content valid in JSON but not in XML 1.0, but valid in XML 1.1? I'm not convinced requiring XML 1.1 solves a real-world problem, and I'm pretty sure it introduces a lot of problems - having everybody upgrade to 1.1, for instance. I've seen a lot of production XML (v3, FHIR, other) and I've never yet encountered 1.1 in the wild.
Vassil Peytchev (Mar 24 2020 at 17:44):
What I understood the issue to be was:
- in JSON you can send \u001d in the text
- in XML 1.0 this character is not allowed
- in XML 1.1 this character is allowed.
If the above is correct, you have at least one use case where it is an issue. If I misunderstood, then XML 1.1 is probably not necessary.
Marc de Graauw (Mar 25 2020 at 11:16):
It's a real case, but in general I'd say if some texts can contain control characters other than TAB, NL, CR, it's best to use base64Binary as @Jose Costa Teixeira has done - after all, from an XML or JSON viewpoint, this is more binary than textual content, and all issues around reserved or invalid characters are bypassed.
Jose Costa Teixeira (Mar 25 2020 at 11:24):
well, we did base64 encoding because we had no choice... it hurts to tell vendors: when you scan a barcode, you either have a GS1 decoding algorithm, or a base64 encoding, or you cannot transmit the scanned barcode
Grahame Grieve (Mar 27 2020 at 04:00):
ICD-11 contains terms that have \u001d in their definition. I think we do not want to change code system to force the definition to be base64Binary
Marc de Graauw (Mar 27 2020 at 09:18):
@Grahame Grieve That makes it a mess indeed. No, you can't change the code system - it's possible though to base64 the definition when serializing it in FHIR. Would require the definition in FHIR to be (string | base64Binary) though, does not seem the way forward either. Another route: since FHIR says nothing about 1.0 vs 1.1, the default assumption should be either is allowed. So if you need to serialize ICD-11 in FHIR, use XML 1.1 - that should be allowed today w/o any changes in specs anywhere. The one thing I see as problematic is requiring XML 1.1 for FHIR serialization - say "if you move to R5, you must move to XML 1.1".
Grahame Grieve (Mar 27 2020 at 19:19):
no that would certainly not be possible
Last updated: Apr 12 2022 at 19:14 UTC