FHIR Chat · JSON vs XML · social

Stream: social

Topic: JSON vs XML


view this post on Zulip Yunwei Wang (May 26 2021 at 19:01):

Is there statistics on the percentage of FHIR data transferred using JSON vs XML?

view this post on Zulip Lloyd McKenzie (May 26 2021 at 22:56):

@James Agnew ?

view this post on Zulip James Agnew (May 28 2021 at 00:34):

@Yunwei Wang @Lloyd McKenzie interesting question.

It's hard to really get a fair answer, since HAPI's server answers in JSON by default unless the client expresses that it only wants a specific encoding.

So with that out of the way, I just did a quick grep of the request/response logs. It's actually kinda crazy just how skewed to JSON it is. Over the length of the most recent non-archived log file:

  • Client didn't specify a preference and got JSON: 71307 requests
  • Client requested JSON: 18435 requests
  • Client requested XML: 95 requests
  • Client requested Turtle: 64 requests

view this post on Zulip James Agnew (May 28 2021 at 00:36):

Really, a more fair way to do this would be to only count POST/PUT, since the client is forced to make a choice in that case for what payload type to send. Sadly I don't log the content type there.. I think I'll add that. :)

view this post on Zulip John Moehrke (May 28 2021 at 10:31):

those stats make XML look like an accident. I am not surprised.

view this post on Zulip Grahame Grieve (May 28 2021 at 10:32):

that is surprising.

view this post on Zulip John Moehrke (May 28 2021 at 10:34):

I wonder if Fire.Ly has skewed the other way?

view this post on Zulip Grahame Grieve (May 28 2021 at 10:34):

i do not track on my server. maybe I'll start

view this post on Zulip Yunwei Wang (May 28 2021 at 12:35):

I am surprised that XML is at the same level as RDF. It would be interesting to see what the format client specifically sent to server.

view this post on Zulip James Agnew (May 28 2021 at 12:40):

I've added that logging as of this morning. Will keep everyone posted. I'm pretty curious too :)

view this post on Zulip Brendan Keeler (May 28 2021 at 13:01):

XML is an accident

view this post on Zulip John Silva (May 28 2021 at 14:22):

Historically though XML came before JSON --- and had a lot of traction in the early days of SOAP (WSDL, etc.) and distributed computing models, before the RESTful paradigm started to gain a footing.

(To me JSON is 'just' another 'wire representation', maybe with fewer characters, but still ... JSON Schema is a copycat attempt at XML Schema and I believe, it still hasn't caught up to its functionality. Is there an equivalent to Schematron in JSON?)

view this post on Zulip Grahame Grieve (May 28 2021 at 14:31):

yes but it's not widely adopted

view this post on Zulip Grahame Grieve (May 28 2021 at 14:31):

https://schematron.com/2018/11/schematron-validation-of-json-data/

view this post on Zulip John Silva (May 28 2021 at 15:04):

Good to know it exists. I guess the question becomes, is anyone using this for FHIR validation yet or do most use the reference Java validator (what to do if your FHIR server isn't Java?). I remember using JSONschma (from Newtonsoft) for trying to do some FHIR JSON validation and it was 'relatively fragile' if I remember right and, of course can only do syntactic validation.

[OK, a little off-topic from JSON vs XML but there's a whole ecosystem that had been developed in the XML world; not sure if the JSON world has 'caught up' yet.]

view this post on Zulip James Agnew (May 28 2021 at 15:13):

The plot thickens.... Here's the payload type counts for creates/updates since I added the logging about 3 hours ago:

view this post on Zulip James Agnew (May 28 2021 at 15:13):

XML: 37

view this post on Zulip James Agnew (May 28 2021 at 15:14):

JSON: 23
RDF: 0

view this post on Zulip James Agnew (May 28 2021 at 15:15):

And here's the read/search count where an explicit type was requested:
XML: 103
JSON: 568
RDF: 0

view this post on Zulip James Agnew (May 28 2021 at 15:15):

This is fun, I can't wait to get more data... :)

view this post on Zulip Gino Canessa (May 28 2021 at 15:16):

Who's out there artificially inflating XML? ;-)

view this post on Zulip James Agnew (May 28 2021 at 15:19):

One thing I just realized, my original counts did not account for people who requested XML explicitly via an Accept header (as opposed to a URL parameter). The new numbers do though.

view this post on Zulip Vassil Peytchev (May 28 2021 at 15:21):

Can you also distinguish between Postman-type "trying things out" queries vs. other (presumably programmatically created) requests?

view this post on Zulip Frank Oemig (May 28 2021 at 15:37):

We do have some FHIR specs here in Germany that only allow for XML...

view this post on Zulip Grahame Grieve (May 28 2021 at 15:43):

is anyone using this for FHIR validation yet or do most use the reference Java validator (what to do if your FHIR server isn't Java?)

Most people use the either the java validator or the DotNet validator. The DotNet validator isn't as comprehensively reliable as the Java validator, but it's coming along quickly. You can run the java validator as a local web service if you won't want inline java, but you can also load it through a JVM interface.

view this post on Zulip John Silva (May 28 2021 at 16:15):

Make sense (options to run validators). My related question, what is the 'CPU overhead' of running the Java or DotNet validator vs running something like JSON Schema (or XMLSchema)? (I realize Schema and even Schamtron don't give as much coverage at the Java validator but at what expense?)

view this post on Zulip Lloyd McKenzie (May 28 2021 at 17:39):

The Java or .NET validators will consume considerably more memory - because they do full checking of terminology, while neither of the schema validators have that capability beyond codes for 'code' elements (and thus no need to load terminologies into memory).

view this post on Zulip Lloyd McKenzie (May 28 2021 at 17:39):

Also, recursively resolving references and checking validity against target profiles definitely takes more time/processing. Again something that the schema validators can't really do.

view this post on Zulip Yunwei Wang (May 28 2021 at 17:41):

I am not sure of schema validation could validate things such as slicing and invariant.

view this post on Zulip Brendan Keeler (May 28 2021 at 19:19):

Carequality's FHIR directory only half supports JSON for the FHIR Organization calls, so everyone uses XML there. I imagine other servers might be similar.

view this post on Zulip Lloyd McKenzie (May 28 2021 at 19:22):

I think JSON schema can do some slicing. Schematron can do invariants expressed as XPath. But for full validation, you need more than what any of the 'schema' languages can provide.

view this post on Zulip Brendan Keeler (May 28 2021 at 19:23):

I kid about XML. Simple XMLs can be great and the tooling is undoubtedly better right now.

I just have a bad association from IHE and Nictiz specs that have a bunch of wrapped XML (MTOM encoded CDA in a XCA with SAML, for example). XML documentation also generally was PDF distributed or all that WSDL stuff.

Open HTML docs, OpenAPI specs, etc are all associated with the JSON era, even if they're usable with XML

view this post on Zulip Brian Postlethwaite (Jun 07 2021 at 00:47):

Any update on this @James Agnew ?
(Note that the dotnet fhir client defaults XML into the Accept header my memory serves me right - so that could skew things that side, but any web clients are more likely just doing json)

view this post on Zulip Ewout Kramer (Jun 14 2021 at 12:05):

I'll see if we measure this on the Firely servers...

view this post on Zulip Brian Postlethwaite (Jun 14 2021 at 14:03):

That will be interesting, as the dotnet fhir client is xml by default


Last updated: Apr 12 2022 at 19:14 UTC