Stream: implementers
Topic: JSON Format for Primitives & extensions
nicola (RIO/SS) (May 29 2018 at 20:46):
Are we sure we want to see this in real systems - https://gist.github.com/niquola/6c826639a7f0061a9e3274c3f7272cae#file-ups-yaml-L74 ?
nicola (RIO/SS) (May 29 2018 at 20:53):
If this is really required and FHIR primitives is not primitives - let's encode it in terrible xmlish style code: "string"
=> code: {value: "string"}
nicola (RIO/SS) (May 29 2018 at 21:01):
I'm pretty sure, that non-fhir developers community will be really confused by such corner cases.
Grahame Grieve (May 29 2018 at 21:05):
yes. we had a big long argument about this in the past. What you propose is exactly what we still had and what I wish we still had. But we decided that it was a better balance to make the simple cases very simple and the corner cases very hard
Grahame Grieve (May 29 2018 at 21:05):
but it's too late to change this now
nicola (RIO/SS) (May 29 2018 at 21:06):
Let's create second json serialization format - this is not breaking change!
nicola (RIO/SS) (May 29 2018 at 21:07):
It could be optional - pretty sure most of users will switch on it eventually
nicola (RIO/SS) (May 29 2018 at 21:08):
I can work on it :) .
Grahame Grieve (May 29 2018 at 21:14):
I'm pretty sure that's a bad idea. I don't think people will be interested. But we could have a straw poll - indicate by choosing :thumbs_up: or :thumbs_down: as a response to the message above
Lloyd McKenzie (May 29 2018 at 21:28):
Competing versions of the same syntax - especially at this point in the standard life cycle - would not be doing implementers any favors. Most implementers will be using one of the reference implementations, meaning they don't care. The only thing new to the argument here is JSON schema and that was in its infancy at the time the decision was made. I don't think there's anything about the syntax that says it can't work. It just makes certain edge-case usecases harder.
nicola (RIO/SS) (May 29 2018 at 21:32):
Hiding complexity in reference implementation lib does not eliminate it :(
Dmitri Sotnikov (May 29 2018 at 21:33):
@Lloyd McKenzie I'm not sure what you're basing the idea that most implementors will be using a reference implementation on. My team certainly isn't, and we will be affected by these edge cases. Furthermore, it seems to me that the protocol itself should be usable without having to rely on third party libraries. If the design is such that the wire format itself is not directly usable that's a fundamental failure of the FHIR specification.
Grahame Grieve (May 29 2018 at 21:50):
no it's not necessary. We argued about this a lot back when we made this decision. Json schema wasn't a factor then, but I generate json schema ok for this now - I don't think it's that complex.
Grahame Grieve (May 29 2018 at 21:52):
I don't like the decision we made, but it seems way to late to change it. and it does make the simple cases much simpler. we rarely encounter extensions on primitives. except for translation but we've never got into gear about a good json representation of translations
Dmitri Sotnikov (May 29 2018 at 22:03):
The fact that JSON is a second class citizen in FHIR is incredibly problematic in my opinion. This is the most commonly used data format today, and many platforms support it natively. Currently, FHIR wire format is pretty much unusable without complex and fragile libraries. I think that focusing on XML as the primary representation is really misguided.
Grahame Grieve (May 29 2018 at 22:04):
the only respect in which JSON is a second class citizen is that we prefer XML for the base examples since JSON doesn't have comments. In every other respect, JSON is an equal ctiizen
Dmitri Sotnikov (May 29 2018 at 22:11):
The way the data is structured in FHIR is not how idiomatic JSON is structured. For example, many instances where you have collections of maps appear to be a artifact of directly translating XML nodes into JSON. Meanwhile, idiomatic JSON would favor use of maps as the top level element keyed on the id of each node. Another example is that collections are named with singular names, where idiomatic JSON would use plural to indicate that they are collections.
From what I see, JSON representation is basically just XML format dumped into JSON. Having JSON as a first class citizen would mean structuring data in JSON idiomatic fashion, and FHIR does not do that.
In practical terms it means that you have to go through a complex, error prone, and expensive parsing step before you're able to work with the data effectively.
Grahame Grieve (May 29 2018 at 22:25):
we talked about pluralising the names - that's something we decided not to do. I don't know how significant that is
Grahame Grieve (May 29 2018 at 22:26):
the issue of maps - that's come up before. That's got nothing to do with xml vs json, and everything to do with complexities in the data. We try for as simple as possible, but sometimes it's just not possible
Patrik Sundberg (May 29 2018 at 22:31):
I suspect this will remain a hot topic for years to come. :) Maybe a compromise here is what Grahame referred to in the past as a "database representation". In the proto representation we already unroll some of these extensions into named tuples, based on the profiles of various resources in use, which makes the db setup and queries easier. I don't think such a representation has to be tied to json at all, and i'm not sure if it has any place being sent over the wire directly, but i do think there is value in talking through what it would look like.
Dmitri Sotnikov (May 29 2018 at 22:38):
I end up taking the data that comes over the wire and then transforming it into a usable form. The question is not of data complexity, but of the use cases the format is optimized for. I also get the impression that nobody on the core team works with JSON data directly, and therefore the problems are not apparent because nobody is dog fooding the protocol.
The decisions appear to often be based on assumptions, such as the case with @Lloyd McKenzie suggesting that people are predominantly using 3rd party libraries. I see very little evidence that there's any effort to validate such assumptions based on the broader community usage. This results in many key decisions being made by a small closed group, and then when people find them problematic they're told that it's too late to change anything.
Lloyd McKenzie (May 30 2018 at 03:29):
"Too late to change" comes from where we are in the standards process. If we change the syntax now, it has a huge impact on implementers - of which there are now thousands. It also means we can't go normative this cycle, which impacts even more. The decisions weren't made by a small closed group - the nature of the JSON protocol were made with a great deal of open discussion. The final shape of the syntax was landed on in a room with 40+ people in it and lots of active discussion - and that was after lots of beating on it in connectathon. It's certainly possible (likely) that for certain uses, the syntax is not ideal. The same would be true of the XML and RDF syntaxes. The focus of the syntax is not persistance or internal data operations, it's just exchange. If those other uses had been primary considerations, it's entirely possible that any of the syntaxes would look different than they do now.
Lloyd McKenzie (May 30 2018 at 03:31):
The assertion of "it's too late" also isn't a veto. You can vote. If there are enough people who agree, then the change will be made. I think that the number of people invested in the current syntax make that an unlikely outcome, but you're absolutely welcome to make the effort. Grahame's statement should not be taken as a "you can't", but rather "based on where the community is at, it's probably not a good use of your time".
Ewout Kramer (May 30 2018 at 07:30):
Correct. There are things in the XML syntax that I don't like either - and I am pretty sure the designers of HTTP or actually any piece of spec or software would love to roll back some of their decisions. At some point however, the cost of change is larger than the benefit - as software engineers I am sure we all recognize this. Personally I think we've past that point quite some time ago.
Grahame Grieve (May 30 2018 at 08:05):
"I also get the impression that nobody on the core team works with JSON data directly" - very much not the case.
Grahame Grieve (May 30 2018 at 08:06):
btw, restructuring the data for easy consumption for a particular use case is pretty common. I've run into quite a few people using graphql for that... I'm trying to encourage them to standardise on the work documented at build.fhir.org/graphql.html
nicola (RIO/SS) (May 30 2018 at 09:11):
We can start experiment with new json format - if it will fail - that's ok, but if we will get real interest ... As a second format, this will be non-breaking addition, which can be abandoned at any time :)
Grahame Grieve (May 30 2018 at 09:13):
well, ok. how's it going to be different? going to the direct representation of primitives, and extensions, ok. what else?
nicola (RIO/SS) (May 30 2018 at 09:14):
- polymorphic elements
Grahame Grieve (May 30 2018 at 09:15):
oh? so how are you proposing to do that?
nicola (RIO/SS) (May 30 2018 at 09:16):
Something as discussed here https://github.com/fhir-fuel/fhir-fuel.github.io/issues/2
nicola (RIO/SS) (May 30 2018 at 09:17):
It should be similar to what you have in ref.lib. - i.e. Observation.value.string
nicola (RIO/SS) (May 30 2018 at 09:17):
As well as local references as {id: "...", resourceType: "Resource"}
Grahame Grieve (May 30 2018 at 09:18):
i don't know what you mean about the reflib.
nicola (RIO/SS) (May 30 2018 at 09:18):
Java/ C#
Grahame Grieve (May 30 2018 at 09:19):
I assumed that was what you mean but I don't know how that's relevant. And that page has 2 solutions.
Grahame Grieve (May 30 2018 at 09:19):
local references?
nicola (RIO/SS) (May 30 2018 at 09:20):
Local means on the same server/db - https://github.com/fhir-fuel/fhir-fuel.github.io/issues/4
Ewout Kramer (May 30 2018 at 10:51):
Hi Nicolai, the json example for solution 2 is exactly the same as the first - I cannot quite grasp what solution 2 is!
nicola (RIO/SS) (May 30 2018 at 10:56):
@Ewout Kramer do you mean union/polymorphic representation? this is open for discussion on GitHub
Ewout Kramer (May 30 2018 at 10:58):
I mean the example under the heading "Solution 2" here: https://github.com/fhir-fuel/fhir-fuel.github.io/issues/2. That snippet of json is exactly the same as the example under Solution 1!
nicola (RIO/SS) (May 30 2018 at 10:59):
The difference is in 1. we have value key in 2: key is named after type: {"value": ".....", "type": "string"}
or {"string": ".........."}
nicola (RIO/SS) (May 30 2018 at 11:00):
This can be unified with primitives representation: we can choose or {"value": ....} or {"<type>": .......}
Ewout Kramer (May 30 2018 at 11:02):
Oh ah, thanks, I failed to notice that ;-)
nicola (RIO/SS) (May 30 2018 at 11:06):
What are use cases for primitive extensions? nullFactors?
nicola (RIO/SS) (May 30 2018 at 11:09):
Why we can't just create extension one level up to represent the same logic?
nicola (RIO/SS) (May 30 2018 at 11:16):
How do you handle extended primitives in ref. impl. ? I still think, this will be source of pain in many technologies, without real profit.
Dmitri Sotnikov (May 30 2018 at 12:33):
I think graphql would address the problem, and the existing work looks very promising.
Grahame Grieve (May 30 2018 at 13:30):
extension 1 level up fails for elements that repeat
Grahame Grieve (May 30 2018 at 13:31):
and is messy because now you can't interpret primitives without their context. Not that the json is any different now... my real unhappiness with it....
nicola (RIO/SS) (May 30 2018 at 13:33):
@Grahame Grieve What are real use cases for primitive extensions? May we find another solution for it?
nicola (RIO/SS) (May 30 2018 at 13:33):
I think NullFactor should be handled in a different way, not by extensions
Grahame Grieve (May 30 2018 at 13:36):
we didn't define it for null flavor. There's a number of primitive data type extensions in the spec. you should have a look at those
nicola (RIO/SS) (May 30 2018 at 13:46):
Do you mean code extensions in basic structure-definitions? Why do you spec the core features by extensions?
Grahame Grieve (May 30 2018 at 13:48):
no I mean the extensions defined here:
Grahame Grieve (May 30 2018 at 13:48):
http://build.fhir.org/extensibility-registry.html
Grahame Grieve (May 30 2018 at 13:49):
the code extensions in the structure definitions are a special case - they are effectively the fhir equivalent of compiler magic
Lloyd McKenzie (May 30 2018 at 14:07):
A few of the use-cases we've hit so far for primitive extensions:
- express a date in an alternate calendaring system
- provide "original text" for any data subsequently encoded as a date, number or boolean (i.e. what did the user actually see/type)
- provide translations of a string
- provide a calculation instead of a value (e.g. express an offset from some specified time-point instead of a date, provide a formula rather than a numeric element
Josh Mandel (May 30 2018 at 14:54):
The interesting thing about FHIR's JSON syntax is that 5 years ago I was making almost the same argument ("nobody on the core team uses JSON", "this is just thinly transformed XML", and "it's not idiomatic") to promote the opposite change (from "all primitives are behind .value
properties" --> "just make FHIR primitives be JSON primitives").
In hindsight, I'm sorry I was persuasive -- because what I was trying to do was save JSON-centric developers from what I considered the line noise of extra properties everywhere... but the real pain was the fact that primitives in FHIR just aren't primitives. Trying to optimize for what I considered the extremely common case of "unadorned primitives" leaves a whole lot of pain in the less common cases, and makes the underlying model was less elegant.
nicola (RIO/SS) (May 30 2018 at 14:59):
I'm still confused by solid point - primitives, which are not primitives
and suspect, that we are trying to solve orthogonal problems like i18n and alternative calendars with wrong tool - extended primitives.
nicola (RIO/SS) (May 30 2018 at 15:00):
If primitive is not enough - use complex type!
Lloyd McKenzie (May 30 2018 at 15:07):
Every single place primitives appear, there's the potential for them to not be enough
nicola (RIO/SS) (May 30 2018 at 15:07):
- Date is a complex type by definition - let's just model it like {calendar: '', date: '', time: ''}
- to preserve original info we can have extensions - this is not common case to document it, in some resources, where this is common case we can create dedicated elements to record this
- for translation we can have complex i18n text {lang: '...', text: ''}, like Narrative
Lloyd McKenzie (May 30 2018 at 15:07):
We don't want to add the complexity of those use-cases for everyone
Lloyd McKenzie (May 30 2018 at 15:08):
The whole point of extensions is to keep the data simple unless use-cases drive a need for complexity
Lloyd McKenzie (May 30 2018 at 15:09):
You can't possibly model all the potential use-cases ahead of time. Extension is for "all the weird use-cases implementers will come up with"
Lloyd McKenzie (May 30 2018 at 15:09):
They will always come up with more than you can imagine.
nicola (RIO/SS) (May 30 2018 at 15:10):
That's true :/
Dmitri Sotnikov (May 30 2018 at 15:10):
@Lloyd McKenzie while I understand that there is impact on implementors due to changes, the impact on millions of people who will ultimately be using the protocol once it's normative outweighs that by orders of magnitude. Any decisions that are made now will affect many people for decades to come, and favoring convenience for current implementors over that seems rather short sighted to me. Having ~40 people make decisions that will impact the community at large is not really a convincing argument. The decision should be based on collecting feedback from the experience of the wider community using the protocol in the wild.
While the protocol focuses on the exchange, the fact that the data has to be extracted from the exchange protocol is still an important consideration. Ultimately, if the protocol is difficult to work with that will lead to errors when people attempt to extract the data from it, and that directly impacts patient safety.
I'd also like to note that UHN has been one of the early adopters of FHIR, and we have tried to make suggestions for the protocol from the start. The message has always been the same. So, while technically we are able to make suggestions, very little feedback ends up being integrated back into the standard in practice. I've been hearing the line that people have already implemented the current version and therefore it's not worth changing from the start.
nicola (RIO/SS) (May 30 2018 at 15:15):
@Lloyd McKenzie so we can tell, that FHIR does not have primitive types - but containers for primitives with extension points?
Lloyd McKenzie (May 30 2018 at 15:22):
@nicola (RIO/SS) Correct. FHIR has no true primitives. Every 'primitive' has extension points.
nicola (RIO/SS) (May 30 2018 at 15:22):
I still have doubts about abusing and wrong interpretation of primitive extensions in real use.
Things like
"_active": {"extension": [{ "url": "http://example.org/fhir/StructureDefinition/recordStatus", "valueCode": "archived" }]},
looks like changing semantic. Such type of power looks dangerous.
nicola (RIO/SS) (May 30 2018 at 15:25):
And with extending primitives people will always have this temptation
nicola (RIO/SS) (May 30 2018 at 15:27):
There are some programming languages, which allow extension of primitives (js,ruby) and this is considered like bad practice and source of bugs & confusion. Because primitives is one of semantic foundation :)
Lloyd McKenzie (May 30 2018 at 15:29):
@Dmitri Sotnikov The process you describe is what we've been doing for the past 6 years. Change requests and ballot submissions have been coming in. The specification reflects the evolution that's resulted from that feedback and the consensus of the community. In the end, the decisions about proposed changes are made by people who show up on the calls and who vote in the ballot. There comes a point where the community who's engaged (and who makes the decisions) decide that the cost of further change isn't worth it. We try to make that community as broad as we can, but we can't second guess them when they make a decision.
Dmitri Sotnikov (May 30 2018 at 16:25):
I've worked with many open source projects, and FHIR specification process is very opaque in comparison. The fact that people have to show up to the calls to vote is a perfect example of that.
Dmitri Sotnikov (May 30 2018 at 16:31):
One of the main issues is that a lot of the discussions are ephemeral in nature. There needs to be a list of proposed features/changes that is public and easily accessible making it possible for the community to meaningfully engage in the process.
Lloyd McKenzie (May 30 2018 at 18:22):
gForge is intended as the public, easily accessible description of proposed features/changes. Sometimes they start out as ephemeral, but discussion within the tracker usually tries to make them concrete. In terms of decision-making, there are two types, both inherited from the HL7 organization - work group decisions and ballots. We have no ability to change how the balloting process works. That's driven by ANSI. In terms of work groups, major decisions are generally put out for discussion on Zulip prior to voting on a call. I don't think there's ever been a situation where there was extensive support for something on Zulip only to have the decision reversed by a small group on the call. (Though it's super common for the Zulip conversation to be totally silent or only have 1-2 people chime in.) There's certainly lots of ability for those who can't attend the calls to share their perspective and to sway opinion.
Dmitri Sotnikov (May 30 2018 at 19:04):
Though it's super common for the Zulip conversation to be totally silent or only have 1-2 people chime in.
This is precisely the problem, because nobody just sits and watches Zulip all day just on an off chance that an issue that affects them comes up for discussion. Being able to participate and affect meaningful change requires a large commitment. This ends up excluding a lot of users who simply don't have the time to do this. A typical approach is to have a public forum for discussions that's persistent, and have separate discussion threads for each issue. This allows people to see what the proposals are easily, see the state of the current discussion, and contribute their experience.
Josh Mandel (May 30 2018 at 19:55):
I think our upcoming move to developing the spec on GitHub, with pull requests driving review, will help in this regard. But @Dmitri Sotnikov your observations here are very much on the mark.
My assessment is that FHIR runs in a very meritocratic, pragmatic way compared with others healthcare standards efforts... but falls short of the mark compared with well-managed open-source projects.
Grahame Grieve (May 30 2018 at 19:59):
"A typical approach is to have a public forum for discussions that's persistent, and have separate discussion threads for each issue" - in fact, I thought we did have that
Grahame Grieve (May 30 2018 at 20:01):
I am not so confident that pull requests will make things better. There's a language gulf between 'the process of changing the standard' and 'the meaning of the standard' - that's the one I live every day. So I'm worried that they'll make things worse, not better.
Grahame Grieve (May 30 2018 at 20:02):
and this is the reason why FHIR is more difficult to engage with than a standard open source project - this language gap, and because changes have to deal with 'what's possible' not 'what happens'
Grahame Grieve (May 30 2018 at 20:05):
@Dmitri Sotnikov the competing priorities in the json format have probably attracted the most broad comment of all the parts of the spec over the last few years. We had extensive discussions about this for over a year. My perspective is that the change was a case to aiming for 'too simple' but I wasn't able to be persuasive on this
Grahame Grieve (May 30 2018 at 20:16):
"I'd also like to note that UHN has been one of the early adopters of FHIR, and we have tried to make suggestions for the protocol from the start. The message has always been the same. So, while technically we are able to make suggestions, very little feedback ends up being integrated back into the standard in practice. I've been hearing the line that people have already implemented the current version and therefore it's not worth changing from the start." - I'm sorry you feel that. There's a tension between open discussion and revisiting a subject again and again because different people have different priorities, and actually making some decision, and growing more and more weight around it.
In this case, Nicola has consistently been unhappy with the json format, primarily because of storage concerns, and he's had one or 2 voices supporting him. These numbers are much less compared to the number of people who weighed in originally, or who would comment if we tried to change this (or, in this case, document a second format). That's driven the outcome in this case.
Grahame Grieve (May 30 2018 at 20:17):
you'll say that straw poll above says different - that's why I'm still in this thread (the only reason)
Grahame Grieve (May 30 2018 at 20:19):
back to json... if I was going to change the way we handle polymorphics, I'd do this:
{ "resourceType" : "Observation", "value": { "_type" : "string", "value" : "the-value" } }
Grahame Grieve (May 30 2018 at 20:21):
or, for a complex data type:
{ "resourceType" : "Observation", "value": { "_type" : "Coding", "system" : "the-uri", "code" : "the code" } }
nicola (RIO/SS) (May 30 2018 at 20:29):
@Grahame Grieve this looks prettier. May be use @type instead of _type (like json-ld). As well for unification we can change resourceType for @type too
Grahame Grieve (May 30 2018 at 20:31):
I deliberately didn't propose @type because it's overloaded by json-ld and we would not be using it consistently with json-ld
Grahame Grieve (May 30 2018 at 20:32):
I don't feel strongly about that. json-ld is problematic, I think... I don't think we'll ever go there
nicola (RIO/SS) (May 30 2018 at 20:32):
Another alternative is use type name as key - this will simplify exact access (at least in database) without filtering by type: resource.value.string or resource.value.Coding
Grahame Grieve (May 30 2018 at 20:33):
resource.value.string.value ? yay!
nicola (RIO/SS) (May 30 2018 at 20:33):
with primitive extensions - yes :(
Grahame Grieve (May 30 2018 at 20:33):
but I don't see how resource.value.string is better than resource.valueString
Grahame Grieve (May 30 2018 at 20:34):
it's an additional access point without any semantic improvement
nicola (RIO/SS) (May 30 2018 at 20:34):
It's more consistent, simple logical check - you will be able to have collection of polymorphics
nicola (RIO/SS) (May 30 2018 at 20:35):
As well check that value exists will be resource.value is not null
instead of resource.valueString || resource.valueCoding
nicola (RIO/SS) (May 30 2018 at 20:36):
Another check - required constraints on it in json-schema using required key
Lloyd McKenzie (May 30 2018 at 21:20):
I'm a bit disconcerted by Josh (who was one of the stronges proponents of the current JSON syntax) indicating that he feels it was a mistake and Grahame (lead architect for FHIR) also saying it's a mistake. And the sole arguments against change being "it impacts too many people".
We are not yet normative. If we're going to make this change, it would likely mean that the JSON couldn't go normative this cycle (though XML still could). I suspect a lot of people will vote against this change because they use JSON for exchange, but don't use JSON internally and don't really care what the JSON looks like so long as it carries the data. For them, change of any sort is bad. However, I'm curious whether any of those who actually use JSON for storage or internal manipulation would vote against this change. I'd hate for us to lock down on something where the community consensus is already "we did it wrong", but we feel we're too far along to change.
Grahame Grieve (May 30 2018 at 21:26):
well... specifically: the way we do primitives is a mistake. On balance, I don't agree with the other changes being discussed here.
Grahame Grieve (May 30 2018 at 21:26):
but I do think it's too late to change, personally
Grahame Grieve (May 30 2018 at 21:27):
and those people who think that we don't value json as much as xml... there's nothing that could say that as strongly as not making json normative
Dmitri Sotnikov (May 30 2018 at 22:20):
Again, I'd like to point out that once the spec becomes normative everybody using it will be affected by it for a very long time to come. If there is a consensus that the current design is a mistake, it really should be changed. The inconvenience for the few early adopters is insignificant when considered against the burden this would put on literally every healthcare IT organization that will be using this standard. I can't stress this point enough.
Grahame Grieve (May 30 2018 at 22:23):
it's not a question of how much you stress the point, but how people will vote.
Grahame Grieve (May 30 2018 at 22:25):
and I don't think that this is that big a deal. In fact, I think that many json implementers will vote for the current format, given a choice, simply because 100% of the primitives they deal with are simpler in the current format
Ivan Dubrov (May 31 2018 at 00:48):
In our current system we are very much JSON-oriented and use it internally as our data structure (which was the simplest way to start). However, we are again and again hitting all the same issues mentioned above, with extensions and with type variants. In every place we work with data in a generic way we have to remember to handle extensions and type variants properly.
At this point it looks like restructuring the data would have been much better approach (and we already do that in one part of the system). It was deceptively "easy" to use JSON representation directly (I didn't think about extensions and type choices until I got familiar enough with FHIR).
On the other hand, as much as I would love to have an alternative JSON representation, supporting both XML and JSON means some conversion is still needed (at very least, arrays are represented differently, etc) -- and if you do it, you might as well do it for JSON, too.
Ivan Dubrov (May 31 2018 at 03:27):
Oh, forgot about the GraphQL. The current JSON representation is also leaking into GraphQL API, which is also non ideal (polymorphic types are better represented using GraphQL unions, for example).
Grahame Grieve (May 31 2018 at 07:51):
i certainly considered that. But there was some reason I didn't.... I think because I wanted standard graphql logic over existing json to work as expected....
Ewout Kramer (May 31 2018 at 13:20):
but I don't see how resource.value.string is better than resource.valueString
When I mentioned the thing I wanted to change about our Xml representation, it was actually very similar to this: the fact that we have not separated clearly the type from the name has proven a disadvantage in a few spots. For one, I'd like to offer my users a consistent path to a property, so my API's offer access using the name, not name+type - this requires access to metadata, even at the faster lower-levels of my parser. Second, instead of a hashed lookup of the name, I need to compare prefixes in the list of elements, which is measurably slower again.
Which is not to say I am in favor of changing the serialization format, that would cause me way more pain than this. Having an "optional" alternative (as Nicolai suggested) would of course turn out into just another obligation on the server writers, cause now you need to support both. I would do it, but I can see not everyone would jump for joy...
Grahame Grieve (May 31 2018 at 20:48):
the HAPI POJOs create both paths... fine for the RI I know. but changing to that would have other strong disadvantages...
Grahame Grieve (Jun 06 2018 at 04:16):
Blog about this: http://www.healthintersections.com.au/?p=2824
Peter Jordan (Jun 06 2018 at 05:24):
If one JSON format is better for exchange and another for persistence, then the answer may be just for implementations to convert from one format to another where necessary and I'm guessing that the relevant class/utility already exists, somewhere. Agree that changing the exchange (FHIR) format at this stage looks bad from a product perspective and two exchange formats for JSON will hinder interoperability.
Grahame Grieve (Jun 06 2018 at 11:22):
so we would not offer any advice about a single internal-suitable format?
Paul Knapp (Jun 06 2018 at 11:31):
@Lloyd McKenzie I don't think you can make the assertion that "most implementers will use one of the reference implementations" given that A) many implement a façade and therefore don't require one of the reference implementations and B) most mature development shops will have to take financial and legal responsibility for their systems and that the reference systems are unlikely to accept that risk on their behalf.
Grahame Grieve (Jun 06 2018 at 11:32):
no. Lloyd was wrong about that. Many applications do, but many don't
Nick Rupley (Jun 06 2018 at 13:06):
As a sort of side-topic, I'd like to add that the JSON Schema backing all resources / data types is important to my organization. We develop a FHIR Connector Extension for our open-source product called Mirth Connect that is used by thousands of healthcare implementers globally. The FHIR Connector uses the JSON Schema to create a dynamic GUI resource builder.
It's important that if the JSON format changes, that the JSON Schema is also kept up to date. If the format "branches" into two possibilities, I think that can be expressed with a "oneOf" property. Or it may make sense to branch the entire thing into two separate schemas?
Nick Rupley (Jun 06 2018 at 13:07):
EDIT: Sorry this Zulip interface is very confusing, can't tell if I'm replying to a thread or creating a new thread in this topic haha...
Sean McIlvenna (Jun 06 2018 at 16:18):
I agree with @Nick Rupley... it is important to keep the json schema updated.
I also, however, believe that the current JSON format is very good for automatically generated GUIs
Sean McIlvenna (Jun 06 2018 at 16:19):
Framework's like angular have problems binding to simple data types, and require binding to a primitive within an object
Sean McIlvenna (Jun 06 2018 at 16:20):
Which is exactly what we do, now. So, I think we will equally-so run into issues with binding UI's (with frameworks like Angular) if we change the JSON format for primitives
Pascal Pfiffner (Jun 06 2018 at 16:40):
Just to also put this on the table: another (well documented) problem with the current JSON format is the disconnect between FHIR's decimal
data type and how it's represented in JSON. Many JSON libraries have precision issues since they use float or double representations for numbers, and almost all lose trailing zeroes. The only solution to this particular problem I believe would be to represent decimals as string, which like all the other proposals here would make the JSON format "less native" to use IMHuO – tradeoffs vs. tradeoffs.
nicola (RIO/SS) (Jun 06 2018 at 18:45):
Current json format can not be accurately described by json schema, because primitive extensions and polymorphics. @Pascal Pfiffner if we make primitive to be json object i think decimal notation will be more clear {decimal: “...”}
Grahame Grieve (Jun 06 2018 at 19:53):
we already comment about precision here: http://build.fhir.org/json.html#primitive
Grahame Grieve (Jun 06 2018 at 19:53):
this was also discussed at length... it wasn't simple
Grahame Grieve (Jun 06 2018 at 19:54):
the current json format can be described completely with JSON schema but the schema doesn't tell you about the relationships e.g. between id and _id
nicola (RIO/SS) (Jun 06 2018 at 19:57):
you can not express required constraints on primitives and only one for valueX or valueY on polymorphics
Grahame Grieve (Jun 06 2018 at 20:01):
there is no constraints on primitives. And I think I do express the only one thing. but it gets real complex if you mix primitives in the choice...
Pascal Pfiffner (Jun 06 2018 at 20:14):
Yes, this is documented and was discussed, just trying to ensure that it's mentioned in this thread in case JSON format redesign is indeed being considered.
nicola (RIO/SS) (Jun 06 2018 at 20:44):
@Grahame Grieve ‘required’ means min=1 in profile
nicola (RIO/SS) (Jun 06 2018 at 20:45):
json schema does not have require ‘property a or b’ keyword
nicola (RIO/SS) (Jun 06 2018 at 20:46):
and ‘one of property’
Grahame Grieve (Jun 06 2018 at 20:47):
ah yes, that is correct
Grahame Grieve (Jun 06 2018 at 20:57):
json schema probably should have that...
Nick Hatt (Jun 06 2018 at 21:38):
I voted :thumbs_up: on revisiting the issue, only because I think it's going to be a recurring theme and a little more documentation around XML->JSON would help. I'll wager that people coming to FHIR/healthcare in 2018 are overwhelmingly coming from JSON backgrounds. They will ask the same question as @nicola (RIO/SS) , and I think they deserve an answer. Reading through this thread has been difficult, so pointing people back to it would not be helpful.
Nick Hatt (Jun 06 2018 at 21:41):
For example, this in the XML format documentation has always bugged me:
Given the way extensions work, applications reading XML resources will never encounter unknown elements.
If a core design principle is that applications reading XML resources will never encounter unknown elements
, I'm fine with that, but it's helpful to know that that's a principle to quell my frustration with iterating through extensions in JSON.
Grahame Grieve (Jun 06 2018 at 21:50):
so you think we should repeat that on the JSON page?
Lloyd McKenzie (Jun 06 2018 at 22:00):
We wouldn't necessarily need to repeat it on the JSON page, but having a separate page that provides background and context for why things are the way they are (perhaps with a link from the main JSON and XML pages) wouldn't be a bad thing. Implementers could ignore, but those who have questions could read it or be directed there.
Grahame Grieve (Jun 06 2018 at 22:01):
might be wiki content...
nicola (RIO/SS) (Jun 06 2018 at 22:11):
i still do not think current design is good and predict a lot of negative feedback in the future from json land :(
nicola (RIO/SS) (Jun 06 2018 at 22:15):
can we represent primitives as primitives in simple case and switch to object notation in case of extensions: ‘’’attr: “...” or ‘attr: { ext: ..., value: ...}’’’?
Grahame Grieve (Jun 06 2018 at 22:16):
I kind of like that but I think that it would cause a lot of run time errors
nicola (RIO/SS) (Jun 06 2018 at 22:17):
this is more regular and this errors looks reasonable
Grahame Grieve (Jun 06 2018 at 22:17):
I cannot recall that specific proposal ever being discussed before
nicola (RIO/SS) (Jun 06 2018 at 22:20):
at least in databases we will use “res.attr.value or res.attr” expression and this can be unified with polymorphics res.attr.string and res.attr.Quantity
nicola (RIO/SS) (Jun 06 2018 at 22:26):
also polymorphics representation as res.attr.Quantity simplify fhirpath impl res.attr.as(Quantity)
Ewout Kramer (Jun 07 2018 at 06:41):
also polymorphics representation as res.attr.Quantity simplify fhirpath impl res.attr.as(Quantity)
Yes, that's one of the lovely aspects of separating the path from the type. If we'd consider that, we should do the same for XML though. I would not necessarily do it with a "." though, as it becomes ambiguous whether you are talking about a path or a type, so I would prefer another separator.
Ewout Kramer (Jun 07 2018 at 06:45):
We've discussed alternatives like the one Nicolay describes (having both attr: and attr: { ext: value: }, but we shied away from it, because the path to the value would change depending on whether the value has extensions (or an id). So, you'd have to try both "attr" and "attr.value" if the first is null.
Grahame Grieve (Jun 07 2018 at 06:55):
actually the first would be an object, not null. json-ld does this with the @context
Grahame Grieve (Jun 07 2018 at 06:56):
though it's either an object or string that's areference to the object. I don't remember discussing it
Ewout Kramer (Jun 07 2018 at 08:12):
actually the first would be an object, not null. json-ld does this with the @context
Indeed, object. So, is that a concern, @nicola (RIO/SS) ?
Michele Mottini (Jun 07 2018 at 17:39):
In the US DSTU2 is already widely used, and so here to stay for years - so changes to a new JSON format would mean that everyone in the US would have to support _both_ the new and the old formats
Grahame Grieve (Jun 07 2018 at 19:48):
yes that's true
nicola (RIO/SS) (Jun 08 2018 at 16:29):
@Michele Mottini it will be used after normative release by much more users for a decades
Michele Mottini (Jun 08 2018 at 17:08):
Maybe - but everyone would still have to support both formats for a good part of those decades
nicola (RIO/SS) (Jun 08 2018 at 19:00):
Some servers does not support xml or json, so this is not a big problem :). I think, in later phase FHIR will be driven by implementations like HTTP, HTML and CSS is driven by browsers.
Grahame Grieve (Jun 08 2018 at 23:37):
I have to say, having reviewed some of my code.... I would strongly prefer
represent primitives as primitives in simple case and switch to object notation in case of extensions: ‘’’attr: “...” or ‘attr: { ext: ..., value: ...}’’’?
There's existing unreported bugs in my code in the corner cases around the existing approach
Grahame Grieve (Jun 08 2018 at 23:41):
but it seems, based on feedback to this point, there's just not enough interest to support a change this size. (lots of interest in a very small community...)
Abbie Watson (Jun 21 2018 at 19:21):
However, I'm curious whether any of those who actually use JSON for storage or internal manipulation would vote against this change. I'd hate for us to lock down on something where the community consensus is already "we did it wrong", but we feel we're too far along to change.
So many thoughts.
+1 for Nicola’s concept of ‘inlining’, which I’ve referred to in the past as ‘flattening’ or ‘resolving’. It would be great to see official rules and guidance on this. Internally, FHIR resources will get globbed together into phylogenetic trees, cladograms, graphs, and other complex data structures. The idea that resources are always exchanged in a Bundle is probably the largest discrepancy between the official spec and our use of FHIR in internal storage and data models.
The FHIR wire format being unusable without complex and fragile 3rd party libraries is just a matter of time. Once the standard goes normative, we can begin adding more tooling and moving those libraries into database drivers, network drivers, preprocessors, build tools, IDEs, etc. That becomes possible when the standard goes normative and we can point other organizations to a spec that has the backing of HL7 and ANSI.
Also, if the wire format doesn’t support FHIR natively, change the wire protocol! Which is essentially what we’ve begun to do by defining the application/json+fhir
mime type. HL7 and ANSI have enough political clout and long-term process management, that they can think in these terms. Lets keep in mind that we’re defining a new standard through formal standards development processes, rather than simply writing software that conforms to existing standards.
That we have to go through a complex, error prone, and expensive parsing step before we’re able to work with the data effectively isn’t a convincing argument to me. That’s basically the experience of anybody who has used or written their own network driver. This issue has been present since the dawn of telecommunications. Let’s just be glad that our complex, error prone, and expensive encoding/decoding is human readable in JSON and English and is well documented in a robust API. And that it’s not pipe delimited, or requires us parsing ZMODEM or Kermit parameters.
But once we receive the data, we have to write it to the database. Besides decimal representation, date and geolocation lat/long are the things we have to regularly accommodate for. We’re slowly refining and refactoring our database mapping functions, and will eventually get them into a differential package that we can submit to Mongo.
Date usage, and the need to handle partial knowledge about birthdates, is probably the $64K healthcare use case that drives the need to actually depart from existing JSON primitives. We currently cast to a Date() object for internal storage, but that’s clinically problematic for a bunch of reasons. We make heavy use of the Moment.js library, which comes close to being able to parse the HL7 rules on date management; and will probably wind up attaching a Provenance resource to the object if there’s not enough information to resolve to an ISO Date. Like... if we get a birthdate that is simply “1950”, we’re currently storing as “Jan 1st, 1950, 00:00:00’, and will probably add a meta tag or extension that points to a provenance value set or Provenance resource type that says ‘this date is fuzzy and unknown’.
Trivia fact... the first video monitor was actually a cathode ray tube, which was used as a storage device at the time. So Sean’s comments about Angular’s problems with binding to simple data types is actually quite relevant and has a long history. We deal with something similar, in that we’re using isomorphic code, and have a library called minimongo that implements our server side storage API in the web browser. So, those storage issues with decimal, date, and lat/long are on both server and client. But, by the same token, we have an isomorphic library that can box/unbox the values as needed.
Regarding disk storage, we’d like to see the application/json+fhir` mime type extended to a file extension. Do we simply save a FHIR file as a .json file? I quite like having .fhir, .ccd, and .bundle extensions.
tl;dr - Personally, we’re fine with the approach to primitives that’s been developed to date, and wouldn’t want to hold up R4 normative with a last minute change. Would encourage people to think about the life cycle of a standard, and how once things go normative, it will lead to tooling, drivers, etc that support the standard - even if they dont currently exist.
Apologies for the wall of text.
Lloyd McKenzie (Jun 21 2018 at 20:15):
Why both .fhir and .bundle extensions?
Abbie Watson (Jun 21 2018 at 21:35):
We have different import rules based on the file extension. We’re stIll a bIt undecided if clinical workflow is better supported by defining the extension as data content type or top level resource. There’s an argument for either (or even both, as in the case of TestPatient.bundle.fhir
vs TestPatient.Medication.fhir
). The biggest use case by far is import/export functionality. Which is a type of file transfer and interoperability if you think about it... data transfer via persistent data storage.
Abbie Watson (Jun 21 2018 at 22:05):
And if you really want to run with the idea, consider how DICOM works as a container mechanism for JPG files (analagous to the manilla patient record folders). Maybe a .fhir extension shouldn’t be a modified .json file at all. Maybe it should be a password protected .zip file, which would enforce encrypted data at rest and HIPAA compliance. Just a thought.
Lloyd McKenzie (Jun 22 2018 at 00:03):
Different types of Bundles would be handled differently though. And we may eventually have other FHIR resource types (or environments where other existing FHIR resource types) are handled differently. I think that if the system knows it's FHIR, that's enough information to look inside and decide how it wants to process it.
Abbie Watson (Jun 22 2018 at 00:28):
Totally agreed the resourceType field has all the logic we need, and using .bundle or other .resourceType extensions probably isn’t the way to go. If it’s a .fhir extension, expect to be able to open up the file and find a resourceType field. That alone would be hugely useful to draft into the specification.
Abbie Watson (Jun 22 2018 at 00:30):
The one thing tha does, though, is locks .fhir into being a .json analog (totally reasonable), so using it as a .zip file is out. Which makes me then think about .sfhir
Abbie Watson (Jun 22 2018 at 00:31):
SFHIR = Secure Fast Healthcare Interoperability Resource = password encrypted zip file containing a .fhir file
Lloyd McKenzie (Jun 22 2018 at 13:33):
That does raise the question of whether we need a different extension for the JSON, XML, RDF and eventually other syntaxes used with fhir.
Ivan Dubrov (Jul 16 2020 at 18:07):
I apologize for necroposting, but I need to vent a bit since I'm facing this issue once again.
After almost three years working with FHIR (which included implementing both a FHIR-native server and frontend libraries), I can say that current JSON representation is one of the biggest obstacles (mostly, for frontend / JavaScript-based platforms).
On backend, where we use non-JavaScript based language (Rust), we designed our static types in a way that they reflect FHIR semantics (where primitive types are actual composite objects and type choices are choice types, enums). It did require us to write our custom serializers / deserializers for JSON since standard one wouldn't map data the way we wanted, but there were other reasons to do that anyway, and in the end, it's not that complicated. It worked out pretty well in the end, I think. However, it was possible primarily because we were able to restrict wire format issue to deserialization / serialization logic, which happens early.
However, frontend part, where everything is "JSON-native", and there is really no need for "deserialization" to happen, current JSON representation is a huge problem:
- It's hard to express types (say, TypeScript) over existing JSON representation that are accurate.
- Being not able to pass primitive value as a single value makes all kinds of auto-generated or metadata-driven forms really hard.
- Same for type choices
- Writing data back to resource (updating its fields) requires knowledge of parent of item being edited: you cannot just "return" updated value. Somewhat manageable for primitives which could be special-cased (because you have two places to write at most, the value itself, and its "element" counterpart), but worse for type choices (where you would need to know all the other possible fields to look at). Typechoices with primitive values? Oh, well. Actually, primitives are also quite difficult for the case when arrays of primitives are involved.
- As an extension of previous problem, (React) state management for any kind of forms based on FHIR type (say, editor for OperationDefinition) is _really_ hard because state management works well when you can structure your components such that children "selects" subset of state from their parent, and it is not easy / possible with current scheme.
- Since extensions are relatively rare on primitives, it encourages yolo approach ("you only live once") to development where extensions are simply ignored.
These issues are not impossible to solve, but they really throw a wrench into any kind of metadata-driven approach to FHIR resources.
I understand concerns about the need to support yet another wire format. I would expect that upgrading backend systems should be less of an issue (although, will incur some cost), but will impact existing frontend / JavaScript-based libraries.
I'm optimistic, though, that in the end it would be a net positive change.
Grahame Grieve (Jul 16 2020 at 23:58):
These are all the same arguments I made back when we settled on the current format. But it really is too late to change
Last updated: Apr 12 2022 at 19:14 UTC