FHIR Chat · Low bandwith serialisation

Stream: implementers

Topic: Low bandwith serialisation

Richard Kavanagh (Mar 07 2016 at 09:34):

Is there currently anywork going on with regards to looking at more efficient serialisation options for FHIR? For systems with high transaction throughput there is a current perception that the current XML/JSON ITS is a little on the heavy side.

Grahame Grieve (Mar 07 2016 at 09:34):

there's been some occasional talk about it, but nothing is currently active.

Grahame Grieve (Mar 07 2016 at 09:35):

what about it is on the heavy side?

Grahame Grieve (Mar 07 2016 at 09:35):

however, there's no silver bullet. Each option where people vary manifests as a choice that must appear in the instance

Richard Kavanagh (Mar 07 2016 at 09:45):

Predominatly its message size that is causing concern. We have system that currently have very high transaction rates and we are looking to expose some FHIR end points - we would expect in due course to hit equally high transaction rates on them as well. The size of message has impact on server memory, cpu, bandwidth, storage etc

Grahame Grieve (Mar 07 2016 at 09:49):

well, there's a tension between a general purpose interop format and a high volume transaction system.

Grahame Grieve (Mar 07 2016 at 09:50):

FHIR resources tend to be less granular than many bespoke restful interfaces, so I think that the difference isn't as big as people think

Grahame Grieve (Mar 07 2016 at 09:50):

still, there is room for a more dense form - but we have been thinking of making it extremely dense indeed

Grahame Grieve (Mar 07 2016 at 09:52):

at the cost of making it exquisitely fragile with regard to change. Is that really a good deal in the 1st world?

Grahame Grieve (Mar 07 2016 at 09:52):

however, that's the equation you have to deal with

Richard Kavanagh (Mar 07 2016 at 09:53):

Yes - one of our suppliers has has proposed a very dense alternative - comprising predominantly of abbreviated element names, and collapsing child elements

Grahame Grieve (Mar 07 2016 at 09:53):

abbreviated element names will really make that much difference?

Richard Kavanagh (Mar 07 2016 at 09:55):

potentially some, but we need to quantify how this would achieve in real world scenarios.

Eric Haas (Mar 07 2016 at 09:57):

Do you mean message size as in Message paradigm or the REST
payload size?

Paul Knapp (Mar 07 2016 at 10:21):

I'm seeing the FHIR equivalents to V3 messages to be significantly smaller. The next reduction I think would need to come from element names and system values.

Paul Knapp (Mar 07 2016 at 10:23):

@Eric Prud'hommeaux: I think Richard means the number of bytes exchanged.

Jason Walonoski (Mar 07 2016 at 12:42):

I've thought about using Google Protocol Buffers to serialize FHIR messages into tiny packages in situations where there is significantly large message throughput... but I don't really have a need to do that right now. Plus, I figured it would be frowned upon.

Richard Kavanagh (Mar 07 2016 at 12:45):

Sorry yes, I should be more precise - My issue is around the number of bytes exchanged. Messaging vs. REST is not really the issue.

Simone Heckmann (Mar 07 2016 at 14:07):

We have a use case in which we need to squeeze a medication plan into a 2D barcode. Here's what we came up with:http://wiki.hl7.de/index.php?title=IG:Ultrakurzformat_Patientenbezogener_Medikationsplan

Simone Heckmann (Mar 07 2016 at 14:08):

Please ignore the German yadda yadda. Just look at tha code examples comparing actual fhir code with the collapsed format

Simone Heckmann (Mar 07 2016 at 14:10):

At least in our scenario, that is sufficient shrinkage. However, in order to be able to inflate/deflate without losing information, the resources have to validate against profiles that are very restrictive

Simone Heckmann (Mar 07 2016 at 14:11):

E.g. Some values are fixed in the profile and thus omitted in the short format.

Richard Kavanagh (Mar 07 2016 at 17:07):

@Simone Heckmann From what I understand of the document this is very much along the lines of what we have been discussing here in England. Is this document a proposal for Germany? Is it being implemented.

Simone Heckmann (Mar 07 2016 at 17:58):

Yes, it's a proposal for Germany. "Is it going o be implemented?" Well, hopefully yes. At least, proving that there's a WAY to squeeze a FHIR Medication Plan into a Barcode, we have been able to make FHIR a candidate for the nationwide mandatory implementation of the medication plan!

Josh Mandel (Mar 07 2016 at 18:14):

If you have a real use case for small payloads, formats like protcol buffers (or the many alternatives) seem like a good fit. I'd be surprised if simply shortening element names in json or XML has a big effect after gzip compression.

Richard Kavanagh (Mar 07 2016 at 18:19):

That's part of what we hope to explore in a future Proof of Concept exercise . I think GZIP would only give bandwidth benefit. @Simone is this going to be adopted as a German localisation of FHIR?

Brian Postlethwaite (Mar 07 2016 at 21:18):

Just looking specifically at <P g="Michaela"/>
This format can't handle multiple given names.

Grahame Grieve (Mar 07 2016 at 21:19):

no. that's the point; you define in your profile what is possible, and then eliminate every redundant syntax

Grahame Grieve (Mar 07 2016 at 21:19):

we called this 'profiled fhir' - it's written up in http://hl7-fhir.github.io/profiling.html but commented out

Brian Postlethwaite (Mar 07 2016 at 21:24):

ok, thanks.

David Hay (Mar 08 2016 at 01:01):

I don't think that's allowed is it? - ie 'defaulting' a value from the profile (at least in the general case)

Grahame Grieve (Mar 08 2016 at 01:09):

not generally. but it is if you're doing 'propfiled FHIR', though I see that the write up of that completely disappeared. Perhaps a topic for Montreal

Richard Kavanagh (Mar 08 2016 at 07:22):

+1 on that Grahame - I'd like to be in a position where I can have more informed conversations with suppliers in this areas.

Stefan Lang (Mar 08 2016 at 10:33):

I'd just like to throw in IoT / connected devices. In mobile health devices the message/payload size is not only an issue of bandwidth but also of battery drain.

Paul Knapp (Mar 08 2016 at 11:45):

@Grahame Grieve: I'm getting an internal error 500 on your web server.

Grahame Grieve (Mar 08 2016 at 11:46):

(deleted)

Grahame Grieve (Mar 08 2016 at 11:46):

what URL?

Paul Knapp (Mar 08 2016 at 11:46):

http://www.healthintersections.com.au/

Grahame Grieve (Mar 08 2016 at 11:47):

oh.

Grahame Grieve (Mar 08 2016 at 11:47):

cpanel hosted. I'll pass that along to the host

David Hay (Mar 08 2016 at 17:54):

Can I confirm my understanding of the current state of play - when I think of 'profiled FHIR' I think of how profiles are currently applied in that they don't support defaults and don't change the wire syntax...

Grahame Grieve (Mar 08 2016 at 18:59):

profiles currently do not change the wire format. We've always been aware that you could use them to do so, and previously used the name 'profiled fhir' to describe doing so

David Hay (Mar 08 2016 at 19:01):

hmmm. confusing terminology I think...

Jason Careless (Mar 09 2016 at 19:08):

But in essence the use of an 'ultra-short format' profile as described in the German link above would fit with accepted standards, if well defined and agreed of course, can offer the 'minimal textual form' of the information we may look to exchange within a high-transaction rate / limited bandwidth environment ?

Lloyd McKenzie (Mar 09 2016 at 19:10):

If we were talking about ultra-low bandwidth, then a profile would define what elements were in vs. out and all such elements (extension or core) would be identified positionally or with a simple sequence byte or something.

Lloyd McKenzie (Mar 09 2016 at 19:11):

It would be extremely non-human friendly...

Jason Careless (Mar 09 2016 at 19:14):

I totally agree on the friendliness for plain-text reading by humans

Stefan Lang (Mar 09 2016 at 19:44):

The rationale behind the German short format is to minimize required storage for usage with QR codes or chip cards (the German insurance cards have only a few KB of data storage for all applications), while still using well formed XML to keep it smooth for implementers in terms of data validation.

Stefan Lang (Mar 09 2016 at 19:50):

There is some proprietary format out there that has a strange delimiter based format and the "ultra short format" is an alternative draft to avoid that proprietary thing

Stefan Lang (Mar 09 2016 at 19:51):

Concerning human-friendliness: I don't think this is of any importance in a low bandwidth / low capacity scenario

Simone Heckmann (Mar 10 2016 at 10:31):

I agree that human readability should not be a concern here. ...as long as one can bijectively transform back and forth from "proper" FHIR to lbs.

Paul Knapp (Mar 10 2016 at 10:35):

I think this would be an alternate serialization, an ITS, rather than a profile., and that the profile should be separate from the serialization: xml, reduced-xml (the short name form),json, reduced-json, etc.

Grahame Grieve (Mar 10 2016 at 10:37):

the link with profile is that it's only when you have a useful profile that you can usefully reduce the wire format

Stefan Lang (Mar 10 2016 at 12:26):

@Simone Heckmann Since the ultra short format as proposed follows a kind of "convention over configuration" paradigm, transformation from that format to proper FHIR would require injecting information about the convention used (e.g. there would be no information about any code system in the short format)

Paul Knapp (Mar 10 2016 at 13:00):

@Grahame Grieve: Agreed. And if it is defaulting and it is a non-formulaic approach then that may move away from an ITS to a custom wire format tightly, maybe solely, attached to a profile.

Patrick Werner (Apr 12 2016 at 11:57):

my current considerations about the USF Mapping:
- buildingprofiles for the resources which should be mapped
- Elements which sould be mapped will contain an XPATH String in mapping.map with it's Destination in the UltraShortFormat
- to get the Long Form from USF you just have to map each element to the corresponding XPATH from the profile

Patrick Werner (Apr 12 2016 at 11:59):

the only downside if you just use XPATH Strings is that you are restricted in what you can do. String Concationation to merge different Elements into one Element or Attribute wouldn't be possible

Patrick Werner (Apr 12 2016 at 12:02):

if we would use XSLT instead of pure XPath concatinations and other fancy stuff would be possible

Simone Heckmann (Apr 12 2016 at 12:05):

Quoting Martijns idea:

Martijn: The thought I had yesterday (from the top of my head, so not researched) was this:
- we have to store in resource instance in as little bytes as possible.
- besides regular compression, there is stuff that we (the reader of the qr-code) knows. And we can reference that instead.

Martijn: for example: a resource always matches some profile, which can be found online

Martijn: We could reference each field in that profile by index. This index will probably fit in one byte. (less than 256 possible fields)

Martijn: Since the order in FHIR is mandatory.

Martijn: So Patient.humanName would be: 10.

Simone Heckmann (Apr 12 2016 at 12:06):

Martijn: For known data types you can think of a default format. Or use sub indexing.

Martijn: So Patient.humanName.family would be 2 bytes: (10)(3)

Simone Heckmann (Apr 12 2016 at 12:07):

Simone Heckmann: ...using the mapping mechanism to assign index to fields...?

Simone Heckmann: Sort of like this:
pasted image

Simone Heckmann: "USF" meaning "ultra short format" :)

Simone Heckmann: The implications of a field NOT having an associated index would be that this field is either not supported in the USF or hase a fixed value that can be added back by default when translating from USF to full FHIR. Does that make sense?

Simone Heckmann: We'd have to add the information somewhere in the Barcode, what profile the USF is based on...

Simone Heckmann (Apr 12 2016 at 12:07):

Simone Heckmann 1:44 PM
"Patient.humanName.family would be 2 bytes: (10)(3)"
...that would mean that we'd have to profile every complex datatype to add the mappings to indices. Then the IG Property "Profiles that apply globally" would need to redirect all complex types to the profiled types...
Feasible or messy?

Patrick Werner (Apr 12 2016 at 12:11):

with xslt transformations we would possibly run into problems in terms of bijection so i would prefer a plain XPATH approach

Patrick Werner (Apr 12 2016 at 12:12):

merging data fields would lead to bijection problems anyway

Patrick Werner (Apr 12 2016 at 12:13):

i personally prefer the proposal of HL7 Germany in the USF Format. It is still human readable (kind of) still XML -> you can validate it against a schema

Patrick Werner (Apr 12 2016 at 12:16):

and you still can use the mappings if the resource or profile changes. If we would just count the data fields every minimal change of a resource structure leads to a complete remapping

Simone Heckmann (Apr 12 2016 at 12:20):

Considering that especially in our QR scenario, size is crucial, i think that the byte addressing could shrink the USF considerably more than the "short XML" approach...

Simone Heckmann (Apr 12 2016 at 12:30):

I am just trying to wrap my head around what that means in terms of implementation. Assuming we have an instance of Patient like

<?xml version="1.0" encoding="UTF-8"?>
<Patient xmlns="http://hl7.org/fhir">
    <name>
        <use value = "official"/>
        <family value = "Heckmann"/>
        <given value = "Simone"/>
    </name>    
</Patient>

and a profile that maps
name -> 11 (being the eleveth field of Patient counting in all inherited fields )
use->fixed to "official"
family->1103 (family being the third field of the complex type Human Name)
given->1104 (given being the fourth field of the complex type Human Name)

What would the USF look like?
1103Heckmann1104Simone

How'd we parse that?

Patrick Werner (Apr 12 2016 at 12:32):

i think you would have to have a seperator for elementN°|Data

Patrick Werner (Apr 12 2016 at 12:32):

otherwise it would be hard to differenciate in between element number and numeric values

Patrick Werner (Apr 12 2016 at 12:33):

maybe "|"V ;) V2 here we come ;)

Patrick Werner (Apr 12 2016 at 12:35):

1103Heckmann1104Simone in the shortened XML ->
<p g="Simone" f="Heckmann"/>

Patrick Werner (Apr 12 2016 at 12:37):

it would be interesting to compare the ziped size of both approaches, if i have some free time @conhit i'll give it a go

Simone Heckmann (Apr 12 2016 at 12:38):

In any case we'd have to have the USF indicating the underlying profile, which would be 0203 (meta.profile) by my counting for every resource. So that would be the first thing to look for when decompressing a USF

Patrick Werner (Apr 12 2016 at 12:40):

the profile definitly must be part of a usf resource

Simone Heckmann (Apr 12 2016 at 12:43):

I still wonder how messy all of this gets when we count in repeatable/optional field and all the implications of having hierarchical structures like a document bundle (which is essentially what the Medication plan is)...

Patrick Werner (Apr 12 2016 at 12:44):

i just was thinking about the same problem. in particular repeatable fields would be a huge problem

Simone Heckmann (Apr 12 2016 at 12:48):

Unless the Profile uses slices to assign unique indices / shorthands to every slice and thereby also restrict the number of repetitions (which probably makes sense for a USF...) If we estimate that a QR can hold a maximum of 15 medications it would make sense for the profile to allow only a maximum of 15 slices and validation failing if someone tried to squeeze more data than possible into it.

Martijn Harthoorn (Apr 12 2016 at 12:56):

An index of course would refer to the index in the profile (or base resource). So repetitive elements would have the same index.

Martijn Harthoorn (Apr 12 2016 at 12:56):

And Patrick is of course right, that regardless of format, you will need a separator.

Martijn Harthoorn (Apr 12 2016 at 12:57):

(binary or not)

Martijn Harthoorn (Apr 12 2016 at 12:58):

The other suggestion I had yesterday, is that instead of a profile reference like a url, or canonical url, you could use a 2 or maybe 4 byte index number that can be found on some registry.

Martijn Harthoorn (Apr 12 2016 at 13:01):

So, let's say you have a Patient with only a name.
(4 bytes: profile reference id to Patient base profile)
(1 byte: value 0, field start code)
(1 byte: value: 10, index of of Patient.name in the base profile)
(12 bytes: value: f:John,l:Doe)

Martijn Harthoorn (Apr 12 2016 at 13:02):

(or something like that)

Lloyd McKenzie (Apr 12 2016 at 16:05):

If you want to be even tighter, you can do something like this: Algorithmically flatten the model so that you have no nesting of anything except repeating elements. For example, if you limit Patient.name to only 1 repetition and support only family and given, then you would treat Patient as having birthdate, gender, name.family and name.given as "root" level elements. Then in the first bytes (after profile identification), identify which optional elements in the instance are present using bit level flags. Represent everything as binary, using the leading byte to identify the length of variable-length elements like strings & date-times. When you hit repeating elements, the first byte indicates the number of repetitions and the first few bytes in each repetition again indicates which optional elements are present. Doing this would mean the data would be binary (non-human readable), but it would be as tight as it's possible to be. You can then encode the binary as necessary to flow over SMS, bar-codes or other low-bandwidth situations. Depending on the amount of data repetition your expecting to occur in numeric and string data, you might be able to benefit from compressing the instance too.

Grahame Grieve (Apr 12 2016 at 19:22):

the imaging formats have well established binary encoding formats, but for this use, I'd use ASN.1 BER. So the process would be:
- profile the resource
- assign the field ids in the profile (so they persist)
- write a converter that converts the profile to an ASN.1 syntax
- use ASN.1 BER for the exchange of the instances
- (optional) write a converter that inter converts instances from the ASN1. syntax to the normal FHIR format based on the profile

John Moehrke (Apr 12 2016 at 20:41):

Seems we should evaluate this against the compression built into http/2

Grahame Grieve (Apr 12 2016 at 21:09):

sure, but not that a lot of the use of this form will be over http

Patrick Werner (Apr 12 2016 at 21:19):

compression only wont help for the imagined usecases (QR Code, SMS, etc). You have to have a procedure where you get rid of information which is already defined in a profile

Patrick Werner (Apr 12 2016 at 21:24):

thanks @Grahame Grieve for bringing up ASN1, seems to be a good idea if you just focus on size. Maybe ASN.1.PER could lead to even smaller files.
Simone and i will start profiling a medication plan usecase on friday.

Patrick Werner (Apr 12 2016 at 21:26):

A comparisson of sizes between compressed resources, shortened XML, ASN.1 BER and PER will be interesting

Brian Postlethwaite (Apr 12 2016 at 21:32):

Just a note that I pushed supprt onto our server overnight to support the gzip/deflate stuff and for the content I was testing with, 600k => 43k (xml) which is pretty good.

Brian Postlethwaite (Apr 12 2016 at 21:33):

Have the other reference servers got this already built in?

Grahame Grieve (Apr 12 2016 at 21:34):

yeah I have gzip support

Brian Postlethwaite (Apr 12 2016 at 21:35):

Do you support that on processing the request also?

Grahame Grieve (Apr 12 2016 at 21:50):

yeah I think I do

Grahame Grieve (Apr 12 2016 at 21:51):

it's pretty transparent for me

Simone Heckmann (Apr 13 2016 at 08:23):

Grahame Grieve 9:22 PM
- assign the field ids in the profile (so they persist)

Is anyone opposed to the idea of using StructureDefinition.mapping for this?

Patrick Werner (Apr 13 2016 at 08:34):

+ ElementDefinition.mapping. For XML flattening& shrinking i would put the Xpath in mapping.map. For ASN.1 mapping.map would contain the ASN.1 Datatype

John Moehrke (Apr 13 2016 at 12:03):

The gzip compression you have experimented with is what has been put into http/2; it is not only for images. It also uses the same dynamically built lookup table for all http sessions within the http/2 tunnel. So over many interactions with a FHIR server, you would find compression better than you are seeing with your gzip experiment. So much so that it even compresses out the http protocol overhead (and duplicate oauth credentials) too.

John Moehrke (Apr 13 2016 at 12:05):

this automatic compression was designed by people who are infrastructure experts, not healthcare data-modeling experts... re-inventing the wheel is not productive. You will end up at something generally round with a central pivot.

Patrick Werner (Apr 13 2016 at 12:06):

this isn't about HTTP, its about serialise FHIR Resources for Barcodes, RFID, SMS Transmissions etc

John Moehrke (Apr 13 2016 at 12:09):

so you want to make a fixed (pre-coordinated) compression lookup table?

Patrick Werner (Apr 13 2016 at 12:12):

not just compression. Compression would be losless (like FLAC). We want to get rid of elements which are already defined in a profile, e.g. code systems so it would be a "lossy" compression (mp3, kind of....)

Last updated: Apr 12 2022 at 19:14 UTC

Main menu

FHIR Chat · Low bandwith serialisation · implementers

Stream: implementers

Topic: Low bandwith serialisation

Richard Kavanagh (Mar 07 2016 at 09:34):

Grahame Grieve (Mar 07 2016 at 09:34):

Grahame Grieve (Mar 07 2016 at 09:35):

Grahame Grieve (Mar 07 2016 at 09:35):

Richard Kavanagh (Mar 07 2016 at 09:45):

Grahame Grieve (Mar 07 2016 at 09:49):

Grahame Grieve (Mar 07 2016 at 09:50):

Grahame Grieve (Mar 07 2016 at 09:50):

Grahame Grieve (Mar 07 2016 at 09:52):

Grahame Grieve (Mar 07 2016 at 09:52):

Richard Kavanagh (Mar 07 2016 at 09:53):

Grahame Grieve (Mar 07 2016 at 09:53):

Richard Kavanagh (Mar 07 2016 at 09:55):

Eric Haas (Mar 07 2016 at 09:57):

Paul Knapp (Mar 07 2016 at 10:21):

Paul Knapp (Mar 07 2016 at 10:23):

Jason Walonoski (Mar 07 2016 at 12:42):

Richard Kavanagh (Mar 07 2016 at 12:45):

Simone Heckmann (Mar 07 2016 at 14:07):

Simone Heckmann (Mar 07 2016 at 14:08):

Simone Heckmann (Mar 07 2016 at 14:10):

Simone Heckmann (Mar 07 2016 at 14:11):

Richard Kavanagh (Mar 07 2016 at 17:07):

Simone Heckmann (Mar 07 2016 at 17:58):

Josh Mandel (Mar 07 2016 at 18:14):

Richard Kavanagh (Mar 07 2016 at 18:19):

Brian Postlethwaite (Mar 07 2016 at 21:18):

Grahame Grieve (Mar 07 2016 at 21:19):

Grahame Grieve (Mar 07 2016 at 21:19):

Brian Postlethwaite (Mar 07 2016 at 21:24):

David Hay (Mar 08 2016 at 01:01):

Grahame Grieve (Mar 08 2016 at 01:09):

Richard Kavanagh (Mar 08 2016 at 07:22):

Stefan Lang (Mar 08 2016 at 10:33):

Paul Knapp (Mar 08 2016 at 11:45):

Grahame Grieve (Mar 08 2016 at 11:46):

Grahame Grieve (Mar 08 2016 at 11:46):

Paul Knapp (Mar 08 2016 at 11:46):

Grahame Grieve (Mar 08 2016 at 11:47):

Grahame Grieve (Mar 08 2016 at 11:47):

David Hay (Mar 08 2016 at 17:54):

Grahame Grieve (Mar 08 2016 at 18:59):

David Hay (Mar 08 2016 at 19:01):

Jason Careless (Mar 09 2016 at 19:08):

Lloyd McKenzie (Mar 09 2016 at 19:10):

Lloyd McKenzie (Mar 09 2016 at 19:11):

Jason Careless (Mar 09 2016 at 19:14):

Stefan Lang (Mar 09 2016 at 19:44):

Stefan Lang (Mar 09 2016 at 19:50):

Stefan Lang (Mar 09 2016 at 19:51):

Simone Heckmann (Mar 10 2016 at 10:31):

Paul Knapp (Mar 10 2016 at 10:35):

Grahame Grieve (Mar 10 2016 at 10:37):

Stefan Lang (Mar 10 2016 at 12:26):

Paul Knapp (Mar 10 2016 at 13:00):

Patrick Werner (Apr 12 2016 at 11:57):

Patrick Werner (Apr 12 2016 at 11:59):

Patrick Werner (Apr 12 2016 at 12:02):

Simone Heckmann (Apr 12 2016 at 12:05):

Simone Heckmann (Apr 12 2016 at 12:06):

Simone Heckmann (Apr 12 2016 at 12:07):

Simone Heckmann (Apr 12 2016 at 12:07):

Patrick Werner (Apr 12 2016 at 12:11):

Patrick Werner (Apr 12 2016 at 12:12):

Patrick Werner (Apr 12 2016 at 12:13):

Patrick Werner (Apr 12 2016 at 12:16):

Simone Heckmann (Apr 12 2016 at 12:20):

Simone Heckmann (Apr 12 2016 at 12:30):

Patrick Werner (Apr 12 2016 at 12:32):

Patrick Werner (Apr 12 2016 at 12:32):

Patrick Werner (Apr 12 2016 at 12:33):

Patrick Werner (Apr 12 2016 at 12:35):

Patrick Werner (Apr 12 2016 at 12:37):

Simone Heckmann (Apr 12 2016 at 12:38):

Patrick Werner (Apr 12 2016 at 12:40):

Simone Heckmann (Apr 12 2016 at 12:43):