FHIR Chat · how about scaling? · implementers

Stream: implementers

Topic: how about scaling?


view this post on Zulip Jens Villadsen (Nov 17 2017 at 12:56):

I would like to start a discussion about how to make FHIR scale. It's not related to any specific use case - it is simply about horizontal hardcore scaling and shipping events instead of just using queries like crazy, hitting the RESTful servers constantly (we don't want stale clients, right?). If FHIR is to scale, event mechanisms (like using subscription with web sockets, mq, server sent events rest/web-hooks and what not) must be in use - notifying clients about changes in data. Clients will need to set up subscriptions dynamically and needs to be notified whenever that initial set of data changes - when stuff enters (currently supported) and when stuff leaves.

Also, if clients are to detect whether they in fact are stale or not versioning is of paramount importance - and should be used as lamports (https://en.wikipedia.org/wiki/Lamport_timestamps) or alike. We also need to establish the fact that data cannot change without having its version bumped (this is currently not the case for metadata) because it makes it impossible to detect if data has changed.

I've created two tickets that points towards some requests that seems necessary to me, in order to get closer to something that can eventually offer consistency and scalability.

https://gforge.hl7.org/gf/project/fhir/tracker/?action=TrackerItemEdit&tracker_item_id=14197
https://gforge.hl7.org/gf/project/fhir/tracker/?action=TrackerItemEdit&tracker_item_id=14198

Please, join in on this discussion if you would like FHIR to scale and using eventing mechanisms.

view this post on Zulip Jens Villadsen (Nov 17 2017 at 12:57):

@Michael Christensen please join me on this crusade

view this post on Zulip Jens Villadsen (Nov 17 2017 at 13:10):

@Vadim Peretokin you're welcome to chip in :)

view this post on Zulip shivnath shelake (Nov 17 2017 at 13:10):

Hi..
I have c# code in that I created patient with pat.Id = "12345";
I want to give this patient reference for observation how i can achieve same..
Also in search how i can get the patient related all resource information...
help is greatly appreciated !!

view this post on Zulip Jens Villadsen (Nov 17 2017 at 15:04):

hmmm .... I need to look into http://build.fhir.org/eventdefinition.html it seems

view this post on Zulip John Calvin Young (Nov 17 2017 at 16:29):

Very interested in this topic, as well as the Bulk Data Access proposal. I've run into scaling challenges alongside flexibility challenges--using the query API to its full flexibility (or beyond!) isn't performant at scale, but it's also hard to run performant enough to get less filtered data and handle client-side. Subscriptions seem very promising if developed carefully and implemented widely.

view this post on Zulip Vadim Peretokin (Nov 21 2017 at 07:00):

+1 to this, while some systems will fit in one server and database, we need to account for horizontal scalability as well

view this post on Zulip nicola (RIO/SS) (Nov 22 2017 at 10:32):

I think, transaction level (or consistency level) can be different and specified for each server.

view this post on Zulip Christiaan Knaap (Nov 22 2017 at 10:33):

Interested.
I may help to define a few different use cases:
1. scaling out to multiple instances of the same FHIR Server ( / RESTful API implementation) on a single database
2. scaling out to multiple database instances (like sharding or replication), still using multiple instances of the same FHIR Server
3. scaling out across different FHIR Servers (or even clients)
I think 1 and 2 can be often be handled by the storage mechanisms underlying the FHIR Server, and at least Vonk keeps no state in the web API layer, so that can be scaled independently. Also, the FHIR Server implementation may assure that changes on metadata don't go unnoticed (Vonk will assign a new version upon an update-interaction no matter what you changed).
With 3 it becomes important to 'replay transactions' somehow as we discussed at the BoF at DevDays. And in this case you cannot rely on all parties detecting changes to metadata that you might want to know about. Or even support versioning at all. How to communicate about the capabilities in this respect? Should that be computable?

view this post on Zulip Brian Postlethwaite (Nov 22 2017 at 21:38):

The $meta-add operations can be performed on older versions of the resource (often to put a security/privacy tag on them to prevent future access)
This is not replicated through the history mechanism.

view this post on Zulip Jens Villadsen (Nov 22 2017 at 22:39):

@Brian Postlethwaite - the fact that you can add metadata without bumping the version of the resource as well makes my skin crawl

view this post on Zulip Brian Postlethwaite (Nov 22 2017 at 22:44):

Unfortunately, for this use-case, its the only way.
and it's the only thing like it in the FHIR spec too.

view this post on Zulip Jens Villadsen (Nov 22 2017 at 22:44):

@Christiaan Knaap regarding the fact that Vonk does not keep state ... does that also go for the subscription resource? does it mean that if I set up two Vonk servers up against the same database, that they automatically vote on who is to ship out the notification to a client

view this post on Zulip Jens Villadsen (Nov 22 2017 at 22:49):

@Brian Postlethwaite well lets just hope that no-one finds it important to track changes or handle concurrency when messing with metadata ...:unamused:

view this post on Zulip Jens Villadsen (Nov 22 2017 at 22:49):

I guess that was not part of the use case ... :astonished:

view this post on Zulip Lloyd McKenzie (Nov 22 2017 at 23:24):

Should we add a "lastMetaUpdated" to meta (including as a search criteria when querying history)? That way there'd be a way to retrieve past versions that have been tweaked since the last time you looked.

view this post on Zulip Brian Postlethwaite (Nov 22 2017 at 23:31):

Or just a note on how to represent the meta change for the history representation

view this post on Zulip Lloyd McKenzie (Nov 22 2017 at 23:32):

Well, if you don't track when the meta changed - or expose it, it's going to be hard for other systems to figure out what happened.

view this post on Zulip Christiaan Knaap (Nov 23 2017 at 09:14):

@Jens Villadsen Currently the subscription evaluation process is part of the Vonk server process, so the notification would be handled by the server that processes the request triggering it. But I expect the implementation will have several makeovers in the future (since the spec will change on this subject and we will gain experience).

view this post on Zulip Jens Villadsen (Nov 23 2017 at 10:15):

@Lloyd McKenzie @Brian Postlethwaite - I'd would favor a logical clock - like Meta.metaVersionId ... for concurrency reasons

view this post on Zulip Jens Villadsen (Nov 23 2017 at 10:19):

@Christiaan Knaap - ... I get what you are saying, but the my question relates to parallel Vonk servers against the same db. If a client submits a subscription to one Vonk server and that server goes down, does the other Vonk server then take over the responsibility to send notifications automatically?

view this post on Zulip Lloyd McKenzie (Nov 23 2017 at 19:43):

Note that versionId isn't necessarily sequential - it can easily be a random GUID or something like that. The only rule is that it's unique within the scope of the resource id.

view this post on Zulip Jens Villadsen (Nov 23 2017 at 21:37):

what happened to using a sound well proven concept such as lamports?

view this post on Zulip Lloyd McKenzie (Nov 23 2017 at 23:53):

FHIR systems aren't necessarily built using FHIR-based persistence technologies. Systems can expose whatever version id they have - in some cases it might even be a generated hash. The sequential element is lastUpdated, not versionId

view this post on Zulip Jens Villadsen (Nov 24 2017 at 11:30):

:thinking_face:

view this post on Zulip Jens Villadsen (Nov 24 2017 at 13:52):

that sounds like a bet on true distributed time ... it however requires that all systems are in total sync regarding time and that no two updates arrive at the same time (especially if versionId) - then again ... none of it is required, right?

view this post on Zulip Lloyd McKenzie (Nov 24 2017 at 14:08):

Versioning is always specific to a single server instance - because that's the scope of the resource id. If you synchronize data across multiple servers, the version ids could be totally distinct - integers on one server, timestamps on another, guids on another. There's no time synchronization issues involved.

view this post on Zulip John Moehrke (Nov 24 2017 at 18:16):

Well the time used is always the server time. My past results had a time that I use in my next query...

view this post on Zulip John Moehrke (Nov 24 2017 at 18:20):

my understanding of the lastUpdated behavior is that we allow servers to have policy on what kind of a change can be excluded. Thus it is policy on if a metadata change will update lastUpdated. right? There are realistic policies that could exist where almost any element change might be considered a non-change. Less so with clinical data vs metadata. But reasonable policy expectations do exist. If this is the case then the theoretical problem mentioned here can be 'managed' in an implementation guide or deployment policy. right?

view this post on Zulip Lloyd McKenzie (Nov 24 2017 at 18:25):

I think lastUpdated can't change when meta changes because doing so would cause history to become out-of-order. lastUpdated is the only way to reliably sort history records

view this post on Zulip Jens Villadsen (Nov 25 2017 at 21:35):

Where is it stated that lastUpdated can't change when meta changes?

view this post on Zulip Jens Villadsen (Nov 27 2017 at 07:57):

my understanding of the lastUpdated behavior is that we allow servers to have policy on what kind of a change can be excluded. Thus it is policy on if a metadata change will update lastUpdated. right? There are realistic policies that could exist where almost any element change might be considered a non-change. Less so with clinical data vs metadata. But reasonable policy expectations do exist. If this is the case then the theoretical problem mentioned here can be 'managed' in an implementation guide or deployment policy. right?

I agree - I think it might be worth mentioning that the doc could be more clear on some topics here (http://build.fhir.org/resource.html#Meta) like what fields are encouraged to be updated only by the server (@Grahame Grieve) like versionId and lastUpdated.

view this post on Zulip Grahame Grieve (Nov 27 2017 at 08:40):

sure you can make a task for that

view this post on Zulip Jens Villadsen (Nov 27 2017 at 12:41):

sure you can make a task for that

Nevermind ... I found it reading a combination of http://build.fhir.org/http.html#create , http://build.fhir.org/http.html#update , http://build.fhir.org/http.html#versionaware and http://build.fhir.org/resource.html#Meta

view this post on Zulip Jens Villadsen (Nov 27 2017 at 13:06):

Where is it stated that lastUpdated can't change when meta changes?

@Lloyd McKenzie found it: http://build.fhir.org/http.html#update

view this post on Zulip Lloyd McKenzie (Nov 27 2017 at 17:03):

@Grahame Grieve Do yo usee utility in adding metaLastUpdated? I'm not sure how else client systems would know when to grab old history records that have had their meta tweaked.

view this post on Zulip Grahame Grieve (Nov 27 2017 at 22:42):

no there's no way to get them. And I really don't see much value in it. If anything, we should remove the special case $meta operations on old records. I've never heard of anyone using them, only complaints around the special case that they resporesent

view this post on Zulip Lloyd McKenzie (Nov 28 2017 at 04:05):

Couldn't you query for _history and filter on the history entries whose meta has changed since you last looked?

view this post on Zulip Lloyd McKenzie (Nov 28 2017 at 04:06):

The only time changing historical records has any use-case is for history and I agree it's unlikely anyone will do that. Should we say that the security tags on the current record should apply to all historical versions?

view this post on Zulip John Moehrke (Nov 28 2017 at 13:32):

That is a policy decision. The tag may have been applied for a version specific reason. The tag may have been applied for a broad reason. It is better with security tags to never have implied conduction. But that is a policy, not a model thing. The FHIR model lets you do either policy. Yet if you put a policy into FHIR model, then you forbid the other reasonable policy from being done.

view this post on Zulip Christiaan Knaap (Dec 03 2017 at 21:01):

@Jens Villadsen I never tried it, but it should. Subscriptions are shared through the Administration database in Vonk, so all instances have access to them.

view this post on Zulip Jens Villadsen (Dec 03 2017 at 22:45):

@Christiaan Knaap let me know if you are sure on this part


Last updated: Apr 12 2022 at 19:14 UTC