FHIR Chat · docs / Issue #44 Make it clear what the Hub should do wit... · fhircast-github

Stream: fhircast-github

Topic: docs / Issue #44 Make it clear what the Hub should do wit...


view this post on Zulip Github Notifications (FHIRcast) (Nov 15 2018 at 11:50):

lbergnehr opened Issue #44

If the notification to a client fails, what should happen? What should happen with other clients? Should they be notified or not? What happens when clients/hubs cannot communicate with each other?

view this post on Zulip Github Notifications (FHIRcast) (Nov 15 2018 at 11:54):

lbergnehr commented on Issue #44

So there are two types of errors:

1. Communication error (the message didn't get there)
2. Application error (the message go there)

We should probably address (1) but not (2).

view this post on Zulip Github Notifications (FHIRcast) (Nov 15 2018 at 12:05):

lbergnehr commented on Issue #44

The important thing here is of course to not be able to show different contexts but perhaps to accept that we don't know about a specific client's context and then being able to convey that information?

view this post on Zulip Github Notifications (FHIRcast) (Nov 17 2018 at 18:54):

jeremysrichardson commented on Issue #44

The important thing here is of course to not be able to show different contexts but perhaps to accept that we don't know about a specific client's context and then being able to convey that information?

I agree. IMHO: Clients should post their context to the hub's session. So, if the hub sends out an update, and the client doesn't respond, it can be seen by someone looking at the hub's state, that the client didn't switch context. Unlike CCOW, though, the hub should own context and shouldn't wait for all clients to acknowledge a change to commit a change.

Practically, I would warn the user in these cases, by having a lightweight desktop app that listens to context and shows a warning when any client doesn't match the hub context after a set timeout. The listener would also query/poll the hub as a heartbeat on some interval, just in case it is missing context messages.

view this post on Zulip Github Notifications (FHIRcast) (Nov 19 2018 at 16:07):

isaacvetter commented on Issue #44

Hey Guys,

Here's the thoughts that I've been collecting on this. Overall, the appropriate error handling is specific to the synchronization scenario, user expectations and implemeter.


FHIRcast describes a mechanism for synchronizing distinct applications. Sometimes things go wrong and applications fail to synchronize or become out of sync. For example, the user within the EHR opens a new patient's record, but the app fails to process the update and continues displaying the initial patient. Depending upon the expectations of the user and the error handling of the applications in use, this scenario is potentially risky. Identified below are four distinct synchronization scenarios, ranging from lowest level of expected synchronization to highest.

Overall, FHIRcast does not dictate how applications should react to synrhonization failure. You should design your product to meet your customer's expectations and needs. Also note that synchronization failure is a worst-case scenario and should rarely occur in production.

## Machine-to-machine-to-machine: Different machines, different times
Scenario: Clinician walks away from her desktop EHR and accesses an app on her mobile device which synchronizes to the EHR's hibernated session.
Summary: This serial or sequential use-case is a convenience synchronization and the clinical risk for synchronization failure is low.

Synchronization failure significance: low
Performance expectations: negligible
User distinction between synchronized applications: high

## Cross device: Different machines, same time
Scenario: Clinician accesses her desktop EHR as well an app on her mobile device at the same time. Mobile device synchronizes with the EHR desktop session.
Summary: The user clearly distinguishes between the applications synchronized on multiple devices and therefore clinical risk for a synchronization failure depends upon the workflow and implementer's goals. User manual action may be appropriate when synchronization fails.

Synchronization failure significance: medium
Performance expectations: low
User distinction between synchronized applications: medium-high

## Same machine, same time
Scenario: Clinician is accessing two or more applications on the same machine in a single workflow.
Summary: Although, disparate applications are distinguishable from one another, the workflow requires rapidly accessing one then another application. Application responsivity to synchronization is particularly important. Synchronization failure may introduce clinical risk and therefore user notification of synchronization failure may be appropriate.

Synchronization failure significance: medium
Performance expectations: high
User distinction between synchronized applications: medium

## Embedded apps: Same machine, same time, same UI
Scenario: Clinician accesses multiple applications within a single user interface.
Summary: Disparate applications indistinguishable from one another require the greatest amount of context synchronization. Clinical risk of synchronization failure is critical. Application responsivity to synchronization should be high.

Synchronization failure significance: critical
Performance expectations: high
User distinction between synchronized applications: none

view this post on Zulip Github Notifications (FHIRcast) (Nov 19 2018 at 16:42):

gkustas commented on Issue #44

Hi guys,

Yes, @isaacvetter , our users will always be somewhat exposed to applications getting out of sync. We have many forms of integrations that can fail intermittently, like an XML file drop to a network folder that is momentarily down. To minimize risk, it's always important the the current patient context is visible front and center in all applications, and where POSSIBLE, the user should be notified of the potential problem.

But I think we can make the FHIRCast specification a lot more bullet-proof than those old XML integrations, even though the systems are very loosely coupled and asynchronous.

The current WebSub spec states that a notification (publication) will be broadcast to all subscribers, including the publisher itself. If we extend the specification just a bit more, we can let the publisher know if the notification didn't get to all of the subscribers. For example:
1. User joerad is logged on to a dictation system and a PACS. Both are subscribed to topic "joerad", which we will assume is a valid session identifier.
2. joerad opens a study for patient Sally Smith on the PACS system, and posts this context change to the hub.
3. The hub in turn broadcasts to each subscriber, in this case joerad on the dictation system.
4. After sending the notification to all the other subscribers, the hub (as defined in WebSub) sends the notification back to the original publisher. In this message, the hub can include _a new element_ containing a list of all the subscribers notified and the status of the notification.
5. So in this scenario, if the dictation system did not positively respond, the PACS would find out and it could warn the user of the potential synchronization failure.

The only potential problem I see here is the latency if there are multiple subscribers (more than two), but I think that is rare use case - no?

view this post on Zulip Github Notifications (FHIRcast) (Nov 30 2018 at 16:45):

wmaethner commented on Issue #44

I know we at some point discussed an error or a result of the context switch returned to the hub by the subscribing app, but it doesn't seem to be currently defined in the specs. It sounds like that would be useful whether it is required or not.

@gkustas one thing I would be worried about in the example you propose, which I don't think would be as much of an issue if the hub was originating the context change, is the potential for latency like you mention. In that scenario we are essentially requiring the subscribing apps to respond to the notification with a status of that context switch, which will then be included in the response to the app requesting the change. How does this situation work in the case where a notification doesn't reach the subscriber (maybe they crashed or there was some network issue) or the subscriber can't respond immediately with their status because some actions need to occur in their system first (saving, prompts to the user, etc) before they actually know whether the context switch was successful. In either of those cases the hub will either need to wait for the subscribing apps to respond or will be left with incomplete information to give back to the app requesting the change.

If this is purely just for informing apps of the status of the context change, and doesn't actually hold up the notifications or context changes in any system (which as I was typing this I think would be the case), then maybe it isn't that big of a deal for the status notification to come in slightly delayed from the actual context change notification. In this situation the app would receive the status and then put up some display or alert to the user in the case that an error occurred, and that doesn't necessarily need to happen immediately, but it should happen fairly soon after the context change to prevent any mis-documentation or confusion.

view this post on Zulip Github Notifications (FHIRcast) (Nov 30 2018 at 17:13):

gkustas commented on Issue #44

Hi @wmaethner

Let me see if I understand the problem...

When the hub receives a notification from client "A", it then posts the same notification to all subscribers, let's say clients "B" and "C". I am assuming each subscriber will return a 200 level (accepted) HttpResponse (or something else over the websocket) as soon as they receive the notification. If there is a network error on client "B" or client "C" crashes (and is therefor not listening), the hub should know almost immediately, right? I would think a short wait period like a few seconds may be adequate before echoing the notification (and new status element) back to the originator. Worse case scenario, the hub is premature in indicating an error on a client. In that case, the user would probably be presented with a warning message by the originating client which can be ignored if the subscribing client appears in the correct patient/study context.

Or am I missing something?

view this post on Zulip Github Notifications (FHIRcast) (Nov 30 2018 at 18:37):

wmaethner commented on Issue #44

Hey @gkustas yeah that sounds about right. I guess the point I was trying to make is that the status portion should be separated in a way that it doesn't cause delays in actually performing the context change. It is an issue that can come up in current integrations (mainly synchronous ones like COM) where the driving system is waiting on the application to handle the context switch and is essentially frozen at that point. It comes up mainly with saving prompts where we will tell a system to close a study and that application prompts the user to save changes. At that point that app can't really reply with their status back to the hub until the user addresses that popup (unless they respond that they successfully received the notification I guess), and that could delay the rest of the workflow. That is the situation I want to avoid basically.

view this post on Zulip Github Notifications (FHIRcast) (Nov 30 2018 at 19:22):

gkustas commented on Issue #44

Yes @wmaethner , I know exactly the situation you describe. Our current COM integration do all of the actual processing asynchronously and fire events when they are done. This is to avoid just what you describe.

What I propose is the same thing for notifications. control should be returned immediately and the processing be done asynchronously (in another thread or however the implementer wants to handle it).

view this post on Zulip Github Notifications (FHIRcast) (Dec 12 2018 at 23:07):

isaacvetter commented on Issue #44

Great discussion!

There's a few different threads here. As a start, I see two undisputed next steps:

1) Ultimately, different integrations have different capacities for synchronization failure. We should better describe the considerations for synchronization failure vs performance in light of the user's expectations for the specific use-case that a developer is working on.

2) Update spec to note that subscriber should respond to a notification with an appropriate HTTP status. In the case of a successful notification, the subscriber responds with an HTTP 200. And that the Hub MAY use these statuses to track synchronization state.

HTTP/1.1 200 Success

Any concerns?

view this post on Zulip Github Notifications (FHIRcast) (Dec 12 2018 at 23:25):

isaacvetter commented on Issue #44

Also, see PR #53.

view this post on Zulip Github Notifications (FHIRcast) (Dec 13 2018 at 13:32):

gkustas commented on Issue #44

Yes, there is no mention of #2 in the FHIRCast docs (return status from notification). I'm fine with 200, but there are a series of codes in the 200's that some application may prefer, and that possibly would make sense. Not important though. Thanks @isaacvetter

view this post on Zulip Github Notifications (FHIRcast) (Feb 05 2019 at 14:22):

isaacvetter closed Issue #44

If the notification to a client fails, what should happen? What should happen with other clients? Should they be notified or not? What happens when clients/hubs cannot communicate with each other?


Last updated: Apr 12 2022 at 19:14 UTC