FHIR Chat · Prod Tx Server: FHIRServer - 7/19/2021 8:40:37 PM · tx.fhir.org/notification

Stream: tx.fhir.org/notification

Topic: Prod Tx Server: FHIRServer - 7/19/2021 8:40:37 PM


view this post on Zulip Terminology Service Monitor Bot (Jul 19 2021 at 20:40):

7/19/2021 8:40:37 PM: Issueed start request, service: FHIRServer

view this post on Zulip Terminology Service Monitor Bot (Jul 19 2021 at 20:41):

7/19/2021 8:41:10 PM: Waiting on service (FHIRServer) after start...

view this post on Zulip Mark Iantorno (Jul 19 2021 at 20:47):

Oh nice

view this post on Zulip Terminology Service Monitor Bot (Jul 19 2021 at 20:48):

7/19/2021 8:48:07 PM: Service is up!

view this post on Zulip Rob Hausam (Jul 19 2021 at 20:48):

Yes, yikes! The service is now re-starting - so that seems to have worked, but there was a lot of churn beforehand. I don't know if @Mark Iantorno or @Gino Canessa may have intervened manually (I haven't seen anything that said so)? The monitor still doesn't seem to quite be there yet, but we are making progress.

view this post on Zulip Mark Iantorno (Jul 19 2021 at 20:48):

I was actually trying to code today, so I was ignoring Zulip. So...wasn't me

view this post on Zulip Gino Canessa (Jul 19 2021 at 20:49):

Yes, I'll update it to post messages more frequently as well - right now it only posts a message during a state change.

view this post on Zulip Gino Canessa (Jul 19 2021 at 20:49):

I haven't touched it

view this post on Zulip Rob Hausam (Jul 19 2021 at 20:49):

Ok. So that's good - it did work.

view this post on Zulip Gino Canessa (Jul 19 2021 at 20:50):

That's probably a fair estimate on time.. it issues a Service stop request, waits until the service is reported as stopped, then issues a start request (which I believe takes some number of minutes to come back up as well)

view this post on Zulip Rob Hausam (Jul 19 2021 at 20:52):

Actually, I was reading the time wrong - had looked at the one from a few days ago, too. Seems like it only took a few minutes from when the downtime was noticed. It did send quite a few messages, though. :)

view this post on Zulip Rob Hausam (Jul 19 2021 at 20:53):

The four stop request messages (are they all real?) seems of the most potential concern to me.

view this post on Zulip Gino Canessa (Jul 19 2021 at 20:53):

Ahh, I see that now. I think it's because the service reports to the Service Control Manager that it failed to start (so I issue another start request)... I guess I need to put in some logic to only issue a single start request and wait, regardless of what the SCM says the state is.

view this post on Zulip Rob Hausam (Jul 19 2021 at 20:54):

Yes, I think so.

view this post on Zulip Gino Canessa (Jul 19 2021 at 20:55):

Yeah, I didn't have anything that simulates that process (successfully starting but reporting a failure). I don't really want to try an figure out what I would need to write to simulate that =)

view this post on Zulip Terminology Service Monitor Bot (Jul 20 2021 at 02:38):

7/20/2021 2:38:17 AM: Service is down! Will restart...

view this post on Zulip Terminology Service Monitor Bot (Jul 20 2021 at 02:38):

7/20/2021 2:38:18 AM: Issued stop request, service: FHIRServer

view this post on Zulip Terminology Service Monitor Bot (Jul 20 2021 at 02:38):

7/20/2021 2:38:25 AM: Waiting on service (FHIRServer) after start...

view this post on Zulip Terminology Service Monitor Bot (Jul 20 2021 at 02:38):

7/20/2021 2:38:25 AM: Waiting on service (FHIRServer) after start...

view this post on Zulip Terminology Service Monitor Bot (Jul 20 2021 at 02:38):

7/20/2021 2:38:25 AM: Waiting on service (FHIRServer) after start...


Last updated: Apr 12 2022 at 19:14 UTC