FHIR Chat · ci-build length limit · committers

Stream: committers

Topic: ci-build length limit


view this post on Zulip Grahame Grieve (Sep 05 2019 at 22:54):

hey @Josh Mandel - check http://build.fhir.org/ig/HL7/UTG/branches/master/build.log - what is the timeout limit for a IG build on the ci-build?

view this post on Zulip Josh Mandel (Sep 05 2019 at 23:27):

https://github.com/FHIR/auto-ig-builder/blob/master/triggers/ig-commit-trigger/job.json#L11

view this post on Zulip Josh Mandel (Sep 05 2019 at 23:27):

Looks like 30min

view this post on Zulip Grahame Grieve (Sep 05 2019 at 23:43):

hmmm but the log seems to be about 11 min... I'm investigating

view this post on Zulip Grahame Grieve (Sep 06 2019 at 01:04):

@Josh Mandel so I fixed something up, and committed it back to the repo, which should have kicked off another build.. but it did not. is there any way I can investigate why nothing happened?

view this post on Zulip Grahame Grieve (Sep 06 2019 at 01:45):

well, it didn't appear in the notification stream here, but it did try ,and the log is here: https://build.fhir.org/ig/HL7/UTG/branches/master/build.log

view this post on Zulip Grahame Grieve (Sep 06 2019 at 01:46):

there's something off about it - I cut the build time in half, but it still timed out....

view this post on Zulip Josh Mandel (Sep 06 2019 at 16:16):

It seems to have actually taken >30min and been killed, per the time limit. Is there a reason to think something else happened?

view this post on Zulip Rob Hausam (Sep 06 2019 at 17:08):

That's what I saw, too. Do we just extend the time limit? Currently the UTG CI build is still failed.

view this post on Zulip Grahame Grieve (Sep 06 2019 at 19:55):

well, we didn't get notification in the committers channel. that's the first issue. The second is what's going on, I'll investigate. but can we get notification?

view this post on Zulip Josh Mandel (Sep 06 2019 at 21:58):

Notifications: yes, this is doable but requires a bit of a re-write. I won't get to it this weekend unless someone thinks it's urgent.

view this post on Zulip Josh Mandel (Sep 06 2019 at 21:59):

Re: time limit, sure I'll update to 1h now.

view this post on Zulip Josh Mandel (Sep 06 2019 at 22:02):

Just updated at https://github.com/FHIR/auto-ig-builder/commit/1c341475da149452395eba71718b7d9f9248af92

view this post on Zulip Grahame Grieve (Sep 06 2019 at 22:02):

ok thanks

view this post on Zulip Josh Mandel (Sep 06 2019 at 22:03):

... and kicked off a build job for UTG.

view this post on Zulip Grahame Grieve (Sep 06 2019 at 22:11):

thx. 12 min for me. I just added late logging so I can see where it's going wrong

view this post on Zulip Josh Mandel (Sep 06 2019 at 22:14):

Hmm, that's a big difference in build times.

view this post on Zulip Josh Mandel (Sep 06 2019 at 22:14):

12 vs >30.

view this post on Zulip Grahame Grieve (Sep 06 2019 at 22:15):

y

view this post on Zulip Josh Mandel (Sep 06 2019 at 22:21):

Let's see about throwing some faster CPUs at it.

view this post on Zulip Josh Mandel (Sep 06 2019 at 22:21):

Just added a new node pool and over the weekend will switch over to it.

view this post on Zulip Grahame Grieve (Sep 06 2019 at 22:22):

ok thanks. btw, I just learnt something.... maven will happily release a snapshot when it doesn't compile...

view this post on Zulip Josh Mandel (Sep 06 2019 at 22:22):

Oh, that's delightful.

view this post on Zulip Grahame Grieve (Sep 06 2019 at 22:23):

this log from running UTG:

Connect to Terminology Server at http://tx.fhir.org                              (00:30.0066)
Load Content                                                                     (00:30.0577)
Exception in thread "main" java.lang.Error: Unresolved compilation problems:
    The method Log(String) is undefined for the type Publisher
    The method Log(String) is undefined for the type Publisher

    at org.hl7.fhir.igtools.publisher.Publisher.load(Publisher.java:2541)
    at org.hl7.fhir.igtools.publisher.Publisher.createIg(Publisher.java:716)
    at org.hl7.fhir.igtools.publisher.Publisher.execute(Publisher.java:609)
    at org.hl7.fhir.igtools.publisher.Publisher.main(Publisher.java:6398)

view this post on Zulip Grahame Grieve (Sep 06 2019 at 23:23):

can we make more memory available?

view this post on Zulip Grahame Grieve (Sep 06 2019 at 23:32):

ok I have a log now. It runs ok for 15 min or so, and then it runs out of RAM and slows down to a crawl (managing memory). Now that you've extended the time out to 60 minutes, it finally runs out of heap space at 45 minutes. On the ci-build it has 4GB, but I'm running it locally with 7GB

view this post on Zulip Grahame Grieve (Sep 06 2019 at 23:32):

you might ask whether we need all of that RAM... I'll consider that, but it would be a major rework to change that

view this post on Zulip Rob Hausam (Sep 06 2019 at 23:35):

I've been running UTG locally with 8GB (6GB seemed not quite enough) - and that was before Ted made his latest updates

view this post on Zulip Ted Klein (Sep 06 2019 at 23:45):

When I added the value sets for CDA (>325 of them) it slowed down on my local build very considerably. My laptop howls and can do little else for about 25 minutes until finished. I have a 4 core i9 with 32Gb of RAM. If we cannot get this working over the weekend, I will update the CDA content manifest to only include a few dozen value sets and see if that will run to completion. Or if it will help to debug, I can do that tonight.

view this post on Zulip Grahame Grieve (Sep 07 2019 at 00:28):

I'm still running in 12 min with all the CDA value sets. And the build is about 1/2 that speed until it his the RAM wall

view this post on Zulip Josh Mandel (Sep 07 2019 at 01:09):

I can give it as much RAM as we want.. will have a go at quadrupling RAM tomorrow.

view this post on Zulip Grahame Grieve (Sep 07 2019 at 02:45):

wooah thanks

view this post on Zulip Rob Hausam (Sep 07 2019 at 02:59):

built successfully in 19 min with 8GB RAM

view this post on Zulip Josh Mandel (Sep 07 2019 at 16:36):

Okay, with 20GB RAM on the latest cascade lake CPU, builds in 13min (that includes the time for the cluster manager to notice that it needed to boot a new VM, and boot it).

success notice

view this post on Zulip Josh Mandel (Sep 07 2019 at 16:54):

(Er, 11 min when I actually set the right machine type, of which 9 are actual container running time.)


Last updated: Apr 12 2022 at 19:14 UTC