FHIR Chat · Bash exited with code '137'. · committers/git-help

Stream: committers/git-help

Topic: Bash exited with code '137'.


view this post on Zulip John Moehrke (Oct 31 2018 at 21:45):

my pull failed with Bash exited with code 137 -- Please help https://github.com/HL7/fhir/pull/195

view this post on Zulip Josh Mandel (Oct 31 2018 at 21:47):

https://fhir-build.visualstudio.com/build.fhir.org/_build/results?buildId=1154&view=logs has the full logs -- did you review?

view this post on Zulip Josh Mandel (Oct 31 2018 at 21:48):

Looks like a terminology cache issue

view this post on Zulip John Moehrke (Oct 31 2018 at 21:48):

could be. I looked and can't see anything I can act upon.

view this post on Zulip John Moehrke (Oct 31 2018 at 21:48):

so, if it is a term cache issue... is there a way to kick it to try again?

view this post on Zulip Josh Mandel (Oct 31 2018 at 21:48):

@Grahame Grieve does this mean anything to you?

view this post on Zulip Josh Mandel (Oct 31 2018 at 21:49):

Well, if there a terminology issue, rerunning won't change the outcome

view this post on Zulip John Moehrke (Oct 31 2018 at 21:49):

2018-10-31T21:40:36.5102961Z [java] -tx cache miss: $validate {null#logic-library: "null"}: "null" for Include All codes from http://terminology.hl7.org/CodeSystem/library-type

view this post on Zulip Josh Mandel (Oct 31 2018 at 21:49):

But FYI you can always kick off another build by pushing another commit.

view this post on Zulip John Moehrke (Oct 31 2018 at 21:57):

so, make a small change that I push in another commit on the branch? (sorry I am struggling with GIT terms)

view this post on Zulip Josh Mandel (Oct 31 2018 at 21:58):

That'd do it, yeah. I can also kick this off for you, but I don't see how a rebuild would change anything here.

view this post on Zulip John Moehrke (Oct 31 2018 at 21:59):

the change I made had nothing to do with terminology... so I expect the problem was not foundational to my change

view this post on Zulip Rob Hausam (Oct 31 2018 at 22:00):

I'm not sure if the -tx cache miss is the issue. There are several of these -tx cache miss notifications that I typically see in the build that don't normally seem to be an issue. The build got to [java] ...validate library-exclusive-breastfeeding-cds-logic before it failed.

view this post on Zulip Josh Mandel (Oct 31 2018 at 22:00):

Huh, okay. That's interesting then.

view this post on Zulip Josh Mandel (Oct 31 2018 at 22:00):

Maybe a memory error.

view this post on Zulip John Moehrke (Oct 31 2018 at 22:01):

the end of the log is rather abrupt

view this post on Zulip Josh Mandel (Oct 31 2018 at 22:01):

I'll kick off a rebuild right now

view this post on Zulip John Moehrke (Oct 31 2018 at 22:01):

thanks

view this post on Zulip Rob Hausam (Oct 31 2018 at 22:01):

sounds reasonable

view this post on Zulip Josh Mandel (Oct 31 2018 at 22:02):

building.

view this post on Zulip Josh Mandel (Oct 31 2018 at 22:03):

In general, we have a lot of trouble constraining the amount of memory the build process uses. We have a _JAVA_OPTIONS=-Xmx3200m env var which (I thought) was supposed to limit usage, but even with just two builds running at the same time, our 16Gb VM sometimes kills one.

view this post on Zulip Josh Mandel (Oct 31 2018 at 22:06):

16999 ubuntu    20   0 7266128 3.931g  24152 S 335.5 25.1   6:51.21 java
12247 ubuntu    20   0 7480120 4.743g  24044 S 199.3 30.3  82:16.41 java

I'm not sure how it gets to 4.7Gb with the limits that should be imposed.

view this post on Zulip Josh Mandel (Oct 31 2018 at 22:08):

Yeah, these two jobs are pushing very close to 12Gb, which is the amount of free RAM, even though they should be constrained to <6.4Gb between them.

view this post on Zulip Josh Mandel (Oct 31 2018 at 22:10):

Java helpfully notes:

2018-10-31T21:41:30.5755733Z      [java] Picked up _JAVA_OPTIONS: -Xmx3200m

in its output -- but then doesn't respect the -Xmx limit. Am I misunderstanding how this limit is fundamentally supposed to work? @Grahame Grieve @James Agnew ? Is there a different way to prevent Java from eating more than the available RAM?

view this post on Zulip Rob Hausam (Oct 31 2018 at 22:15):

I always thought that limit did work properly when I've used it (as when I need a lot of RAM for an OWL reasoner to classify a large ontology). In those cases for me the JVM (normally using Oracle) hasn't seemed to exceed the set limit. So I don't know why it would in this case, since it says that it successfully "picked up" the option.

view this post on Zulip Rob Hausam (Oct 31 2018 at 22:23):

The -Xmx option applies to only the heap space, so the process can use more RAM than that (not sure how much more is possible). And if you exceed the heap space you'll get an out of memory error, so that by itself wouldn't actually solve the problem.

view this post on Zulip Grahame Grieve (Oct 31 2018 at 22:42):

sig137 = running out of heap

view this post on Zulip Grahame Grieve (Oct 31 2018 at 22:42):

what's running out of memory? local, or the build?

view this post on Zulip John Moehrke (Oct 31 2018 at 22:47):

the second attempt built successfully. so it seems it was an infrastructural temporary failure.

view this post on Zulip Josh Mandel (Oct 31 2018 at 23:33):

The build is running out, when two competing Java builds run simultaneously. There's plenty of RAM on the machine (16Gb) to accommodate both, but I can't figure out how to limit each process effectively.

view this post on Zulip Josh Mandel (Oct 31 2018 at 23:33):

I can always just throwing more VMs at the problem, but I'd like to understand this.

view this post on Zulip Josh Mandel (Nov 01 2018 at 00:52):

jinfo on a running build (currently using 5.5Gb RAM) shows me:

VM Flags:
Non-default VM flags: -XX:CICompilerCount=4 -XX:CompressedClassSpaceSize=796917760 -XX:InitialHeapSize=1468006400 -XX:MaxHeapSize=3355443200 -XX:MaxMetaspaceSize=805306368 -XX:MaxNewSize=1118306304 -XX:MinHeapDeltaBytes=524288 -XX:NewSize=489160704 -XX:OldSize=978845696 -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseParallelGC
Command line:  -Xmx2000m -Xms1400m -XX:MaxMetaspaceSize=768m -Djava.awt.headless=true -Djava.util.logging.config.file=logging.properties -Xmx10000m -Xmx3200m

... whereas I would have thought this should be limited to Xmx + MaxMetaspaceSize == 3200 + 768 == ~4000Mb.

view this post on Zulip Josh Mandel (Nov 01 2018 at 01:03):

Looking at the running process with https://github.com/patric-r/jvmtop I see "NHMAX" is 1700Mb, so that would account for total usage approaching 5Gb, though I'm not sure what factors contribute to that NHMAX value.

view this post on Zulip Josh Mandel (Nov 01 2018 at 01:05):

It's getNonHeapMemoryUsage ----> getMax()


Last updated: Apr 12 2022 at 19:14 UTC