Stream: cql
Topic: Memory Hog R4FhirModelResolver.initialize
Vitor Pamplona (Dec 14 2021 at 17:06):
After some debugging, I realized that the initialization of ResourceTypes in one-by-one fashion causes the rebuilding of multiple Maps and Lists within FhirContext, triggering several GC across the loading time.
It would be nice if the method made a list of all ResourceTypes to initialize and only then called scanResourceTypes with the entire enum-based array. This would clear some memory hog the lib currently struggles with.
Does that make sense? The necessary FhirContext methods to be called however doesn't seem to be public, which leads me to believe we might need to change HAPI if we really need this upfront loading of all ResourceTypes.
Vitor Pamplona (Dec 14 2021 at 17:12):
Removing the initialization cuts the time for a fresh evaluation (loading fhir and everything else) by 50%
JP (Dec 14 2021 at 21:32):
Unfortunately, due to the way that the FhirContext is sealed post-initialization we have to do all initialization upfront. From the perspective of the cql-engine any arbitrary CQL can be executed so it doesn't know the types beforehand. There are some limited scenarios where if your CQL content were completely static that you could select a set of Resource types to be initialized but the most scenarios are currently the opposite, where the CQL is dynamic. We also expect the ModelResolver to be long-lived such that the initialization is a one-time cost.
JP (Dec 14 2021 at 21:36):
IOW, exposing such an API would both require upstream changes and not able to be used in most scenarios.
JP (Dec 14 2021 at 21:38):
An alternative approach would be to update the FhirContext such that new types can be added at runtime. In which cases only the types that you actually used would be loaded, and we'd not need to do the initialization upfront. I haven't dug into it to see if that would be possible.
JP (Dec 14 2021 at 21:39):
If the FhirContext supported that approach we wouldn't have to pre-init either and the startup time would be quicker.
Vitor Pamplona (Dec 14 2021 at 21:42):
Agree, but the question is: if we have to pre-init, can we do it all in one load. Because right now, the procedure is just extremely wasteful
JP (Dec 14 2021 at 21:42):
As far as I'm aware there's not an API that supports that that's publicly available.
Vitor Pamplona (Dec 14 2021 at 21:43):
I am ready to use reflection and force that method to be public :)
JP (Dec 14 2021 at 21:45):
There are already a couple places in the engine that we do that, so I'm not opposed outright. Obviously an official API would be better. :smile:
JP (Dec 14 2021 at 21:46):
As far as the resource type list, for now it needs to be everything plus the few custom types the engine adds to patch a couple gaps.
Vitor Pamplona (Dec 14 2021 at 23:16):
It's ugly but it works!
The previous method is 43% slower than the new one, on the desktop.
Time to Load R4 on desktop one by one (previous method)
3.305 s
4.006 s
3.301 s
3.152 s
3.311 s
Average 3.41 +/- 0.24seconds
Time to Load R4 on desktop by Reflection (New Method)
2.345 s
2.293 s
2.410 s
2.382 s
2.438 s
Average 2.37 +/- 0.04seconds
Final impl below
protected void initialize() {
// HAPI has some bugs where it's missing annotations on certain types. This patches that.
this.fhirContext.registerCustomType(AnnotatedUuidType.class);
// The context loads Resources on demand which can cause resolution to fail in certain cases
// This forces all Resource types to be loaded.
/*
for (Enumerations.ResourceType type : Enumerations.ResourceType.values()) {
// These are abstract types that should never be resolved directly.
switch (type) {
case DOMAINRESOURCE:
case RESOURCE:
case NULL:
continue;
default:
}
this.fhirContext.getResourceDefinition(type.toCode());
}
*/
// force calling of validateInitialized();
this.fhirContext.getResourceDefinition(Enumerations.ResourceType.ACCOUNT.toCode());
Map<String, Class<? extends IBaseResource>> myNameToResourceType;
try {
Field f = this.fhirContext.getClass().getDeclaredField("myNameToResourceType");
f.setAccessible(true);
myNameToResourceType = (Map<String, Class<? extends IBaseResource>>) f.get(this.fhirContext);
List<Class<? extends IBaseResource>> toLoad = new ArrayList(myNameToResourceType.size());
for (Enumerations.ResourceType type : Enumerations.ResourceType.values()) {
// These are abstract types that should never be resolved directly.
switch (type) {
case DOMAINRESOURCE:
case RESOURCE:
case NULL:
continue;
default:
}
if (myNameToResourceType.containsKey(type.toCode().toLowerCase()))
toLoad.add(myNameToResourceType.get(type.toCode().toLowerCase()));
}
Method m = this.fhirContext.getClass().getDeclaredMethod("scanResourceTypes", Collection.class);
m.setAccessible(true);
m.invoke(this.fhirContext, toLoad);
} catch (IllegalArgumentException e) {
e.printStackTrace();
} catch (NoSuchFieldException e) {
e.printStackTrace();
} catch (SecurityException e) {
e.printStackTrace();
} catch (IllegalAccessException e) {
e.printStackTrace();
} catch (NoSuchMethodException e) {
e.printStackTrace();
} catch (InvocationTargetException e) {
e.printStackTrace();
}
}
Vitor Pamplona (Dec 15 2021 at 01:21):
PR up: https://github.com/DBCG/cql_engine/pull/521
Michael Riley (Dec 20 2021 at 17:28):
- It's very possible with the HAPIFhirContext to create your own "Context" with only types you want to support. You have to create your own context that implements the base FhirContext and manually add classes one-at-a-time
- The CQL execution is dynamic, but aren't the retrieval context themselves very static? I've only been able to in the cql grammar to retrieve exactly one type of resourcetype per retrieval at a time, which makes sense. One resource domain per retrieval. A simple collection of the retrieval contexts in a script would reveal which types were needed (right?)
JP (Dec 20 2021 at 17:44):
Yes, if your content is static (i.e. you know the full set of CQL you will ever execute ahead of time, prior to initializing the CQL engine) then you can know the full set of requirements ahead of time. In most of the deployments we're currently supporting the content is not static. For example, the cqf-ruler allows execution of arbitrary CQL via the $cql operation. Or an Android SDK could download the required Libraries for a guideline at runtime.
Last updated: Apr 12 2022 at 19:14 UTC