Stream: implementers
Topic: FHIR R4 create large resource
Pavel Pilar (Aug 04 2021 at 12:25):
Hello,
we are struggling with large payloads in FHIR resource update. One of our codesystems has roughly 200MB in size and although we updated settings on ES side as well as FHIR server and DB side, we still cannot create the resource.
Currently it fails between ES, FHIR server and DB probably on another timeout.
[Hibernate Search: Elasticsearch transport thread-2] ERROR o.h.s.exception.impl.LogErrorHandler [LogErrorHandler.java:71] HSEARCH000058: Exception occurred org.hibernate.search.exception.SearchException: HSEARCH400007: Elasticsearch request failed.
Request: PUT /ca.uhn.fhir.jpa.model.entity.resourcetable/ca.uhn.fhir.jpa.model.entity.ResourceTable/3102 with parameters {}
Response: null
...
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Connection reset by peer
We are using hapi fhir r4 5.1 version and the FHIR client update method to create a resource. Our solution is deployed in AWS. We are using RDS for the database.
Are there any guidelines for creating such a large resource (CodeSystem) in FHIR server?
Thanks
Alexander Kiel (Aug 04 2021 at 16:47):
Such big resources are rather unusual in FHIR. You may need a terminology server that has special support for this CodeSystem in order to not needing it in literal form. What CodeSystem is it?
Nevertheless, although I can't help you with HAPI, I would be pleased if you give Blaze a try. Blaze uses an embedded RocksDB Database with a focus on performance.
I generated a Codesystem using this Clojure code:
(require '[jsonista.core :as j])
(->> {:resourceType "CodeSystem"
:concept
(for [i (range 500000)]
{:code (format "code-%d" i)
:property
(for [i (range 10)]
{:code (format "property-code-%d" i)
:valueString (format "property-value-%d" i)})})}
(j/write-value-as-string)
(spit "test.json"))
started Blaze with:
docker run -p 8080:8080 -v blaze-data:/app/data --rm -e JAVA_TOOL_OPTIONS=-Xmx4g samply/blaze:0.11.0
on a Linux desktop machine with 32 GB RAM and quite slow SSD. Uploaded the CodeSystem using:
% curl -d @test.json -H 'content-type:application/json' http://localhost:8080/fhir/CodeSystem > response.json
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 606M 100 303M 100 303M 4684k 4684k 0:01:06 0:01:06 --:--:-- 56.4M
from my Mac over LAN. As you can see, my CodeSystem is 303 MB in size and the upload incl. complete download from the response took about 1 minute.
John Silva (Aug 04 2021 at 18:40):
@Pavel Pilar - Are you trying to upload SNOMED or LOINC ? We had trouble trying to load these LARGE CodeSystems to HAPI (earlier version though). I seem to remember @James Agnew mentioning that there is (or will be) support for some of these large CodeSystems already being loaded (or loaded in some way to get around their large size and network timeouts). Probably good to ask this question in the #hapi channel.
Peter Jordan (Aug 04 2021 at 22:26):
@John Silva - most of the FHIR Terminology Servers don't place all the concepts from large Code Systems (e.g. SNOMED CT and LOINC) in CodeSystem resources - i.e. they set the value of CodeSystem.content to not-present
. In these cases, the content is provided via ValueSets and, even then, some of the Servers impose restrictions on the number of concepts that can be returned from any given request, even allowing for paging. My approach is to persist SNOMED CT, LOINC, and others in a database and use 'standard' multi-layered architecture to service requests; others might use the file system or even place the entire code system in memory.
Pavel Pilar (Aug 05 2021 at 07:28):
@John Silva - We are trying to upload part of HMDB. About 120000 concepts. Thanks @Alexander Kiel for hints with Blaze, I'll give it a try. I'll also ask question on #hapi channel.
Last updated: Apr 12 2022 at 19:14 UTC