Stream: bulk data
Topic: 5 years of claims for 8 states: O(10^9) observations
Dan Connolly (Jun 11 2019 at 15:55):
Help me find how people get data into FHIR at scale?
In the GROUSE project at the University of Kansas Medical Center, we're dealing with medicare and medicaid claims for everyone in 8 states (and growing.... 20M+ beneficiaries) from 2011-2017 (and growing). We've loaded this into an i2b2 datamart; the observation_fact table has over 70B facts.
https://github.com/kumc-bmi/grouse
Before that, we developed ETL from the KU Hostpital's Epic installation into i2b2. More like 3B observations on O(100K) patients.
I'm trying to find related work with FHIR. The focus seems to be on writing mobile apps that query a FHIR service about one patient at a time. But how does the data get into the FHIR service? I'm looking at the docs for http://hapifhir.io/ and https://github.com/Microsoft/fhir-server and if there's documentation about how to get data in there in bulk, I don't see it.
What am I missing? Clue me in, please?
Grahame Grieve (Jun 11 2019 at 15:57):
we have not defined a standard way for bulk import. It just so happens that there's a pop-up discussion here at DevDays today about exactly that
Dan Connolly (Jun 11 2019 at 16:00):
where / when? On https://www.devdays.com/us/schedule/ , I see something Wednesday, but it's not a pop-up.
Michele Mottini (Jun 11 2019 at 16:00):
What am I missing?
If you have a database / system that already contains the data the way to go is to add a FHIR API on top of that, not to try to duplicate all the data into another system that happens to have a FHIR api
Dan Connolly (Jun 11 2019 at 16:00):
And meanwhile, while there's no standard, what are people doing?
Grahame Grieve (Jun 11 2019 at 16:01):
the popup was announced this morning - @Dan Gottlieb can you provide details...
Dan Connolly (Jun 11 2019 at 16:02):
add a FHIR API on top of that
any pointers on related experience? Do I write something that has the whole scope of HAPI? Or can I plug in somewhere?
Michele Mottini (Jun 11 2019 at 16:03):
HAPI is pluggable into an existing data store
Dan Connolly (Jun 11 2019 at 16:03):
pointer to details, please? I'm looking without luck.
Grahame Grieve (Jun 11 2019 at 16:04):
you should ask on the HAPI stream. Are you here at DevDays?
Dan Connolly (Jun 11 2019 at 16:05):
yes, I'm here
Michele Mottini (Jun 11 2019 at 16:05):
http://hapifhir.io/doc_intro.html
Grahame Grieve (Jun 11 2019 at 16:05):
ask @James Agnew on the HAPI stream
Dan Gottlieb (Jun 11 2019 at 16:06):
The bulk import discussion is at 2:10pm in Sonora - it’s on the online schedule now
Dan Connolly (Jun 11 2019 at 16:07):
HAPI stream... I don't see it. Ugh.
Michele Mottini (Jun 11 2019 at 16:09):
Michele Mottini (Jun 11 2019 at 16:10):
HAPI server doc: http://hapifhir.io/doc_rest_server.html\
Michele Mottini (Jun 11 2019 at 16:11):
If you are not a Java shop there are similar solutions for .NET
Dan Connolly (Jun 11 2019 at 16:19):
2:10 thanks. (schedule = https://www.devdays.com/us/schedule/ ?)
Dan Connolly (Jun 11 2019 at 16:19):
found hapi stream
Isaac Vetter (Jun 11 2019 at 16:23):
Dan's $import
pop-up session: https://www.devdays.com/us/schedule/#event-459
Paul Church (Jun 11 2019 at 18:06):
The FHIR store in Google Cloud Healthcare has a bulk import operation* that can load quite large datasets. The largest one that I know of being loaded was about 2.4 billion resources. SyntheticMass is around 400 million resources which is a nice load test.
nicola (RIO/SS) (Jun 11 2019 at 18:29):
fhirbase and aidbox can handle such scale as well
Last updated: Apr 12 2022 at 19:14 UTC