FHIR Chat · reference patients in EHR Fhir sandboxes

Stream: patient empowerment

Topic: reference patients in EHR Fhir sandboxes

Virginia Lorenzi (Jul 16 2020 at 04:41):

"Also, I've advocated in the past that we need reference patients with detailed medical histories that we can register in each of the EHR test systems, and I think this dovetails with that concern. If you actually go look at each of the test sandboxes, we'll see the Jason and Jessica Argonaut in Epic, we'll see Nancy SMART and family in Cerner; the Society of Imaging Informatics in Medicine has the Siim family, and so forth." - said by @Abigail Watson

@Danielle Friend @Hans Buitendijk @Jeffrey Danford @Josh Mandel @kevin - This doesn't sound that hard to do and would be really useful for the patient apps that are aggregating data - could the EHRs create this? The idea is for all of the EHR Sandboxes to have information about the same set of patients, emulating a realworld scenario where a patient has visits at multiple providers which use different EHRs. Would be good for Argonaut as well as for CARIN.

Virginia Lorenzi (Jul 16 2020 at 04:42):

@Ryan Howells All the CARIN sandboxes should have information on the same patient - see the thread,

Abbie Watson (Jul 16 2020 at 04:43):

(deleted)

John Moehrke (Jul 16 2020 at 12:25):

I think the way to make this work is to define the patients and their medical history in technology independent ways. To provide these histories as FHIR resources is "leading the witness". But if we define the medical history in human readable terms, making clear what facts are important to record, then we also prove that each system gets really close to the same results. Some variation need to be allowed for, but the interoperable aspects that there are standards implementation guides for (e.g. FHIR US-Core, C-CDA, IPS, etc) should come out of each of these the same.

Josh Mandel (Jul 16 2020 at 14:28):

@David Kreda I'm mentioning you here because your work on the Sync for Science "Discovery" app touches on the same needs, using SMART on FHR. Involved creation of a simulated-fragmented-across-multiple-providers SMART on FHIR data set, and tools for viewing a reassembled view.

Cooper Thompson (Jul 16 2020 at 15:50):

One option to look at is using the test data used for 2015-era ONC certification. Since most EHRs will go through certification, at least some of their environments probably have those patients and associated data already - just maybe not the public sandboxes.

Vassil Peytchev (Jul 16 2020 at 15:53):

Will future Inferno-based certification include such sample patients? @Reece Adamson

Abbie Watson (Jul 16 2020 at 18:13):

Ideally, we would want this to be computable though, so that utilities like Inferno and Touchstone can leverage it. Maybe we don't specify the FHIR resources, but if we specify the LOINC and SNOMED codes, then we can reasonably expect Observations, Procedures, and other resources built around those codes in a consistent way.

Reece Adamson (Jul 16 2020 at 18:56):

I can't speak to future certification efforts, but the current Inferno tests are designed to allow systems to bring their own data. That being said we do offer a standard set of data, and scripts to create your own unique set, which systems can load to meet the certification criteria.

The exact set loaded into the reference server is here
The script that runs Synthea, down selects to a minimal set, and makes sure all the must-supports are covered is here
And a separate reference set created from that script is here

Based on the discussion I don't think these resources are the solution, but they might be useful. I do think its a really cool idea and would be useful! Particularly showing resources fragmented across systems would be really cool IMO. @Jason Walonoski would probably be interested in this thread based on his experience with Synthea.

Jeffrey Danford (Jul 17 2020 at 13:06):

The idea is good but the hard part is coming up with the data set. Sandboxes are usually populated with a template used to create new internal testing system and the data in them is usually quite sparse. We've been looking at loading Synthea data on all our public R4 sandboxes. Is there a better (defined as more comprehensive and more realistic) set of data we should be looking into?

Cooper Thompson (Jul 17 2020 at 13:53):

I think the ONC Cures-era question is more for the ONC-ATLs that will be using Inferno to certify EHRs rather than Inferno itself. I believe for the MU3-era, the ONC-ATLs had the pre-defined set of patient data to use. It was probably different data across the ONC-ATLs though. However that might be good, as it would mean we'd have a few different data sets.

Virginia Lorenzi (Jul 17 2020 at 17:56):

@Abigail Watson I think Abigail had some ideas on the data, and perhaps we can find people to resource this if there is a non-techie friendly template!

Abbie Watson (Jul 18 2020 at 12:49):

While there’s a clear usecase for using Synthea, the one concern that I have with that approach is that the data is statistically generated; so it can be difficult to ensure that a specific reference patient exists with specific conditions.

My personal take is that the issue at hand is essentially an issue of patient matching and longitudinal records; but it then is applicable to a bunch of other tasks. For what I’ve been advocating for, I’d be more inclined to do something like take something like the ONC-ATL reference patients, give them to the Clinicians on FHIR and Patent Empowerment groups to develop out a detailed master longitudinal record for one (or a few) of patient(s) involved; and then figure out some way to break the record up into parts so that different groups can load up some portion of the record.

On the other hand, Synthea uses seeds to generate patients, and can ensure a consistent cohort of patients is generated; even if specific resources or not. We might get some mileage out of selecting a specific reference city and seed value. R4 resources using ‘Washington DC’ location and a seed of ‘12345’, for example. Then vendors could use whichever pipeline they want; and we might get some semblance of repeatability and consistent patients.

@Jason Walonoski - Does the above match up to your understanding of how Synthea works? Is there a way to force Synthea to use a specific patient across pipelines, so as to generate a longitudinal record?

Ryan Howells (Jul 18 2020 at 21:58):

100% agree @Virginia Lorenzi although it's unclear what the solution is. Hey @Aaron Seib @Dave Hill : Who at MITRE was working on this at one of our past connectathons? Any progress you'd like to bring the group up to speed on?

Dave deBronkart (Jul 18 2020 at 23:05):

Hi @Ryan Howells - which of Virginia's posts were you giving a 100 to? Top of thread or yesterday's note to Abbie?

Ryan Howells (Jul 19 2020 at 00:37):

Top of thread. We have been in need of real synthetic data for quite some time. Unclear to me what else is being recommended. Hoping others can chime in.

John Moehrke (Jul 20 2020 at 13:37):

many have tried. It is mostly a momentum problem. see the google simulated hospital - https://github.com/google/simhospital

Debi Willis (Jul 21 2020 at 18:59):

I like @Jeffrey Danford idea of using Synthea data. One of the issues we are seeing is that different EHRs give us data in different ways. I would like to pull data from the same patient in each EHR sandbox to see where the data are the same/different. One example is device data. We are getting data that is in different formats. We can't really tell whether the data was entered wrong in the EHR or whether it was incomplete. Recently, we received COVID results in formats that are not standard. We could not tell if the data was not entered correctly in the EHR or if for some weird reason a COVID test was handled differently than any other positive/negative test. Having the same data in all sand boxes would enable app developers to more quickly identify problems in conformity.

I know Abigail's objection was because she wanted a longitudinal record. Are there any other objections to using Synthea?

Dave deBronkart (Jul 22 2020 at 11:44):

Debi Willis said:

I would like to pull data from the same patient in each EHR sandbox to see where the data are the same/different.

Oo oo!

Abbie Watson (Jul 22 2020 at 14:05):

My concerns aren't so much an objection. Happy to use Synthea if we can. I'm just concerned that default settings of Synthea will produce statistical data and we won't be able to guarantee a consistent cohort of patients. When we've tried doing this before with HAPI and Node servers, we wound up having to write a whole extra dashboard to do data validation because the data turned out a lot fuzzier than we expected. It's not as simple as just 'use Synthea'. A pipeline and settings file has to be defined to make this all happen.

I suspect that we'll need to turn off some of the statistical functionality, and have a more deterministic pipeline for this particular task. Doable, but not Synthea's default config.

John Moehrke (Jul 22 2020 at 16:53):

using syntha simply to produce THE dataset, where everyone uses the same dataset, might address that problem. We leverage the benefit of syntha to generate pseudo data, so that we don't get people worried about 'real patient data', but we have a single dataset so that we can prove data quality for we know what we asked them to record for patient Martha.

Debi Willis (Jul 22 2020 at 22:33):

So how do we go about making that happen? It seems like it would also help the EHR vendors...they would not have to come up with the dataset AND there would be many people able to pull data into their apps to test consistency of format among the EHR vendors. (Which is the whole idea of standardization and interoperability.) Any ideas on how to get that moving among the EHR vendors? @Jeffrey Danford Thanks for making that suggestion. Would Allscripts be able to do that to start the momentum?

John Moehrke (Jul 23 2020 at 13:13):

I think the problem is not technical. This thread has uncovered plenty of historic efforts to do this. I think the problem is motivation for the EHR (and PHR and registries) to be compelled to spend the energy entering the data set given into their system. This might be forced by government (ONC) -- aka "stick". I might recommend that the patient empowerment group could produce a campaign that makes this something the EHR vendors want to do -- aka "carrot". I don't know what that carrot is, but we would be far more capable of doing carrot things.

Debi Willis (Jul 26 2020 at 20:19):

Could the carrot be that they would be able to really test their conformance with a large group of 3rd party apps pulling in the same data from multiple EHRs? It would seem that this would benefit everyone.

Brendan Keeler (Jul 27 2020 at 14:22):

Carequality is moving down this path in that Implementers will be required to set up test environments (and test patients). Seems to align with the ask here, especially as the FHIR based exchange implemention guide comes into use

Last updated: Apr 12 2022 at 19:14 UTC

Main menu

FHIR Chat · reference patients in EHR Fhir sandboxes · patient empowerment