FHIR Chat · real-world data · implementers

Stream: implementers

Topic: real-world data


view this post on Zulip siwei zhou (Mar 02 2022 at 16:39):

Is there any real-world FHIR JSON data that I can download from the Internet?

view this post on Zulip John Silva (Mar 02 2022 at 19:19):

Yes, if you go to the FHIR specification pages, every page has an Examples tab, with examples in JSON for that particular resource.
e.g. Patient examples: https://hl7.org/fhir/patient-examples.html

If you are looking for "complete FHIR datasets" that represent "real world" clinical data you probably want to look at Synthea:
https://synthetichealth.github.io/synthea/

view this post on Zulip siwei zhou (Mar 03 2022 at 03:25):

John Silva said:

Yes, if you go to the FHIR specification pages, every page has an Examples tab, with examples in JSON for that particular resource.
e.g. Patient examples: https://hl7.org/fhir/patient-examples.html

If you are looking for "complete FHIR datasets" that represent "real world" clinical data you probably want to look at Synthea:
https://synthetichealth.github.io/synthea/

Thanks for your help!
I am still concerned that the synthea data will not be convincing enough to be used in the paper.

view this post on Zulip Lloyd McKenzie (Mar 03 2022 at 03:38):

If you find 'real' FHIR data sitting on a server anywhere, that's an indication of a serious security violation. Generally the only way to have access to 'real' data is to work within an organization that owns the data - and even then, it might go through a de-identification process first. Healthcare data is sensitive. It's not made publicly available.

view this post on Zulip Bret H (Mar 03 2022 at 03:46):

you'd need a bunch of patients who were no longer living, and some good longitudinal data on them to get a real-world database. Maybe there's a foundation that would ask people to donate their data as part of a will? The synthea data offers the state of the art and has been used in a large number of publications. The appropriateness of the data depends on your use case.

look here https://synthea.mitre.org/about
you might be interested in SyntheticMass

view this post on Zulip Lloyd McKenzie (Mar 03 2022 at 03:55):

No longer living doesn't necessarily mean the patients privacy rights are waived. Patients can certainly elect to make their data publicly available or available for research - but it's likely that their data won't necessarily be representative as those willing to publicly share data are those least likely to have information in their records they consider 'sensitive'

view this post on Zulip Bret H (Mar 03 2022 at 04:01):

Well...when they've been deceased for 50 years the rights change. However, your point on representation is accurate. Better to go with a synthetic data base that is 'statistically similar' to some large group of people.

view this post on Zulip Bret H (Mar 03 2022 at 04:02):

correction 'greater' than 50 years. and I'm speaking from a US perspective (https://www.hhs.gov/hipaa/for-professionals/faq/1500/do-hipaa-protections-apply-to-the-health-information-of-individuals/index.html)

view this post on Zulip Lloyd McKenzie (Mar 03 2022 at 04:05):

Sure - but there tends to not be much discrete healthcare data on people who've been dead for 50+ years, let alone data exposed as FHIR :)

view this post on Zulip Lloyd McKenzie (Mar 03 2022 at 04:06):

Also, even if it existed, the data wouldn't necessarily be relevant to training AIs, as so much has changed in terms of risks, testing approaches, meds, etc.

view this post on Zulip Bret H (Mar 03 2022 at 04:08):

Indeed. You'd want an AI that was constantly learning as the system changed. And the AI itself, if widely providing input into clinical care, will alter the associations between action and data. It's quite a thing to attempt. If an AI is what you're working on, better to get a good collaboration.

view this post on Zulip Bret H (Mar 03 2022 at 04:09):

unless you're 'simply' looking for the AI to find current associations to report, without an intervention, I suppose

view this post on Zulip siwei zhou (Mar 03 2022 at 04:48):

I have already used synthea data in my recent work. Thx anyway.
BTW, I found this paper on Pubmed. :point_right: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6416981/

view this post on Zulip Bret H (Mar 03 2022 at 15:03):

That is a great reference for those who wander to this chat!

view this post on Zulip Irene Joseph (Mar 04 2022 at 02:14):

Has anyone reviewed or use the open access MIMIC III database? https://physionet.org/content/mimiciii-demo/1.4/

MIMIC-III is a large, freely-available database comprising deidentified health-related data associated with over 40,000 patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012 [1]. The MIMIC-III Clinical Database is available on PhysioNet (doi: 10.13026/C2XW26).

view this post on Zulip John Silva (Mar 04 2022 at 15:19):

That's an interesting article - at least the overview. Yes, there are many variables in exactly how patient care was delivered, that's part of the reason for 'standards of care' and 'care guidelines' and metrics to check that those guidelines are being followed and actually improving care. (the PDCA loop). I suspect though, at this point Synthea data is the best, freely available, de-identified patient datasets that you can find. (As Lloyd points out, unless you are working with a provider who gives you access under HIPAA or other PHI protection rules.)


Last updated: Apr 12 2022 at 19:14 UTC