FHIR Chat · Hl7 FHIR JSON file convert to CSV file · implementers

Stream: implementers

Topic: Hl7 FHIR JSON file convert to CSV file


view this post on Zulip Er.Madhav Adhikari (Mar 31 2022 at 08:24):

Hello All, I am new for FHIR File. I am looking for assistance to convert them into the normal csv format file.

view this post on Zulip René Spronk (Mar 31 2022 at 09:21):

Welcome to this forum. Could you tell us what you've already tried ? And please read https://confluence.hl7.org/display/FHIR/FHIR+Rules+for+Asking+Questions - thanks.

view this post on Zulip Lloyd McKenzie (Mar 31 2022 at 15:22):

First, there is no 'normal' CSV file - all CSV files are a custom set of columns. Second, CSV is a flat list of values for each record. FHIR is a nested hierarchy, so you'll have to accommodate that. For example, a patient can have an unlimited number of addresses, each of which can have various parts. Same deal with names (and each name can have multiple given names). You'll have to figure out how you want to 'flatten' that structure to correspond to your desired set of CSV columns.

view this post on Zulip Er.Madhav Adhikari (Mar 31 2022 at 15:22):

Hello Rene, Yes I have read the instructions provided as your link. I heard the HAPI FHIR will work. I am looking into this. Could you please share the links or any recourses if you have related to the FHIR JSON file to CSV converter . I am new for FHIR. so looking for training or document t know about the this file format

view this post on Zulip Er.Madhav Adhikari (Mar 31 2022 at 15:31):

@Lloyd McKenzie sound good. Do you know the JSON file of one one patient or member have multiple resources files or combined JSON file of all module for one member ?

view this post on Zulip Nabin Pandey (Mar 31 2022 at 15:39):

Hello everyone, I am also looking for the same portion for having FHIR content data to flatten structure.Can anyone provide me the link or resources to go through that

view this post on Zulip Lloyd McKenzie (Mar 31 2022 at 15:44):

First, you'll need to figure out how you want to retrieve the data - are you performing a simple search, a search with _include or _revinclude, a Batch with a bunch of searches, a Bulk Data retrieval? (You'll need to go read about all of those things and understand what the options are.) Then you'll have to look at the data structures and map them to your own CSV data structure and figure out how to manage the possibility that there might be more repetitions of things than your data structure knows how to handle (by filtering, by choosing one, by modifying your data structure). There's no one-size-fits-all solution.

view this post on Zulip Nabin Pandey (Mar 31 2022 at 15:48):

Thanks @Lloyd McKenzie for detailing on this.

view this post on Zulip Elliot Silver (Mar 31 2022 at 16:48):

@Er.Madhav Adhikari, as @Lloyd McKenzie indicates, a major challenge converting a FHIR resource to a CSV file (or any flat representation, or relational database) is that elements can repeat and are hierarchical. If you're converting a patient resource to CSV, do you want one row per patient, and add additional columns for each name (and how many name columns do you allocate)? Or do you repeat the patient info on multiple lines, one for each name, and somehow link all those lines together? Or do you create some sort of structure that lets you have the non-repeating info in one CSV file, and the repeating info in another CSV file with "links" back to the main file (essentially re-inventing a relational database in CSV). And that's just for one attribute in one resource; now repeat that evaluation for other attributes and in other resources.

Beyond that, resources are heavily cross-linked. How do you want to represent that? Does that full information of a patient's preferred doctor show up in the patient CSV, or is there somehow a "link" to the practitioner CSV? What if multiple patients have the same preferred doctor?

Finally, FHIR supports extensions on most elements--which can be thought of as a standard way of adding non-standard information. Does the data your using have/will have/might at some point have any extensions? Do you have control over which extensions might be encountered? Do you care about expressing the extension content in the CSV or are you willing to drop that information? How will your CSV approach support arbitrary additional (potentially hierarchical) information at any place in the model?

As you can see, there is no simple one-size-fits-all FHIR (JSON or XML) to CSV transform.

view this post on Zulip Elliot Silver (Mar 31 2022 at 16:49):

Actually, the discussion should have started not with how to do a JSON to CSV transform, but why do a JSON to CSV transform. What are you trying to accomplish?

view this post on Zulip Er.Madhav Adhikari (Mar 31 2022 at 17:02):

@Elliot Silver yes I agreed with you and @Lloyd McKenzie .
Firstly, My requirements : I have my healthcare application which support only parquet or CSV file formate after performing transformation from raw data file to our standard format by running in distribution system (hadoop/spark). So my some vendor sent the FHIR file in the JSON file formate, I have to convert it into csv.

And , I want one row per patient for each module ( patient info, coverage ,EOB, medication etc) and later linked or merged all modules to one flattened file.

view this post on Zulip John Silva (Mar 31 2022 at 17:21):

This sounds like basically the classic Object to Relational mapping problem --- not easy as already stated.
You can 'sort of' think of the FHIR 'data model' as an object model, and, in fact, is represented as such with UML on each FHIR resource spec page. The CSV is 'sort of' the relational, row/column, flattened model.

view this post on Zulip Elliot Silver (Mar 31 2022 at 18:05):

If parquet is an option, I suggest browsing the #bulk data stream. I know there has been discussion about parquet export, but don't know where that landed.

view this post on Zulip Er.Madhav Adhikari (Apr 01 2022 at 12:16):

@Elliot Silver Thank you for suggestion however, I cant get any solution there.

view this post on Zulip Lloyd McKenzie (Apr 01 2022 at 14:08):

There won't be a solution that won't require a bunch of analysis and custom development on your end


Last updated: Apr 12 2022 at 19:14 UTC