Stream: implementers
Topic: AllergyIntolerance code
Chris Gibson (Sep 21 2021 at 22:23):
Hi group, question on the AllergyIntolerance resource I can't seem to figure out - for Medication Allergies our EHR is able to capture both the Medication that caused the allergy and the Allergy Group/class of substance that caused the allergy as well.
Since the FHIR spec cardinality only permits 1 single value to go into the "Code" field, which of the two values should be inputted here? The specific medication or the drug class if/when known?
Lloyd McKenzie (Sep 21 2021 at 23:10):
You can specify multiple codings within AllergyIntolerance.code to communicate the substance at varying levels of granularity.
Morten Ernebjerg (Sep 22 2021 at 05:26):
Also, if specific medications are being captured as documentation of particular instances of allergic reactions, they may fit into AllergyIntolerance.reaction
("Details about each adverse reaction event linked to exposure to the identified substance.").
Anoop Bhat (Sep 23 2021 at 17:15):
Hello there I am pretty new to ml. I got this FHIR dataset of Allergy Intolerance from public test server. The data does not look like it have those features and labels type. How to do I do Machine Learning on it.
Lloyd McKenzie (Sep 23 2021 at 17:33):
First, NEVER train a ML algorithm on a public test server. Public test servers will, by definition, be full of random, fake and quite often totally non-realistic and non-representative data. If you want to train a ML algorithm, you'll need to get access to real data from real servers (which will require establishing agreements with holders of such data).
Not totally clear what you mean by "those features and labels type" - what are you looking for?
John Moehrke (Sep 23 2021 at 17:35):
unless you are looking to create ML rules that would sort garbage in to garbage out...
Anoop Bhat (Sep 23 2021 at 17:42):
I am just using it to learn ml.
The datas were of json. I had to clean it and set in tabular columns
Now it looks something like this -- image.png
Anoop Bhat (Sep 23 2021 at 17:43):
I have done ml on healthcare datasets from kaggle
But this was something new to me
Anoop Bhat (Sep 23 2021 at 18:12):
can i get some advise to perform ml on this data. Yes its from public test server but its fine for me. Its for keeping it locally.
Lloyd McKenzie (Sep 23 2021 at 23:35):
If it's just to mess around with and will never have a hint of going near real patients, ok. What are your issues with the data you have? If it's the fact that some columns are missing values, that's normal - and your ML algorithm will have to learn to deal with it. It's extremely rare for all data to be available all the time. And in the base FHIR specification, very few elements are mandatory.
Anoop Bhat (Sep 24 2021 at 03:12):
i cant understand which columns should I train
Anoop Bhat (Sep 24 2021 at 13:25):
I cant figure out whether its a predictive or classification
Lloyd McKenzie (Sep 24 2021 at 14:54):
Well, given that it's largely junk data, I'm not sure it much matters? Figuring out which elements are relevant to your process depends on what you're trying to predict and isn't an answer we can give you.
Anoop Bhat (Sep 24 2021 at 19:00):
How can I get access to real FHIR data
can I get it to practice my ML skills?
Daniel Venton (Sep 24 2021 at 19:28):
There is a product called Synthea that generates realistic but fake medical data (in FHIR format). But as a normal person, no steward of real patient data is likely (illegal) to share actual live data with you. If you are part of a research group, you might be able to negotiate a de-identified view of real data. For learning ML, I would think Synthea would be enough.
Lloyd McKenzie (Sep 24 2021 at 19:36):
Keep in mind that, because Synthea is synthetic data, your ML algorithm will likely just end up predicting the algorithms Synthea uses to generate its fake data. If you're just interested in building ML skills, that's fine. But don't presume that any algorithms you produce by training against synthetic data will ever be valid against real data.
Jason Walonoski (Sep 29 2021 at 12:28):
As the developer of Synthea, I agree with Daniel and Lloyd. Synthea can be used ML/Algorithm development (especially in educational learning), but mostly in terms of pipeline development, development and integration, etc. Once you want to use something in production, you better be using real data. As George Box famously said, "All models are wrong, some are useful." Synthea models are therefore wrong. But they can still be useful, but you do need to understand where the edges are.
John Silva (Sep 29 2021 at 15:42):
There is also the possibility that you can "train biases" into your ML/AI model if you only use a dataset like Synthea. Of course this same bias can happen for 'real data; too if you do not have a dataset that represents a varied population.
Last updated: Apr 12 2022 at 19:14 UTC