Stream: IG creation
Topic: Data Dictonary
Noemi Deppenwiese (Sep 22 2020 at 09:16):
Have there been any ideas / any work on adding a kind of "data dictionary" to the publisher output? E.g. a flat list on a single page of all elements that are marked as "MS" in all profiles in the IG and e.g. links to the ValueSets for elements with binding. In one of my projects there is a requirement for such a list and i wanted to check before developing my own solution.
Grahame Grieve (Sep 22 2020 at 11:02):
it sounds like generically a very long list in many cases
Noemi Deppenwiese (Sep 22 2020 at 11:28):
Yes, but a list that could be printed as pdf. Which is unfortunately still a requirement for some reports/ proposals.
(BTW, is there some "total IG as pdf" function in the publisher?)
Grahame Grieve (Sep 22 2020 at 11:31):
no. You can design a template that produces a single page output, which can then be printed to a PDF. but... yuck.
Grahame Grieve (Sep 22 2020 at 11:32):
this doesn't mean that we can't produce a summary of something, but I would only do that if there was a clear definition of what it contained, and there was interest from more than a single party
Noemi Deppenwiese (Sep 22 2020 at 11:37):
Thanks for the template idea!
I'll try to develop a more detailed specification for an "IG data dictionary" in my project. If someone else could also use such a list and has some requirements, please let me know.
Jose Costa Teixeira (Sep 22 2020 at 11:37):
To me, a data dictionary is the of elements that we have in a logical model
Noemi Deppenwiese (Sep 22 2020 at 11:39):
Maybe data dictionary is the wrong term. Data overview?
Jose Costa Teixeira (Sep 22 2020 at 11:40):
depends. what do you want to capture in a data dictionary? Concepts (terms) and their definitions?
Noemi Deppenwiese (Sep 22 2020 at 11:46):
More like "what (data) do you actually use in your project". This may be somewhat different from the initial information model developed some time ago... I hope an overview of FHIR resources and attributes will work, but maybe going back to a more theoretical view is what is needed.
Jose Costa Teixeira (Sep 22 2020 at 11:57):
yes, that is what i'd call a data dictionary (as in a list of data elements used, and their characteristics) which hopefully also correspond to the elements in a logical model.
Noemi Deppenwiese (Sep 22 2020 at 12:02):
I'll check if this is what they actually want. Thanks for the input!
Jose Costa Teixeira (Nov 11 2021 at 07:15):
A Data Dictionary would be:
a List / Composition / ...
linking to
ElementDefinitions / StructureDefinitions / CodeSystems
?
Jose Costa Teixeira (Nov 11 2021 at 07:19):
are there thoughts? I'm thinking Composition of either/any of the resources
Diana_Ovelgoenne (Nov 11 2021 at 08:15):
I would see it as a List or a composition, so you can potentially have the hierarchy of the elements, of elementDefinitions linked to one or more CodeSystems
Jose Costa Teixeira (Nov 11 2021 at 10:39):
Hierarchy of elements - what do you mean?
Diana_Ovelgoenne (Nov 11 2021 at 10:57):
hmm... if I think on an example, let's say you have a Creatinine Value as a definition, (represented as an Observation if you would go for Structure Definition instead), still this would have children depending if the value gets calculated from urine or blood / plasma / serum collection. That is where I would see the need of building with hierarchies.
Jose Costa Teixeira (Nov 11 2021 at 11:27):
Thanks. I wa wondering if this would be something about the structure hierarchy (e.g. Lab result contains patient, patient contains patient ID) - for those, a StructureDefinition would be OK
Jose Costa Teixeira (Nov 11 2021 at 12:15):
If the data dictionary is "managed" then I think a Composition makes more sense. And Composition does have sections which may support hierarchy
Jose Costa Teixeira (Nov 11 2021 at 12:26):
Oops. I think we cannot make a Composition of ElementDefinitions (because it is a data type, not a resource).
@Grahame Grieve any ideas?
Jose Costa Teixeira (Nov 11 2021 at 12:29):
If we want to make a data dictionary, I think a Composition of ElementDefinitions would be OK, but since that is not possible, should we have a Composition of StructureDefs, each with one Element? I thnk that would require us to redefine elements instead of reusing..
Lloyd McKenzie (Nov 14 2021 at 20:01):
I would expect a data dictionary to be a StructureDefinition. Hierarchy is reflected by the element hierarchy. You can list the corresponding codes if necessary. If you need distinct metadata (e.g. author, etc.) for each you need a distinct StructureDefinition.
Jose Costa Teixeira (Nov 14 2021 at 20:02):
A data dictionary is (usually) a flat list.
Jose Costa Teixeira (Nov 14 2021 at 20:05):
If we put all elementdefinitions in a single structure definition,
then when we use the same definition in a logical model, we are going to redefine it...
Jose Costa Teixeira (Nov 14 2021 at 20:07):
E.g. if we have Patient model that contains "firstName" then we are defining it in the LM. If then we want to put it in a dictionary, we need a second StructureDef that redefines the Element
Jose Costa Teixeira (Nov 14 2021 at 20:11):
If this is the only way, I guess it is possible to do something with it, but it seems a technical workaround. Valid, but still a workaround. Is it the only way?
Lloyd McKenzie (Nov 14 2021 at 20:22):
Non-flatness shows up in complex structures like 'address'. I don't understand what you mean by "re-defines" the element. We've defined an ability to propagate element definitions to Questionnaires. We haven't done that for StructureDefinition elements though. We could, I suppose.
Jose Costa Teixeira (Nov 15 2021 at 10:14):
Lloyd McKenzie said:
We've defined an ability to propagate element definitions to Questionnaires. We haven't done that for StructureDefinition elements though.
Where's that?
Jose Costa Teixeira (Nov 15 2021 at 10:14):
Redefining is: If we write things like "gender: The gender administratively assigned to the patient" in the dictionary, and then in the StructureDefinition for the a given logical model for a patient, this definition should point to the previous, not allowing it to be redefined as "the social gender identity of the patient"
Brian Postlethwaite (Nov 15 2021 at 11:45):
Gotta be careful there, gender admin assigned to a patient, is therefore associated with patient, and not person etc. Hence that element is a part of a patient logical model, and not on its own. The seperate concept admin assigned gender itself that could be in a SD by itself would not be isolated.
As for a sample of the definition reference in questionnaire, here's a sample
https://sqlonfhir-r4.azurewebsites.net/fhir/Questionnaire/prac-demo/_history/9?_format=html
Jose Costa Teixeira (Nov 15 2021 at 13:54):
Brian Postlethwaite said:
Gotta be careful there, gender admin assigned to a patient, is therefore associated with patient, and not person etc. Hence that element is a part of a patient logical model, and not on its own. The seperate concept admin assigned gender itself that could be in a SD by itself would not be isolated.
Right, this was just an example. We'll have other elements that may have context-sensitive meaning and others that are not context-sensitive (or not expressed in a specific context)
Jose Costa Teixeira (Nov 15 2021 at 13:55):
I don't find anything strange with having in a dictionary 4 entries: "gender" , "patient gender", "practitioner gender" and "contact gender"
Jose Costa Teixeira (Nov 15 2021 at 13:58):
Brian Postlethwaite said:
As for a sample of the definition reference in questionnaire, here's a sample
https://sqlonfhir-r4.azurewebsites.net/fhir/Questionnaire/prac-demo/_history/9?_format=html
Oh, what Lloyd was mentioning is the .definition. I thought it was something different (I was thinking of this only for the extraction)
Jose Costa Teixeira (Nov 15 2021 at 13:59):
I guess it's not very inappropriate to use a element called .definition to convey, well, the definition :innocent:
Jose Costa Teixeira (Nov 21 2021 at 18:07):
If we use this solution (pointing to an external definition in any StructureDefinition elements), what definition prevails? The inherited one or the local one?
Lloyd McKenzie (Nov 22 2021 at 15:36):
If you specify something local, local overrides.
Mark Kramer (Nov 28 2021 at 23:54):
mCODE has a data dictionary. It took some doing and it is not simple code. Take a look at http://build.fhir.org/ig/HL7/fhir-mCODE-ig/dictionary.html. In spite of the investment of time, I am not fully convinced that it is a useful format. It is true that many clinicians and domain expert types have zero comprehension of IGs, who think they want a data dictionary. FHIR doesn’t flatten nicely into a list of elements, and pity those who don’t grok structures.
Max Masnick (Nov 29 2021 at 01:42):
We are working on open-sourcing the code for the Excel data dictionary generation used in mCODE.
In addition to making a data dictionary, this code will also produce a "diff" comparing two arbitrary versions of an IG using a similar Excel/tabular format. I personally find this quite useful as a compact summary of key changes to profiles and value sets between versions.
Grahame Grieve (Nov 29 2021 at 02:00):
what language is it in?
Jose Costa Teixeira (Nov 29 2021 at 16:30):
I presuming that excel should not be the preferred source format for a data dictionary
Jose Costa Teixeira (Nov 29 2021 at 16:32):
(agreeing with @Mark Kramer here)
Max Masnick (Nov 29 2021 at 18:37):
@Grahame Grieve it's written in TypeScript (it uses SUSHI as a dependency for reading FHIR resources).
In addition to the Excel format, it also outputs a .json version of the data dictionary that has essentially the same content as the Excel file.
Max Masnick (Nov 29 2021 at 18:38):
The SUSHI dependency does not mean it is limited to FSH IGs -- the JSON snapshots from the IG Publisher are the main input
Grahame Grieve (Nov 29 2021 at 18:44):
I could run it as part of a standard IG build then
Jose Costa Teixeira (Nov 29 2021 at 23:58):
What does that "data dictionary" consist of?
Jose Costa Teixeira (Nov 30 2021 at 00:00):
I am not sure from the discussion if it is a list of all elements that are used in the profiles, or those defined in logical models, or both
Max Masnick (Nov 30 2021 at 01:44):
@Jose Costa Teixeira it consists of either all the elements in a profile snapshot, or just the MustSupport elements (depending on settings). It also includes value sets and extensions.
Jose Costa Teixeira (Nov 30 2021 at 03:43):
I would not call that a Data dictionary
Jose Costa Teixeira (Nov 30 2021 at 09:32):
Can we call that Data Catalog or something?
Jose Costa Teixeira (Nov 30 2021 at 09:32):
I just checked and I think I can confirm that from a Data Management perspective, the Data Dictionary is at the Logical Model level, not at the physical level. So the list of all data elements in profiles should not be confused with a data dictionary.
Max Masnick (Dec 14 2021 at 02:04):
I'm not opposed to changing the name. Is FHIR-I the appropriate place to figure out how to best name/describe this?
Grahame Grieve (Dec 14 2021 at 04:04):
yes
Last updated: Apr 12 2022 at 19:14 UTC