Stream: terminology / utg
Topic: Duplicates in UTG File
Clayton Daley (Jan 21 2020 at 22:48):
I'm cross-posting this from /terminology/ (original at https://chat.fhir.org/#narrow/stream/179202-terminology/topic/OID.20for.200568.20and.200930) since I was told this was the correct place for this kind of information/discussion.
When processing the current UTG file for use in our product (hopefully in lieu of manual entry of various codes), I ran into a bunch of duplicates in different fields. I'm not clear whether the duplicates are simply confusing or semantically incorrect.
The following CodeSystem OIDS duplicate:
- 2.16.840.1.113883.21.452: v2-0567 and v2-0929
- 2.16.840.1.113883.21.453: v2-0568 and v2-0930
- 2.16.840.1.113883.1.11.10228: v2-0952 and v3-Confidentiality
- 2.16.840.1.113883.1.11.20428: v2-0719, v3-ConfidentialityModifiers, and v3-InformationSecurityPolicy
- 2.16.840.1.113883.1.11.78: v2-0078 and v3-ObservationInterpretation
- 2.16.840.1.113883.1.11.20560: v2-0959 v3-WorkClassificationODH
The following Names duplicate (in both the ValueSet and CodeSystem):
- AllowSubstitution: 0161 and 0279
- ProviderRole: 0286 and 0443
- OrganizationUnitType: 0406 and 0474
- ItemStatus: 0625 and 0776
This CodeSystem name is also duplicated in:
- AlternativeCodeKind: Codesystem-composition-altcode-kind and CodeSystem-codesystem-altcode-kind
As an implementer, I would expect uniqueness for any computer-processable field (and most non-processable) within a type (CodeSystem, ValueSet, NamingSystem).
Ted Klein (Jan 26 2020 at 13:19):
These are not really problems...a v2 table has its content informed by one or more value sets. Any given value set may inform more than one v2 table. The problem is that the 'tables' are not in a 1:1 alignment with the modern terminology model, i.e. code systems and value sets. On top of that, over the decades, the V2 Standard was published with more than one table having the same name. What this means is that the human readable names are NOT necessarily unique, and should NOT be used for computer processing to determine uniqueness. In all but a couple of cases, each table has a unique OID. The canonical URIs of the code systems and value sets in UTG are all also unique. The v2 tables in UTG are modeled as a single code system where each concept represents a v2 table, with many extended properties to reference the code systems and value sets, WHICH ARE NOT THE TABLES. The provide the underlying content for the tables, which are an artifact of how v2 has been published for over 30 years as a Word or PDF document.
Clayton Daley (Jan 27 2020 at 22:35):
I'm under the (perhaps incorrect) impression that the UTG is following the semantics of e.g. a FHIR CodeSystem where "name" is "Name for this code system (computer friendly)" and "title" is "Name for this code system (human friendly)". Is this not the case?
Clayton Daley (Jan 28 2020 at 00:01):
===
It may not matter, but I incorrectly identified the duplicate OIDs as CodeSystems when I tried to summarize the other thread. They are actually ValueSets. I was finding them by OID so I also grabbed extra names in at least one case i.e. 2.16.840.1.113883.1.11.20428 is only the identifier for v2-0719 and v3-InformationSensitivityPolicy.
I'm still unclear on the presence/interpretation of duplicate OIDs within the ValueSet space, but here's how I react to parts of the explanation:
- "Any given value set may inform more than one v2 table". I see how this is possible, but (to me) that implies an external definition of a table where two tables reference the same ValueSet/OID/URI in the UTG namespace.
- "a v2 table has its content informed by one or more value sets". By this I assume you mean that a Table could combine two or more concepts (as a contrived example, race and ethnicity) that are separate ValueSets in v3. This would make an external definition of tables even more critical.
I realize that this may be a fait acompli, but (speaking as a user) the UTG would be easier to process if each v2 table mapped to a single ValueSet and ValueSets (when appropriate) shared CodeSystems. As an outsider, this is how I naturally interpret the existing UTG infrastructure. Building on the contrived example above:
- Contextualized ValueSets "v2-HumanWeightUnits", "v2-MedicineWeightUnits", and "v3-WeightUnits" (all with distinct ValueSet OIDs) could share the exact same set of codes form a UnitsOfWeight CodeSystem.
- A v2 ValueSet "Race and Ethnicity" could coexist with v3 ValueSets "Race" and "Ethnicity". They could even be composed of the exact same codes drawn from the exact same CodeSystem.
In this world, it's very easy for me to express the mapping from v2 "Race and Ethnicity" to v3 "Race" because the two concepts are part of the same logical space. I have an OID for both ValueSets and can create mappings in my database. If the v2 Table is an external definition that maps to one or more ValueSets, it significantly increases my implementation complexity.
Frank Oemig (Feb 02 2020 at 20:28):
IMHO, you should not map ValueSets, but CodeSystems. So it does not matter how many codesystems a value set uses.
Lloyd McKenzie (Feb 02 2020 at 21:29):
Mapping needs to be done at the value set level because what codes are available sets context for what mappings are possible
Last updated: Apr 12 2022 at 19:14 UTC