Stream: genomics
Topic: variation-code component
Larry Babb (Nov 05 2019 at 18:13):
From the v0.4 IG spec...
- Observation.component:variation-code
SliceName : variation-code
Definition : This term is used to report the unique identifier of the simple variant found in this study.
Control : 0..1
I assume this is a placeholder for any "unique identifier" that the producer wishes to use to identify the variant that the observation is about. If so, I'm not sure why it is constrained to only 1. I also think the wording of the definition should be adjusted to clarify whether this is only for simple variants found of is it for the variant that the observation is being made about. Since observations can be used to share information about variants that are "absent" or "present". And, I assume not only simple but complex variants may be registered somewhere that have unique identifiers as well - and these presumably could be used here.
I would consider making this component a little more useful by allowing folks to share any identifiers or codes that they feel are unique and identifying for their purposes. Since there is no universal standard for variant codes or identifiers, this would be more of a cross-referenceable set of codes and identifiers that would help people find and understand more about the actual variant that the observation was referencing.
Maybe it should be renamed to variation-xref-identifiers. (or variation-xref-codes if you must) with a 0...* cardinality.
The added benefit here would be that you could combine dbSNP-id component here, which many folks use to cross reference and identify variation.
Anyway. The basic ask is "why the 0..1 constraint" when there may be multiple identifiers that are useful to share (e.g. ClinVar variation id, ClinGen canonical allele id, GA4GH variant digest id, hgvs nomenclature, COSMIC id, gnomAD id, etc...)
Jamie Jones (Nov 05 2019 at 18:35):
I like the approach of having a single component for all relevant identifiers, though currently we split out dbSNP and HGVS because they are treated very differently. Systems can easily slice this component to require specific codings to be present (clinvar, COSMIC, etc).
I also agree the textual guidance on it is currently misleading, I believe "found" here pertains to in the study, not necessarily in the patient.
Jamie Jones (Nov 05 2019 at 18:36):
Note that even with one component you could have 0..* codings, not sure if that makes a difference without considering examples.
Larry Babb (Nov 05 2019 at 18:43):
@James Jones the coding within the code's Codeableconcept that has a 0..* cardinality only allows you to provide multiple equivalent code(s) but does not allow you to provide multiple code-value pairs.
Jamie Jones (Nov 05 2019 at 18:46):
I believe the code would be sent with the required LOINC coding but the value's CodeableConcept could come with multiple codings (each with a system and value)
Larry Babb (Nov 05 2019 at 18:48):
Jamie Jones (Nov 05 2019 at 18:49):
Within CodeableConcept there are 0..* codings and 0..1 text. That differential table does not expand the value[x] element because the profile does not define any differences from the base element on Observation
Larry Babb (Nov 05 2019 at 18:49):
I see now. there's a fixed code. and then you would put as many codings as you want which would all be presumed to be equivalent concepts for the fixed code.
Larry Babb (Nov 05 2019 at 18:50):
my bad. pretty straight forward considering.
Jamie Jones (Nov 05 2019 at 18:51):
Not something I understood until I got in with a lot of examples. The graphical representation of the tables is very dense. Hope they improve in the future (and our text surrounding the concept could certainly be improved).
Kevin Power (Nov 05 2019 at 18:52):
If you look at the 'Snapshot Table' tab instead of the "Differential Table" tab, you see all the things - which in a case like this makes it easier to see what you can do. Of course, you have to wade through lots of other stuff as well.
edit - Not as much as I thought, still doesn't show all the value[x] : CodeableConcept options. :frown:
Larry Babb (Nov 05 2019 at 18:53):
it is interesting that for some components, you would replicate the entire component if you want to provide multiple values, but for situations where the component value is a CodeableConcept we have to only provide one component instance with multiple codings buried in the value's CodeableConcept. It's stuff like this that seems to slow down innovation IMO. But this is a model that is pretty well established and part of learning how to play with FHIR.
Jamie Jones (Nov 05 2019 at 18:54):
yes, the distribution of 'list' elements vs singletons is the real headache of FHIR in my experience
Kevin Power (Nov 05 2019 at 18:56):
The distinction we try to make is what you said above: 'which would all be presumed to be equivalent concepts', so we keep the component as 0..1, but when the thing we are representing is not equivalent concepts, we allow the component to be 0..* (now I am hoping we don't have a mistake which shows we don't do that right all the time :slight_smile:)
Larry Babb (Nov 05 2019 at 18:57):
nesting multiple answers in a value's codeableconcepts in some cases feels right in some cases. for example, when they are truly alternative codings for the same exact thing. But when they are somewhat different, from differing authorities and not so exact, it feels like they should be separate component copies. Just saying.
Larry Babb (Nov 05 2019 at 18:58):
i'm not sure a clinvar variationId is equivalent to a CIVIC variant id, which is equivalent to a gnomAD representation or id which is equivalent to a clingen allele registry id. And if you go that far, then some would say that hgvs and dbsnp are equivalent "enough" to play in that equation.
Larry Babb (Nov 05 2019 at 18:59):
i'm good on this. I don't think its a battle worth fighting. In the end, there are much bigger fish to fry.
Jamie Jones (Nov 05 2019 at 19:00):
I don't know much about gnomAD or the allele registry, but I would hope they are at least equivalent in terms of being 'unique identifiers for the variant' (I agree 'simple' needs to be removed). Otherwise they may be better suited elsewhere, as a different concept
Kevin Power (Nov 05 2019 at 19:00):
All fair points, and ones we quibbled over for a bit. The discussion on multiple codings describes it like this:
https://hl7.org/fhir/datatypes.html#CodeableConcept
Additional Codes
More than one code may be used in CodeableConcept. The concept may be coded multiple times in different code systems (or even multiple times in the same code systems, where multiple forms are possible, such as with SNOMED CT). Each coding (also referred to as a 'translation') is a representation of the concept as described above and may have slightly different granularity due to the differences in the definitions of the underlying codes. There is no meaning associated with the ordering of coding within a CodeableConcept. A typical use of CodeableConcept is to send the local code that the concept was coded with, and also one or more translations to publicly defined code systems such as LOINC or SNOMED CT. Sending local codes is useful and important for the purposes of debugging and integrity auditing.
Jamie Jones (Nov 05 2019 at 19:03):
I might consider dbSNP/allele ids as a less granular unique id, so could lump it in here already it seems? :thinking:
Kevin Power (Nov 05 2019 at 19:12):
Another thing that was considered, but there was significant pushback on including dbSNP, as it is really a definition of the 'location', not the 'variant'
Not sure what you are referring to as 'allele ids' ?
Jamie Jones (Nov 05 2019 at 19:20):
I simply meant 'allele' here as 'location' (eg dbSNP). I understand the decision to pull it out of variation code--even though it may be used in some spaces as a less granular identifier, us giving it a separate component implies it should not be used as a unique identifier, which I agree with. We may even want to suggest an invariant requiring at least alt-allele
if dbSNP is populated, from my limited experience
Kevin Power (Nov 05 2019 at 19:24):
Got ya. FWIW - I was all for combining in dbSNP (with appropriate caveats in the documentation), but if I recall I was well in the minority on that thinking. Might be some good implementer feedback to gather.
Jamie Jones (Nov 05 2019 at 19:27):
It was a discussion before I started with the group and we haven't really reviewed the component since. We may be able to put dbSNP in as a sliced coding with an invariant at that level, which may be very cool (would then have to populate data-absent-reason on alt-allele, for example) though I am unsure of how to get that into the structureDef with the spreadsheet method... ;)
Jamie Jones (Nov 05 2019 at 19:33):
Summary for the topic seems to be that the current approach for variant-code,
(code=LOINC|81252-9
, multiple entries on value[x].coding
)
works for everyone's needs, but dbSNP and/or HGVS may be lumped into the same component in the future if there is enough call for it.
Bret H (Nov 18 2019 at 17:27):
the dbSNP component is historical. I think we'll be able to remove it. Clem was the strongest proponent at the time, but that was before our more recent work. My suggestion is to raise a proposal on the next call to remove it, make sure Clem and anyone else have a chance to raise objections, then cast a vote. Likely it will be deprecated in favor of putting dbSNP into variant ID.
Larry Babb (Nov 18 2019 at 17:54):
I would say the one issue here is that dbSNPIds without the precise nucleotide change is somewhat ambiguous, so calling it a variation code is a bit of a stretch. Like many of the variation codes that I think folks would use this field should be more along the lines of a "cross-reference" code. XRefs or cross-references are useful in communicating other identifiers that precisely or closely relate to the concept in question.
If you truly want a field of equivalent or precise cross-referenceable codes (or ids) to the variation in question then dbSNP should not go with variation-code. Its a tradeoff for sure. I can see the case for an xref-codes field so that dbSNP and any other's like it can be shared versus having a dedicated field for each type that exists out there, historical or future.
Jamie Jones (Nov 18 2019 at 18:04):
An invariant on dbSNP prompting alt-allele seems in order, regardless of whether the system shows up as its own component or lumped into one with others
Patrick Werner (Nov 19 2019 at 20:00):
you could also populate c.hgvs, or g.hgvs , right?
Bret H (Nov 21 2019 at 11:27):
@Jamie Jones expound please on 'an invariant...prompting'
Kevin Power (Nov 21 2019 at 14:35):
I think he means a constraint that would say "if you provide a dbSNP ID, you should also provide the alt-allele"
Patrick Werner (Nov 21 2019 at 15:01):
Sorry for repeating myself. But wouldn't c.hgvs or g.hgvs are also sufficient to provide with a dbsnp id?
Jamie Jones (Nov 21 2019 at 15:10):
Yes, I think it would be hard to submit either of those without them saying what the alt-allele is.
Patrick Werner (Nov 21 2019 at 15:16):
I'm thinking about a use case where only hgvs is populated and the alt-allele isn't filled out. As the hgvs gives you the alt allele this should be ok to use with dbSnp
Kevin Power (Nov 21 2019 at 15:17):
That would require someone to parse out the HGVS string to find the alt-allele if they needed it though.
Jamie Jones (Nov 21 2019 at 15:17):
This is similar to the cytogenetic-location / chromosome id overlap
Jamie Jones (Nov 21 2019 at 15:18):
I guess the difference is there is more rigor around hgvs
Bob Dolin (Nov 21 2019 at 15:19):
Probably not going to fly, but would be nice to say "you must include SPDI, and you can also include whatever else you want"
Patrick Werner (Nov 21 2019 at 15:20):
That would require someone to parse out the HGVS string to find the alt-allele if they needed it though.
Thats actually a very valid point.
+1 for invariant which enforces a alt allele if the system of variant id equals dbSNP
Patrick Werner (Nov 21 2019 at 15:22):
i'd love to have only a single variant_ID component. (We had at some point at least 3, i remember discussing with clem which lead to the removal of the cosmicID component.
Jamie Jones (Nov 21 2019 at 15:22):
I still don't know that we need to support dbSNP based on it's limitations
Patrick Werner (Nov 21 2019 at 15:23):
me neither. Thats why i want to get rid of its component (even without constraint)
Kevin Power (Nov 21 2019 at 15:23):
I am willing to bet someone will want to send it, at least for a while.
Kevin Power (Nov 21 2019 at 15:26):
In my experience, when people are hooked on something, you have to get them off of the drug slowly. In this case, have a way to support dbSNP, but make it easy to not need dnSNP going forward.
Jamie Jones (Nov 21 2019 at 15:28):
Well we could have the alt-allele prompt/warning mention that dbSNP will be deprecated in a future version ;)
Jamie Jones (Nov 21 2019 at 15:29):
There are some ways to get warnings in, but I'm not sure on the limitations of it
Last updated: Apr 12 2022 at 19:14 UTC