Stream: genomics
Topic: Sequence.coordinateSystem cardinality
Bob Milius (Jul 08 2016 at 19:46):
We added coordinateSystem to the Sequence resource to force users to declare if they are using 0- or 1-based coordinates, so we made the cardinality 1..1.
But it's possible to send a sequence without using coordinates, so this doesn't make sense for them, e.g.,
<Sequence> <patient> <reference value="Patient/123"/> </patient> <specimen> <reference value="Specimen/456"/> </specimen> <type value="DNA"/> <coordinateSystem value="0"/> <observedSeq value="GCTCCCACTCCAT"/> </Sequence>
So I'm inclined to make it optional. But if they are using start and end inside of referenceSeq or variant or quality, then they should be required to declare what coordinate system, and preferably the same coordinateSystem for all of those.
Not sure how to handle this.
btw, all the start and end definitions say to use inclusive/exclusive (respectively), but if we are forcing them to declare the coordinateSystem, then we should remove 'inclusive' and 'exclusive' from the definitions.
thoughts?
Kevin Power (Jul 08 2016 at 20:23):
+1 to making it optional. But the inclusive/exclusive still make sense, regardless of system right? Coordinate sytem just means do you say [0,1) or [1,2) - or so I think.
Bob Milius (Jul 08 2016 at 20:43):
Yes, but I think if people are using 1-based, they are also often assuming inclusive for the end. So looking at the just the first nucleotide in a sequence with zero-base would be start=0, end=1, i.e., [0,1), and in 1-based it would be start=1, end=1, ie [1,1]
Am I wrong? In any case, there should not be any ambuity about it. I know HGVS uses 1-base. Does it use inclusive/exclusive [ ), or inclusive/inclusive [ ] ?
Bob Milius (Jul 08 2016 at 20:48):
nevermind about HGVS, I don't think they use an end postions, just a start. See the section on nucleotide numbering in http://www.hgvs.org/mutnomen/recs.html
Kevin Power (Jul 08 2016 at 20:54):
Actually, reading about deletions, I think you are right (both positions are inclusive): http://www.hgvs.org/mutnomen/recs-DNA.html#del
Kevin Power (Jul 08 2016 at 20:55):
" c.76_78del (alternatively c.76_78delACT) denotes a ACT deletion from nucleotides 76 to 78 "
Bret H (Jul 11 2016 at 06:29):
Re: enforcing use of coordinateSystem only when coordinates are used.
How about moving coordinateSystem into a requited field of begin and end?
Along the lines of:
<begin value="value">
<coordinateSystem value="0"/>
<begin/>
Also, I think that even if a decision were to be reached using only "0" or "1", specifying the index start is an explicit way to indicate to a downstream system...still up to a great implementation guide to ensure proper use.
Best,
Bret
Last updated: Apr 12 2022 at 19:14 UTC