FHIR Chat · variant location · genomics

Stream: genomics

Topic: variant location


view this post on Zulip Gaston Fiore (May 27 2016 at 14:42):

Hi @ Bob Milius , if variant, referenceSeq, and structureVariant were all moved from the Sequence resource to the Observation profile as @Amnon Shvo proposes, how would you carry out the workflow that you described at yesterday's meeting? Will it be affected? If so, how? @Jonathan Holt , if you have any input, please chime in as well. Thanks a lot!

view this post on Zulip Jonathan Holt (May 27 2016 at 15:50):

@Gaston Fiore , I think that we need to flush out the goals of the sequencing resource to better understand the placement of attributes. I will also need to defer to @Amnon Shvo for his vision of his proposed changes. My take is that we need to separate the genomic data from the observation and from the interpretation. The genetic testing report ( the interpretation) refers to multiple observations ( the observation profile) which references the sequence Resource (data) in a "show your work" approach to make it unambiguous. The more granular bioinformatics files ( VCF, SAM/BAM) are in separate repositories. The second goal of the sequence resource is to enable storage and querying of raw data, without having to dive into the VCF or alignment files.

view this post on Zulip Gaston Fiore (May 27 2016 at 18:30):

I agree @Jonathan Holt . What is not clear to me is whether it is necessary to have variant and structureVariant in Sequence for querying raw data. @ Bob Milius , in your HLA cases, when querying variants, it is necessary to have those elements in Sequence, right? I'm also trying to understand with specific examples whether moving those elements to the Observation profile would adversely affect querying.

view this post on Zulip Gaston Fiore (May 27 2016 at 19:05):

(deleted)

view this post on Zulip Bob Milius (May 29 2016 at 17:11):

@Gaston Fiore Short answer: I don't know. I need to see how our HLA use case works by moving variant from sequence to observation. I need to see it mocked up with examples. In our use case, we need to represent a consensus sequence from a lab that includes one or more sequence blocks (e.g. some but not all of the exons in the gene). In the group of sequence blocks, I need to see whether the lab has determined phase between any of them, or if it is unknown. I also need to know whether each sequence-block is a novel variant as compared to the the IPD-IMGT/HLA allele database. This level of detail is somewhere between primary data and interpretation. The sequence blocks are a result of aligning/assembling the raw reads, which could change depending on the tools/algorithms used, but are then used to make an allele assigment. One person's interpretation is another's primary data. In the end, I'll have to mock it up with examples and see if it works, and even if it does, whether it's more awkward/cumbersome.

view this post on Zulip Gaston Fiore (May 30 2016 at 13:27):

Thanks @ Bob Milius . As a software engineer, I completely agree that we need examples. I was going to go back to an email you sent a couple of month ago with background on your use cases. Let me know when/as you work on this. I want to participate. Working examples through are key to the bottom-up approach that I want to pursue. Thanks Bob!


Last updated: Apr 12 2022 at 19:14 UTC