FHIR Chat · strategy for phasing sequence blocks · genomics

Stream: genomics

Topic: strategy for phasing sequence blocks


view this post on Zulip Bob Milius (Jun 01 2016 at 17:21):

I've been thinking of how to group sequences into phased sets and am looking at Observation.related for this. Here's a diagram to illustrate what I'm thinking. Does this seem reasonable?
pasted image

view this post on Zulip Gaston Fiore (Jun 01 2016 at 19:00):

This looks fine. What do you mean by Observation/Sequence on the leaves of the tree? @Josh Mandel, do you see any issue with this type of nested structure?

view this post on Zulip Kevin Power (Jun 01 2016 at 19:24):

@ Bob Milius I assume this nesting would only be used when the phasing is known?

view this post on Zulip Bob Milius (Jun 01 2016 at 19:52):

@Gaston Fiore Observation/Sequence indicates either in the Observation for Genetics profile which can point to Sequence, or preferably in the Sequence resource itself. Right now the Observation.related.target can only point to either another Observation (or presumably a profile of Observation) or QuestionaireResponse so it can't point to Sequence directly. We'll have to suggest it for it to be another option.

view this post on Zulip Bob Milius (Jun 01 2016 at 19:55):

@Kevin Power Yes. Three ways for it be inferred. 1) direct physical determination, e.g., clonal amplification/single molecule sequencing/long reads, 2) imputed using allele/haplotype frequencies, and 3) family history

view this post on Zulip Kevin Power (Jun 01 2016 at 20:37):

That makes sense, but when you have optional nesting like that, it makes implementation more difficult. That isn't a reason not to do it. Just want to make sure we understand the requirements and/or what will we miss out on if we don't represent phasing. I just haven't seen many examples from our clients where they report or send phasing information. But maybe that's because there isn't a good way to do so today. :)

view this post on Zulip Bob Milius (Jun 01 2016 at 20:46):

@Kevin Power Yes, it would make implementation more complicated. At NMDP, we ask our labs to send us phasing information if they have it. Another way to do it is to add a phaseset element to the sequence resource or observation for genetics profile and use a unique label (e.g., uuid) to tag those sequences that are in phase with each other. Just need to make sure they aren't used outside of the lab result.

view this post on Zulip Kevin Power (Jun 01 2016 at 20:58):

I like the additional phaseset element, as long as we don't lose something important.

view this post on Zulip Bob Milius (Jun 01 2016 at 21:01):

The nice thing about grouping the sequences to phase sets is it simplifies reporting the allele assignments for genotyping. e.g., for HLA-A, one phase set could represent one allele name assignment (e.g. HLA-A*01:02), and another phase set would represent the other (e.g., HLA-A*01:03), and the final genotype would be reported at the observation where both are included (e.g., HLA-A*01:02 + HLA-A*01:03)

view this post on Zulip Bob Milius (Jun 01 2016 at 21:03):

but you may be able to get the same thing using phaseset element with a unique tag. I'll have to mock up some examples of both ways.

view this post on Zulip Joel Schneider (Jun 01 2016 at 22:34):

For HLA interpretation, for example, it's generally useful to know which exon 2 and exon 3 go together (have the same phase). Information about phase may also be useful for other scenarios involving exomic data.

In terms of the SO, I'm not sure whether we're talking about a representation for a sequence_collection (SO:0001260), or maybe a sequence_assembly (SO:0000353).

view this post on Zulip Joel Schneider (Jun 01 2016 at 22:47):

The most relevant reference I'm seeing is consensus-sequence-block, described in the HML manuscript.

In that case, the consensus-sequence-block is an interpretation of the primary sequence data, and phase-set is acting like a shorthand for some (presumably omitted) sequence data which was used to resolve phase.

view this post on Zulip Bob Milius (Jun 02 2016 at 13:22):

Yes, when we developed HML, we used the phase-set attribute of consensus-sequence-block to tag sequences that are in chromosomal phase with each other. We can do something similar in the Sequence resource or Observation for Genetics and/or group them with Observation.related as well.

view this post on Zulip Kevin Power (Jun 02 2016 at 13:56):

For others listening - would be curious if the need to represent phasing is important for non-HLA use cases? Meaning, could this be part of one of the HLA specific profiles? I get the sense it is good to represent, but like to see the use cases to support it. Or maybe the best question here - Is representing phase part of the 80% or the 20%?

view this post on Zulip Bob Milius (Jun 02 2016 at 14:34):

Yes, the Consensus-sequence-block profile of Sequence was created for the HLA use case, and includes phaseSet. We can use that. But if it can be used for other use cases, it could be elevated to the Sequence resource, or included in the Observation for Genetics Profile.

view this post on Zulip Kevin Power (Jun 02 2016 at 14:42):

Thanks Bob. I didn't check the profile before asking the question. Should have known you had it covered. :)

view this post on Zulip Gaston Fiore (Jun 06 2016 at 14:30):

Hi @ Bob Milius , would you be able to present on the 2 different ideas (complex Observations and new attribute in Sequence) on Thursday? Is there anything specific you would like me to add to the Sequence resource and show on the staging site on Thursday?

view this post on Zulip Bob Milius (Jun 06 2016 at 15:25):

Sure, I can present that. I'll think about the staging site and let you know.

view this post on Zulip Gaston Fiore (Jun 07 2016 at 18:34):

@ Bob Milius , you mentioned you wanted a uri for the phase attribute in Sequence? Let me know your idea and I'll make the change. You can talk about it on Thursday and this way people might better understand what you propose. Thanks a lot.

view this post on Zulip Bob Milius (Jun 07 2016 at 18:44):

I was thinking along the lines of how resources in bundles are identified, using the Bundle.entry.fullUrl element which is a uri data type (http://hl7-fhir.github.io/datatypes.html#uri). So it would be a uuid/guid in the form of
urn:uuid:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx (using a unique uuid for each phase-set).

view this post on Zulip Gaston Fiore (Jun 07 2016 at 20:47):

Okay. I'll have it on the staging site tomorrow morning. I'll let you know so you can take a look. Thanks Bob!

view this post on Zulip Gaston Fiore (Jun 08 2016 at 14:05):

Hi @ Bob Milius , I've added a phase set identifier to Sequence:
http://genomics-advisor.smartplatforms.org:4000/sequence.html
Please let me know if you'd like to change anything, including the definition of the element, or anything else you see fit. Thanks!

view this post on Zulip Gaston Fiore (Jun 08 2016 at 14:08):

I've been following the discussion on nested Observations in the implementers stream. I'll take a more detailed look this afternoon. I feel we'll have to explain this tomorrow so people can understand how phasing could be handled in FHIR.

view this post on Zulip Gaston Fiore (Jun 09 2016 at 16:13):

Very informative presentation @ Bob Milius ! Thank you!

view this post on Zulip Bob Milius (Jun 13 2016 at 20:50):

fyi, I uploade my presentation from last week to the HL7 CG document center
http://www.hl7.org/documentcenter/public/wg/clingenomics/FHIR_ObservationRelated_phaseset.pdf

view this post on Zulip Gaston Fiore (Jun 14 2016 at 14:40):

Great, thanks! I'll be adding some of that to the IG. We should find a way to handle references.


Last updated: Apr 12 2022 at 19:14 UTC