Stream: fhir/infrastructure-wg
Topic: NLP Derived Elements
Josh Mandel (Jan 10 2022 at 17:54):
Also for discussion today: https://chat.fhir.org/#narrow/stream/179166-implementers/topic/Specs.20and.20Unicode if Grahame is able to join.
Josh Mandel (Jan 10 2022 at 21:03):
@Guy Becker @Paul Church @Rick Geimer @Dan Gottlieb -- We've made progress on FHIR-34475 and agreed to add to this to spec, with some adjustments:
- renamed to "derivation-reference"
- added a way to point to "text documents" (via DocumentReference, or Binary) as well as arbitrary other text content in a FHIR resource (e.g. Composition.section narrative)
In the discussion, Paul mentioned some other use cases including 2D imaging annotations, more metadata about the ML model / algorithms, etc. I'd like to schedule follow-up discussion to explore how we can enhance the model. Please thumbs-up here if you want to join that discussion.
Dan Gottlieb (Jan 10 2022 at 21:05):
Think @Bin Mao on the BCH NLP team will interested as well.
Josh Mandel (Jan 11 2022 at 17:10):
@Lloyd McKenzie in the resolution yesterday it looks like you changed context from Resource, Element
to Element
, but I don't think this was discussed. I don't think a context of Element
allows for use at the resource level, does it? In which case we probably need to update to support both.
Lloyd McKenzie (Jan 11 2022 at 21:11):
There's an Element that represents the overall resource level in the snapshot. However, if we need Resource too, that's fine. Just note it in the comments.
Josh Mandel (Jan 11 2022 at 22:51):
I don't quite follow your note about snapshot, but I think we did mean to write "Resource, Element". I added a comment to note this at https://jira.hl7.org/browse/FHIR-34475?focusedCommentId=195083&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-195083 to this effect, thanks. (FYI @*Guy Becker.)
Josh Mandel (Jan 13 2022 at 18:32):
OK, we've scheduled a call to accommodate everyone who responded with a "thumbs up" to my message above. I'll include details here in case others are interested in joiniong.
Call is 11a-12p CT on Jan 26: Microsoft Teams meeting
Gino Canessa (Jan 26 2022 at 17:03):
Bump since the meeting starts in a few minutes, cheers!
Josh Mandel (Jan 26 2022 at 20:00):
Thanks to all who joined today's discussion! I've posted the recording at https://youtu.be/kPye_GjWrBQ (may take YouTube a few minutes to complete processing).
Josh Mandel (Jan 26 2022 at 20:00):
Call notes are available here.
Josh Mandel (Jan 31 2022 at 19:33):
From our next steps on extensions: there's interest in communicating coordinates or bounding boxes for data derived from 2D images (e.g. OCR'd documents). Looking at real-world examples (e.g. Google and Microsoft and AWS) there's quite a lot of capability in contemporary OCR toolchains -- e.g.
- hierarchical relationships between items like pages, paragraphs, and words
- orientation markers, adjustment rotations, and other pre-processing applied in order to "rectify" images
My take here, though, is that we're not trying to develop a meta-model for full OCR capabilities. Rather, just want a pointer from FHIR data back to original source materials. As such, being able to draw a simple "bounding box" would be the rough "2d image" equivalent of our text-based "offest and length" annotations, and would be a good place to start. As such, I'd suggest adding to our existing extension a set of optional properties for use with 2D image sources:
xmin
:0..1 positiveInt
xmax
:0..1 positiveInt
ymin
:0..1 positiveInt
ymax
:0..1 positiveInt
A constraint would ensure that these four properties were all present or absent -- something like (xmin or xmax or ymin or ymax) implies (xmin and xmax and ymin and ymax)
.
Josh Mandel (Jan 31 2022 at 19:37):
Should we take this forward, or is there an alternative design folks would like to suggest as an enhancement to https://build.fhir.org/extension-derivation-reference.html ?
Last updated: Apr 12 2022 at 19:14 UTC