Stream: social
Topic: Discussion/Design focused Connectathon Tracks?
Cooper Thompson (Sep 13 2021 at 13:45):
We have a massive number of connectathon tracks this cycle. If previous connectathons are any indication, at least some of these tracks will be discussion focused, and involve little, if any, actual connection testing. That style of track seems problematic, as if you classify a design call as a connectathon activity, then you need to exclude anyone not signed up for the connectathon.
It seems like these sorts of connectathon tracks are really just a mini use-case focused WGM vs. the committee-focused WGM we have next week.
I wonder what folks think of this. Should we make more of an effort to limit connectathon tracks to only those that are going to do actual testing? Or should we open up the connectathon discussion sessions for non-registered participants somehow? Or maybe split the connectathon into two parts, one that is connection testing, and one that is design focused (i.e. a mini, use-case focused WGM)?
Yunwei Wang (Sep 13 2021 at 13:55):
I think the trend to use connectathon as a discussion forum is concerning. The purpose for connectathon is connecting client and server implementers to test feasibility of an IG. The discussion/design sessions fit better for regular WGM quarters or weekly conference calls.
Yunwei Wang (Sep 13 2021 at 14:01):
FMM2 requires an artifact to be tested in a semi-realistic settings, such as connectathon. Developers should not use "discussion" connectathon as a easy path to work around FMM requirement.
Cooper Thompson (Sep 13 2021 at 14:02):
I do like the use-case focused part of the connectathon though. Moving that into WGM has a similar problem, where you have to exclude folks who aren't registered. Having a set of dedicated days for focusing on a project that typically only has weekly calls is nice.
Yunwei Wang (Sep 13 2021 at 14:04):
One or two break out sessions is helpful.
Cooper Thompson (Sep 13 2021 at 14:08):
Yup, but are you allowed to invite non-registered folks to breakout sessions?
Josh Mandel (Sep 13 2021 at 15:46):
It seems like these sorts of connectathon tracks are really just a mini use-case focused WGM vs. the committee-focused WGM we have next week.
@Cooper Thompson this is extremely well put, and you're right to bring attention to it. (And the corollary which you hint at as well: there are so many tracks that it's impractical for many people to participate in the set of things they might really want/need to.)
Josh Mandel (Sep 13 2021 at 15:50):
I'm not sure what to do with this at the "macro" level. In discussions with @Gino Canessa we've mused about what a different "connectathon" process might look like, perhaps similar to what we did (informally!) on SMART Scheduling Links (i.e., running a "connectathon" on our own schedule, taking a few days to bring the community together for testing/implementation during a week when there weren't dozens of other tracks competing for time).
At the "micro" level it might mean we should be scheduling more WGM quarters for these kinds of discussions, for workgroups/specs. For example, FHIR-I has some quarters to spare that'd be suitable for higher-level design discussions if there's community interest or topics identified.
Lloyd McKenzie (Sep 13 2021 at 16:46):
The FMG definitely intends connectathon to be used for 'testing' and any discussion should be limited to the technical merits of the portions of the spec being tested - by those who are testing it. Sandy has tried to give direction to that effect. Our shift to requiring earlier registration was in part to ensure attendees had opportunity to ramp up and come prepared to test, as last minute sign-ups tended to be primarily people who were interested in discussion/presentation, not actually implementing. We could go further and say "design decisions SHALL NOT be made at connectathon", but that might be overly strict, as there are certainly situations where the group of people involved in the testing is exactly the right group to make decisions. (And the reality is that all decisions are subject to vetting through Jira and voting.) Suggestions on how to ensure that connectathons remain true to their primary purpose of testing are welcome...
Gino Canessa (Sep 13 2021 at 17:10):
Yes - I planning on writing something up with a bit more detail, but the primary issue I was focusing on is that with so many tracks people are not able to participate well. In large part, I think this is a symptom of changing to the virtual format but still using the structure of the physical meetings.
The starting point for me is my experience: there are at least 3 tracks that I would like to participate in this week, beyond the one I'm hosting. At the same time, most of the participants in my track are attending multiple tracks... up to seven according to the survey. This means they'll have far less attention and time on any given track.
In person, this wasn't really an issue. People weren't interested in spending a weekend in a ballroom just for a couple of hours of meetings that could be scheduled during the WGM. It was also easy to grab someone for an hour or two from another track because we were all in the same room.
For a proposal, I was thinking about an 'event' that is several weeks long, but capped at one or two tracks on any given day. For many/most people, it is still just a few full days - either focused deeply on the reduced tracks they are interested in or spending an hour or two across many tracks.
A setup like that takes advantage of the fact we are virtual by letting people pop in and out of what they need.
I think that there should be very little extra overhead in running it like that, since the actual number of attendees shouldn't change much from the current format. It spreads out time on some types of issues (e.g., I can't connect), but also lowers the overall load at any given time so would hopefully be a wash.
Thoughts?
Lloyd McKenzie (Sep 13 2021 at 18:41):
Hmm. That would lessen the load on some, but significantly increase the load on others. would also make cross-track joints harder. I think the biggest challenge would be HQ - would love to see a screenshot of Sandy's face when she sees the proposal for a several-week connectathon :)
I'm not sure that the issue is only about virtual. The reality is that as the size of the community grows, the number of interesting things happening also grows, making it harder and harder to participate in everything you care about - and prioritizing can get hard. My guess is that at least 1/3 of the current participants suffer from the same problem you're having Gino. But if you asked them "Would your boss be willing to give up 20 work days of your year for connectathon instead of the 6 you give up now", few would get a positive answer. (Keeping in mind that we've also got 15+ days a year for WGMs and DevDays too.)
That said, more 'smaller' events might be an option.
Mary Ann Boyle (Sep 13 2021 at 18:43):
@Lloyd McKenzie my face too. :smiling_face:
Yunwei Wang (Sep 13 2021 at 18:51):
That shows the difference between the testing part of connectation and discussion part of connectathon. My own experience is that when I join as an software developer, my goal is to test my system as thoroughly as possible, and fix as many bugs as I could. There is no time to think about other tracks. But when I put my designer hat on, my goal is to exchange ideas which may affect my product. In such case, I would like to participate as many related tracks as people as possible.
I think my experience echoes the original questions raised by Mr Cooper
1) Should we separate the connectathon for developers from conectathon for designers because they target to different people or people with different hats
2) If we do separate them, how do we include broader attendees to join the discussion/design part. I think one reason to have design tracks in connectathon is that connectathon has some advantage over weekly wg meeting: connectathon has preset topics, dedicated time frame, and broader participants from out of regular wg attendees.
Lloyd McKenzie (Sep 13 2021 at 18:57):
Our connectathons have long had the implementers being the designers. I.e. you try it out, you see what breaks, you brainstorm how to get around the break, then you try that. I don't think we want to change that ethos. And the reality is that if you want to be part of that process, you pay to participate. There are lots of other ways to participate in general design discussions outside connectathon. What we should be trying to minimize is design discussions that aren't accompanied by code that exercises the designs. We shouldn't be targeting any except introductory tracks to people who aren't bringing code.
Gino Canessa (Sep 13 2021 at 19:09):
Yeah, it's not a proposal without challenges =). That said, I would hope that the same content over a longer time period wouldn't cause Sandy and Mary Ann much (or any? =) additional work. If the same number of tracks are happening, theoretically it means that it's roughly the same amount of 'admin'. Even all things equal, I realize there would be additional context switching (e.g., 10 minutes a day vs. 1 hour).
To offset any additional work, I propose sending them chocolate and/or alcohol =)
Josh Mandel (Sep 13 2021 at 20:12):
(I think a side benefit of smaller and more discrete tracks spread over time would be more of a self-service administration model as well; when I ran the smart scheduling links event earlier this year, I didn't ask for any administrative assistance --- although I did get some from ONC which was most welcome :-))
John Moehrke (Sep 13 2021 at 20:17):
I think the current experience shows a "new" need. That new need has found the best fit to come to FHIR-Connectathon and observe. This new need use to be satisfied by the HIMSS Interop Showcase; but (a) HIMSS hasn't been able to happen, (b) HIMSS never tried to create an Interop Showcase, and (c) HIMSS interop showscase became overly polished and thus not feeling like reality.
Lloyd McKenzie (Sep 13 2021 at 20:25):
We very much don't want FHIR connectathons to turn into any kind of showcase. One of the key notions is "What happens at FHIR connectathon stays at FHIR connectathon". We don't want polish, we want bleeding edge and open to being refactored over a weekend. If our participants become afraid of sharing something that might break, we're in deep trouble...
John Moehrke (Sep 13 2021 at 20:32):
I fully agree. but I don't think we have that under the current situation.
John Silva (Sep 14 2021 at 01:14):
Does the IHE Connectathon model have any practices that would be good to emulate and/or borrow? The time(s) I attended it seemed like there were very good "rules" for testing and validating against whatever "WIP specs" a particular track was working towards and the read-out was pretty much "automatic" (by the s/w that kept track of the testing results). Yes, there were people (including myself at one time) that were there more as a 'designer/onlooker' than coding participant but that was useful as well.
Grahame Grieve (Sep 14 2021 at 06:07):
the IHE connectathons have a different and higher purpose, so we don't always copy what they do, but we're sure not averse to learning from other people's experiences
Eric Haas (Sep 14 2021 at 06:29):
Obvious to me at least is to limit track numbers and more skin in game to participate. Josh's suggestion a lot more workable and less drastic and easier to manage than my "survivor" ideas. (proof of work to participate and lottery or shark tank approach to limit track numbers :-) )
Michele Mottini (Sep 14 2021 at 12:01):
FHIR connectathons to turn into any kind of showcase.
This is already happening: I had strong pushbacks on reporting negative results and people wanting a 'nice' end report (with polished demos when we were still in-person)
Paul Church (Sep 14 2021 at 15:19):
And your testing is typically the most useful because you actually report negative results! As far as I'm concerned the track leads need to get their priorities straight.
Lloyd McKenzie (Sep 14 2021 at 19:46):
@Sandra Vance - @Michele Mottini's issue needs to be part of track lead training. Negative results are encouraged and welcomed. It's good to find out if/where things fail so we can fix them. Connectathon is not about having a pretty report out and it's expected and appropriate for some of what's demonstrated to have bugs or be wobbly. Also, we should provide caution about recording everything. Recording should be optional and anyone who wants to demo non-recorded should be free to do so.
Sandy Vance (Sep 14 2021 at 20:50):
Gino Canessa said:
Yes - I planning on writing something up with a bit more detail, but the primary issue I was focusing on is that with so many tracks people are not able to participate well. In large part, I think this is a symptom of changing to the virtual format but still using the structure of the physical meetings.
The starting point for me is my experience: there are at least 3 tracks that I would like to participate in this week, beyond the one I'm hosting. At the same time, most of the participants in my track are attending multiple tracks... up to seven according to the survey. This means they'll have far less attention and time on any given track.
In person, this wasn't really an issue. People weren't interested in spending a weekend in a ballroom just for a couple of hours of meetings that could be scheduled during the WGM. It was also easy to grab someone for an hour or two from another track because we were all in the same room.
For a proposal, I was thinking about an 'event' that is several weeks long, but capped at one or two tracks on any given day. For many/most people, it is still just a few full days - either focused deeply on the reduced tracks they are interested in or spending an hour or two across many tracks.
A setup like that takes advantage of the fact we are virtual by letting people pop in and out of what they need.
I think that there should be very little extra overhead in running it like that, since the actual number of attendees shouldn't change much from the current format. It spreads out time on some types of issues (e.g., I can't connect), but also lowers the overall load at any given time so would hopefully be a wash.
Thoughts?
Gino I would be interested in hearing your thoughts on this. I have brought up the idea of a more distributed schedule multiple times but the limiting factors are typically bandwidth of those participating (stepping away from their 9-5 gigs for a week or more at a time) and that the community of track leads / FHIR experts is limited. This population is growing so maybe there is a way. And with more advanced testing tools now there is a lot that can be done in one's own time without the presentation style that we use today. Let's talk soon!
Matt Varghese (Sep 14 2021 at 20:53):
On a similar note, I want to ask what really qualifies as connectathon testing.
Some part of me wants to propose that testing against a reference implementation does NOT qualify as connectathon testing for balloting purposes. A reference implementation is made by the creators of the spec, or is created in accordance with how the spec proposes / assumes the workflow is conducted. And so discrepancies between those assumptions and how things exist in the real world get missed?
Sandy Vance (Sep 14 2021 at 20:54):
Lloyd McKenzie said:
Hmm. That would lessen the load on some, but significantly increase the load on others. would also make cross-track joints harder. I think the biggest challenge would be HQ - would love to see a screenshot of Sandy's face when she sees the proposal for a several-week connectathon :)
I'm not sure that the issue is only about virtual. The reality is that as the size of the community grows, the number of interesting things happening also grows, making it harder and harder to participate in everything you care about - and prioritizing can get hard. My guess is that at least 1/3 of the current participants suffer from the same problem you're having Gino. But if you asked them "Would your boss be willing to give up 20 work days of your year for connectathon instead of the 6 you give up now", few would get a positive answer. (Keeping in mind that we've also got 15+ days a year for WGMs and DevDays too.)
That said, more 'smaller' events might be an option.
Ha! Lloyd - you're right. I LOVE connectathons as an event but the current structure is breaking the backs of the meetings services folk. We need to find a way that doesn't require so much manual intervention to get people what they need to 1. Access and understand the scenario to be tested and 2. Connect to partners to do meaningful testing. I believe a distributed schedule COULD work - if we re-invent the way we connect people so that we do not require as much administrative minutia to keep the testing going.
Michele Mottini (Sep 14 2021 at 20:56):
Some part of me wants to propose that testing against a reference implementation does NOT qualify as connectathon testing for balloting purposes.
Testing of a non-reference implementation client/server against a reference implementation server/client should definitely count
Matt Varghese (Sep 14 2021 at 20:57):
Michele Mottini said:
Testing of a non-reference implementation client/server against a reference implementation server/client should definitely count
It counts for the non-reference implementation validating their implementation against the spec. But it doesn't really count for the validation of the spec itself, for the reason I stated?
Michele Mottini (Sep 14 2021 at 20:59):
It counts - some real-world system was able to implement the specs and use them
Sandy Vance (Sep 14 2021 at 20:59):
John Moehrke said:
I fully agree. but I don't think we have that under the current situation.
Which parts preclude this John? Track Highlights? Recorded Sessions? IG Overviews? Other things? I agree - as FHIR evolves we need a place to demo things but this isn't it. So perhaps if we work on that part we could shift the pieces that hinder the grit to a different "venue".
Lloyd McKenzie (Sep 14 2021 at 21:01):
@Matt Varghese I presume you're asking about the FMM2. For those purposes, a reference implementation absolutely counts. What we're trying to check at that level is "Have 3 independent sets of implementers been able to take the specification, write code that implements it, and have those implementations talk to each other.?" None of the implementations need to be 'real' or even be drawn from the community that's intended to implement in the end, though we'd certainly raise eyebrows at a project where there are no 'real' implementers willing to step up to the table at that point in the maturity. However, if a single author creates both a client and a server reference implementation, that counts as only '1' implementation. All implementations need to be developed independently of each other by separate people without close coordination during development. Otherwise, you're not actually testing the spec.
Lloyd McKenzie (Sep 14 2021 at 21:02):
The expectation for 'real' implementation manifests in FMM4 and FMM5 of the maturity scale.
Matt Varghese (Sep 14 2021 at 21:03):
I'll give the example of the struggle I am having with CRD.
CRD reference implementation has rather simple rules which can give a response for order-sign hook on whether prior authorization is required or not. However, in our evaluating to implement CRD, we realized issues similar to what another EHR implementer also independently realized in this thread: https://chat.fhir.org/#narrow/stream/180803-Da-Vinci.20CRD/topic/CDS.20Hooks.20for.20CRD . This means in the real world, an order-sign hook could not realistically answer whether prior authorization is required.
I think these are thorny issues with the workflow as proposed by the spec, which connectathon testing should have found, but testing with reference implementation probably did not?
Lloyd McKenzie (Sep 14 2021 at 21:07):
It's absolutely true that testing with real-world systems will identify more issues than testing with proof-of-concept implementations. However, asking 'real' systems to implement something that's at a low level of maturity often isn't super realistic. Doing a first pass for "proof of concept", then getting ballot review and approval, then getting 'real' implementation (and the feedback that comes from it, then iterating seems to be the path that works best for most parts of the community. That doesn't mean that some specs might not have real implementations from day 1 - that's what happened with some of the terminology services stuff, for example. It really depends on the readiness/interest level/resources of the community.
Matt Varghese (Sep 14 2021 at 21:09):
I may be phrasing incorrectly / using terminology wrong here. But my concern is, how can we make sure these kinds of issues are found earlier? My understanding was, that was the purpose of the connectathon?
Otherwise, a spec gets marked as mature to certain degrees without that maturity actually being there / gets regulated prematurely etc.?
John Moehrke (Sep 14 2021 at 21:10):
Sandra Vance said:
John Moehrke said:
I fully agree. but I don't think we have that under the current situation.
Which parts preclude this John? Track Highlights? Recorded Sessions? IG Overviews? Other things? I agree - as FHIR evolves we need a place to demo things but this isn't it. So perhaps if we work on that part we could shift the pieces that hinder the grit to a different "venue".
for purely connectathon focused sessions, recording should be forbidden. That does not mean that there is no place for recorded sessions, but they should be explicitly defined. Things like kickoff, or report out. Meaning during connectathon the presumption should be that no one wants recording. I was on a call this week where leader asked if anyone had a problem with recording, a typical thing to do... I pointed out that anyone that is uncomfortable will simply not speak at all, so asking this is not helpful during connectathon. What is a fine thing to do during a normal meeting, to ask, is just not appropriate during connectathon.
Lloyd McKenzie (Sep 14 2021 at 21:11):
IGs will soon start surfacing their maturity on an overall and per artifact level - so you'll be able to distinguish "implemented as proof of concept only" vs. "implemented in the wild".
John Moehrke (Sep 14 2021 at 21:12):
reference servers should be part of the testing, testing against them should count... but they must also not be presumed to have been perfectly implemented. I think the struggle many have is presuming that the reference implementtation is more normatve than the specification text.
Josh Mandel (Sep 14 2021 at 21:14):
To Matt's point, I think spec editors absolutely need to shoot for real-world implementation before considering their specs ready for the real world. That said, editors can't always get this to happen -- e.g., in your example Matt, I can't tell if the failure mode is 1) no real-world systems participated, and so broken stuff moved ahead in the process, or 2) real-world systems shared feedback about stuff that was broken, and the editors ignored it.
If it's (1), then... well, that's an incentive to get involved and share feedback early.
If it's (2), then.... that sounds like a consensus process failure.
Matt Varghese (Sep 14 2021 at 21:20):
We had a discussion about this at the CRD track. And the Track leads showed me some of the use-cases they considered when creating the spec. For those use-cases, what was done with the reference implementation was somewhat reasonable. However, those use-cases were simpler and easier than what is found elsewhere.
And so that is where I think the risk with reeference implementations lie:
in that we're only trying to show it works; not trying to find what could go wrong with it, through a reference implementation
Lloyd McKenzie (Sep 14 2021 at 21:22):
I think Matt's issue may also be "regulators mandated implementation of content that only had FMM2 maturity". And that's a regulation challenge. Regulation and standards is always an uneasy mix, particularly the FHIR process with an STU iteration model that can produce repeated breaking changes over several years, while regulation wants to lock things in early - ready or not.
Lloyd McKenzie (Sep 14 2021 at 21:23):
Perhaps surfacing the maturity will help mitigate this problem some, but the notion that it may take 4-5 years from initial conception of an idea to when it's truly locked down as normative doesn't fly too well either. And then there's the challenge of "will anyone actually implement and detect the places the spec is broken if there's no regulation to make them try?" It's a hard problem.
Josh Mandel (Sep 14 2021 at 21:26):
There is a misperception about regulations here. No regulations demand use of any specific technology implementation guide in this space today. Sometimes specification editors like to use regulatory intent as a motivation to speed up the process, and some amount of this can be healthy but it's important to understand what the actual constraints are. I strongly agree with Matt's point that if reference implementations aren't trying to explore the way specs break in addition to exploring the way specs work, there's something missing.
Josh Mandel (Sep 14 2021 at 21:27):
(I disagree that this is a fundamental limitation of reference implementations. The best reference implementations are designed and deployed in the context of early stage real world use.)
Matt Varghese (Sep 14 2021 at 21:30):
I strongly agree with Matt's point that if reference implementations aren't trying to explore the way specs break in addition to exploring the way specs work, there's something missing.
I agree with this Josh. However, I feel like the process may be set up to incentivize reference implementations that only show it works, and not ones that try to find how it breaks as well?
Josh Mandel (Sep 14 2021 at 21:32):
Ideally, yes. That's a really hard thing to do; I think it's probably important for other real-world implementers to get in the mix and share their feedback in order to complement what we learned from the reference implementations here.
Matt Varghese (Sep 14 2021 at 21:34):
Ok, so is there something we can do here with process to make sure that connectathons actually find these kinds of issues early? (Trying to tie this back to Cooper's original post)
Josh Mandel (Sep 14 2021 at 21:35):
I'd like to understand this better. What's the situation where you participated in a connectathon and generated feedback and shared it but it was ignored and things move ahead despite this, based on faulty or limited conclusions derived entirely from a reference implementation?
Matt Varghese (Sep 14 2021 at 21:40):
No, that is not the case. Rather, the CRD Spec has been through a few connectathons. So I expected a certain level of maturity (especially also given it was almost regulated). I only started coming to the Burden Reduction track last connectathon, and I am evaluating this for implementation, but I'm finding issues that I would have expected to have been already identified. This surprised me, especially since I feel like this has been through a few connectathons?
Josh Mandel (Sep 14 2021 at 21:43):
Realistically, "going through a connectathon" isn't a hard or reproducible metric. You can't just count cycles or participants (sometimes a small track generates fabulous critical feedback). But you can and should raise concerns about maturity during the subsequent consensus process (conference calls, ballot feedback etc) because all of those are gating functions beyond connectathons. Connectathons are just one piece of the maturity story.
Matt Varghese (Sep 14 2021 at 21:45):
Agreed. The reason I made my original point was that I feel like testing against a reference implementation puts you in the mindset of "showing our stuff works", and not particularly keen on the question what could break..
Gino Canessa (Sep 14 2021 at 21:52):
Trying to catch up here, so sorry about jumping around =).
I agree about not recording connectathons sessions - I prerecord 'useful' information and then take notes on what may be relevant to a wider audience during sessions. In my experience, people do not share as freely if something is recorded. To that end, I don't even record the 'walkthrough' sessions, since I want people to ask questions instead of worrying about 'how will this look recorded' (again, can only be done if you are willing and able to record that content separately).
I'm not sure what can realistically be done regarding 'reference' vs. 'prototype' vs. 'real' implementations, other than perhaps noting on report-outs? The point of implementing the spec is to find issues converting the spec to code. Obviously there are some issues that get overlooked if it's the same author doing both (see: me on Subscriptions), but the hope is that the other implementations using the RI will expose those deficiencies. If Reference Implementations no longer 'count' for anything - are people going to invest as heavily in them? Without reference implementations to work against, prototyping is going to be significantly harder for others, and without both I'm not sure how you get to a production system.
Regarding the attitude of trying to break things vs. trying to confirm things.. I don't think it's an issue of a specific implementation 'type'. E.g., if you have a group that has a specific use case and has written the spec backwards based on what they are doing, being a production implementation is no higher bar than being an RI. Circling all the way back around to the earlier discussion - this is one of the points I worry about when we're stretched so thin. Having the ability to get broader and more concentrated effort on tracks will improve the amount and quality of feedback.
Notification Bot (Sep 15 2021 at 12:22):
This topic was moved by Josh Mandel to #implementers > v2 to fhir in GCP
Grahame Grieve (Sep 15 2021 at 20:54):
I'm not sure what can realistically be done regarding 'reference' vs. 'prototype' vs. 'real' implementations, other than perhaps noting on report-outs?
a key quality metric regarding reference implementations is how close to real world the maintainers are. Reference implementations are a great way to surface engineering issues, but not a good way to surface real world implemetation experience; that very much depends on the leaders, and the choices they make.
Last updated: Apr 12 2022 at 19:14 UTC