FHIR Chat · Feedback Needed: Array Syntax

Stream: shorthand

Topic: Feedback Needed: Array Syntax

Chris Moesel (Jul 15 2020 at 14:34):

FHIR Shorthand uses indexes to support building multi-item arrays. For example, consider this person with three variations of his given names:

* name[0].given[0] = "William"
* name[0].given[1] = "Robert"
* name[1].given[0] = "Willy"
* name[1].given[1] = "Robby"
* name[2].given[0] = "Will"
* name[2].given[1] = "Rob"

This approach works but it is brittle because you (the author) must keep track of the indexes.

If you want to add "Billy Bob" as the second name, you need to add it with index 1 and then manually increment the index of every name after it.
Similarly, if you want to remove one of the names, you need to then decrement every index after it
God help you if it's an array with 50 items and you realize you mistakenly repeated index 7 (it's happened before).

As part of ballot reconciliation, we have the opportunity to fix this. Here is an approach we've come up with. We welcome your thoughts and suggestions. If you have a better idea, we want to hear it!

KEY:
list[ ] --> last item in list; when list is empty, the equivalent of [+]
list[+] --> add a new item to list
list[n] --> the n+1 item ([0] is first element)

* name[0].given[0] = "William"
* name[ ].given[+] = "Robert"
* name[+].given[0] = "Willy"
* name[ ].given[+] = "Robby"
* name[+].given[0] = "Will"
* name[ ].given[+] = "Rob"

This then allows you to add or remove an element in the middle without having to modify all the elements after it. We would still support omitting the index as a shortcut for [0], but I've used [0] above because it actually makes for better alignment (and easier reading).

Chris Moesel (Jul 15 2020 at 14:36):

I'm attaching a slightly more complex example with a snippet of a CapabilityStatement (attaching before and after).
index-example-before.fsh
index-example-after.fsh

Chris Moesel (Jul 15 2020 at 14:38):

Also note that eventually we would want to marry this with a syntax for more easily working with deeply nested paths (such as the with syntax that @Jose Costa Teixeira suggested) -- but that seemed a bit too far from what was balloted to justify slipping it in as a reconciliation.

Gino Canessa (Jul 15 2020 at 15:08):

My concern is trading a compiler error and annoyance for a run-time error and annoyance. Specifically, if you swap any two lines in the example you get incorrect output without any warnings:

* name[+].given[0] = "Willy"
* name[+].given[0] = "Will"
* name[ ].given[+] = "Robby"

In this short example, it's easy to see because of the pattern and formatting - but if there are more fields (or an inconsistent number of fields) you now have incorrect output that builds without errors.

John Moehrke (Jul 15 2020 at 15:21):

I don't like that this proposal is adding procedural logic to a statement based language.
How about you do this with aliases? Have a different alias for each name the person has, manage the increments in the alias file.
Even if we had to add procedural increment support ONLY in the alias declaration file, that would be less of a violation of statement based language from procedural based language.

Chris Moesel (Jul 15 2020 at 15:52):

@Gino Canessa -- I understand your concern, but to be clear -- it's also quite possible to mess up explicit indices and not receive any compiler errors. I guess it's a question of which type of a mistake are authors more likely to make?

@John Moehrke -- this is probably silly, but does it feel less procedural if we define item[ ] as "the same item as the previous rule" and item[+] as "the next item" (essentially removing the "verbs" from the descriptions)?

Chris Moesel (Jul 15 2020 at 15:54):

@John Moehrke -- I'm not sure I understand what you're suggesting regarding the aliases. If you have a moment, could you flesh that out for me?

Gino Canessa (Jul 15 2020 at 15:59):

I agree that it is possible, but I believe it is less likely. I believe something in the toolchain now flags if you skip an element. As for redefining one twice, I'm not sure if anything does today, but if nothing does, it should be straightforward to add.

These checks can even be implemented with UI notifications (e.g., flag element redefinition, flag array indexes out of order (check for same index or higher in a block)).

There are still issues from copy pasta, since we know that's how repetitive blocks get added, but with everything implicit I don't believe these errors can be detected anymore.

Jose Costa Teixeira (Jul 15 2020 at 16:16):

One thing that I like about sushi is that I can say today (top of file) say name[0]=Alice name[1]=Bob name[2]=Chuck and then 300 lines later I can go "forget that, name[1]=Bart

Jose Costa Teixeira (Jul 15 2020 at 16:18):

If we want to keep that, I'd think that this approach is a special case.
Some suggestions:

Jose Costa Teixeira (Jul 15 2020 at 16:19):

* name[0].given[0] = "William"
* name[ ].given[+] = "Robert"
* name[+].given[0] = "Willy"
* name[ ].given[+] = "Robby"
* name[+].given[0] = "Will"
* name[ ].given[+] = "Rob"

this should not be allowed. Either you're iterating the inner loop or the outer loop.

Jose Costa Teixeira (Jul 15 2020 at 16:39):

I can imagine you could do this

* with name[] do
**   given[] = "Robert"
**   given[] = "Robby"
**   given[] = "Rob"

* with name[] do
**   given[] = "William"
**   given[] = "Willy"
**   given[] = "Will"

Jose Costa Teixeira (Jul 15 2020 at 16:40):

(notice the subtle suggestion to remove the + or the space inside the [])

Keith Boone (Jul 16 2020 at 00:23):

I like Jose's syntax. FHIR Shorthand is supposed to be easy to write, which also means it needs to be easy to read. Having to keep track of the numbers makes it challenging to write, and I can't tell you how many times I've had to add complex logic to keep track of where I started and where I finished in code that should be compartmentalized. I have a set of things to add (extensions) which are for different reasons, and have to pass along how many extensions I've already added to another routine that depending on thee situation could add more, and both of these separate routines have to return somehow back to the caller the total count of what they've added. That's impossible to do in some programming constructs (e.g., declarative languages like XSLT don't let you pass back anything other than the result of the transform, and you cannot modify state [Yes, I've written FSH from an XSLT]. This would make my life so much simpler.

Chris Moesel (Jul 16 2020 at 01:09):

Yeah, I like the with syntax as well... I just don't think we can/should introduce something like that in the reconciliation phase without balloting it. I thought the proposed array syntax would not be very controversial (and is similar enough to existing syntax) that it might be appropriate for STU1... but I'm starting to have 2nd thoughts about that as well! It may be that we just work with you (the community) on getting these right early on and put them in the next ballot (and have them available in SUSHI prior to that -- just like we did w/ pre-STU1 features). I'll have to talk with
@Mark Kramer and the team about this.

Keith Boone (Jul 16 2020 at 15:44):

Chris, while YOU CAN do something like that in the reconciliation phase, it introduces a new requirement for a system interpreting the language. As a result, according to existing HL7 rules, you would need to reballot that particular item (and any other breaking change). If the syntax change were small, you might successfully argue it was a technical correction, but I don't think the with syntax is small as you say. So I'd probably classify that as "consider for future release", even though I like it very much.

Lloyd McKenzie (Jul 16 2020 at 15:48):

That's only true if you're going normative. It's totally fine to make substantive changes during reconciliation and not go back to ballot. The base question is whether the changes are so significant that the validity of the ballot result would be called into question. (I.e. you can't ballot one spec and publishing something that looks totally different and say it passed STU.)

Keith Boone (Jul 16 2020 at 16:07):

Thank you for that clarification. I wasn't aware of that subtlety. That actually helps me in another DSTU ballot where we added a requirement.

Chris Moesel (Jul 16 2020 at 16:11):

The base question is whether the changes are so significant that the validity of the ballot result would be called into question.

Right, and that's why I felt comfortable considering [ ]/[+] array indexers in reconciliation, but less so something more significant like a nested syntax. But as I noted above, we're leaning toward holding off on the array indexers too... We don't want to introduce something that is controversial.

Kirstine Rosenbeck Gøeg (Aug 27 2020 at 14:47):

What is the status of the proposal? I would certainly support the use of [ ]/[+] array indexers. I am currently implementing questionnaire instances, so some implementation of auto-increment would come in handy :-)

Chris Moesel (Aug 27 2020 at 15:36):

Hi @Kirstine Rosenbeck Gøeg -- that proposal has been folded into an updated proposal that includes support for parameterized RuleSets. See the slides from the most recent FSH Telecon for details: https://confluence.hl7.org/display/FHIRI/FHIR+Shorthand+Minutes+2020-08-13?preview=/91981964/91981967/FHIR%20Shorthand%20Conference%20Call-2020-08-13.pptx

If you don't want to go through the slides, the array syntax proposal is almost the same as what was proposed, except instead of [ ] to represent the current element, you use [=] instead. So it's [+] and [=]. As for implementation, we're currently focused on the FHIR Shorthand 1.0.0 spec finalization and a corresponding release (in the next 2 weeks) and then we'll move on to new features like the array syntax.

Elliot Silver (Aug 27 2020 at 16:48):

@Chris Moesel , how does the proposal handle “resetting” sub indexes? Does it recognize that if we have a[0].b[7], when we go to a[1], that we need to go back to b[0]?

Elliot Silver (Aug 27 2020 at 16:49):

Also, for the parameterized macros proposal, are there concerns with the $ notation conflicting with the code system $ notation?

Chris Moesel (Aug 27 2020 at 17:04):

@Elliot Silver Yes... bumping a higher-level index resets all indexes on childpaths to 0. As for the argument notation ($), that was discussed while I was on vacation, but I assume that we'll change it to something else if we determine it's an issue. But it's probably ok because you would never use an alias in a RuleSet declaration (only in the rules inside the RuleSet itself).

Jose Costa Teixeira (Aug 27 2020 at 17:05):

what happened to the with syntax?

Chris Moesel (Aug 27 2020 at 17:06):

@Jose Costa Teixeira -- nothing yet. I think the with syntax is complementary to these other proposals -- not a replacement for them.

Jose Costa Teixeira (Aug 27 2020 at 17:07):

I think the fact that you can change indexing on several levels in one line makes it complicated. - a[+].b[+] is confusing, perhaps only to me...

Elliot Silver (Aug 27 2020 at 17:08):

Ok, so that means we don’t need an explicit “set to zero”.
You wouldn’t define an alias inside a rule set, but I can certainly see you using one, or passing it as a parameter. Can one rule set invoke another?

Jose Costa Teixeira (Aug 27 2020 at 17:09):

the reason for me asking about with is that i'd feel safer nesting withs than this a[+].b[+] thing.

Elliot Silver (Aug 27 2020 at 17:10):

I can see how it’s two different issues. One doesn’t eliminate the need for the other.

Jose Costa Teixeira (Aug 27 2020 at 17:44):

It's not the syntax, but the fact that we have two increments to different levels in one line.

Jose Costa Teixeira (Aug 27 2020 at 17:45):

I think a[+] is fine, a[1].b[+] is fine. just a[+].b[+] (and possible variations) may be too flexible and hard to debug when something goes wrong.

Elliot Silver (Aug 27 2020 at 17:45):

No, that wouldn’t happen, I assume. Only one index gets increments on each line.

Jose Costa Teixeira (Aug 27 2020 at 17:46):

I'm looking at the ppt, page 9

Elliot Silver (Aug 27 2020 at 17:46):

Agree. a[].b[+] is ok.

Elliot Silver (Aug 27 2020 at 17:49):

Ah, I see. No, we shouldn’t have that, it’s confusing.

Elliot Silver (Aug 27 2020 at 17:51):

I’d rather see an empty array mean 0 or last value, and + be increment (along with reset sub counters).

Jose Costa Teixeira (Aug 27 2020 at 17:51):

Thanks, that is my feedback indeed (that it is confusing).

Elliot Silver (Aug 27 2020 at 17:53):

But it is a separate issue from ‘with’.

Jose Costa Teixeira (Aug 27 2020 at 17:53):

yes.

Chris Moesel (Aug 27 2020 at 18:17):

I think that [+] means [0] if the array is empty because you are essentially adding an element to the array. You had a zero-length array before, so using [ ] or [=] to access the current element is meaningless when there are zero elements. You need the [+] just to create that first element at index 0.

Chris Moesel (Aug 27 2020 at 18:19):

It is also what allows a RuleSet to be used to define arbitrary elements on an array via inserts (see the PPT I linked above if you're not familiar w/ this use). The [+] has to work equally as well at initializing an array and referencing the first element as it does at appending a new element to an existing array.

Elliot Silver (Aug 27 2020 at 18:20):

hmm, right.

Chris Moesel (Aug 27 2020 at 18:20):

Also note that you can still use numeric indexers. We don't need a "reset" indexer; we already have one: [0].

Chris Moesel (Aug 27 2020 at 18:21):

So aside from the RuleSet use case I noted above, if authors are more comfortable explicitly using [0] on the first element of an array and then using [+] for subsequent elements, that's totally OK.

Elliot Silver (Aug 27 2020 at 18:22):

A line with multiple + on it needs to be interpreted as "increment the first counter, reset the others"? it doesn't look intuitive, but I can understand the logic.

Chris Moesel (Aug 27 2020 at 18:24):

I understand @Jose Costa Teixeira's concern that managing subcontext of each nested array can be confusing -- but I'm not sure a great way around that aside from using [0] explicitly if you want to and/or using his with idea. But even with with, I think we'd want to allow the more compact syntax.

Chris Moesel (Aug 27 2020 at 18:28):

A line with multiple + on it needs to be interpreted as "increment the first counter, reset the others"?

The way I think of a[+].b[+].c[+] is that once you increment a to n, then a[n].b is empty -- so the b[+] adds that first a[n].b element. And once you have a new a[n].b[0] element, its c array is also empty, so you need the c[+] to add and reference its first element. That's probably a terrible way to describe it, but it's actually quite consistent if you think about it.

Elliot Silver (Aug 27 2020 at 18:28):

Actually, rereading the above -- I don't think RuleSets depend on the + notation. I think my suggestion of [] meaning zero or last would address the issue equally.

Elliot Silver (Aug 27 2020 at 18:29):

Right, I agree that it makes a certain sense, but isn't intuitive.

Chris Moesel (Aug 27 2020 at 18:32):

RuleSet: Foo
* bar[ ] = "hello"

Instance: Something
InstanceOf: SomeResource
* insert Foo
* insert Foo
* insert Foo

If [ ] means 0 or last, then you keep overwriting bar[0] -- because the first time it sets bar[0], then the next two times the last is still bar[0] (since there is nothing to increment it).

Elliot Silver (Aug 27 2020 at 18:34):

Ah, I hadn’t considered that use.

Chris Moesel (Aug 27 2020 at 18:34):

Right, I agree that it makes a certain sense, but isn't intuitive.

OK. So then maybe best practice is: a[+].b[0].c[0].

Elliot Silver (Aug 27 2020 at 18:35):

Best practise, but not required, otherwise it messes with your use case.

Chris Moesel (Aug 27 2020 at 18:37):

Or maybe we could make a rule that once you declare a [+] in a path, you can't declare another one in the same path. We'd have to think about if that excludes any meaningful use cases. It might be fine -- and if so, then we need to think about if it is a restriction that makes things more user-friendly or less user-friendly. It sounds like you and @Jose Costa Teixeira would consider it more user-friendly.

Elliot Silver (Aug 27 2020 at 18:43):

Hmm gets interesting when you get to ‘insert RuleSet(a[+])’.

Chris Moesel (Aug 27 2020 at 18:59):

Ha. Yeah. I guess so. But given that parameters are currently defined as doing just straight up string substitution, that would be kind of weird. But I guess you could write your ruleset that way.

Elliot Silver (Aug 27 2020 at 19:27):

RuleSet foo(x):
* $x$[=].value1 = "bar"
* $x$[=].value2 = "baz"

Profile: ...

* insert  foo(a[+])
* a[=].value3 = "first value"
*  insert foo(a[+])
* a[=].value3 = "second value"

Doesn't look all that weird to me.

Chris Moesel (Aug 27 2020 at 19:28):

No, just weirder. It's all relative I guess. ;-)

Jean Duteau (Aug 27 2020 at 19:33):

That ends up being this...

a[+][=].value1 = "bar"
a[+][=].value2 = "baz"
a[=].value3 = "first value"
a[+][=].value1 = "bar"
a[+][=].value2 = "baz"
a[=].value3 = "second value"

Elliot Silver (Aug 27 2020 at 19:45):

Ah, oops, try without the “[=]” inside the ruleset.

Jean Duteau (Aug 27 2020 at 19:50):

i liken RuleSets to C macros - there are going to be some best practices that even come directly over from the world of macros. One of the rules that I remember is be careful of having operators in your macro parameters. Even without the '[=]', that still might not be doing what you think it's doing:

a[+].value1 = "bar"
a[+].value2 = "baz"
a[=].value3 = "first value"
a[+].value1 = "bar"
a[+].value2 = "baz"
a[=].value3 = "second value"

This would end up with (I used XML notation to show that there are four a's in the array):
<a><value1 value="bar"/></a>
<a><value2 value="baz"/><value3 value="first value"/></a>
<a><value1 value="bar"/></a>
<a><value2 value="baz"/><value3 value="second value"/></a>

Chris Moesel (Aug 27 2020 at 20:28):

Good point, @Jean Duteau. I hadn't caught that, but you're right (of course)!

Jean Duteau (Aug 27 2020 at 20:33):

btw, my point is not that we should add in checks to detect this. I think that, like C macros, we are providing a powerful mechanism for IG FSH authors, but with "great power comes great responsibility".

Keith Boone (Sep 08 2020 at 05:51):

Having written about 5 pages of IG text illustrating a Measure definition using @Jose Costa Teixeira's with syntax. I find myself much in favor of it over the [+][=] syntax. The reason for this is that it creates much more human readable content with less dense code, and shorter lines. It has a hugely meaningful impact on my productivity to a) not have to read all the repetitive gunge, and b) to NOT have to type it.

I'll post a link later today showing how it looks.

Chris Moesel (Sep 08 2020 at 12:59):

Thanks, @Keith Boone. I imagine you still need to use some [+]/[=] syntax, right? I mean, the with construct doesn't fix the issue regarding needing to add or remove something in the middle of an existing long array... But it does make you not need to repeat [=] so many times (or maybe allows you to avoid [=] altogether).

Keith Boone (Sep 08 2020 at 16:30):

Yeah, somewhat, but not often. If I had to prioritize the two, with is more important than [+][=]. I think [=] is still useful b/c copy/paste is faster than setting up a with for a two-three line case.

Mark Kramer (Sep 09 2020 at 16:58):

@Keith Boone when we get to macros, the "with" doesn't have as much power as [+] and [=]. If you consider the CapabilityStatement example in the PPT that Chris mentions above, the ability to combine soft indexing with macros is when serious magic happens. I don't think you can get the same power with "with" but I'm willing to be proven wrong.

Jose Costa Teixeira (Nov 25 2020 at 21:05):

I don't know what is the status for this, but I'd suggest:

using element[0] and element[+] makes sense, I am not so sure we need element[=] or if that can be omitted as default.
for the first time a repeating element is used, element[+] is equivalent to element
the with syntax or a syntax like

* (with) identifier :
** system  =
** value =

woud be most interesting for example when handling logical models - it allows us to write exactly what we mean (i.e. indented elements).
Combining both we'd have a powerful way to add a lot of content just by copy-pasting and only changing what is needed:

* with identifier[+] :
** system = http://snomed.ct/info
** value = 12345

* with identifier[+] :
** system = http://snomed.ct/info
** value = 12345

* with identifier[+] :
** system = http://snomed.ct/info
** value = 23456

* with identifier[+] :
** system = http://snomed.ct/info
** value = 34567

Jose Costa Teixeira (Nov 25 2020 at 21:07):

basically the with is an inline macro declaration but cleaner (IMO).

Jose Costa Teixeira (Nov 25 2020 at 21:08):

the example above could perhaps be made leaner, i'm not sure. I don't see what could make my copy-paste easier than this nested/indented syntax

Chris Moesel (Nov 25 2020 at 21:13):

@Jose Costa Teixeira (and anyone else playing along), check out the "Wicked FHIR" presentation over in the FSH School Downloads to see the latest on the soft-indexing, parameterized rulesets, and "with" proposals.

Elliot Silver (Nov 25 2020 at 22:34):

I didn't realize you still had all three of auto-indexing, parameterized rulesets and "with" still under consideration. I thought you had decided that parameterized rulesets eliminated the need for at least one of the others.

One thing that may potentially be useful for parameterized rulesets is the ability to specify (part of) the item in the parameters. Consider a Questionnaire where you can have item[] or item[].item[] or item[].item[].item[]. It would be helpful to be able to say:

Ruleset ChooseFromValueSet(myItem, myValueSet)
* {myItem}.type = #choice
* {myItem}.answerValueSet = Canonical({myValueSet})

We should be explicit about how soft indexing works in this case. (Is the index evaluated only once on invocation, or does it repeat every time?)

Your capability statement example in the presentation increments the index in the first ruleset, and all other rulesets just reuse that index. Is there a need to be able to increment the index without using it?

RuleSet Support Resource(resource, expectation)
* rest.resource[=].type = {resource} ...

...
rest.resource[+]
SupportResource(MyResource, MyExpectation)

Last updated: Apr 12 2022 at 19:14 UTC

Main menu

FHIR Chat · Feedback Needed: Array Syntax · shorthand