Stream: questionnaire
Topic: multi-language questionnaires
Kenneth Myhra (Nov 13 2019 at 07:34):
It was pointed out to us that this section in the specification http://hl7.org/fhir/questionnaire.html#sdc refers to support for multi-language questionnaires, but I I'm having trouble locating the multi-language part in the SDC implementation guide. If someone could guide me to it I would greatly appreciate it
Also saw this fairly recent post on Translation packs for Questionnaire so maybe there currently isn't any support for that in the SDC? https://chat.fhir.org/#narrow/stream/179255-questionnaire/topic/Translation.20Packs.20for.20Questionnaire
Lloyd McKenzie (Nov 13 2019 at 12:55):
The sdc-questionnaire profile includes a notes section that speaks to the use of translations, however it would be appropriate to at least point to this language from the rendering page. Can you submit a change request? We've also agreed to add support for valueset-based translations of question and other item text.
Kenneth Myhra (Nov 14 2019 at 08:15):
@Lloyd McKenzie with the Translation Packs for Questionnaire mentioned and some other complaints I noticed about the questionnaires getting quite big does that mean you will move away from the language extension in later versions of sdc implementation guide?
Lloyd McKenzie (Nov 14 2019 at 13:13):
It's two different design approaches to the same problem. The specification will be driven by the implementer community. If the community consolidates on a single approach, we'll drop the other option. If it turns out that both approaches are useful (presumably in different circumstances/environments), then we'll likely keep both but provide guidance about which should be used when.
Kenneth Myhra (Nov 14 2019 at 13:35):
Thanks, that helps in our discussions regarding aligning with sdc.
I could just briefly add that we have previously solved this by creating a separate Questionnaire for each language using the language attribute to identify the language + an unique identifier to tie them together.
Brian Postlethwaite (Nov 14 2019 at 20:52):
With R4 can use the based on link instead of the identifier association.
Giorgio Cangioli (May 12 2021 at 13:53):
I've been asked how multi-language support to questionnaire can be provided...
.. Am I correct if I suggest to create 'specialized' questionnaire instances derived from a common parent questionnaire
and not to try to rely on the 'translation' extensions, translated designations for coded concepts , and so on ?
Morten Ernebjerg (May 12 2021 at 14:38):
@Giorgio Cangioli That is the approach we are taking, e.g. in this COVID-related IG. For us, it has the advantage that the client using the Questionnaire doesn't have to understand the various separate elements/extensions carrying multi-languge info - it just has to get the Questionnaire in the right language and can be language-agnostic from that point on. We typically generate the different language versions from a single template & dictionaries using super-minimal custom templating code.
One possible difficulty with this approach when using a standard FHIR server is that it is currently impossible to search for resources by language - see this thread and the FHIR-32020 JIRA ticket suggesting such functionality.
Lloyd McKenzie (May 12 2021 at 15:16):
We've talked about ConceptMap as a mechanism to provide item text translation, but I don't know that we've taken it very far
Jose Costa Teixeira (May 12 2021 at 15:25):
that means more questionnaires to maintain...
Jose Costa Teixeira (May 12 2021 at 15:26):
Q1: what's the downside of translation extensions?
Jose Costa Teixeira (May 12 2021 at 15:28):
Q2: Doesn't multiplying questionnaire instances (in Belgium we have 3 languages, IHE Volume may have more languages) make mapping etc much more complicated?
Jose Costa Teixeira (May 12 2021 at 15:29):
Thanks @Giorgio Cangioli and @Morten Ernebjerg for bringing it up, I was not aware of that choice, so this is my first reaction
Lloyd McKenzie (May 12 2021 at 15:53):
The downside of translation extensions is that the people creating the translations are often not the people who have authority to update the Questionnaire.
Jose Costa Teixeira (May 12 2021 at 20:34):
correct.
I think we should come up with a more friendly way to handle translations. Like a resource file and the corresponding mechanisms associated with it..?
Jose Costa Teixeira (May 12 2021 at 20:35):
(I was also thinking that we could have an extension to the questionnaire to list the other questionnaires that are translations of the first one, but that only makes the problem easier to track, not easier to avoid...
Lloyd McKenzie (May 12 2021 at 21:41):
I think the notion is that ConceptMap would work something like the resource file you're envisioning - though you'd need a set - one for the item text and one for each of the answer text codes.
Carl Leitner (May 13 2021 at 02:07):
Jose Costa Teixeira said:
Q2: Doesn't multiplying questionnaire instances (in Belgium we have 3 languages, IHE Volume may have more languages) make mapping etc much more complicated?
WHO has six official languages
Carl Leitner (May 13 2021 at 02:12):
It would be great be able to take advantage of the existing tooling for translations (e.g. .po/.pot files)
Grahame Grieve (May 13 2021 at 02:24):
what are those?
Carl Leitner (May 13 2021 at 03:00):
See https://en.wikipedia.org/wiki/Gettext What's nice is that they get the grammar right for plurals across multiple languages (look at discussion of Plural Form). List of implementations is here https://www.gnu.org/software/gettext/manual/html_node/List-of-Programming-Languages.html#List-of-Programming-Languages. I have used https://translations.launchpad.net/ to manage .po and .pot curation for a few projects.
Carl Leitner (May 13 2021 at 03:02):
Supported in transifex too https://docs.transifex.com/formats/gettext (though I don't have direct experience)
Grahame Grieve (May 13 2021 at 05:03):
@Mark Iantorno
Tilo Christ (May 13 2021 at 07:15):
Carl Leitner said:
It would be great be able to take advantage of the existing tooling for translations (e.g. .po/.pot files)
Yes! +1 for .po/.pot (or XLIFF)! It would be great if questionnaire translation was directly compatible with industry standard tooling (not conceptually similar, but really identical). po/pot and XLIFF files can be successfully translated by lay people with POEdit and are also supported by numerous other tool chains, plus all the home-grown toolchains in enterprises (for all the steps like grammar checking, style checking, English-English translation, English->Target translation, verification and clinical validation by native speakers, etc.). po/pot also has very mature mechanisms for pluralization (it can deal with more complex rules than just 0, 1, >1. Many languages have a concept of "few", some have special plurals for anything that ends with the digit 3, etc.).
Oliver Egger (May 13 2021 at 08:56):
thanks for these links! that looks interesting. did you already use that in a questionnaire setup? use a base language for the questionnaire and then on the fly translate it with the corresponding .po/pot/XLIFF file? another possiblity would be that such a translation could be offered from the ig publisher I assume.
Jose Costa Teixeira (May 13 2021 at 09:06):
I think that once the translation mechanism is there, we could add it to the IG Publisher.
Jose Costa Teixeira (May 13 2021 at 09:16):
Draft idea (with the information I have now - can someone give feedback/improve this?):
- One questionnaire
-
Multi-language = using a standard translation extension.
For external translations: -
Include other translations in .po/.pot / XLIFF or similar mappings
- The IG Publisher could be added one feature that takes such translations and adds the corresponding translation extension, therefore resulting in the same standard approach for translations (which means we only change the publisher and templates in one way).
Mark Iantorno (May 13 2021 at 13:18):
I'll make a point to look into this next.
Carl Leitner (May 13 2021 at 15:57):
@Oliver Egger have not used that for questionnaires. Right now we're preparing a multilingual questionnaire in Vietnam. @Jose Costa Teixeira anything here that would make sense to look at yet?
Jose Costa Teixeira (May 13 2021 at 16:00):
The VN questionnaires are using the translation extension. Not one questionnaire per language.
Perhaps those questionnaires could be pilots for an upcoming translation feature..
Elliot Silver (May 13 2021 at 16:15):
There are also multi-language discussions today in #terminology. It might be helpful to bring questionnaire, terminology, and potentially other focus areas together for a more general discussion of the issues rather than each coming up with a different solution.
Jose Costa Teixeira (May 13 2021 at 16:31):
Agree. The questionnaire does contain ValueSets, so hopefully there's not a gap in approaches there.
And for all purposes, my asymptote is multi-language IGs with questionnaires and forms and all
Elliot Silver (May 13 2021 at 16:37):
So you mention multi-language IGs, which brings up the question of general multi-language resources. This is why I wonder if a broader solution is needed. Do we want a new "Translation" resource that can be applied to any other resource in a way that allows editing of the translation independent of editing of the translated resource? What are the implications of this? (Details of multi-language support is not my forté, but I see the discussions swirling and think that a unified approach might be needed.)
Jose Costa Teixeira (May 13 2021 at 16:47):
https://chat.fhir.org/#narrow/stream/179166-implementers/topic/Representing.20translations
Jose Costa Teixeira (May 13 2021 at 16:50):
I think the idea is to leverage from current tools and practices, and not ask people to create translated resources, but to just provide the translation
Jose Costa Teixeira (May 13 2021 at 16:51):
i.e. a Dutch author would author a resource in Dutch (sushi or JSON or XML), and a French translator would not need to author anything. They would get a list of thing to translate and provide the translations. The system would create the final artifact (which I presume would be the same questionnaire but with extensions)
Jose Costa Teixeira (May 13 2021 at 16:51):
The same thing could work for valuesets and codesystems
Jose Costa Teixeira (May 13 2021 at 16:52):
And IG narrative content.
Jose Costa Teixeira (May 13 2021 at 16:52):
and IG templates
Morten Ernebjerg (May 13 2021 at 19:41):
I like the idea of having a standard way of supplying & applying translations (we also used a version of this approach). From my POV, if there are standard (i.e. non-FHIR) solutions with good existing tooling, that would be preferable to starting from scratch with a new resource. From my experience, I think it would make a lot of difference to the technical writers/translators if they have tools to work with without having to know much about FHIR. Existing tooling could perhaps also relatively easily be built into servers like HAPI so that one could do the translation behind the scene & just request the resource in the required language (speaking of an integrated approach: I think the need to be able to query for a resource by language is also part of this).
I suppose one open question is whether one could use Questionnaires out-of-the-box as templates with the chosen approach, e.g. by simply putting key in the required text fields (we did that). For value sets, I guess one question is how to combine this approach with current approach of designations (or code system supplements) for different languages & doing language-specific expansions of VSs (cf. @Elliot Silver's call for a unified discussion).
Elliot Silver (May 13 2021 at 20:14):
How about putting the translation files into a DocumentReference unaltered, and an extension or careful use of existing elements in DocRef pointing back to the questionnaire. Then extensions in questionnaire that identify the key to look up in that file. That way we can add translations without modifying the questionnaire. I wonder if a pattern like this could be reused elsewhere, like vocab.
Lloyd McKenzie (May 13 2021 at 22:01):
Why a DocumentReference? Why not just a Binary?
Elliot Silver (May 13 2021 at 22:05):
Fair enough. Because DocRef came to mind and Binary didn't? Because there might be some use to additional metadata that I can't yet think of? We need to have the file indicate the canonical URL of the questionnaire; can I do this with extensions on Binary?
Elliot Silver (May 13 2021 at 22:12):
How do I find the relevant Binary? I can do reverse search using context.related on the questionnaire if it is DocRef.
Lloyd McKenzie (May 13 2021 at 22:53):
If the Questionnaire points to the Binary, you can just do a 'read'.
Elliot Silver (May 13 2021 at 22:54):
My thought was that the translation file point to the Questionnaire. That way the person doing the translations doesn't need to modify the Questionnaire. Additionally, you can have multiple translations all pointing to the same Questionnaire.
Tilo Christ (May 14 2021 at 09:59):
I agree that the person doing the translation should never have to touch the Questionnaire. Their skills (e.g. "Certified translator for French medical devices" - a non-technical liberal arts degree) and their tooling (AcroLinx, Cosima, translation memory, etc.) would not allow them to do that.
One commonly employed mechanism is that the "thing to be translated" knows the base name of the translation file and then augments that with the desired locale at runtime in order to resolve this to the actual file. In essence what browsers are also doing with the good old "Accept-Language" header. That way you would ever only encode the base name into the Questionnaire and not touch it again after creation.
John Manning (May 14 2021 at 11:39):
@Tilo Christ, that's similar to what we did with our Flutter remake of the PRAPARE Questionnaire.
Granted, we connected our translation to the question based on its linkId, everything was preset at compile time.
I know this is a bit more Flutter specific, but the package we used connects to a single Google spreadsheet so that, from the perspective of the translator, all they need to review is the content within a single column.
There are probably many ways to make this more extensible for other approaches and tech stacks. The spreadsheet approach gave us access to =GOOGLETRANSLATE()
, which was super useful in retrospect.
Jose Costa Teixeira (May 14 2021 at 12:11):
Once that translation is done, I propose it should be quickly integrated in the publication mechanism.
Jose Costa Teixeira (May 14 2021 at 12:13):
the people that do the translation should not need to edit the questionnaire JSON / XML... but they should be able to see the translation to see if the translation is fit for the context
Tilo Christ (May 14 2021 at 12:17):
John Manning said:
Tilo Christ, that's similar to what we did with our Flutter remake of the PRAPARE Questionnaire.
Granted, we connected our translation to the question based on its linkId, everything was preset at compile time.
I know this is a bit more Flutter specific, but the package we used connects to a single Google spreadsheet so that, from the perspective of the translator, all they need to review is the content within a single column.
There are probably many ways to make this more extensible for other approaches and tech stacks. The spreadsheet approach gave us access to
=GOOGLETRANSLATE()
, which was super useful in retrospect.
Across all the different technology stacks the two formats that are universally being used are gettext (po/pot) and XLIFF (typically the older 1.2 version). The Flutter library for gettext would be "https://pub.dev/packages/i18n_extension".
The machine-based pre-translation certainly shaves time off the translation effort. You can use tools like "POEdit" or CrowdIn which will tie in with Google Translate, DeepL, etc.:
Tilo Christ (May 14 2021 at 12:19):
Jose Costa Teixeira said:
the people that do the translation should not need to edit the questionnaire JSON / XML... but they should be able to see the translation to see if the translation is fit for the context
With commonly used tooling they would see the original text (e.g. English), their translation, the order of the original texts, and potentially a human-readable hint (such as "needs to fit on a small button, keep it short")
Lloyd McKenzie (May 14 2021 at 13:28):
I presume that context matters for translation - what would we need to do to ensure that the hierarchy of questions, association of answers to questions, etc. manifests appropriately in the translation tools?
Tilo Christ (May 14 2021 at 13:39):
Lloyd McKenzie said:
I presume that context matters for translation - what would we need to do to ensure that the hierarchy of questions, association of answers to questions, etc. manifests appropriately in the translation tools?
Translation tools are inherently linear (think "glorified Excel sheet"). When I had the challenge of presenting a hierarchical questionnaire in a linear fashion for my vertically scrolling form filler I used the traversal order called "pre-order" in order to linearise it. I think the same would work for translations.
See: https://en.wikipedia.org/wiki/Tree_traversal#Pre-order,_NLR
Jose Costa Teixeira (May 14 2021 at 15:31):
My understanding is that the translation is done by expression where the text is not unique. For example, let's say the text "done" appears as a caption in a button, or in a question as a status of a procedure. This will translate into different expressions in other languages. So these would be different entries to translate
Jose Costa Teixeira (May 14 2021 at 15:39):
Lloyd McKenzie said:
I presume that context matters for translation - what would we need to do to ensure that the hierarchy of questions, association of answers to questions, etc. manifests appropriately in the translation tools?
My preference would be for those people to immediately see the result in the form (+1 for the form preview). At least context mismatches will be spotted that way.
Tilo Christ (May 14 2021 at 17:17):
Jose Costa Teixeira said:
Lloyd McKenzie said:
I presume that context matters for translation - what would we need to do to ensure that the hierarchy of questions, association of answers to questions, etc. manifests appropriately in the translation tools?
My preference would be for those people to immediately see the result in the form (+1 for the form preview). At least context mismatches will be spotted that way.
"Those people" get paid by the word and won't be bothered to learn/use somebodies tool. They have built up muscle memory for their existing highly streamlined tool chain.
Regarding the context mismatches (same word needing different translation depending on context): that is indeed a real-world issue. The solution depends on the format you are using:
gettext (po/pot): Each text string can be annotated with an optional context. context + string will have to be unique.
XLIFF: Each text string is accompanied by a unique identifier
For informal context (contributing to background knowledge, rather than key uniqueness), both formats support adding a human-readable comment to each translation unit.
I think maybe this extension http://hl7.org/fhir/StructureDefinition/valueset-comments could be used to fill such a human-readable comment.
Carl Leitner (May 14 2021 at 18:24):
agreed with @Elliot Silver that we should only have one Questionnaire here. In the .po/.pot world, we would use the Questionnaire to generate a .pot file (translation template). Then each language (ideally locale) would have it's own .po file containing the translations. Note, that one way that I have dealt with the context issue is to group the translatable text based on software module - the analog to this here would be at the Questionnaire level.
Translation services such as launchpad.net have in-built suggestion / curation features. See here for some examples:
https://translations.launchpad.net/ihris-manage/trunk/+pots/ihris-manage-person-position
Lloyd McKenzie (May 14 2021 at 18:36):
Would we then update the Questionnaire from the .po files to embed the translation extensions for use by tools that render and allow filling out the Questionnaire?
Elliot Silver (May 14 2021 at 18:42):
In my vision, the Questionnaire would be single language as created by the author. Translation-aware clients would retrieve the Questionnaire and the translation file, and use the translation file when displaying the Questionnaire.
Lloyd McKenzie (May 14 2021 at 18:53):
That seems like a big ask for questionnaire clients to know how to do... (Understanding the translation extensions is a bit of an ask, but it's less and would work using standard FHIR libraries.)
Elliot Silver (May 14 2021 at 18:55):
Maybe it's a capability of the server -- if you ask for a questionnaire, and provide a locale in the HTTP header, the server will do the translation if available. Also, we've got other significant expectations of SDC actors, so is this actually too much of an ask.
Tilo Christ (May 14 2021 at 19:14):
I would agree with the preference for an approach to keep the client "dumb" and ask the server to provide a translated questionnaire.
I am not sure that supporting non-FHIR translation files is inherently harder than supporting the FHIR translation extensions. When you are building a localised client you already have to pull in the libraries to translate the UI anyhow (all the buttons, confirmation dialogs, etc. which do not live in FHIR). In my current UI framework of choice it is 25 LOC to evaluate the FHIR translation extension vs. 4 characters of code to pull in a translation from a .po file. But it does depend a bit on the framework. The Linux, Java, Flutter camps would probably agree with me, whereas the Android colleagues would prefer if it was XLIFF.
Tilo Christ (May 14 2021 at 19:16):
Carl Leitner said:
Note, that one way that I have dealt with the context issue is to group the translatable text based on software module - the analog to this here would be at the Questionnaire level.
I think this is going to work quite well up until the point when you start putting "contained" value sets/code systems into the questionnaire. I can imagine some clashes between those.
Elliot Silver (May 14 2021 at 19:44):
What if translation of the questionnaire is an operation with the target language as a parameter? That would allow easy retrieval of the unmodified resource, as well as a way that the server can indicate support for translation in its capability statement. The place where this runs into difficulties is with other operations on Questionnaire, like defined by SDC. How would you ask for a pre-populated questionnaire in a specific language?
Elliot Silver (May 14 2021 at 19:54):
Lloyd McKenzie said:
That seems like a big ask for questionnaire clients to know how to do... (Understanding the translation extensions is a bit of an ask, but it's less and would work using standard FHIR libraries.)
Tilo Christ said:
I am not sure that supporting non-FHIR translation files is inherently harder than supporting the FHIR translation extensions.
Absolutely, multi-linguagal clients are going to be more effort than unilingual. I disagree with Lloyd that translation extensions are inherently easier than translation file because they are supported by the standard FHIR libraries. The standard FHIR libraries support what the library authors choose to support. If we settle on a mechanism that is viewed as helpful, they'll support it, particularly if it is a pattern that can be reused for things other than questionnaires. It would be great to come up with a solution that could be reused for other definitional resources (ooh, a multilinguagal IG) and for vocabulary--anything with a canonical url. (I don't think it works as well for other types of resources, but maybe.)
Lloyd McKenzie (May 14 2021 at 19:57):
Translation as an operation makes sense. We'd need to think about validation rules.
Tilo Christ (May 14 2021 at 20:37):
Lloyd McKenzie said:
Translation as an operation makes sense. We'd need to think about validation rules.
How would you see validation rules change with a translated questionnaire? I would assume that all the coded info would be identical; order, cardinalities, etc. identical. Only things like "item.text", "display", etc. would change from one language to another one?
Lloyd McKenzie (May 14 2021 at 20:43):
There's an expectation in a QuestionnaireResponse that item.text in the response matches item.text in the Questionnaire. Similarly 'display' values for codes are expected to match those in the code system (though it's only a warning if they don't match).
Tilo Christ (May 14 2021 at 20:45):
Elliot Silver said:
What if translation of the questionnaire is an operation with the target language as a parameter? That would allow easy retrieval of the unmodified resource, as well as a way that the server can indicate support for translation in its capability statement. The place where this runs into difficulties is with other operations on Questionnaire, like defined by SDC. How would you ask for a pre-populated questionnaire in a specific language?
Regarding the proposed "target language parameter": Would this existing part of the spec already suffice? https://hl7.org/fhir/2018May/languages.html##http
This would in essence be a best-effort approach. I am asking the server what I want and it returns the best match that it can and tells me what that is. I think this would not strictly require an enhanced capability statement.
Elliot Silver (May 14 2021 at 21:06):
Lloyd McKenzie said:
There's an expectation in a QuestionnaireResponse that item.text in the response matches item.text in the Questionnaire. Similarly 'display' values for codes are expected to match those in the code system (though it's only a warning if they don't match).
They still would -- an uploaded QR includes the language of the response, which is validated against the translated Questionnaire; translated display values are validated against translated code systems.
Where I see difficulty is with "enableWhen" and similar elements on the client side, however if they focus only on code values, rather than the display values, it should still work.
Tilo Christ said:
Regarding the proposed "target language parameter": Would this existing part of the spec already suffice? https://hl7.org/fhir/2018May/languages.html##http
This would in essence be a best-effort approach. I am asking the server what I want and it returns the best match that it can and tells me what that is. I think this would not strictly require an enhanced capability statement.
I thought that was in there, thanks for finding it. My reason for suggesting an operation was the ability to control when a translated questionnaire is retrieved, and when the original resource is. I don't know if we are concerned about clients with little control over their HTTP headers.
I'd still like to see something in a capabilitystatement to let you know the server supports this feature (which I think is, if we use separate translation files, beyond what is described in the languages page).
Lloyd McKenzie (May 14 2021 at 22:01):
The issue is that the canonical URL of the questionnaire doesn't change when you translate it, so if a validator retrieves the questionnaire by URL, it's going to get the original version, not the translated version - and the text won't match.
Elliot Silver (May 14 2021 at 22:03):
Or, looking at it a different way, it gets whatever version of the questionnaire matches the language the request asks for, if any. If you didn't want it in russian, don't ask for it in russian.
Lloyd McKenzie (May 14 2021 at 22:16):
Versioning and translations are different things. The version shouldn't change when you translate.
Grahame Grieve (May 14 2021 at 22:20):
this sounds like a great thing for a connectathon, but not this week! I'd like a SDC appendix describing how to register translation files with a forms server, and explaining how to ask for it in a particular language
Elliot Silver (May 14 2021 at 22:21):
Sorry, sloppy language. I meant "it gets the questionnaire translated into whatever language the request asks for, if any." My point was if you want the questionnaire in the original language, don't specify that you want it in another language.
Elliot Silver (May 14 2021 at 22:39):
OK. FHIR#32379
Lloyd McKenzie (May 15 2021 at 03:19):
Is this a solution that would be specific for Questionnaire or is this something we'd want for other resources - CodeSystems and Value Sets seem like obvious candidates, though maybe others too?
Elliot Silver (May 15 2021 at 03:22):
I’m hoping it could be useful for others, including IGs and profiles.
Jose Costa Teixeira (May 15 2021 at 07:19):
My proposal was to use 1 questionnaire and the language extension.
Jose Costa Teixeira (May 15 2021 at 07:21):
Also, the point of context-sensitive translation - my answer to that is to simply include it in the IG Publication tool
Jose Costa Teixeira (May 15 2021 at 07:23):
Tilo Christ said:
"Those people" get paid by the word and won't be bothered to learn/use somebodies tool. They have built up muscle memory for their existing highly streamlined tool chain.
Right, so their work gets imported into the publication/build process (which I believe is normal for software translators) and they get to see the final preview of their work
Jose Costa Teixeira (May 15 2021 at 07:28):
for me the solution should not rely on external translation operations - just the questionnaire get published with the necessary content in several languages. Those translations are also "official" deliverables.
Jose Costa Teixeira (May 15 2021 at 07:45):
Sorry i missed the discussion, but are we thinking of putting the translations file on the forms server, and not in the IG?
Jose Costa Teixeira (May 15 2021 at 07:46):
to me they belong in the publication process . I'm thinking of Belgium w 3 languages, but those translators are not spinning off - they are part of the official process
Jose Costa Teixeira (May 15 2021 at 08:29):
I'd propose this:
image.png
Jose Costa Teixeira (May 15 2021 at 08:31):
This requires (only?) 2 additions:
- Generate an XLIFF file based on the content identified as "translatable" (with the "translatable" extension or simply empty translations)
- Import the translation from the XLIFF into the "translation" extension
Jose Costa Teixeira (May 15 2021 at 08:32):
This means the same process for internal or translations, leaving the translators with a standard workflow, I believe
Grahame Grieve (May 15 2021 at 11:09):
I don't think it's different to any other resource, which is why I said explain, rather than define
Jose Costa Teixeira (May 15 2021 at 11:17):
right. I think all resources should benefit from such an approach
Jose Costa Teixeira (May 15 2021 at 11:17):
I drafted this:
image.png
Jose Costa Teixeira (May 15 2021 at 11:36):
this repo contains a few XLF files for a questoinnaire:
https://github.com/hl7-be/riziv-inami/tree/master/input/translations
Jose Costa Teixeira (May 15 2021 at 12:09):
I remember it contains the word "active" which, depending on the context can be translated in different ways. In this case, the translator could quickly look at the translated result in its context (the ig publisher, or the generated form) and spot any issues.
Jose Costa Teixeira (May 15 2021 at 12:22):
and the sushi to generate (almost) empty placeholders for translation:
https://github.com/hl7-be/riziv-inami/blob/master/input/fsh/BeAllergyIntolerance.fsh#L193:L194
Jose Costa Teixeira (May 15 2021 at 12:41):
the fhir resource (which has the content to create the XLF) looks like this
"item": [
{
"linkId": "patient",
"text": "Patient",
"_text": {
"extension": [
{
"url": "http://hl7.org/fhir/StructureDefinition/translation",
"extension": [
{
"url": "lang",
"valueCode": "fr-BE"
},
{
"url": "content",
"valueString": "."
}
]
},
{
"url": "http://hl7.org/fhir/StructureDefinition/translation",
"extension": [
{
"url": "lang",
"valueCode": "nl-BE"
},
{
"url": "content",
"valueString": "."
}
]
}
]
},
Jose Costa Teixeira (May 17 2021 at 13:51):
is this a topic we could address during this week?
Jose Costa Teixeira (May 17 2021 at 13:51):
@Tilo Christ @Brian Postlethwaite @Paul Lynch @Oliver Egger what do you think of the approach?
Paul Lynch (May 17 2021 at 18:14):
Jose Costa Teixeira said:
is this a topic we could address during this week?
If you would like me to add something to the Questionnaire track agenda for a discussion about this, I would be happy to do so. Just pick an open slot and let me know.
Elliot Silver (May 17 2021 at 18:44):
I see a few groups of translation requirements:
- IG and profiles -- these are likely to be viewed on a website, and the translation will be done in advance of publication using standard translation tooling. The IG publisher and templates should be able to generate one website with all the translated langauges, and allow selection either through the http language header or explicit choice by the user (picking a language from a drop-down on every page).
- Definitional resources like Questionnaire, and likely terminology resources -- these are likely be to retrieved from a FHIR server, and the translation will be done in advance of publication using standard translation tooling. FHIR servers should have a way of retrieving the resource in the original or a selected langauge.
- "Dynamic" resources - These are the "one-off" resources like specific Observations where we want the translation to be embedded in the resource. They are likely to be retrieved from a FHIR server, and translation is done along with instance creation. For this, I see using the existing translation extensions.
Is this a reasonable enumeration of the use cases? Are there more or fewer categories?
Brian Postlethwaite (May 18 2021 at 08:52):
(sorry only just read this) I think there are 2-3 parts here the authoring, the storage and the execition.
Authoring I agree with those saying should be with whatever tools people natively use. However I think this only compares with valuesets/conceptmaps, structuredefs and questionnaires are the only resource types that apply here, as they are about user interaction, not storing clinical data as a result of the interaction. Labels a user sees/selects.
Brian Postlethwaite (May 18 2021 at 08:53):
Questionnaires has a concept of derivation and a re-assembly operation that this could fit into also. (original use case was for component questionnaires)
Jose Costa Teixeira (May 19 2021 at 18:36):
I think this discussion should be taken outside of Questionnaire, right?
Jose Costa Teixeira (May 19 2021 at 18:37):
@Giorgio Cangioli FYI
Carl Leitner (May 19 2021 at 21:43):
Jose Costa Teixeira said:
I think this discussion should be taken outside of Questionnaire, right?
where to?
Jose Costa Teixeira (May 19 2021 at 22:49):
https://chat.fhir.org/#narrow/stream/179166-implementers/topic/Representing.20translations
?
Elliot Silver (Jun 25 2021 at 01:21):
I see an update to J#32379. Any interesting points come up in discussion @Lloyd McKenzie?
Lloyd McKenzie (Jun 25 2021 at 02:32):
We started with a plan for Questionnaire in SDC in R4 with a plan to push it into R5 for a generic solution. Specific questions for the community:
- Is it ok if we choose to support XLIFF only - with an expectation that anyone who wants to use PO/POT can convert to and from if necessary?
- We know that browsers populate the Accept-Language header for regular browser calls, however we believe that RESTful calls by apps running inside the browser will NOT automatically have such an Accept-Language header specified.
Feel free to look at the other thoughts are here: https://jira.hl7.org/browse/FHIR-32379
Much more discussion to come
Brian Postlethwaite (Jun 25 2021 at 02:33):
My concern was that relying on the content negotiation language only would mean from browser might be hard to "not" provide it and just get the raw base questionnaire
Elliot Silver (Jun 25 2021 at 02:38):
DICOM faced a similar issue -- although you can do content negotiation in a HTTP GET, a URL in a PDF doesn't give an opportunity to specify an ACCEPT header; so they ended up adding a format parameter to the URL to provide more explicit control.
Elliot Silver (Jun 25 2021 at 02:39):
Was there strong feeling that translation support was needed?
Lloyd McKenzie (Jun 25 2021 at 02:39):
We could define a similar parameter to _format for language if we needed
Lloyd McKenzie (Jun 25 2021 at 02:40):
Don't understand your last question @Elliot Silver?
Elliot Silver (Jun 25 2021 at 02:42):
Did the discussion go "Wow, how did we get to R4 without supporting translation?" or was it "Sure, add it, I guess someone might need it, but I'm not going to bother implementing it"?
Elliot Silver (Jun 25 2021 at 02:43):
(Sorry, I didn't realize the topic was up for discussion or I would have tried to join the call.)
Brian Postlethwaite (Jun 25 2021 at 03:34):
Continuing next week I believe.
Jose Costa Teixeira (Jun 25 2021 at 06:48):
I missed the questionnaire calls. Can someone please point me to their schedule?
Ilya Beda (Jun 25 2021 at 12:07):
Hi!
It seems that we are going to use XLIFF as a primary tool for Questionnaire translation. I don't think that it is a good idea.
With all my respect to XML is a kind of obsolete format for our days.
PO looks like the more universal tool to do translations. There are various kind of open source and commercial tools that supports po.
PO is a kind of well know translation standard and I confused a lot that we are not going to support it.
I used it to translate questionnaires from English to Hebrew and it worked just fine.
I can share my experience on the workgroup call next week.
Paul Lynch (Jun 25 2021 at 12:25):
Jose Costa Teixeira said:
I missed the questionnaire calls. Can someone please point me to their schedule?
They are Thursdays 5-6 p.m. ET. https://www.hl7.org/concalls/CallDetails.cfm?concall=55578
Paul Lynch (Jun 25 2021 at 12:34):
Ilya Beda said:
With all my respect to XML is a kind of obsolete format for our days.
I am not sure that is a good argument for PO, since PO files are written in their own custom format (neither JSON, nor YAML, nor anything standard). They might be easier to read, though.
That being said, your input on the workgroup call would be much appreciated. I didn't get the impression that anyone on the call yesterday had experience with XLIFF or PO, but we were hoping that since there seemed to be conversion tools in both directions, it would not be necessary to directly support both.
Lloyd McKenzie (Jun 25 2021 at 13:29):
The basic driver for the decision is that with XLIFF there's only one file that defines both source and translations, so only one thing to point at. Also, a (very brief) Google search seemed to suggest that XLIFF could do more than POT/PO and there seemed to be community tools that converted between the two. Also, in general, it'll be simpler for everyone if there's only one supported approach. I guess the last answer is that the XML file will be somewhat easier to read and write for FHIR-specific tools interested in supporting the implementations to generate the "to translate" file or to consume the "translation" files.
Lloyd McKenzie (Jun 25 2021 at 13:30):
All that said, if it's not true that the community can manage either XLIFF or POT/PO and convert between them and there's much deeper industry support for POT/PO than XLIFF, we could adjust. (The notion that XML is obsolete doesn't fly because XML trumps custom syntax any day in terms of ease of use.)
Tilo Christ (Jun 25 2021 at 20:43):
A couple of inputs (since the calls will be in the middle of the night from where I am residing):
The basic driver for the decision is that with XLIFF there's only one file that defines both source and translations, so only one thing to point at.
The same is true for both formats. The .po (aka "gettext") format is only using the .po file for the actual translation. The .pot file (po-template) is a .po file that has not been translated to a target language yet, but only contains the source strings. XLIFF simply uses the same file extension for either case.
Also, a (very brief) Google search seemed to suggest that XLIFF could do more than POT/PO and there seemed to be community tools that converted between the two.
XLIFF can indeed "do more" than PO, which at the same time is its blessing and its curse. While the PO spec fits on a page, XLIFF's is a novel, and comes in multiple flavors (Angular XLIFF, Apple XLIFF). I am using both commercially to translate content and software into a dozen languages and the "more" of XLIFF is commonly not needed. Your finding that they can be converted is correct. Though with a loss of fidelity in either direction (meta info lost when converting XLIFF -> PO, fine-grained pluralisations lost in PO -> XLIFF). Both losses typically not relevant.
Also, in general, it'll be simpler for everyone if there's only one supported approach.
That one I would agree on. PO and XLIFF are both the pre-dominant translation formats (PO more in the Unix/Linux and Python/Django camp, XLIFF more in the Apple, Angular, Android camp - Java and .Net are doing their own thing but have libraries to process either). Choosing either one of the two and running with it will provide a very solid set of libraries and tools and translation services.
I guess the last answer is that the XML file will be somewhat easier to read and write for FHIR-specific tools interested in supporting the implementations to generate the "to translate" file or to consume the "translation" files.
Not necessarily. Every programming language comes with an excellent PO library. The challenge with XLIFF again is that you can parse them with any XML parser, but you will get a lot of different nested tags back in comparison to PO and need to know if and how to handle them.
Regarding @Elliot Silver's point around the need: At least for my particular use-case - Patient-facing surveys - the need for translation is very pressing. Many questionnaires exist in numerous languages to target various communities (English, Spanish, Russian, etc.). The current approach with the translation extension is incompatible with commonly used translation tooling. Official adoption of an industry standard would be very helpful.
Lloyd McKenzie (Jul 08 2021 at 22:05):
On today's SDC call, we talked briefly about XLIFF vs. PO further on today's call. Gut instinct is that XLIFF is easier to read/work with for newbies. We would presumably define a 'minimal' subset of XLIFF that would get generated by the "prepare for translation" operation so flavors might not matter much. However, we'll still take input from the community. We'd really rather not support both approaches as that significantly increases complexity for everyone.
Also talked about format for ids within the file. Proposal is to use FHIRPath. Further discussion (and some questions) in the comment on the tracker item here: https://jira.hl7.org/browse/FHIR-32379
Ilya Beda (Jul 15 2021 at 18:39):
Hi!
Here is how I approach Questionnaire translation.
I started with PO format and write a tool that extracts all text from the Questionnaire to the PO file. PO file is translated via 3rd party software. When I render a questionnaire all text elements are wrapped with gettext
and got translations from PO.
This approach works well when you use Questionnaires defined as part of your source code.
You can create a fully automated CI/CD pipeline that extracts text to PO, post it to the translation tool. Once the translation is done translation tool pushed the updated PO file back to the repository.
The same approach may be done via XLIFF since it is just another format of the translation file.
However, when you need to create questionnaires in the runtime, it stopped work for me.
After a few iterations with custom JSON format for translation, I decided that the easiest way is to use Questionnaire resource. This translation dedicated Questionnaire contains Questionnaire. item
only on top-level.
During questionnaire rendering, I load the translation dedicated Questionnaire and for each Questionnaire.item
of the original questionnaire, I load its version from the translation dedicated Questionnaire by linkId
.
Then I do a deep merge between two Questionnaire. item
. So attributes from translation dedicated Questionnaire. item
overrides original ones.
This translation technique worked very well for me. I hope somebody could find it useful as well.
I would like to hear your thoughts.
Lloyd McKenzie (Jul 15 2021 at 18:55):
Limiting Questionnaires to only having items at the root level isn't feasible. The need to translate shouldn't impact the design of the Questionnaire in any way. Also, the item.text isn't the only thing in a Questionnaire you might need to translate - description, copyright and other metadata - and maybe even certain extensions might also need translation.
Ilya Beda (Jul 16 2021 at 07:39):
There is no limitation for the Questionnaire that defines a form. The limitation on nested items applied for the Questionnaire that stores translations.
I created an example.
The original questionnaire that I want to translate:
https://gist.github.com/ir4y/88e386263f1d2c5aad3826aa86f074ec#file-acc-edit-diagnoses-yaml
The questionnaire that stores translation data:
https://gist.github.com/ir4y/88e386263f1d2c5aad3826aa86f074ec#file-acc-edit-diagnoses-fr-translation-yaml
The same technique with deep merge can be used for description, copyright, and other metadata on Questionnaire root.
Lloyd McKenzie (Jul 16 2021 at 14:59):
What do you use for canonical URL and version for the translation Questionnaire (given that servers should restrict to only having one instance with a given canonical + version combination)?
Ilya Beda (Jul 19 2021 at 09:10):
Sorry, I am not sure that I got your question, Lloyd.
A questionnaire for translation is not used as a real Questionnaire. In my case, it is like a convenient format to store translation.
Translation versioning is separated from the original Questionnaire.
Lloyd McKenzie (Jul 19 2021 at 12:49):
If a Questionnaire is being persisted and you're using the Questionnaire resource structure, then it's "real" from the perspective of the system storing it, so considerations around unique identity still matter.
Last updated: Apr 12 2022 at 19:14 UTC