Stream: implementers
Topic: HL7 Validator for Runtime Validation
Ken Sinn (Jan 31 2020 at 14:29):
What are implementers using for runtime validation for FHIR resource profile compliance? Are implementers using the HL7 FHIR Validator JAR at runtime, or using the validation included as part of HAPI?
Interested to hear what folks are using, thanks!
Ken Sinn (Jan 31 2020 at 14:35):
(in the context of runtime validation as messages are received, not for conformance testing)
Morten Ernebjerg (Jan 31 2020 at 15:20):
HAPI offers validation against profiles that can be used quite easily standalone or as part of a HAPI server:
https://hapifhir.io/hapi-fhir/docs/validation/introduction.html
The core Java validator can also be used without HAPI, see the comment starting on line 66 in this class:
Ken Sinn (Jan 31 2020 at 17:05):
Is the hapi validator (with or without HAPI) intended to be used to support $validate operations, or is it designed/implemented for system performance in mind e.g. runtime validation on resources during production RESTful interactions?
Grahame Grieve (Jan 31 2020 at 19:17):
I'm not sure what you mean by 'designed with system performance in mind'. It's designed to validate, and it does as thorough a job as it knows how to.
Most of the performance issues involve 4 things:
- where's the terminology server, and how fast is it?
- how do you manage the cache of conformance resources the validator uses?
- how do you look up other resources (transaction boundaries!)
- how much validation do you actually want to do?
The more profiles involved, and the more constraints involved in the profiles, the slower profile processing is.
When you host the validator in HAPI, you make the choices for the first 3 (to some degree). We could consider changes on the last.
Grahame Grieve (Jan 31 2020 at 19:19):
however, generally, my focus is on complete validation over fast validation. The main internal thing I could do to speed things up is parallelization, or to keep validating the instance while some other thread is off doing the terminology stuff. But when it comes to slicing, I need to wait for terminology validation anyway, and also if the terminology server is close and fast, all I'm doing it introducing threading bugs....
Ken Sinn (Jan 31 2020 at 19:41):
Thanks Graham, maybe I'm not being clear enough with the question.
Are implementers using the validator as part of pre-processing HTTP Create requests, to check for errors/etc? Or are they using some other sort of validation, e.g. JSON Schema validator or some bespoke validation code?
Grahame Grieve (Jan 31 2020 at 19:53):
some are, some aren't - I've run into people using schema
Grahame Grieve (Jan 31 2020 at 19:56):
by recommendation in practice is: in general, only run the business rules that you need to protect the system. These are typically different from validation things anyway.
but that advice depends on the trading environment. A server that is relatively public needs to at least always run the security relevant validation features. I should make it possible to run just the security relevant features of the validator...
Lloyd McKenzie (Jan 31 2020 at 22:38):
I know of one that's using it for messaging in production.
Kevin Mayfield (Feb 01 2020 at 15:45):
I've used it. I'm intending to fully validate on test instances but on live, I may not validate terminology - performance of core validation is good.
Arthur Krughkov (Feb 04 2020 at 16:33):
Hi guys, I understand that there is a 'validator' and that is something that is advised to use. What if our code is not Java based? We have an application that is running on Node.js (TypeScript app) and we need to validate the FHIR resources that are coming in. So what what we have done is use a fhir-json-schema-validator to validate against the R4 JSON schema that we pulled down form hl7.org site, but are struggling to find a library or a method to validate against StructureDefinition that is defined in our implementation guide (SIMPLIFIER). We were able to export a StructureDefinition resource in JSON format from SIMPLIFIER, but it is not a JSON schema that we can use to validate against. Are there tools that would allow us to convert the StructureDefinition resource into a standard JSON schema that we could just use a json-schema-validator for? Or are there standard/approved fhir libraries for Node.js/TypeScript that we could use to validate against our implementation guide specifications?
Grahame Grieve (Feb 04 2020 at 21:36):
you can host the validator inside your node code if you need to.
the structure definitions can say (and often do) things that can't be stated in json schema at all, and very often when I push json schema I find that the tools are not consistent.
Grahame Grieve (Feb 04 2020 at 21:37):
there's been some talk about a javascript validator, but it's a very large amount of work, so I don't think it's really happened
Grahame Grieve (Feb 04 2020 at 21:37):
if you want to host the validator, we can talk about how to do that (JNI + NativeHostServices.java)
Arthur Krughkov (Feb 04 2020 at 23:01):
Thanks Grahame, we thought about hosting the validator as a service, but that would add a lot more to the performance of the application. Do you have a different way of hosting it so that it doesn't have this type of performance impact?
Also, about the tool that you are talking about that you run against the structure definition. If our implementation Guide is not that complicated, can we run it and see if it covers everything? Could we use it?
Ward Weistra (Feb 05 2020 at 04:46):
Hi @Arthur Krughkov, you could also set up a FHIR server like Vonk based on the resources in you Simplifier.net project (http://docs.simplifier.net/vonk/features/conformanceresources.html#conformance-fromsimplifier) and validate your data against the server with the $validate
endpoint (http://docs.simplifier.net/vonk/features/validation.html). Keeps it nicely separated from the rest of your code and offloads the validation to a separate server. Would that work?
Arthur Krughkov (Feb 05 2020 at 12:41):
Hi Weistra, I see that working for development or for testing purposes, but for runtime I am trying to optimize the performance as much as possible. The most efficient way that I see possible is to use a light weight/native libraries to do schema validation (please let me know if there are other ways that I may not be thinking of). So I can see this option work along the side of runtime validation:
- setup a FHIR resource in Simplifier and expose $validate endpoint
- since our StructureDefinition is derived from base R4 FHIR schema, use a tool (from @Grahame Grieve ?) to generate a custom JSON schema
- manually verify and run tests to validate our own FHIR JSON schema
- at runtime use JSON schema validation class to validate against:
2a. R4 JSON FHIR schema from hl7.org
2b. Our custom schema
Grahame Grieve (Feb 05 2020 at 13:13):
you can also set up a HAPI Server much close to your system and get it to do the validation must faster. Or you can host the java validator inside your project too.
Or yes, it's true I have code that generates json schema from structure definitions. It has to drop a lot of nuance etc to do so, so it's a much less effective schema validation tool. And, btw, I challenge the nation that the json schema engines are lighterweight than the FHIR specific validator
Arthur Krughkov (Feb 05 2020 at 16:27):
@Grahame Grieve its good to hear that you think that the FHIR validator is just as quick if not quicker than a JS schema validator. We are using @asymmetrik/fhir-json-schema-validator, which is built on ajv (Currently Ajv is the fastest and the most standard compliant JS validator - according to a set of benchmarks) ... are there any performance metrics or details on the Java Validator? (I haven't looked at how to run Java Validator on my Node.JS server as of yet)
So if that is true and using the ajv based schema validator is just as quick as FHIR validator, which of the following is a better/quicker:
- HAPI FHIR server with a $validate endpoint
- Vonk FHIR server with a $validate endpoint (as per Ward's suggestion)
Outside of performance, is one validator more complete than the other?
I'm open to all solutions, as long as I can meet my performance requirements :)
Grahame Grieve (Feb 05 2020 at 19:29):
I haven't done metrics, and I don't know about 'quicker'. So I'm not making promises. But also, the validator is much more thorough than json schema; in particular, it gets the terminology correct in way that schema can't.
but that also takes a performance hit - validating terminology is slow compared to anything else, so people sometimes turn that off - but therefore if json schema doesn't do it, it won't take that hit.
As for which of the local servers is faster - vonk or hapi... I've not seen any comparison and haven't tested it. Nor can I comment on which validator is more thorough with authority, though I can tell you that it looks like the dotnet validator has 6 test cases, while the java validator has 248 test cases
Michael Lawley (Feb 05 2020 at 22:03):
The other dimension here is the level of Terminology support available to the validator. Ontoserver provides a $validate endpoint and very complete terminology support including for SNOMED CT with ECL. Under the hood, the core validation code is the same as that for HAPI, except for terminology.
Arthur Krughkov (Feb 10 2020 at 21:25):
@Michael Lawley are you saying that HAPI FHIR validator does not have a complete terminology support and the hl7.com/fhir/validator does have full terminology support?
Michael Lawley (Feb 10 2020 at 21:40):
I'm not up to date on the current state of HAPI's support as there were major improvements in either 4.0 or 4.1 IIRC.
However, I'm fairly certain that neither service has support for Snomed's ECL (which gets a lot of use in Australia).
Other common "edge cases" in terminology servers that I'm aware of include:
* Snomed postcoordination
* Infinite code systems (eg UCUM)
* Display Language support
* CodeSystem supplement support
* Versioning
* Dynamic ValueSets
* Implicit ValueSets for various CodeSystem (LOINC, Snomed, )
Sent from my iPhone
Arthur Krughkov (Feb 10 2020 at 21:51):
@Grahame Grieve , I have been looking into more details today and found a Node library to use https://github.com/joeferner/node-java. Essentially I will have to create, host, and manage a JVM and from reading the https://github.com/hapifhir/org.hl7.fhir.core/blob/master/org.hl7.fhir.validation/src/main/java/org/hl7/fhir/r5/validation/NativeHostServices.java , it states that "this solutions uses lots of RAM". If this is the only way for me to embed the validator in my code then we may need to run some tests and see if its worth doing it vs validation against a JSON schema.
Meanwhile, has anyone else asked about the conversion tool that you use to convert the StructureDefinition into a JSON schema? The reason I am asking is that I see the benefit of both applications. For data submission, I would want to make sure that there is more validation done to ensure that we have minimal data issues (and use the validator); but for cases where I may be building and just storing the data temporary (such as CMS), I would want to pay more attention to performance and stability (hence use JSON Schema validation only).
Grahame Grieve (Feb 10 2020 at 22:47):
so I have code that converts profiles to json schemas. But so little of the things people do in real profiles can be done in JSON schema:
- invariants
- value set bindings
- slicing by those (or by profile on a reference)
Grahame Grieve (Feb 10 2020 at 22:48):
these are core things that profiles do.
Grahame Grieve (Feb 10 2020 at 22:48):
the validator does all these things. If I try to generate JSON schema I have to strip all these things out, and all I get left with is a few cardinality constraints. It barely seems worthwhile to me
Grahame Grieve (Feb 10 2020 at 22:49):
just use the base json schemas. If they work, that is. My experience with json schema tooling is that they are very unreliable for the kind of schemas that we produce
Grahame Grieve (Feb 10 2020 at 22:50):
I did write that 'this uses lots of RAM'. "lots" is a relative term; you'd have to try it out and see.
Jose Costa Teixeira (Feb 19 2020 at 15:02):
is that json schema generator available? I have some RAM left, can I give it a try?
(and I want to see if we can generate json schemas for our profiles, to see if it is worth the effort)
Grahame Grieve (Feb 20 2020 at 00:23):
I'm sure I remember writing something that collapsed slicing out - for this purpose. but I can't find it now
Misha Kaletsky (Feb 24 2020 at 16:40):
Grahame Grieve said:
so I have code that converts profiles to json schemas. But so little of the things people do in real profiles can be done in JSON schema:
- invariants
- value set bindings
- slicing by those (or by profile on a reference)
Is this the code that generates https://www.hl7.org/fhir/fhir.schema.json.zip ? If so, is it available somewhere?
Grahame Grieve (Feb 24 2020 at 19:58):
yes org.hl7.fhir.definitions.generators.specification.json.SchemaGenerator
Last updated: Apr 12 2022 at 19:14 UTC