FHIR Chat · Progress improvements proposal · bulk data

Stream: bulk data

Topic: Progress improvements proposal


view this post on Zulip Vladimir Ignatov (May 14 2020 at 15:26):

I have a proposal for small addition to the export (and import) spec that
attempts to improve how a progress is being reported.

The problem is that the value reported in the x-progress is implementation-dependent
and could be literally anything. While that is understandable, it also means that it
is not really possible to create a client that can do anything with that information,
other than simply displaying it. In other words, I may have a client that is capable of
rendering a progress indicator, but it wouldn't know for sure if that x-progress value is
a percent number, or remaining time or something else...

My idea is to add another optional custom header that describes the type of data
that x-progress represents. That new header could be called x-progress-type and
could contain a value from a predefined list (e.g.: "percent-complete", "estimate-time-remaining", "resources-handled", "message"...).
Perhaps think of it as a ValueSet for progress information types.

The clients can then behave as follows:

  • If the x-progress-type is not present or contains unknown value, treat it as if the value was "message".
  • If the x-progress-type value is "message" the clients can simply display the value of x-progress as is.
  • If the x-progress-type value is "percent-complete" the clients can render a progress bar.
  • Other values are TBD, but clients can always fail back to "message"

view this post on Zulip Christiaan Knaap (May 15 2020 at 19:25):

Suggestion: standardize x-progress (if present) to the format "x/y" where y can be "?" if the server does not know the total number of resources to process. If x and y are unknown to the server, just don't send the x-progress at all.
If you want to express a percentage, use y = 100

view this post on Zulip Vladimir Ignatov (May 15 2020 at 20:45):

Seems doable but implies parsing on the client-side. Also, that might restrict the servers by not allowing them to send a free-text message.

BTW, I just realized that a client should be able to inspect the size of the downloaded files and that way determine that the process is not stuck. This means that the only remaining issue is that the client would like to know how to interpret the progress information. Consider these examples:

  • 350/4600 is usable for progress bars if parsed. Can also be displayed as is, although it is not very descriptive.
  • 350/4600 resources is usable for progress bars if parsed. Can also be displayed as is and is slightly easier to understand if rendered somewhere.
  • 46% is usable for progress bars if parsed. Can also be displayed (even though it is a little short).
  • number - Not usable! The client wouldn't know what it is. It can be displayed as is, if that makes any sense.
  • string - anything else can only be displayed as is.

If we decide to specify the format (instead of the two headers solution), then perhaps we need to make sure that this format can cover every single use case. For example the following formats can probably be used and parsed with simple RegExp:

  • ^{number}{unit} {optional suffix} - for example "33% done" or "5 min remaining"
  • ^{numerator}/{denominator} {subject} {optional suffix} - for example "34/45 resources handled" or "1024/45678 bytes exported..."
  • anything else as free message - for example "Export in progress. Please try again in 3 minutes."

We can predict most of it, but I don't feel confident that we can have a format that covers everything. At least not until we have good number of servers and EHRs that support bulk-data in production, so that we have a reasonably good idea about what the possible x-progress values might be.

view this post on Zulip Christiaan Knaap (May 15 2020 at 20:56):

Regarding "https://hl7.org/fhir/uv/bulkdata/export/index.html#response---complete-status": As per https://hl7.org/fhir/uv/bulkdata/export/index.html#response---complete-status the file download can only start when the server is done exporting the data. By then progress reporting is not useful anymore.

view this post on Zulip Vladimir Ignatov (May 15 2020 at 21:00):

Yes, that is correct. Please disregard the part about the progress being stuck :)


Last updated: Apr 12 2022 at 19:14 UTC