FHIR Chat · Synthea · tooling

Stream: tooling

Topic: Synthea


view this post on Zulip Jason Walonoski (Sep 27 2018 at 18:33):

Synthea 2.2 is now available: https://github.com/synthetichealth/synthea/releases/tag/v2.2.0

Significant new features:

  • New Support for FHIR R4 (v3.5.0)
  • FHIR ExplanationOfBenefit Resource (STU3 only) using Blue Button 2.0 Implementation Guide profiles.

view this post on Zulip Jason Walonoski (Jan 11 2019 at 23:25):

I updated Synthea today for R4 (v4.0.0) in case anyone wants to generate and use synthetic records for the latest R4 version at the connectathon this weekend in San Antonio.

view this post on Zulip Jason Walonoski (Sep 13 2019 at 13:46):

Synthea has been updated for US Core STU3 (FHIR R4) v3.0.0 for the FHIR Connectathon in Atlanta this weekend.

This is the first time that Synthea covers all of the US Core resources, including Device and clinical notes. US Core is turned off by default, so if you want to turn it on then edit synthea.properties and change exporter.fhir.use_us_core_ig = true.

view this post on Zulip Jason Walonoski (Nov 14 2019 at 14:07):

Synthea 2.5 is now available: https://github.com/synthetichealth/synthea/releases/tag/v2.5.0

With this release, you can now use Synthea as a Java library dependency, and not just as a Java Application. See https://github.com/synthetichealth/synthea/wiki/Embedding for details on how to include synthetic patient generation in your Java project.

view this post on Zulip Eric Haas (Dec 28 2019 at 03:18):

Hey J is there a way to use Synthea to get a snapshot in time like for my use case after an admission? so the encounter is still in progress?

view this post on Zulip Eric Haas (Dec 29 2019 at 15:28):

also, how to I get only the modules I want, I removed all the modules except for my example module, but I am still generating a bunch of immunizations and wellness visit encounters?

view this post on Zulip Eric Haas (Jan 01 2020 at 20:38):

Ok so my hack is to just rebundling the resources leaving out the immunizations and wellness stuff. I also created Orgs from the Payor's CSV and made a stand alone Coverage for the Patient which references the Payor Org. I think those would be nice additions to have out of the box (instead of a contained Coverage inside the EOB)

view this post on Zulip Eric Haas (Jan 01 2020 at 20:40):

Just want to say how nice it Is to be able to generate a few to a bunch of US Core resources with minimal effort. :-)

view this post on Zulip Jason Walonoski (Jan 07 2020 at 13:06):

Hey J is there a way to use Synthea to get a snapshot in time like for my use case after an admission? so the encounter is still in progress?

Yes, but not easily. A few options:

  1. Manual Edit. Edit the patient Bundle and remove the second half (or whatever) of the encounter and change the status.
  2. Search. Generate a lot of patients. Search them until you find a patient Bundle where the last encounter is actually still in progress. Needle in a haystack.
  3. Create a new module with the module-builder. Or edit an existing one. Make it so the encounter and the associated activity (e.g. Labs, Observations, Procedures) are only partially complete, and it never finishes.

view this post on Zulip Jason Walonoski (Jan 07 2020 at 13:14):

also, how to I get only the modules I want, I removed all the modules except for my example module, but I am still generating a bunch of immunizations and wellness visit encounters?

Immunizations and Wellness visits are built in (hard-coded, if you will) and can't be disabled at this moment. If you really want to do that, comment out these lines:

https://github.com/synthetichealth/synthea/blob/65d032592b992f5f09fee03173ebbd47e9a7e1e6/src/main/java/org/mitre/synthea/modules/EncounterModule.java#L110-L113

and change the below return interval to some huge number of years (e.g.return 900; years)

https://github.com/synthetichealth/synthea/blob/65d032592b992f5f09fee03173ebbd47e9a7e1e6/src/main/java/org/mitre/synthea/modules/EncounterModule.java#L204

I haven't tried those steps though, so buyer beware...

view this post on Zulip Jason Walonoski (Feb 11 2020 at 21:19):

Synthea now soliciting contributions to our new Synthea-international repository: https://github.com/synthetichealth/synthea-international

Thanks to an open-source contributor, our first non-US location is Finland!

This is a work in progress, and there are probably some bugs. If you'd like synthetic data outside of the US, please use or contribute to Synthea-international!

view this post on Zulip Jason Walonoski (Jun 04 2020 at 14:44):

The latest Synthea code on GitHub (the master branch) now has a fully functional COVID19 model. This data has been used in several hackathon events already, and is now being featured in this VHA and FDA challenge: https://precision.fda.gov/challenges/11

The model features tests, diagnoses, treatments, lab work, use of ventilators and critical supplies. We hope this helps the FHIR community work towards developing Health IT solutions to address the pandemic!

view this post on Zulip Michael van der Zel (Jul 23 2020 at 15:53):

@Jason Walonoski We have forked synthea here https://github.com/dHealthNL/synthea to add Dutch specific config. We also need to add some switches in the sources to disable ethnicity and race stuff. Were do you want those changes?
UPDATE: I think I have the answer. I put them in my synthea-international fork.

view this post on Zulip Michael van der Zel (Jul 23 2020 at 15:56):

@Jason Walonoski Another question: We created a module that creates a sequence of Encounters. We expect the result to be patients that have Encounters, but not all of them. The first Encounter is a GP visit, the next is a Diagnostics Concult, then a Operation, then a Follow Up. We then expect some patients to only have a GP visit and others to already be in the Follow Up. How can we do this? Thanks.

view this post on Zulip Jason Walonoski (Jul 27 2020 at 12:59):

@Michael van der Zel You've probably read these, but let me point you to the documentation: https://github.com/synthetichealth/synthea/wiki/Generic-Module-Framework%3A-States#encounter and https://github.com/synthetichealth/synthea/wiki/Generic-Module-Framework%3A-States#encounterend and https://github.com/synthetichealth/synthea/wiki/Generic-Module-Framework%3A-Transitions.

So, what all the documentation boils down to is this: each encounter should start with an Encounter state and end with an EncounterEnd state. So, put the EncounterEnd states between the encounter starts. Also note that the "Wellness" encounter waits until a regularly scheduled GP visit, whereas all the other types of encounters occur immediately.

Also, you can use the different types of transitions to select where patients go. For example, you can use a Distributed to make 80% go to the Consult, and 20% skip it. Or only 70% go to the follow-up encounter, or something like that. You also have the ability to use conditional logic that takes into account patient data or attributes (which are like variables that you read and write on each patient).

Let me know if you have questions. Happy to explain it all and walk you through it.

view this post on Zulip Jason Walonoski (Jul 27 2020 at 13:00):

If you don't want the next encounter to occur immediately use a Delay state.

view this post on Zulip Jason Walonoski (Jul 27 2020 at 13:02):

Also, most modules have a Delay or Guard state in the beginning so they do not run as soon as the "baby" is born. If data does not appear that you think should be in the record, this could be the reason why. The encounters are happening to newborns, and Synthea default is to only output the last 10 years of history.

view this post on Zulip Michael van der Zel (Aug 24 2020 at 14:47):

I don't think that answers my question. Will try to clarify.
We define a module that creates 2 Encounters with a delay. 1 for GP visit and 1 after that for Surgery. Everybody does both visits. The challenge is that for some patients the Surgery is planned, but not done yet. It is just because the delay after the GP visit takes us after "today", so that is why the Surgery encounter is nog there yet.

view this post on Zulip Jason Walonoski (Aug 25 2020 at 13:34):

@Michael van der Zel I'm not sure what your question is, so I made you this module. Please take a look at it in the module-builder. I'm willing to have a Microsoft Teams call or something if you'd like. example.json

view this post on Zulip Carl Anderson (Sep 15 2020 at 18:25):

I have a question about loading Synthea-generated data into a local FHIR server. I'm happy to start a new stream for this, but for starters I thought I would glom on to this one. If anyone feels strongly that this should be a new stream, please don't hesitate to suggest it now.

So, here goes.

I have generated 1000 random patients using the us-core profile option, all geographically centered around the Madison, WI metropolitan area. That all worked beautifully.

From those files, I want to load the data into a local dockerized hapi server for my test application. Setting out to do so, I started small, loading the first 5 patients and right away I noticed something odd in the server data. There were duplicate organizations, locations, practitioners, etc. Any resource that was referenced by another resource would be duplicated in the server.

Doing a little digging, I see that each patient file declares all its dependent resources, so each individual patient file can be loaded and it gets its dependencies created. This is great if I just wanted to load one patient - but I want a fake EHR database. So, what I need is for each patient to reference a resource rather than include a copy of one.

Is there a way to configure Synthea to produce output files that I can load in a particular order to get what I want? I assume not, having spent some time hunting through the config properties looking for this. I'm happy to be wrong here. The generate.database_type option caught my eye, but I haven't experimented with the non-default setting yet. Anyway, insights into configuration options are welcome.

Assuming that I can't configure Synthea to generate a more loader-friendly format, I experimented with parsing the server response to load bundles and manually mapping the response 'server IDs' to the Synthea-generated 'file IDs'. I partially implemented this in a loader script, which attempts to save the ID mappings and even update all references before processing the bundle - but I ran into several problems doing this and abandoned the work several times after several false starts.

Before I dive into providing example data - let me stop myself here and ask if this makes sense and if there's an existing solution. Thanks!

view this post on Zulip Jason Walonoski (Sep 16 2020 at 00:16):

Are you using a recent version of Synthea? Provider organizations should all have the same UUID across the patient files.

view this post on Zulip Jason Walonoski (Sep 16 2020 at 00:17):

For example, with the latest code, I see a bunch of patients with this exact reference:

"organization": {
"reference": "urn:uuid:b1ddf812-1fdd-3adf-b1d5-32cc8bd07ebb",
"display": "BETH ISRAEL DEACONESS HOSPITAL - PLYMOUTH"
  }

view this post on Zulip Michael van der Zel (Sep 16 2020 at 09:03):

@Jason Walonoski I am trying to figure out how Providers assigning to Encounters works. Depending on the reason for the Encounter we need to assign a certain Provider. For example the initial contact is always through a GP. Then for simple procedures the patient is referred to a normal hospital and for more complex procedures the Encounter should be with a academic/specialist hospital. Can I configure this?

view this post on Zulip Jason Walonoski (Sep 16 2020 at 12:56):

In synthea.properties there is an option called generate.providers.selection_behavior with a few different options.

# Provider selection behavior
# How patients select a provider organization:
#  nearest - select the closest provider. See generate.providers.maximum_search_distance
#  quality - select the best provider if quality is known. Otherwise nearest.
#  random  - select randomly.
#  network - select the nearest provider in your insurance network. same as random except it changes every time the patient switches insurance provider.
generate.providers.selection_behavior = nearest

The behavior works separately for each type of provider. For instance, the GP/PCP is selected separately from an urgent care facility, which is selected separately from an emergency facility, and so on. Once a provider is selected, it often does not reset unless you are using network and a change in insurance causes the patients to "shop around." (this is US-centric and also a lie, since the US health system is so opaque and anti-patient that "shopping around" is effectively a fool's errand, or time pressure of care effectively eliminates choice).

If none of those options work for you, you can override the behavior by implementing https://github.com/synthetichealth/synthea/blob/master/src/main/java/org/mitre/synthea/world/agents/behaviors/IProviderFinder.java and then editing Provider.buildProviderFinder().

view this post on Zulip Carl Anderson (Sep 16 2020 at 15:05):

Yes, I've checked out the repo and built it from master (although, it's a month old by now). But I think the issue is that when I PUT the resources into my hapi server, they get a numeric ID assigned to them, throwing the uuid away. That breaks all the references.

Maybe this is a hapi-specific issue, then?

view this post on Zulip Jason Walonoski (Sep 16 2020 at 16:38):

Hmmm.... not really a HAPI issue. I wonder if we should change the transaction bundle for shared resources to use conditional create, and use Bundle.entry.request.ifNoneExist in those instances... we'd then have to change the references to all the shared resources to use "conditional references" (see https://www.hl7.org/fhir/http.html#trules).

view this post on Zulip Jason Walonoski (Sep 16 2020 at 16:39):

Yeah, @Carl Anderson your issue is definitely related to how we output the Bundle. The only issue with making that change is that I'm not sure how widespread the conditional create and conditional references are supported.

view this post on Zulip Carl Anderson (Sep 16 2020 at 19:47):

Ah, so something like this?

{
  "resourceType": "Bundle",
  "type": "transaction",
  "entry": [
    {
      "fullUrl": "urn:uuid:678f72e9-9de7-59d5-37ce-13a92d481dfe",
      "resource": {
        "resourceType": "Patient",
        "id": "678f72e9-9de7-59d5-37ce-13a92d481dfe",
        "identifier": [
          {
            "system": "https://github.com/synthetichealth/synthea",
            "value": "678f72e9-9de7-59d5-37ce-13a92d481dfe"
          }
        ]
      },
      "request": {
        "method": "POST",
        "url": "Patient",
        "ifNoneExist": "identifier=https://github.com/synthetichealth/synthea|678f72e9-9de7-59d5-37ce-13a92d481dfe"
      }
    }
  ]
}

view this post on Zulip Carl Anderson (Sep 16 2020 at 19:50):

This is assuming that the server accepts the uuids as valid resource IDs, I suppose, too. I'm still not sure how to approach this when I have to map the synthea uuids to server-issued IDs, but I feel this may be one of the missing pieces.

view this post on Zulip Carl Anderson (Sep 16 2020 at 20:05):

Oh, and I would need to update all references, too, to use conditional references. So, would that be something like this?

From:

        "managingOrganization": {
          "reference": "urn:uuid:e6c5d179-370a-3659-9ce6-3d09da3c3ad0",
          "display": "UNITYPOINT HEALTH - MERITER"
        }

To:

        "managingOrganization": {
          "reference": "Organization?identifier=e6c5d179-370a-3659-9ce6-3d09da3c3ad0",
          "display": "UNITYPOINT HEALTH - MERITER"
        }

view this post on Zulip Jason Walonoski (Sep 17 2020 at 13:46):

Regarding the first example, yes, except I would only do that for the shared resources (e.g. Organization). It doesn't matter if the server does not accept the UUIDs as ids, because the UUIDs will be in the identifiers (which is different). For your second/third examples, yes, but again only for the shared resources (not all resources).

view this post on Zulip Carl Anderson (Sep 17 2020 at 14:04):

Thanks, @Jason Walonoski , that's essentially what I found in my testing. By adding the ifNoneExist to the bundle.entry.request - I've avoided creating duplicates in the data I've loaded. By changing the reference values to use the Synthea identifier, I've created the correct linkings in the loaded data, too. :-D

However, I don't understand why I wouldn't want to apply this transformation to references of non-shared resources. At the point I'm loading a file, how would I know if a resource would be shared in a different file that I haven't opened yet? Can you say more about this?

view this post on Zulip Josh Mandel (Sep 17 2020 at 20:03):

I'm trying to remember how this syntax workds...

{
          "reference": "Organization?identifier=e6c5d179-370a-3659-9ce6-3d09da3c3ad0",
          "display": "UNITYPOINT HEALTH - MERITER"
}

... and it doesn't seem documented at http://build.fhir.org/references-definitions.html#Reference.reference or https://www.hl7.org/fhir/references.html or http://build.fhir.org/bundle.html#references . Do you know the right place to look @Jason Walonoski ?

view this post on Zulip Carl Anderson (Sep 17 2020 at 20:45):

@Josh Mandel https://www.hl7.org/fhir/http.html#trules has it.

Also, what I'm seeing is that if the referenced resource has no identifier, this type of reference won't work. The other catch seems to be that, with the Synthea data, some resources have an identifier attribute but the uuid does not always appear as a valid value. So, in order to make my data load work I need to:

  • if there's an identifier attribute:
    • if there is no identifier with value="my uuid"
      • include the synthea identifier in the list of identifiers

This is true of Practitioners, FWIW, in the data set I generated. They have an MRN identifier, but (at least in some cases) nothing else.

view this post on Zulip Josh Mandel (Sep 17 2020 at 20:53):

Thanks for the link! So this approach is interesting, and will work in environments that support FHIR's transaction interaction.

view this post on Zulip Josh Mandel (Sep 17 2020 at 20:55):

In synthea's data, I'd assume all Practitioners and Organizations will have an identifier -- so the generator could have a "create transaction bundles as outputs" mode that takes advantage of these identifiers for all references. I'm not sure I follow the nested bullets above.

view this post on Zulip Carl Anderson (Sep 17 2020 at 21:07):

In code I have this:

class EntryReferenceUpdater:
    """Updates references to use an ID and creates Resources conditionally."""

    _id_type = {}
    _identifier = 'https://github.com/synthetichealth/synthea'

    def update_identifier(self, resource):
        if not 'identifier' in resource:
            return
        uuid = resource['id']
        synthea_id = {
            'system': self._identifier,
            'value': uuid,
        }
        if not [x for x in resource['identifier'] if x.get('value') == uuid]:
            resource['identifier'].append(synthea_id)

view this post on Zulip Josh Mandel (Sep 17 2020 at 22:00):

But the important thing should not only be identifiers that match the Resource.id; like for example in https://r4.smarthealthit.org/Practitioner (synthea generated data) every practitioner has an identifier in the NPI namespace that should work just as well.

view this post on Zulip Carl Anderson (Sep 17 2020 at 23:13):

Yeah, and I'm starting to understand that even this (my experimental) approach may not work.

The problem is with forward references.

I can't simply replace all references with a conditional reference because at the time of load - the target resource doesn't exist yet. So, it appears I've solved the duplicate resource problem and replaced it with a different one. Right now I'm getting 'stub' resources which, I believe, are being created by forward conditional references.

I could try to create a partial ordering of the resources in the bundle to eliminate forward references, but I don't believe I can rely on the server to create the resources mid bundle-transaction. I'm wondering if I would have to create entire separate bundles...

view this post on Zulip Josh Mandel (Sep 17 2020 at 23:21):

It's not clear whether fhir transaction processors should be able (or required) to discover these dependencies (forward reference via search + conditional creation of a resource that will satisfy his search) within a transaction bundle.

view this post on Zulip Josh Mandel (Sep 17 2020 at 23:22):

Pragmatically, creating a partial ordering by type and then loading type specific bundles would probably work fine. You wouldn't get true transactional behavior across the bundles of course.

view this post on Zulip Jason Walonoski (Sep 18 2020 at 13:16):

There are options in Synthea to output the Organizations and Practitioners in separate bundles: exporter.hospital.fhir.export and exporter.practitioner.fhir.export. The only issue with this is that the data is duplicates to what is exported in the patient bundles.

view this post on Zulip Jason Walonoski (Sep 18 2020 at 13:17):

We could make a change, where if those options are enabled, that data is not duplicated in the patient bundles. Then the user would have to load the Organizations and Practitioners before loading any of the patient bundles.

view this post on Zulip Josh Mandel (Sep 18 2020 at 13:43):

That would be sweet! (Even with duplication, these type-specific bundles plus the pattern od search-based reference and ifNoneExists conditions would make loading a lot easier.)

view this post on Zulip Josh Mandel (Sep 18 2020 at 13:44):

I'm kind of surprised that this hasn't come up before; guess folks are generally happy to have isolated graphs per patient, and willing to tolerate duplicate demo data for the benefits of isolation!

view this post on Zulip Jason Walonoski (Sep 18 2020 at 14:13):

New Synthea issue created https://github.com/synthetichealth/synthea/issues/795. Feel free to express implementation preferences there.

view this post on Zulip Eric Haas (Sep 21 2020 at 17:46):

to create a partial ordering of the resources in the bundle to eliminate forward references,

That is what I do

order based on resource type... putting patients, practitioners, orgs first since they are all referenced by Groups, etc.

here is an example

view this post on Zulip Eric Haas (Sep 21 2020 at 17:50):

resort list so will load nicely as transaction

will need to customize this this list for each case

# pprint ([f"{x['resourceType']}/{x['id']}" for x in r_list])

keyorder = 'Patient','Practitioner','Organization','Questionnaire','QuestionnaireResponse','Group'

r_list.sort(key=lambda x: keyorder.index(x['resourceType']))

#print('---')
#pprint ([f"{x['resourceType']}/{x['id']}" for x in r_list])

view this post on Zulip Eric Haas (Sep 21 2020 at 17:55):

@Carl Anderson , @Josh Mandel For Argo PL we are strongly leaning toward requiring reference.reference so only providing reference.identifier won't be conformant. It looks like when processing a transaction does the server will attach an id to all resources. Is that what happens?

view this post on Zulip Josh Mandel (Sep 21 2020 at 18:50):

That is correct. This discussion is just about having a convenient format for quickly getting data into a server; the data in a server would be expected to always populate reference.reference

view this post on Zulip Marc Hadley (Oct 12 2020 at 15:20):

I'm working on addressing this in Synthea and my testing against hapi.fhir.org has revealed a couple of things:

  1. If the bundle is of type transaction then the post fails for all entries in the bundle if any of the entries fail the ifNoneExist precondition
  2. If you add a resource using a reference.identifier, the server does not populate the reference.reference in the resource. For example see the managingOrganization in this location that references this organization

I can get around 1 to some extent by using a batch bundle instead of a transaction, still noodling on what to do about 2.

view this post on Zulip Marc Hadley (Oct 13 2020 at 21:10):

Some additional findings:

  • If the bundle is of type transaction then backward reference.reference to entry.fullUrl are resolved and converted to ResourceType/CreatedResourceID references
  • If the bundle is of type batch then even backward reference.reference to entry.fullUrl fail to resolve and cause an error when trying to create the referencing resource
  • Synthea generated FHIR Organization resources will have the same identifier.value and id across runs and this will soon be the case for Location resources

This makes it tricky to create a file or set of files that will:

  • Load without error into a server that already has some existing Synthea generated data, and
  • Uses reference.reference throughout instead of reference.identifier, and
  • Not create duplicate resources

view this post on Zulip Josh Mandel (Oct 13 2020 at 21:14):

Thanks for the experimentation and analysis here! Have you assessed which of these behaviors are "working according to the fhir spec" vs "undefined by the fhir spec / grey areas" vs "hapi quirks/bugs"?

view this post on Zulip Marc Hadley (Oct 13 2020 at 21:28):

I haven't been able to find a clear treatment of how preconditions within a transaction bundle should work. Seems that HAPI treats a a positive ifNoneExist match as a failure but you could argue it the other way. Here's a link to the conditional create text: https://www.hl7.org/fhir/http.html#ccreate.

view this post on Zulip Marc Hadley (Oct 13 2020 at 21:35):

I did just find this though "in a transaction (and only in a transaction), references to resources may be replaced by a search URI that describes how to find the correct reference" in this section https://www.hl7.org/fhir/http.html#trules which would help with references across multiple files but since it only applies to transactions there's still the issue with avoiding creating duplicates and it doesn't help with the batch approach where that wouldn't be an issue.

view this post on Zulip Marc Hadley (Oct 14 2020 at 12:55):

Here is an approach I think could work:

  • Output the Organization, Location and Practitioner in separate bundles using the exporter.hospital.fhir.export and exporter.practitioner.fhir.export options. We will modify the exporter so that these are output as batch bundles with ifNoneExist preconditions. We will also modify the Location export so that the managingOrganization reference uses the Organization identifier to allow it to import without reference errors.
  • We will modify the per-patient transaction bundle exporter so that, when the exporter.hospital.fhir.export and exporter.practitioner.fhir.export options are specified, Organization, Location and Practitioner resources will be omitted. We will utilize search URI references in any resource in the patient bundle that needs to refer to an Organization, Location or Practitioner.

You would first load the hospital and practitioner bundles, then each of the patient bundles. The only reference that wouldn't be fully resolved would be the Location.managingOrganization. There should be no duplicate Organization, Location or Practitioner resources once the load is completed.

Thoughts?

view this post on Zulip Josh Mandel (Oct 14 2020 at 13:28):

your proposal sounds great and should definitely address the concern that began this thread.

I also think there are a couple of finer points that we should hash out in the fhir specification if we can reach consensus about correct behaviors.

  1. is it correct that a failed precondition should abort a transaction bundle rather than skipping a specific entry as not applicable? also do we have clarity on whether these preconditions apply to the state of the server only before the transaction is processed or whether it's okay if a sibling entry in the bundle would be the one to rescue the precondition? This echoes some of the authorization logic that @Paul Church is proposing in #smart

  2. do servers have (or should they) a way to indicate what they will proceed search based references versus failing on them?

view this post on Zulip Julius Severin (Jan 15 2021 at 13:21):

I am using Synthea to generate test data for my master's thesis.
As I want to see how my code performs over different sizes of datasets I wanted to generate data for the same patient, change the years_of_history.

  1. I noticed that the IDs are not the same in the different datasets. Is there a way achieve this?
  2. To achieve the same IDs across datasets I wanted to generate the dataset with most years and then generate the others from it. When going through the dataset I noticed that it also contains entries that are not within the range defined in years_of_history. Can somebody tell me why this is the case?
  3. This article states that Synthea can generate up to 10 years of medical history. Is this true or can it actually generate more? I went up to the 72 years of my patient and the size of the dataset definitely increased.

Thank you in advance.

view this post on Zulip Jason Walonoski (Jan 15 2021 at 14:33):

In src/main/resources/synthea.properties the property defaults to exporter.years_of_history = 10

view this post on Zulip Jason Walonoski (Jan 15 2021 at 14:33):

Change the value to 0 if you want all patient history, or 50 if you want 50 years, etc.

view this post on Zulip Jason Walonoski (Jan 15 2021 at 14:35):

Some items are unaffected by the years of history filter. For example, if a patient was diagnosed with Diabetes or another chronic condition prior to the filtering period (say diagnosed 11 years ago, if the filter is 10 years) and that condition was not resolved (for example, with a condition end date) then it will be included.

view this post on Zulip Jason Walonoski (Jan 15 2021 at 14:44):

If you want IDs the same across datasets use seeds (e.g. -s 123 and -cs 456) and a reference date (e.g. -r 20200115).

For example:

./run_synthea -s 123 -cs 456 -p 1 -r 20200115 --exporter.years_of_history=10
./run_synthea -s 123 -cs 456 -p 1 -r 20200115 --exporter.years_of_history=20

The data across the two files (you should move or rename them between runs because it will try to rewrite the files in /output) will be identical in the overlapping years.


Last updated: Apr 12 2022 at 19:14 UTC