Support synthetic data import #63

minrk · 2025-04-12T15:16:46Z

Both for testing and demo purposes, it would be extremely useful to support synthetic data import, e.g. from synthea. With that data, we could produce tests and example notebooks we can actually display publicly without needing to go through 'real' data population, which we then can't use as examples in documentation.

We'd have to do our own part to handle e.g. consents of fake accounts, etc. but I think that should not be hard.

s1monj · 2025-04-12T21:24:24Z

@minrk sounds good but I don't see any examples of generating device/wearable Observations over a time frame? I came across this from 2018 but that's per day - were you thinking of creating a custom module?

minrk · 2025-04-13T06:08:23Z

I’m not sure what the best way is. Even if there’s any appropriate sample data online that someone else might have published, and slotting that in would work, if that exists.

For BP, I was considering just generating data with synthea and inserting the values into what we have to create records. cgm has much more characteristic curves that wouldn’t work for. Even sampling a hand-drawn curve would be okay.

minrk · 2025-04-15T14:00:47Z

@maryamv brought up iglu, which has some sample data, which appears to originate from https://doi.org/10.1371/journal.pbio.2005143.s010 in this paper (105k records, 57 subjects, ~2k records per subject; most for about a week, some for much longer but similar sample count). We could use that to populate synthetic CGM data.

I think for BP, a random walk within range really ought to be enough, or we could extract numbers from Synthea, like I did here.

The main thing is:

generating all the right fields for our schema
loading it into JHE so we can run a test or demo 'for real' against JHE, but with non-sensitive output

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support synthetic data import #63

Support synthetic data import #63

minrk commented Apr 12, 2025

s1monj commented Apr 12, 2025

minrk commented Apr 13, 2025

minrk commented Apr 15, 2025

Support synthetic data import #63

Support synthetic data import #63

Comments

minrk commented Apr 12, 2025

s1monj commented Apr 12, 2025

minrk commented Apr 13, 2025

minrk commented Apr 15, 2025