-
Notifications
You must be signed in to change notification settings - Fork 3
Support synthetic data import #63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I’m not sure what the best way is. Even if there’s any appropriate sample data online that someone else might have published, and slotting that in would work, if that exists. For BP, I was considering just generating data with synthea and inserting the values into what we have to create records. cgm has much more characteristic curves that wouldn’t work for. Even sampling a hand-drawn curve would be okay. |
@maryamv brought up iglu, which has some sample data, which appears to originate from https://doi.org/10.1371/journal.pbio.2005143.s010 in this paper (105k records, 57 subjects, ~2k records per subject; most for about a week, some for much longer but similar sample count). We could use that to populate synthetic CGM data. I think for BP, a random walk within range really ought to be enough, or we could extract numbers from Synthea, like I did here. The main thing is:
|
Both for testing and demo purposes, it would be extremely useful to support synthetic data import, e.g. from synthea. With that data, we could produce tests and example notebooks we can actually display publicly without needing to go through 'real' data population, which we then can't use as examples in documentation.
We'd have to do our own part to handle e.g. consents of fake accounts, etc. but I think that should not be hard.
The text was updated successfully, but these errors were encountered: