-
Notifications
You must be signed in to change notification settings - Fork 159
Change references to "Synthetic Data" #1036
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
For a 0.18.6 release
For a 0.19 release
Document changes and sundries. No new release.
For a 0.19.1 release
Fix TransformedTarget example in manual (no new release)
Some small documentations improvements. Not to trigger a new release.
Add new auto-generated Model Browser section to the manual. Not to trigger new release.
Update documentation
Doc fixes. No new release.
Update to the manual. No new release.
For a 0.19.2 release
The term "synthetic data" is ambiguous here, because in statistics, it usually refers to "fake data" created by fitting a generative model to a dataset, then drawing random data with the correct statistical properties from it. (Usually to preserve the privacy of participants while still providing a dataset for replication.)
Thanks @ParadaCarleton for this. Sure, within some Statistical community there might be some confusion. But looking around (see e.g. https://en.wikipedia.org/wiki/Synthetic_data) it seems a more general conception of "synthetic data" is pretty common. How about we just add a clarifying sentence at the top of the "Generating Synthetic Data" section; something like
? I think "example data" is too broad a term. This could be anything, real or imagined. |
Mostly I suggested this because when I tried to look up "Synthetic Data Generation Julia" or "Synthetic Population Julia" I kept getting this as a result 😅 |
@ParadaCarleton Are you not happy with the smaller clarification I suggested? Your latest commit does not reflect it. |
Codecov Report
@@ Coverage Diff @@
## dev #1036 +/- ##
=======================================
Coverage 60.97% 60.97%
=======================================
Files 2 2
Lines 41 41
=======================================
Hits 25 25
Misses 16 16 |
Closed as rendered redundant by 2178c10 |
The term "synthetic data" is ambiguous here, because in statistics, it usually refers to "fake data" created by fitting a generative model to a dataset, then drawing random data with the correct statistical properties from it. (Usually to preserve the privacy of participants while still providing a dataset for replication.)