You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
GrowthViz was developed in partnership between the Health FFRDC and CDC, with feedback from leading health researchers, to support post-processing and data visualization of growthcleanr output.
3
+
GrowthViz was developed in partnership between the Health FFRDC and CDC, with
4
+
feedback from leading health researchers, to support post-processing and data
5
+
visualization of growthcleanr output.
4
6
5
-
The objective of this tool is to allow users to conduct post-processing and data visualization of growthcleanr output. [growthcleanr](https://github.com/carriedaymont/growthcleanr) is an automated method for cleaning longitudinal pediatric growth data from EHRs. It provides an environment that includes graphical user interfaces as well as interactive software development to explore data.
7
+
The objective of this tool is to allow users to conduct post-processing and data
8
+
visualization of growthcleanr output.
9
+
[growthcleanr](https://github.com/carriedaymont/growthcleanr) is an automated
10
+
method for cleaning longitudinal anthropometric data from EHRs. GrowthViz
11
+
provides an environment that includes graphical user interfaces as well as
12
+
interactive software development to explore data.
6
13
7
14
## Contents
8
15
@@ -15,84 +22,201 @@ The objective of this tool is to allow users to conduct post-processing and data
15
22
16
23
## Git Repository Information
17
24
18
-
The latest code for this project should run `GrowthViz-pediatrics.ipynb` or `GrowthViz-adults.ipynb`, depending on the user's patient population.
25
+
The latest code for this project should run `GrowthViz-pediatrics.ipynb` or
26
+
`GrowthViz-adults.ipynb`, depending on the user's patient population.
19
27
20
-
The notebook requires Python 3, Jupyter Notebook, Pandas, Matplotlib and Seaborn. Some widgets also require the Qgrid extension enabled in Jupyter. The `.csv` files in the repository are the source data required to run the notebook. Custom data should replace these files in the same format. For more details see [the simple install instructions below.](#simple-install)
28
+
The notebook requires Python 3, Jupyter Notebook, Pandas, Matplotlib and
29
+
Seaborn. Some widgets also require the Qgrid extension enabled in Jupyter. The
30
+
`.csv` files in the repository are the source data required to run the notebook.
31
+
Custom data should replace these files in the same format. For more details see
The objective of this tool is to allow users to conduct post-processing and data visualization of growthcleanr output. [growthcleanr](https://github.com/carriedaymont/growthcleanr) is an automated method for cleaning longitudinal pediatric growth data from EHRs. It is available as open source software. GrowthViz is to be used **after** a data set has been run through growthcleanr.
36
+
The objective of this tool is to allow users to conduct post-processing and data
37
+
visualization of growthcleanr output.
38
+
[growthcleanr](https://github.com/carriedaymont/growthcleanr) is an automated
39
+
method for cleaning longitudinal anthropometric growth data from EHRs. It is
40
+
available as open source software. GrowthViz is to be used **after** a data set
41
+
has been run through growthcleanr.
25
42
26
43
### Background
27
44
28
-
As stated in [Automated identification of implausible values in growth data from pediatric electronic health records](https://academic.oup.com/jamia/article/24/6/1080/3767271):
45
+
As stated in
46
+
[Automated identification of implausible values in growth data from pediatric electronic health records](https://academic.oup.com/jamia/article/24/6/1080/3767271):
47
+
48
+
> In pediatrics, evaluation of growth is fundamental, and many pediatric
49
+
> research studies include some aspect of growth as an outcome or other
50
+
> variable. The clinical growth measurements obtained in day-to-day care are
51
+
> susceptible to error beyond the imprecision inherent in any anthropometric
52
+
> measurement. Some errors result from minor problems with measurement
53
+
> technique. While these errors can be important in certain analyses, they are
54
+
> often small and generally impossible to detect after measurements are
55
+
> recorded. Larger measurement technique errors can result in values that are
56
+
> biologically implausible and can cause problems for many analyses.
57
+
58
+
GrowthViz uses data sets that were evaluated with growthcleanr. The tool expects
59
+
the output to be in a CSV format that is described later on in the notebook.
60
+
61
+
GrowthViz is a [Juypter Notebook](https://jupyter.org/). It provides an
62
+
environment that includes graphical user interfaces as well as interactive
63
+
software development to explore data. To achieve this, GrowthViz references
64
+
different software languages and packages:
65
+
66
+
-[Python programming language](https://www.python.org/) is used to import,
67
+
transform, visualize and analyze the output of growthcleanr. Some of the code
68
+
for the tool is directly included in this notebook. Other functions have been
69
+
placed in an external file to minimize the amount of code that users see in
70
+
order to let them focus on the actual data.
71
+
72
+
- Data analysis is performed using [NumPy](https://numpy.org/) and
73
+
[Pandas](https://pandas.pydata.org/). The output of growthcleanr will be
GrowthViz provides functions for transforming DataFrames to support
77
+
calculation of some values, such as BMI, as well as supporting visualizations.
78
+
It is expected that users will create views into or copies of the DataFrames
79
+
built initially by this tool. Adding columns to the DataFrames created by this
80
+
tool is unlikely to cause problems. Removing columns is likely to break some
81
+
of the tool's functionality.
82
+
83
+
- Visualization in the tool is provided by [Matplotlib](https://matplotlib.org/)
84
+
and [Seaborn](http://seaborn.pydata.org/). Users may generate their own charts
85
+
with these utilities.
29
86
30
-
> In pediatrics, evaluation of growth is fundamental, and many pediatric research studies include some aspect of growth as an outcome or other variable. The clinical growth measurements obtained in day-to-day care are susceptible to error beyond the imprecision inherent in any anthropometric measurement. Some errors result from minor problems with measurement technique. While these errors can be important in certain analyses, they are often small and generally impossible to detect after measurements are recorded. Larger measurement technique errors can result in values that are biologically implausible and can cause problems for many analyses.
87
+
## Simple Install
31
88
32
-
GrowthViz uses data sets that were produced by growthcleanr. The tool expects the output to be in a CSV format that is described later on in the notebook.
89
+
Anaconda is an all-in-one package installer for setting up dependencies needed
90
+
to run and view GrowthViz.
33
91
34
-
GrowthViz is a [Juypter Notebook](https://jupyter.org/). It provides an environment that includes graphical user interfaces as well as interactive software development to explore data. To achieve this, GrowthViz references different software languages and packages:
35
-
-[Python programming language](https://www.python.org/) is used to import, transform, visualize and analyze the output of growthcleanr. Some of the code for the tool is directly included in this notebook. Other functions have been placed in an external file to minimize the amount of code that users see in order to let them focus on the actual data.
36
-
- Data analysis is performed using [NumPy](https://numpy.org/) and [Pandas](https://pandas.pydata.org/). The output of growthcleanr will be loaded into a [pandas DataFrame](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html). GrowthViz provides functions for transforming DataFrames to support calculation of some values, such as BMI, as well as supporting visualizations. It is expected that users will create views into or copies of the DataFrames built initially by this tool. Adding columns to the DataFrames created by this tool is unlikely to cause problems. Removing columns is likely to break some of the tool's functionality.
37
-
- Visualization in the tool is provided by [Matplotlib](https://matplotlib.org/) and [Seaborn](http://seaborn.pydata.org/). Users may generate their own charts with these utilities.
instructions](https://docs.anaconda.com/anaconda/install/windows/) are
99
+
step-by-step and will get everything set up properly for the project.
40
100
41
-
Anaconda is an all-in-one package installer for setting up dependencies needed to run and view GrowthViz.
101
+
2. Download the [GrowthViz project](https://github.com/mitre/GrowthViz) as a zip
102
+
file using the "Clone or download" button on GitHub.
42
103
43
-
1. Install Anaconda
44
-
- Follow install instructions [found here for installation.](https://docs.anaconda.com/anaconda/install/)
45
-
- Opt for the Python 3.7 version
46
-
- The [windows install instructions](https://docs.anaconda.com/anaconda/install/windows/) are step-by-step and will get everything set up properly for the project.
47
-
2. Download the [GrowthViz project](https://github.com/mitre/GrowthViz) as a zip file using the "Clone or download" button on GitHub.
48
-
3. Unzip the GrowthViz zip file to have access to all of the source files for the Jupyter notebook.
49
-
4. Run the Anaconda Navigator that was installed during Step 1 (go to Start>Anaconda Navigator). This may take a while to load.
50
-
5. Before Launching the Jupyter Notebook application (shown on the home page), download one additional dependency "Qgrid". To do this:
51
-
- Click 'Environments' on the left.
52
-
- Type 'Qgrid' in the `Search Packages` text box in the top center of the screen. If it shows up with a green checkbox, proceed to Step 6.
53
-
- If it does not appear:
54
-
- Change the 'Installed' drop down in the top center of the application to 'Not Installed' and type in 'Qgrid' in the search bar on the right.
55
-
- If Qgrid still does not show up click 'Update Index...' button next to the search bar. This may take several minutes. Once it is done search for Qgrid again.
56
-
- Check the box to the left of Qgrid in the list and click the green 'Apply' button in the lower right corner.
57
-
- Confirm the installation dialog. Installation may again take several minutes.
58
-
- Once installation is successful, click on the 'Home' in the upper left navigation panel and proceed to Step 6.
59
-
6. Click ‘Launch’ under the ‘Jupyter Notebook’ icon. This will open the Jupyter Notebook interface in your default browser.
60
-
7. Within the browser, navigate to the `GrowthViz-master` folder you downloaded and unzipped in Step 2 (likely found in your Downloads/ folder). Click on `GrowthViz.ipynb` to run the Python notebook.
61
-
8.**[Optional step for testing the notebook]** Once the notebook is open, click the 'Run' button to step through the various blocks (cells) of the document, OR click the 'Cell' dropdown in the menu bar and select 'Run all' to test the entire notebook all at once.
62
-
63
-
If not using Anaconda, specific versions of packages can be found in requirements.txt.
104
+
3. Unzip the GrowthViz zip file to have access to all of the source files for
105
+
the Jupyter notebook.
106
+
107
+
4. Run the Anaconda Navigator that was installed during Step 1 (go to
108
+
Start > Anaconda Navigator). This may take a while to load.
109
+
110
+
5. Before Launching the Jupyter Notebook application (shown on the home page),
111
+
download one additional dependency "Qgrid". To do this:
112
+
113
+
- Click 'Environments' on the left.
114
+
115
+
- Type 'Qgrid' in the `Search Packages` text box in the top center of the
116
+
screen. If it shows up with a green checkbox, proceed to Step 6.
117
+
118
+
- If it does not appear:
119
+
120
+
- Change the 'Installed' drop down in the top center of the application to
121
+
'Not Installed' and type in 'Qgrid' in the search bar on the right.
122
+
123
+
- If Qgrid still does not show up click 'Update Index...' button next to the
124
+
search bar. This may take several minutes. Once it is done search for
125
+
Qgrid again.
126
+
127
+
- Check the box to the left of Qgrid in the list and click the green 'Apply'
128
+
button in the lower right corner.
129
+
130
+
- Confirm the installation dialog. Installation may again take several
131
+
minutes.
132
+
133
+
- Once installation is successful, click on the 'Home' in the upper left
134
+
navigation panel and proceed to Step 6.
135
+
136
+
6. Click ‘Launch’ under the ‘Jupyter Notebook’ icon. This will open the Jupyter
137
+
Notebook interface in your default browser.
138
+
139
+
7. Within the browser, navigate to the `GrowthViz-master` folder you downloaded
140
+
and unzipped in Step 2 (likely found in your Downloads/ folder). Click on
141
+
`GrowthViz.ipynb` to run the Python notebook.
142
+
143
+
8.**[Optional step for testing the notebook]** Once the notebook is open, click
144
+
the 'Run' button to step through the various blocks (cells) of the document,
145
+
OR click the 'Cell' dropdown in the menu bar and select 'Run all' to test the
146
+
entire notebook all at once.
147
+
148
+
If not using Anaconda, specific versions of packages can be found in `requirements.txt`.
64
149
65
150
## Sample data and first run testing
66
151
67
-
By default when you reach Step 6 of the [Simple Install](#simple-install) instructions above the notebook will use sample data loaded from the `.csv` files located in the GrowthViz-master project.
152
+
By default when you reach Step 6 of the [Simple Install](#simple-install)
153
+
instructions above the notebook will use sample data loaded from the `.csv`
154
+
files located in the GrowthViz-master project.
68
155
69
-
To ensure that all of the necessary example files are present, run the `check_setup.py` script.
156
+
To ensure that all of the necessary example files are present, run the
157
+
`check_setup.py` script.
70
158
71
159
## Docker Install
72
160
73
-
Docker allows for the ability to download GrowthViz and its dependencies in an environment. To use this method, [download and install Docker Desktop](https://www.docker.com/products/docker-desktop)
161
+
Docker allows for the ability to download GrowthViz and its dependencies in an
162
+
environment. To use this method,
163
+
[download and install Docker Desktop](https://www.docker.com/products/docker-desktop)
74
164
75
165
1. Download GrowthViz-Docker with the following command:
76
-
-`docker run -it -p 8888:8888 -v [data-path]/growthviz-data:/usr/src/app/growthviz-data mitre/growthviz`
77
-
- Replace the `[data-path]` with a directory path you choose on your local computer. For instance, I choose: `~/Documents` which means that a folder named `/growthviz-data` will be created in my documents folder. When I want to input my own data in to GrowthViz, I can simply drop my CSV files in this `/growthviz-data` folder.
78
-
- Note also that when mapping a folder on Windows, you may be prompted to confirm that you indeed want to "Share" the folder. This is a standard Windows security practice, and it is okay to confirm and proceed.
79
-
2. View GrowthViz
80
-
- After running the above command, several lines of text will appear. Choose the third URL in this text and navigate to it in a web browser.
81
-
- The URL should be in the format: `http://X.X.X.X:8888/?token=XXX...`
82
-
- Within the browser, click on the file `GrowthViz.ipynb`. This will open a new window with the GrowthViz Jupyter Notebook.
83
-
3. Run GrowthViz
84
-
- You can choose to either click the `Run` button to step through the various blocks (cells) of the document, OR click the 'Cell' dropdown in the menu bar and select 'Run all' to test the entire notebook all at once. However, this will run with the default sample data. Step 4 will explain how to use your own data.
85
-
4. Input Your Own Dataset CSVs
86
-
- To input your own data, drop a file `[name-of-your-file.csv]` into the `/growthviz-data` folder you created in step 1.
87
-
- Then, navigate to Cells 7 and 28 and replace:
88
-
-`cleaned_obs = pd.read_csv("sample-data-cleaned.csv")` with
- Where [name-of-your-file.csv] is the input CSV file you placed in your
208
+
`/growthviz-data` folder.
91
209
92
210
#### Output boxes
93
-
When you run all cells (see Step 8 above) `Out[#]:` boxes will appear in the notebook below the `In[#]:` code cells. These outputs are the result of the functioning code blocks on the data. The out blocks will often be interactive charts and graphs used to explore the growthcleanr data. Descriptions of each `Out[#]:` block can be found in the text sections above the `In[#]:` blocks.
211
+
212
+
When you run all cells (see Step 8 above) `Out[#]:` boxes will appear in the
213
+
notebook below the `In[#]:` code cells. These outputs are the result of the
214
+
functioning code blocks on the data. The "Out" blocks will often be interactive
215
+
charts and graphs used to explore the growthcleanr data. Descriptions of each
216
+
`Out[#]:` block can be found in the text sections above the `In[#]:` blocks.
94
217
95
218
## Notice
219
+
96
220
Copyright 2020-2021 The MITRE Corporation.
97
221
98
222
Approved for Public Release; Distribution Unlimited. Case Number 19-2008
0 commit comments