Skip to content

Update project to latest libraries and add devcontainer #9

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
{
"name": "data-science",
"image": "mcr.microsoft.com/devcontainers/universal:2-linux",
"hostRequirements": {
"memory": "32gb"
},
"forwardPorts": [
6006,
8888
],
"customizations": {
"codespaces": {
"openFiles": [
"README.md",
"Titanic.ipynb"
]
},
"vscode": {
"settings": {
"editor.minimap.enabled" : false,
"python.defaultInterpreterPath": "/opt/conda/envs/golden_scenario_env",
"workbench.editorLargeFileConfirmation": 1024
},
"extensions": [
"ms-python.python",
"ms-toolsai.jupyter"
]
}
},
"remoteEnv": {
"ENABLE_ORYX_BUILD": "false"
},
// There is not a feature for LaTeX in VS Code yet, so we need to install it manually
"onCreateCommand": "sudo apt-get update && sudo apt install -y texlive-xetex pandoc && conda env create -f environment.yml",
"updateContentCommand": "conda init"
// "postCreateCommand": "/opt/conda/envs/golden_scenario_env/bin/jupyter lab --notebook-dir=/workspaces/data-science --ip='0.0.0.0' --port=8888 --no-browser"
}
6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,6 @@
.DS_Store

.idea/
.ipynb_checkpoints/
log/
.venv/
venv/
4 changes: 2 additions & 2 deletions .vscode/extensions.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"recommendations": [
"ms-toolsai.jupyter",
"ms-python.python"
"ms-python.python",
"ms-toolsai.jupyter"
]
}
14 changes: 8 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,17 @@ This notebook uses machine learning algorithms to get the best accuracy of predi
- Set the channel priority to strict to avoid issues with the environment creation taking forever.
- `conda config --set channel_priority strict`
- Run the following commands (in either the terminal or an Anaconda Prompt):
- `conda env create -f golden_scenario_env.yml`
- `conda env create -f environment.yml`
- `conda activate golden_scenario_env`
- `conda install python=3.7`
- In VS Code, open the [Titanic.ipynb](Titanic.ipynb) file and connect to the golden_scenario_env kernel

You need to setup the environment as an `ipykernel` to use it from the Jupyter notebook. To do it run inside of the conda activated environment:

`python -m ipykernel install --user --name golden_scenario_env --display-name "Golden Scenario Env"`

Also if you want to support PDF export from jupyter you need to setup LaTeX:

`sudo apt-get install texlive-xetex texlive-fonts-recommended texlive-plain-generic`

## Dev Containers

You can also run the notebooks inside dev containers:

* [![Open in Visual Studio Code](https://img.shields.io/static/v1?label=&message=Open%20in%20Visual%20Studio%20Code&color=blue&logo=visualstudiocode&style=flat)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/claudiaregio/data-science)
* [![Open in Github Codespaces](https://img.shields.io/static/v1?label=&message=Open%20in%20Github%20Codespaces&color=2f362d&logo=github)](https://codespaces.new/claudiaregio/data-science?quickstart=1&hide_repo_select=true)
18,710 changes: 7,705 additions & 11,005 deletions Titanic.ipynb

Large diffs are not rendered by default.

18 changes: 18 additions & 0 deletions environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
name: golden_scenario_env
channels:
- conda-forge
- defaults
dependencies:
- ipywidgets==8.1.0
- matplotlib==3.7.2
- notebook==7.0.3
- numpy==1.23.5
- pandas==2.0.3
- pydotplus==2.0.2
- scikit-learn==1.3.0
- tensorflow==2.12.1
- ydata-profiling==4.5.1
# pin typeguard to 2.13.3 to avoid a runtime issue with ydata-profiling
- typeguard==2.13.3
# auto register conda environments as ipython kernels in jupyter
- nb_conda_kernels==2.3.1
19 changes: 0 additions & 19 deletions golden_scenario_env.yml

This file was deleted.

10 changes: 10 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
ipywidgets==8.1.0
matplotlib==3.7.2
notebook==7.0.3
numpy==1.23.5
pandas==2.0.3
pydotplus==2.0.2
scikit-learn==1.3.0
# it is version 2.13.0 to be compatible with mac too
tensorflow==2.13.0
ydata-profiling==4.5.1