You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -45,25 +46,25 @@ We use [MOFid](https://github.com/snurr-group/mofid) for preprocessing and analy
45
46
46
47
Configure the `.env` file to set correct paths to various directories, dependent on the desired functionality. An [example](./.env)`.env` file is provided in the repository.
47
48
48
-
For model training, please set the learning-related paths.
49
+
For [model training](#training), please set the learning-related paths.
49
50
- PROJECT_ROOT: the parent MOFDiff directory
50
51
- DATASET_DIR: the directory containing the .lmdb file produced by processing the data
51
52
- LOG_DIR: the directory to which logs will by written
52
53
- HYDRA_JOBS: the directory to which Hydra output will be written
53
54
- WANDB_DIR: the directory to which WandB output will be written
54
55
55
-
For MOF relaxation and structureal property calculations, please additionally set the Zeo++ path.
56
+
For [MOF relaxation and structureal property calculations](#relax-mofs-and-compute-structural-properties), please additionally set the Zeo++ path.
56
57
- ZEO_PATH: path to the Zeo++ "network" binary
57
58
58
-
For GCMC simulations, please additionally set the GCMC-related paths.
59
+
For [GCMC simulations](#gcmc-simulation-for-gas-adsorption), please additionally set the GCMC-related paths.
59
60
- RASPA_PATH: the RASPA2 parent directory
60
61
- RASPA_SIM_PATH: path to the RASPA2 "simulate" binary
61
62
- EGULP_PATH: path to the eGULP "egulp" binary
62
63
- EGULP_PARAMETER_PATH: the directory containing the eGULP "MEPO.param" file
63
64
64
65
## Process data
65
66
66
-
You can download the preprocessed `BW-DB` data from [Zenodo](https://zenodo.org/uploads/10467288) (recommended).
67
+
You can download the preprocessed `BW-DB` data from [Zenodo](https://zenodo.org/uploads/10467288) (recommended). To use the preprocessed data, please extract `bw_db.tar.gz` into `${oc.env:DATASET_DIR}`.
67
68
68
69
Alternatively, you can download the `BW-DB` raw data from [Materials Cloud](https://archive.materialscloud.org/record/2018.0016/v3) to `${raw_path}` and preprocess with the following command. This step requires MOFid.
69
70
@@ -96,7 +97,7 @@ The default output directory is `${oc.env:HYDRA_JOBS}/bb/${expname}/`. `oc.env:H
Logging is done with [wandb](https://wandb.ai/site) by default. You need to login to wandb with `wandb login` before training. The training logs will be saved to the wandb project `mofdiff`. You can also override the wandb project with command line arguments. You can also disable wandb logging by removing the `wandb` entry in the [config](./conf/logging/default.yaml) as demonstrated [here](./conf/logging/no_wandb_logging.yaml).
100
+
Logging is done with [wandb](https://wandb.ai/site) by default. You need to login to wandb with `wandb login` before training. The training logs will be saved to the wandb project `mofdiff`. You can also override the wandb project with command line arguments or disable wandb logging by removing the `wandb` entry in the [config](./conf/logging/default.yaml) as demonstrated [here](./conf/logging/no_wandb_logging.yaml).
100
101
101
102
### training coarse-grained diffusion model for MOFs
102
103
@@ -110,15 +111,15 @@ For BW-DB, training the building block encoder takes roughly 3 days and training
110
111
111
112
## Generating CG MOF structures
112
113
113
-
Pretrained models can be found [here](https://zenodo.org/record/10467288).
114
+
Pretrained models can be found [here](https://zenodo.org/record/10467288). To use the pretrained models, please extract `pretrained.tar.gz` and `bb_emb_space.tar.gz` into `${oc.env:PROJECT_ROOT}/pretrained`.
114
115
115
-
With a trained CG diffusion model `${diffusion_model_path}`, generate random CG MOF structures with the following command, where `${bb_cache_path}` is the path to the trained building encoder, as described [above](#training-the-building-block-encoder).
116
+
With a trained CG diffusion model `${diffusion_model_path}`, generate random CG MOF structures with the following command, where `${bb_cache_path}` is the path to the trained building encoder`bb_emb_space.pt`, either sourced from the pretrained models or generated as described [above](#training-the-building-block-encoder).
To optimize MOF structures for a property defined in BW-DB (e.g., CO2 adsorption working capacity) use the following command:
122
+
To optimize MOF structures for a property defined in BW-DB (e.g., CO2 adsorption working capacity) use the following command, where `${data_path}` is the path to the processed data `data.lmdb`, either sourced from the pretrained models or generated as described [above](process-data).
Install [eGULP](https://github.com/danieleongari/egulp) following the instruction in the repository. The following commands install eGULP in `/usr/local/bin/egulp`:
174
+
Install [eGULP](https://github.com/danieleongari/egulp) following the instruction in the repository. The following commands install eGULP in `/usr/local/bin/egulp-master`:
174
175
175
176
```
176
-
mkdir /usr/local/bin/egulp && tar -xf egulp.tar -C /usr/local/bin/egulp
177
-
cd /usr/local/bin/egulp/src && make && cd -
177
+
unzip egulp-master.zip -d /usr/local/bin
178
+
cd /usr/local/bin/egulp-master/src && make
178
179
```
179
180
180
-
Finally, RASPA2 requires a set of forcefield parameters with which to run the simulations. To use our default simulation settings, copy the UFF parameter set from [ForceFields](https://github.com/lipelopesoliveira/ForceFields/tree/main) into the RASPA2 forcefield definition directory, typically located at `$RASPA_PATH/share/raspa/forcefield`.
181
+
Finally, RASPA2 requires a set of forcefield parameters with which to run the simulations. To use our default simulation settings, copy the UFF parameter set from [ForceFields](https://github.com/lipelopesoliveira/ForceFields/tree/main) into the RASPA2 forcefield definition directory, typically located at `${oc.env:RASPA_PATH}/share/raspa/forcefield`.
The GCMC simulation results will be saved in `${sample_path}/gcmc/screening_results.json`.
200
201
202
+
We have found that RASPA2 may occasionally have trouble reading input files as generated by python. If you encounter errors of the general form `Creating molecules for more systems than the maximum allowed` then please set the `rewrite_raspa_input` flag.
0 commit comments