Skip to content

Commit 9917b81

Browse files
author
Jake Smith
committed
updated README
1 parent 870eab0 commit 9917b81

File tree

3 files changed

+25
-15
lines changed

3 files changed

+25
-15
lines changed

.env

+2-2
Original file line numberDiff line numberDiff line change
@@ -16,5 +16,5 @@ export ZEO_PATH="/usr/local/bin/zeo++-0.3/network"
1616
# MOF-related software paths
1717
export RASPA_PATH="/anaconda/envs/mofdiff/lib/python3.8/site-packages/RASPA2"
1818
export RASPA_SIM_PATH="/anaconda/envs/mofdiff/bin/simulate"
19-
export EGULP_PATH="/usr/local/bin/egulp/src/egulp"
20-
export EGULP_PARAMETER_PATH="/usr/local/bin/egulp/data"
19+
export EGULP_PATH="/usr/local/bin/egulp-master/src/egulp"
20+
export EGULP_PARAMETER_PATH="/usr/local/bin/egulp-master/data"

.gitignore

+4-1
Original file line numberDiff line numberDiff line change
@@ -157,4 +157,7 @@ cython_debug/
157157
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
158158
# and can be added to the global gitignore or merged into this file. For a more nuclear
159159
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
160-
#.idea/
160+
#.idea
161+
162+
# pretrained models folder
163+
pretrained/

README.md

+19-12
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ If you find this code useful, please consider referencing our paper:
2323
- [Generating MOF structures](#generating-cg-mof-structures)
2424
- [Assemble all-atom MOFs](#assemble-all-atom-mofs)
2525
- [Relax MOFs](#relax-mofs-and-compute-structural-properties)
26+
- [GCMC simulations](#gcmc-simulation-for-gas-adsorption)
2627

2728
## Installation
2829

@@ -45,25 +46,25 @@ We use [MOFid](https://github.com/snurr-group/mofid) for preprocessing and analy
4546

4647
Configure the `.env` file to set correct paths to various directories, dependent on the desired functionality. An [example](./.env) `.env` file is provided in the repository.
4748

48-
For model training, please set the learning-related paths.
49+
For [model training](#training), please set the learning-related paths.
4950
- PROJECT_ROOT: the parent MOFDiff directory
5051
- DATASET_DIR: the directory containing the .lmdb file produced by processing the data
5152
- LOG_DIR: the directory to which logs will by written
5253
- HYDRA_JOBS: the directory to which Hydra output will be written
5354
- WANDB_DIR: the directory to which WandB output will be written
5455

55-
For MOF relaxation and structureal property calculations, please additionally set the Zeo++ path.
56+
For [MOF relaxation and structureal property calculations](#relax-mofs-and-compute-structural-properties), please additionally set the Zeo++ path.
5657
- ZEO_PATH: path to the Zeo++ "network" binary
5758

58-
For GCMC simulations, please additionally set the GCMC-related paths.
59+
For [GCMC simulations](#gcmc-simulation-for-gas-adsorption), please additionally set the GCMC-related paths.
5960
- RASPA_PATH: the RASPA2 parent directory
6061
- RASPA_SIM_PATH: path to the RASPA2 "simulate" binary
6162
- EGULP_PATH: path to the eGULP "egulp" binary
6263
- EGULP_PARAMETER_PATH: the directory containing the eGULP "MEPO.param" file
6364

6465
## Process data
6566

66-
You can download the preprocessed `BW-DB` data from [Zenodo](https://zenodo.org/uploads/10467288) (recommended).
67+
You can download the preprocessed `BW-DB` data from [Zenodo](https://zenodo.org/uploads/10467288) (recommended). To use the preprocessed data, please extract `bw_db.tar.gz` into `${oc.env:DATASET_DIR}`.
6768

6869
Alternatively, you can download the `BW-DB` raw data from [Materials Cloud](https://archive.materialscloud.org/record/2018.0016/v3) to `${raw_path}` and preprocess with the following command. This step requires MOFid.
6970

@@ -96,7 +97,7 @@ The default output directory is `${oc.env:HYDRA_JOBS}/bb/${expname}/`. `oc.env:H
9697
python mofdiff/scripts/train.py --config-name=bb expname=bwdb_bb_dim_64 model.latent_dim=64
9798
```
9899

99-
Logging is done with [wandb](https://wandb.ai/site) by default. You need to login to wandb with `wandb login` before training. The training logs will be saved to the wandb project `mofdiff`. You can also override the wandb project with command line arguments. You can also disable wandb logging by removing the `wandb` entry in the [config](./conf/logging/default.yaml) as demonstrated [here](./conf/logging/no_wandb_logging.yaml).
100+
Logging is done with [wandb](https://wandb.ai/site) by default. You need to login to wandb with `wandb login` before training. The training logs will be saved to the wandb project `mofdiff`. You can also override the wandb project with command line arguments or disable wandb logging by removing the `wandb` entry in the [config](./conf/logging/default.yaml) as demonstrated [here](./conf/logging/no_wandb_logging.yaml).
100101

101102
### training coarse-grained diffusion model for MOFs
102103

@@ -110,15 +111,15 @@ For BW-DB, training the building block encoder takes roughly 3 days and training
110111

111112
## Generating CG MOF structures
112113

113-
Pretrained models can be found [here](https://zenodo.org/record/10467288).
114+
Pretrained models can be found [here](https://zenodo.org/record/10467288). To use the pretrained models, please extract `pretrained.tar.gz` and `bb_emb_space.tar.gz` into `${oc.env:PROJECT_ROOT}/pretrained`.
114115

115-
With a trained CG diffusion model `${diffusion_model_path}`, generate random CG MOF structures with the following command, where `${bb_cache_path}` is the path to the trained building encoder, as described [above](#training-the-building-block-encoder).
116+
With a trained CG diffusion model `${diffusion_model_path}`, generate random CG MOF structures with the following command, where `${bb_cache_path}` is the path to the trained building encoder `bb_emb_space.pt`, either sourced from the pretrained models or generated as described [above](#training-the-building-block-encoder).
116117

117118
```
118119
python mofdiff/scripts/sample.py --model_path ${diffusion_model_path} --bb_cache_path ${bb_cache_path}
119120
```
120121

121-
To optimize MOF structures for a property defined in BW-DB (e.g., CO2 adsorption working capacity) use the following command:
122+
To optimize MOF structures for a property defined in BW-DB (e.g., CO2 adsorption working capacity) use the following command, where `${data_path}` is the path to the processed data `data.lmdb`, either sourced from the pretrained models or generated as described [above](process-data).
122123

123124
```
124125
python mofdiff/scripts/optimize.py --model_path ${diffusion_model_path} --bb_cache_path ${bb_cache_path} --data_path ${data_path} --property "working_capacity_vacuum_swing [mmol/g]" --target_v 15.0
@@ -170,14 +171,14 @@ apt-get update
170171
apt-get install -yq libgsl0-dev pkg-config libxrender-dev
171172
```
172173

173-
Install [eGULP](https://github.com/danieleongari/egulp) following the instruction in the repository. The following commands install eGULP in `/usr/local/bin/egulp`:
174+
Install [eGULP](https://github.com/danieleongari/egulp) following the instruction in the repository. The following commands install eGULP in `/usr/local/bin/egulp-master`:
174175

175176
```
176-
mkdir /usr/local/bin/egulp && tar -xf egulp.tar -C /usr/local/bin/egulp
177-
cd /usr/local/bin/egulp/src && make && cd -
177+
unzip egulp-master.zip -d /usr/local/bin
178+
cd /usr/local/bin/egulp-master/src && make
178179
```
179180

180-
Finally, RASPA2 requires a set of forcefield parameters with which to run the simulations. To use our default simulation settings, copy the UFF parameter set from [ForceFields](https://github.com/lipelopesoliveira/ForceFields/tree/main) into the RASPA2 forcefield definition directory, typically located at `$RASPA_PATH/share/raspa/forcefield`.
181+
Finally, RASPA2 requires a set of forcefield parameters with which to run the simulations. To use our default simulation settings, copy the UFF parameter set from [ForceFields](https://github.com/lipelopesoliveira/ForceFields/tree/main) into the RASPA2 forcefield definition directory, typically located at `${oc.env:RASPA_PATH}/share/raspa/forcefield`.
181182

182183
### running simulations
183184

@@ -198,6 +199,12 @@ python mofdiff/scripts/gcmc_screen.py --input ${sample_path}/mepo_qeq_charges
198199

199200
The GCMC simulation results will be saved in `${sample_path}/gcmc/screening_results.json`.
200201

202+
We have found that RASPA2 may occasionally have trouble reading input files as generated by python. If you encounter errors of the general form `Creating molecules for more systems than the maximum allowed` then please set the `rewrite_raspa_input` flag.
203+
204+
```
205+
python mofdiff/scripts/gcmc_screen.py --input ${sample_path}/mepo_qeq_charges --rewrite_raspa_input
206+
```
207+
201208
## Acknowledgement
202209

203210
This codebase is based on several existing repositories:

0 commit comments

Comments
 (0)