Skip to content

Complete ESM2 pretraining #112

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 93 commits into from
Aug 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
93 commits
Select commit Hold shift + click to select a range
692079a
mount data
farhadrgh Aug 7, 2024
532a6fa
scale quary before rope
farhadrgh Aug 7, 2024
71f6427
scale quary before rope
farhadrgh Aug 7, 2024
0f423be
ruff
farhadrgh Aug 8, 2024
eb2926e
doc str
farhadrgh Aug 8, 2024
34ab0f3
logits test
farhadrgh Aug 8, 2024
68493d5
add embedding and logits tests
farhadrgh Aug 9, 2024
b4693d7
move tests to test_model
farhadrgh Aug 9, 2024
a317e16
ESM2 bionemo1 options
farhadrgh Aug 12, 2024
74dab95
move doc str
farhadrgh Aug 12, 2024
54c5236
ruff
farhadrgh Aug 12, 2024
c4104b0
fix typo
farhadrgh Aug 12, 2024
5dcf303
move fixtures up
farhadrgh Aug 13, 2024
f2f4bfa
test loss gv
farhadrgh Aug 13, 2024
a2cc3af
add ESM2 pretrain script
farhadrgh Aug 13, 2024
54e05d3
add doc str
farhadrgh Aug 13, 2024
bd6e264
update pretrain test
farhadrgh Aug 13, 2024
c582abb
add device
farhadrgh Aug 14, 2024
96c6ba2
rm async_save
farhadrgh Aug 14, 2024
dcb9f64
remove comments
farhadrgh Aug 14, 2024
a8a7005
change atol
farhadrgh Aug 14, 2024
0022523
skip pytest
farhadrgh Aug 14, 2024
c0a7890
change atol
farhadrgh Aug 15, 2024
97b13a2
rename to avoid ci error
farhadrgh Aug 15, 2024
12422bd
fix import
farhadrgh Aug 15, 2024
20cef07
test with dummy data
farhadrgh Aug 15, 2024
4d811b0
skip pretrain test
farhadrgh Aug 15, 2024
dc64871
compute hf reference
farhadrgh Aug 15, 2024
41e2cb0
update hf reference
farhadrgh Aug 15, 2024
e97f40d
update
farhadrgh Aug 16, 2024
909b0fd
rm skip
farhadrgh Aug 16, 2024
990de8c
tokenizer serde
farhadrgh Aug 16, 2024
b77a4dc
fix typo
farhadrgh Aug 16, 2024
b9a0528
fix tests
farhadrgh Aug 16, 2024
0b2869e
doc str
farhadrgh Aug 16, 2024
5da634d
skip pretrain tests
farhadrgh Aug 16, 2024
b27e34b
update doc str
farhadrgh Aug 16, 2024
12c6839
doc str
farhadrgh Aug 8, 2024
ee61014
Add ESM2 Dataset and Datamodule
pstjohn Aug 2, 2024
908ce03
duplicate
farhadrgh Aug 14, 2024
ffafbe2
add accumulate_grad_batches
sichu2023 Aug 19, 2024
60ba685
increase num-dataset-workers for persistent_workers
sichu2023 Aug 19, 2024
8342dd8
add esm2 hparam to argparse
sichu2023 Aug 20, 2024
2935a95
fix typo
sichu2023 Aug 20, 2024
0c52a0e
fix experiment name
sichu2023 Aug 20, 2024
9d6f5a3
fix _random_crop
sichu2023 Aug 20, 2024
0461742
fix typo
sichu2023 Aug 20, 2024
bddf4e2
disable cyclic sampler
sichu2023 Aug 20, 2024
683f1ea
add train and val dataset shuffling
sichu2023 Aug 20, 2024
abfd521
uncomment pretrain tests
sichu2023 Aug 20, 2024
2bc8bdc
support fractional value of limit_val_batches
sichu2023 Aug 21, 2024
2b44822
revert change on .devcontainer/devcontainer.json
sichu2023 Aug 22, 2024
76f6c65
add fractional limit_val_batches testing
sichu2023 Aug 22, 2024
f1d83f4
fix ruff
sichu2023 Aug 22, 2024
3db6434
update MegatronMixedPrecision
sichu2023 Aug 22, 2024
2d7a36b
update and skip fractional limit_val_batches main test
sichu2023 Aug 22, 2024
0d77dff
update to skip io.track_io
sichu2023 Aug 23, 2024
4b11b6a
parameterize limit_val_batches main testwq
sichu2023 Aug 23, 2024
eb9e64f
wrap main with model parallel state in test
sichu2023 Aug 23, 2024
a5616ce
remove assertion
sichu2023 Aug 23, 2024
3a456d6
clean up callbacks
sichu2023 Aug 23, 2024
fc32a90
move pretrain cli cwd to tempdir
sichu2023 Aug 23, 2024
6da31eb
ruff
sichu2023 Aug 23, 2024
bdb9567
clean up ppl logging callback
sichu2023 Aug 23, 2024
da2beef
share gbs calculation to geneformer
sichu2023 Aug 23, 2024
752ae55
clean up geneformer test_pretrain.py
sichu2023 Aug 23, 2024
d0ab66e
limit main testing to argparse
sichu2023 Aug 26, 2024
be0f114
sync testing changes on geneformer
sichu2023 Aug 26, 2024
18dcb67
support limit_val_batches = None
sichu2023 Aug 26, 2024
2a2ce08
add unittest to ensure consistant samples per epoch
sichu2023 Aug 26, 2024
1867704
add unittest on limit_val_batches = None
sichu2023 Aug 26, 2024
8503b78
add unittest on float_or_int_or_none
sichu2023 Aug 26, 2024
09c42e7
add infer_global_batch_size unittest
sichu2023 Aug 26, 2024
6974d28
ruff
sichu2023 Aug 26, 2024
e675f94
ruff
sichu2023 Aug 26, 2024
468697a
ruff
sichu2023 Aug 26, 2024
1cdf113
fix dict order in tensor_dict_hash
sichu2023 Aug 26, 2024
4923056
remove eval_iters from num_val_samples
sichu2023 Aug 27, 2024
72cdab8
remove comment
sichu2023 Aug 27, 2024
394582d
expand test_limit_val_batches_is_none
sichu2023 Aug 27, 2024
b70b76b
fix unittest and add limit_val_samples = 0 edge case
sichu2023 Aug 27, 2024
bb6ef40
Fix unittest and add limit_val_batches = 0 in esm
sichu2023 Aug 27, 2024
2653033
rename PRNGDatasetShuffler
sichu2023 Aug 27, 2024
ab679a9
drop total_samples in create_valid_dataset
sichu2023 Aug 27, 2024
bfc4769
drop total_samples from create_train_dataset
sichu2023 Aug 28, 2024
cd37c0f
ruff and extract infer_num_val_samples
sichu2023 Aug 28, 2024
ea7240f
transfer infer_num_samples to geneformer
sichu2023 Aug 28, 2024
ceac497
ruff
sichu2023 Aug 28, 2024
aaf3405
use hosted tokenizer files
sichu2023 Aug 28, 2024
8669a7d
add back num_samples in esm dataset
sichu2023 Aug 28, 2024
740f2b6
ruff
sichu2023 Aug 28, 2024
8c0daad
fix valid shuffling
sichu2023 Aug 28, 2024
aaabf34
num_train_samples in dataset shuffling
sichu2023 Aug 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
"target": "dev"
},
"mounts": [
// Mount the local ~/.aws config to pass along AWS credentials for PBSS
// Mount the local ~/.aws config to pass along AWS credentials for PBSS.
"source=${localEnv:HOME}/.aws,target=/home/bionemo/.aws,type=bind,consistency=cached",
"source=${localEnv:HOME}/.ssh,target=/home/bionemo/.ssh,readonly,type=bind,consistency=cached"
],
Expand Down
Loading
Loading