-
Notifications
You must be signed in to change notification settings - Fork 25
Fixed Exxx lint issues #284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed Exxx lint issues #284
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, although personally I am not a big fan of the line-length being increased. Feel free to address my comments at your own discretion.
pyproject.toml
Outdated
# Wide rows | ||
line-length = 100 | ||
line-length = 120 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the rationale here for increasing the line length?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, 100 seems better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will reset the length back to 100 and update.
src/weathergen/utils/train_logger.py
Outdated
@@ -172,8 +172,8 @@ def read(run_id, model_path: str, epoch=-1): | |||
with open(fname_log_train, "rb") as f: | |||
log_train = np.loadtxt(f, delimiter=",") | |||
log_train = log_train.reshape((log_train.shape[0] // len(cols_train), len(cols_train))) | |||
except: | |||
_logger.warning(f"Warning: no training data loaded for run_id={run_id}") | |||
except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be great if you actually only catch relevant exceptions (probably FileNotFound
, maybe ValueError
?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to catch specific codes but the problem in enlisting only the specific exceptions was for instance if there were not so common PermissionError or some FormatErrors it would not be caught, so instead I wanted to catch the general exception, if needed can print the stack trace for debugging as well...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can catch common exceptions if that is preferred and add another general exception catch for those missed by first catch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would advice against a general exception catch (since usually you want to know when something breaks). To handle the more uncommon exceptions you can handle them all at once like so:
try:
dangerous_code()
except (Exception1, Exception2, ...) as e:
handle_exceptions()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I understand but this is what I meant or suggested as an alternative which is as below, to not discard general exception altogether missed by the first catch, as it is very much possible to enumerate n exceptions and miss something which will not be caught.
try: dangerous_code() except (PermissionError, OSError, UnicodeDecodeError) as e: handle_known_issue(e) except Exception as e: log_unexpected_error(e) raise
I have now caught format and file based errors in two catches giving specific context and one general exception catch block for anything else.
src/weathergen/utils/train_logger.py
Outdated
@@ -202,8 +202,8 @@ def read(run_id, model_path: str, epoch=-1): | |||
with open(fname_log_val, "rb") as f: | |||
log_val = np.loadtxt(f, delimiter=",") | |||
log_val = log_val.reshape((log_val.shape[0] // len(cols_val), len(cols_val))) | |||
except: | |||
print(f"Warning: no validation data loaded for run_id={run_id}") | |||
except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above
src/weathergen/utils/train_logger.py
Outdated
@@ -215,8 +215,8 @@ def read(run_id, model_path: str, epoch=-1): | |||
with open(fname_perf_val, "rb") as f: | |||
log_perf = np.loadtxt(f, delimiter=",") | |||
log_perf = log_perf.reshape((log_perf.shape[0] // len(cols_perf), len(cols_perf))) | |||
except: | |||
print(f"Warning: no performance data loaded for run_id={run_id}") | |||
except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Sindhu-Vasireddy thanks for the changes. Regarding figuring out the exceptions: the paid version of copilot for vscode (10 eur / month) is very helpful at autocompleting the exceptions (or you can copy/paste the try/catch block in your favorite LLM and prompt it to give the most likely exceptions)
pyproject.toml
Outdated
@@ -96,7 +93,7 @@ ignore = [ | |||
"SIM401", | |||
"F811", | |||
# To ignore, not relevant for us | |||
"E741", | |||
"SIM108" # in case additional norm layer supports are added in futureml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call, it can indeed make the code more readable
src/weathergen/utils/train_logger.py
Outdated
except: | ||
print(f"Warning: no performance data loaded for run_id={run_id}") | ||
except Exception as e: | ||
print(f"Warning: no validation data loaded for run_id={run_id} due to exception: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also change the print statement to _logger.warn(...)
? Another task is also to remove all the print statements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But let's remove the print() in another PR :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also as per @tjhunter's comment only for these 3 try catch blocks I have set the print to logger.error for the two file processesing errors and general exception, kept the initial format errors as logger.warn.
If you would like me to remove these I can do so as well. Kindly let me know your preference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since a few changes are still required, I am marking this PR with changes requested.
Will make the changes and update the PR
|
Addressed the comments and resolved conflicts, could you kindly re-review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, my main concerns are addressed. However I noticed some commented code fragments (probably introduced somewhere else). @clessig Do you know when they were introduced and if this is outdated or relevant code? It should be addressed but for the sake of getting this merged maybe in a separate issue?
# [ | ||
# torch.nn.Sequential(torch.nn.Linear(dim_embed, max(dim_embed//2,4*dim_out)), | ||
# torch.nn.GELU(), | ||
# torch.nn.Linear(max(dim_embed//2,4*dim_out), dim_out)) for _ in range(num_channels) | ||
# torch.nn.Sequential( | ||
# torch.nn.Linear(dim_embed, max(dim_embed//2,4*dim_out)), | ||
# torch.nn.GELU(), | ||
# torch.nn.Linear(max(dim_embed//2,4*dim_out), dim_out) | ||
# ) for _ in range(num_channels) | ||
# ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know its not your code originally, but commented out code should be removed: It is unknown if this code is still functional, and if so it is also remembered by git => There is no need to comment it out just to remember it. @clessig can you maybe comment on if this is an artifact that should be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't buy the second argument: It's remembered by git doesn't help if it's not remembered by any human. And humans need hints or triggers to remember things.
The concrete case here is Linear versus MLP for the embedding, with the MLP being commented out. This has never been properly ablated. That's why it's commented out for someone to pick up. An issue would probably be a better way as a trigger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that this commented out code looks like random garbage in the middle of the rest of the code, and there is no limit to it: if someone starts to put such lines, other people would feel entitled to put full files. My suggestion would be to not have it as a comment. An alternative is using conditional statement, ideally with a config flag eventually:
unembed_layer = "linear"
if unembed_layer == "linear":
l = [torch.nn.Linear(dim_embed, dim_out) for _ in range(num_channels)]
else:
l = [torch.nn.Sequential(...) for _ in range(num_channels)]
self.unembed = torch.nn.ModuleList(l)
# block, | ||
# tokens_global, | ||
# tokens, | ||
# q_cells_lens, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code should be kept since the work around below is very difficult to read, and this code is what explains what is actually happening. What is missing is a comment (and perhaps also in the comment about the work around).
# x_embed = torch.cat( [embed( s_c, c_c).flatten(0,1) | ||
# for s_c,c_c in zip( torch.split( s.source_tokens_cells, 49152), | ||
# torch.split( s.source_centroids, 49152))]) | ||
# x_embed = torch.cat( | ||
# [ | ||
# embed(s_c, c_c).flatten(0, 1) | ||
# for s_c, c_c in zip( | ||
# torch.split(s.source_tokens_cells, 49152), | ||
# torch.split(s.source_centroids, 49152), | ||
# ) | ||
# ] | ||
# ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment explains why it's there. This needs to be parametrized, but it would be worthwhile to also investigate what triggers the issues and if there's a better workaround.
# pct_start=0.0) | ||
# self.scheduler_cooldown = torch.optim.lr_scheduler.OneCycleLR( | ||
# optimizer, | ||
# max_lr=lr_final_decay, | ||
# total_steps=n_steps_cooldown, | ||
# pct_start=0.0, | ||
# ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment above explains it. Any issue should be open to investigate the problem. I would keep the code in the meantime.
src/weathergen/utils/train_logger.py
Outdated
except: | ||
print(f"Warning: no validation data loaded for run_id={run_id}") | ||
except (TypeError, AttributeError, IndexError, ZeroDivisionError, ValueError) as e: | ||
_logger.warning(( | ||
f"Warning: no validation data loaded for run_id={run_id}", | ||
"Data loading or reshaping failed — possible format, dimension, or logic issue.", | ||
f"Due to specific error: {e}" | ||
)) | ||
except (FileNotFoundError, PermissionError, OSError) as e: | ||
_logger.error(( | ||
f"Error: no validation data loaded for run_id={run_id}", | ||
"File system error occurred while handling the log file.", | ||
f"Due to specific error: {e}" | ||
)) | ||
except Exception: | ||
_logger.error(( | ||
f"Error: no validation data loaded for run_id={run_id}", | ||
f"Due to exception with trace:\n{traceback.format_exc()}" | ||
)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:)
src/weathergen/utils/train_logger.py
Outdated
_logger.warning(f"Warning: no training data loaded for run_id={run_id}") | ||
except (TypeError, AttributeError, IndexError, ZeroDivisionError, ValueError) as e: | ||
_logger.warning(( | ||
f"Warning: no training data loaded for run_id={run_id}", | ||
"Data loading or reshaping failed — possible format, dimension, or logic issue.", | ||
f"Due to specific error: {e}" | ||
)) | ||
except (FileNotFoundError, PermissionError, OSError) as e: | ||
_logger.error(( | ||
f"Error: no training data loaded for run_id={run_id}", | ||
"File system error occurred while handling the log file.", | ||
f"Due to specific error: {e}" | ||
)) | ||
except Exception: | ||
_logger.error(( | ||
f"Error: no training data loaded for run_id={run_id}", | ||
f"Due to exception with trace:\n{traceback.format_exc()}" | ||
)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:)
src/weathergen/utils/train_logger.py
Outdated
)) | ||
except (FileNotFoundError, PermissionError, OSError) as e: | ||
_logger.error(( | ||
f"Error: no validation data loaded for run_id={run_id}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:)
# [ | ||
# torch.nn.Sequential(torch.nn.Linear(dim_embed, max(dim_embed//2,4*dim_out)), | ||
# torch.nn.GELU(), | ||
# torch.nn.Linear(max(dim_embed//2,4*dim_out), dim_out)) for _ in range(num_channels) | ||
# torch.nn.Sequential( | ||
# torch.nn.Linear(dim_embed, max(dim_embed//2,4*dim_out)), | ||
# torch.nn.GELU(), | ||
# torch.nn.Linear(max(dim_embed//2,4*dim_out), dim_out) | ||
# ) for _ in range(num_channels) | ||
# ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that this commented out code looks like random garbage in the middle of the rest of the code, and there is no limit to it: if someone starts to put such lines, other people would feel entitled to put full files. My suggestion would be to not have it as a comment. An alternative is using conditional statement, ideally with a config flag eventually:
unembed_layer = "linear"
if unembed_layer == "linear":
l = [torch.nn.Linear(dim_embed, dim_out) for _ in range(num_channels)]
else:
l = [torch.nn.Sequential(...) for _ in range(num_channels)]
self.unembed = torch.nn.ModuleList(l)
07b9a0e
to
99dadbf
Compare
@tjhunter I have now rebased and linted the latest changes as well. Could you kindly re-review. |
@@ -93,7 +90,7 @@ ignore = [ | |||
"SIM102", | |||
"SIM401", | |||
# To ignore, not relevant for us | |||
"E741", | |||
"SIM108" # in case additional norm layer supports are added in future | |||
] | |||
|
|||
[tool.uv] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
below required-version
in the same file, can you add the following:
[tool.ruff.format]
# Use Unix `\n` line endings for all files
line-ending = "lf"
d54290e
to
53e47ff
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Sindhu-Vasireddy thank you very much! I am merging.
* Rebased to the latest changes and linted new changes * addressed review comments * addressed review comments * Linted the latest changes. * corrected the formating * corrected the formating * configured ruff to use LF line endings in pyproject.toml
* Implement mock IO (#336) * Adapt score class score class (#339) * Implement mock IO * Adapt score class * Removing unused file (#349) * remove database folder (#355) * Small change - CI - pinning the version of formatting (#361) * changes * changes * Update INSTALL.md * Update INSTALL.md * Fixed Exxx lint issues (#284) * Rebased to the latest changes and linted new changes * addressed review comments * addressed review comments * Linted the latest changes. * corrected the formating * corrected the formating * configured ruff to use LF line endings in pyproject.toml * [357] Sub-package for evaluation (#359) * working * changes * removing deps from non-core project * changes * fixes * comments * Iluise quick fix stac (#374) * remove database folder * fix database * Simplifying workflow for plot_training (#368) * Simplifying workflow for plot_training * Ruffed * Working on implementing exclude_source * Remove unused code * Fixed ruff issue * Fixing bug in lat handling (377) (#378) * Fixing bug in lat handling * Added comment --------- Co-authored-by: Seb Hickman <[email protected]> * recover num_ranks from previous run to calculate epoch_base (#317) * recover num_ranks from previous run to calculate epoch_base * set email settings for commits * addressing Tim's comment * make ruff happy * improve style * changes (#385) Linter rule so np.ndarray is not used as type * changed the script name from evaluate to inference as it simply gener… (#376) * changed the script name from evaluate to inference as it simply generate infer samples * changed evaluate to inference in the main scripts and corresponding calls in the config * update the main function for the inference script * changed evaluate to inference also in docstring, unit test scripts, and integration test scripts --------- Co-authored-by: Patnala,Ankit <[email protected]> * Introduce tuples instead for strings to avoid TypeError (#392) * Exclude channels from src / target (#363) * Exclude channels from src / target * Simplified code and added comment that pattern matching is used * Adding new stream config * Fixing bug that led to error when accessing self.ds when dataset is empty * Wokign on exlcude_source * work in progress * Fixing incorrect formating for logger (#388) * Ruffed * Refactored and cleaned up channel selection. Also added check that channels are not empty * Cleaned channel parsing and selection * Adjustments * Removing asserts incompatible with empty dataset --------- Co-authored-by: Christian Lessig <[email protected]> * add embed_dropout_rate to config v1 (#358) * [402] adds checks to the pull request (#403) * chanegs * mistake * mistake * mistake * changes * doc * Introduce masking class and incorporate in TokenizerMasking (#383) * creating masking class and adapting tokenizer_masking to use this class * minor changes to masking.py and tokenizer_masking * removed old tokenizer_masking * include masking_strategy in default_config * change ValueError to assert * linting formatting changes files * further linting of docstrings * create mask_source and mask_target in Masker, and update tokenizer_masking to use these, then style improvements * linted masking, tokenizer_masking * modify masker, rng and perm_sel now part of class, remove extra masking_rate, update comments, remove archived class * remove check if all masked, not masked * remove self.masking_rate from MultiStreamDS class, and masking args from batchify_source * update tokenizer utils with description of idx_ord_lens in comment * remove masking args from batchify_, perm_sel removed now internal to Masker class, remove handling special cases of masking (all masked) * adding masking_strategy: to config * remove unused mentions of masking_combination * removed comment about streams * changed assert to check self perm_sel is not None * ruff masking, tokenizer_masking * Ruffed * Added warning to capture corner case, likely due to incorrect user settings. * Fixed incorrect call twice * Fixed missing conditional for logger statement * Required changes for better handling of rngs * Improved handling of rngs * Improved handling of rng --------- Co-authored-by: Christian Lessig <[email protected]> * Implement per-channel logging (#283) * Fix bug with seed being divided by 0 for worker ID=0 * Fix bug causing crash when secrets aren't in private config * Implement logging losses per channel * Fix issue with empty targets * Rework loss logging * ruff * Remove computing max_channels * Change variables names * ruffed * Remove redundant enumerations * Use stages for logging * Add type hints * Apply the review * ruff * fix * Fix type hints * ruff --------- Co-authored-by: Tim Hunter <[email protected]> * [346] Passing options through the slurm script (#400) * changes * fixes * refactor `validation_io.write_validation` to make it more readable * remove legacy code `validation_io.read_validation` * encapsulate artifact path logic in config module * remove redundant attribute `Trainer.path_run` * use config to look up base_path in `write_validation` * remove unused `write_validation` args: `base_path`, `rank` * ensure correct type for pathes * remove streams initialization from `Trainer` * remove path logic from `Trainer.save_model` * simplify conditional * rename mock io module * update uv to include dask * Implement io module to support reading/writing model output * implement new validation_io routine * use new write_validation routine * remove unused code * rename output routine to `write_output` * ruffed and added comments * fixed annotation * use simple __init__ method for `OutputItem` instead of dataclasses magic * address reviewers comments * rename method * add simple docstrings * ruffed * typehint fixes * refactor names * update comments and typehints, dont import pytorch * remove `__post_init__` methods, cache properties * fixes and integration test * final fixes :) * changes * changes * changes * changes * changes * more work * changes * changes * changes * ruffed * ruffed * improve logging and comments * Update to score-class according to internal discussions and feedback in PR. * Add license header. * Ruffed code. * Update to score-class according to internal discussions and feedback in PR. * Add license header. * Ruffed code. * Add doc-string to call-method and provide example usage for efficient graph-construction. * Some fixes to score-class. * Some fixes to handling aggregation dimension. * Add missing import of MockIO. * changes * changes * removing the scores * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes --------- Co-authored-by: Kacper Nowak <[email protected]> Co-authored-by: Christian Lessig <[email protected]> Co-authored-by: iluise <[email protected]> Co-authored-by: Sindhu-Vasireddy <[email protected]> Co-authored-by: Seb Hickman <[email protected]> Co-authored-by: Julian Kuehnert <[email protected]> Co-authored-by: ankitpatnala <[email protected]> Co-authored-by: Patnala,Ankit <[email protected]> Co-authored-by: Savvas Melidonis <[email protected]> Co-authored-by: Christian Lessig <[email protected]> Co-authored-by: Till Hauer <[email protected]> Co-authored-by: Simon Grasse <[email protected]> Co-authored-by: Michael <[email protected]>
* Implement mock IO (ecmwf#336) * Adapt score class score class (ecmwf#339) * Implement mock IO * Adapt score class * Removing unused file (ecmwf#349) * remove database folder (ecmwf#355) * Small change - CI - pinning the version of formatting (ecmwf#361) * changes * changes * Update INSTALL.md * Update INSTALL.md * Fixed Exxx lint issues (ecmwf#284) * Rebased to the latest changes and linted new changes * addressed review comments * addressed review comments * Linted the latest changes. * corrected the formating * corrected the formating * configured ruff to use LF line endings in pyproject.toml * [357] Sub-package for evaluation (ecmwf#359) * working * changes * removing deps from non-core project * changes * fixes * comments * Iluise quick fix stac (ecmwf#374) * remove database folder * fix database * Simplifying workflow for plot_training (ecmwf#368) * Simplifying workflow for plot_training * Ruffed * Working on implementing exclude_source * Remove unused code * Fixed ruff issue * Fixing bug in lat handling (377) (ecmwf#378) * Fixing bug in lat handling * Added comment --------- Co-authored-by: Seb Hickman <[email protected]> * recover num_ranks from previous run to calculate epoch_base (ecmwf#317) * recover num_ranks from previous run to calculate epoch_base * set email settings for commits * addressing Tim's comment * make ruff happy * improve style * changes (ecmwf#385) Linter rule so np.ndarray is not used as type * changed the script name from evaluate to inference as it simply gener… (ecmwf#376) * changed the script name from evaluate to inference as it simply generate infer samples * changed evaluate to inference in the main scripts and corresponding calls in the config * update the main function for the inference script * changed evaluate to inference also in docstring, unit test scripts, and integration test scripts --------- Co-authored-by: Patnala,Ankit <[email protected]> * Introduce tuples instead for strings to avoid TypeError (ecmwf#392) * Exclude channels from src / target (ecmwf#363) * Exclude channels from src / target * Simplified code and added comment that pattern matching is used * Adding new stream config * Fixing bug that led to error when accessing self.ds when dataset is empty * Wokign on exlcude_source * work in progress * Fixing incorrect formating for logger (ecmwf#388) * Ruffed * Refactored and cleaned up channel selection. Also added check that channels are not empty * Cleaned channel parsing and selection * Adjustments * Removing asserts incompatible with empty dataset --------- Co-authored-by: Christian Lessig <[email protected]> * add embed_dropout_rate to config v1 (ecmwf#358) * [402] adds checks to the pull request (ecmwf#403) * chanegs * mistake * mistake * mistake * changes * doc * Introduce masking class and incorporate in TokenizerMasking (ecmwf#383) * creating masking class and adapting tokenizer_masking to use this class * minor changes to masking.py and tokenizer_masking * removed old tokenizer_masking * include masking_strategy in default_config * change ValueError to assert * linting formatting changes files * further linting of docstrings * create mask_source and mask_target in Masker, and update tokenizer_masking to use these, then style improvements * linted masking, tokenizer_masking * modify masker, rng and perm_sel now part of class, remove extra masking_rate, update comments, remove archived class * remove check if all masked, not masked * remove self.masking_rate from MultiStreamDS class, and masking args from batchify_source * update tokenizer utils with description of idx_ord_lens in comment * remove masking args from batchify_, perm_sel removed now internal to Masker class, remove handling special cases of masking (all masked) * adding masking_strategy: to config * remove unused mentions of masking_combination * removed comment about streams * changed assert to check self perm_sel is not None * ruff masking, tokenizer_masking * Ruffed * Added warning to capture corner case, likely due to incorrect user settings. * Fixed incorrect call twice * Fixed missing conditional for logger statement * Required changes for better handling of rngs * Improved handling of rngs * Improved handling of rng --------- Co-authored-by: Christian Lessig <[email protected]> * Implement per-channel logging (ecmwf#283) * Fix bug with seed being divided by 0 for worker ID=0 * Fix bug causing crash when secrets aren't in private config * Implement logging losses per channel * Fix issue with empty targets * Rework loss logging * ruff * Remove computing max_channels * Change variables names * ruffed * Remove redundant enumerations * Use stages for logging * Add type hints * Apply the review * ruff * fix * Fix type hints * ruff --------- Co-authored-by: Tim Hunter <[email protected]> * [346] Passing options through the slurm script (ecmwf#400) * changes * fixes * refactor `validation_io.write_validation` to make it more readable * remove legacy code `validation_io.read_validation` * encapsulate artifact path logic in config module * remove redundant attribute `Trainer.path_run` * use config to look up base_path in `write_validation` * remove unused `write_validation` args: `base_path`, `rank` * ensure correct type for pathes * remove streams initialization from `Trainer` * remove path logic from `Trainer.save_model` * simplify conditional * rename mock io module * update uv to include dask * Implement io module to support reading/writing model output * implement new validation_io routine * use new write_validation routine * remove unused code * rename output routine to `write_output` * ruffed and added comments * fixed annotation * use simple __init__ method for `OutputItem` instead of dataclasses magic * address reviewers comments * rename method * add simple docstrings * ruffed * typehint fixes * refactor names * update comments and typehints, dont import pytorch * remove `__post_init__` methods, cache properties * fixes and integration test * final fixes :) * changes * changes * changes * changes * changes * more work * changes * changes * changes * ruffed * ruffed * improve logging and comments * Update to score-class according to internal discussions and feedback in PR. * Add license header. * Ruffed code. * Update to score-class according to internal discussions and feedback in PR. * Add license header. * Ruffed code. * Add doc-string to call-method and provide example usage for efficient graph-construction. * Some fixes to score-class. * Some fixes to handling aggregation dimension. * Add missing import of MockIO. * changes * changes * removing the scores * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes * changes --------- Co-authored-by: Kacper Nowak <[email protected]> Co-authored-by: Christian Lessig <[email protected]> Co-authored-by: iluise <[email protected]> Co-authored-by: Sindhu-Vasireddy <[email protected]> Co-authored-by: Seb Hickman <[email protected]> Co-authored-by: Julian Kuehnert <[email protected]> Co-authored-by: ankitpatnala <[email protected]> Co-authored-by: Patnala,Ankit <[email protected]> Co-authored-by: Savvas Melidonis <[email protected]> Co-authored-by: Christian Lessig <[email protected]> Co-authored-by: Till Hauer <[email protected]> Co-authored-by: Simon Grasse <[email protected]> Co-authored-by: Michael <[email protected]>
Description
Fixed the Exxx lint issues
Type of Change
Issue Number
#275
Fixed the Exxx issues - E741,E501,E721,E722 and added ignore for SIM108
Code Compatibility
Code Performance and Testing
uv run train
and (if necessary)uv run evaluate
on a least one GPU node and it works$WEATHER_GENERATOR_PRIVATE
directoryDependencies
Documentation
Additional Notes