-
Notifications
You must be signed in to change notification settings - Fork 3k
Add CallbackGroup & Metadata factory function #13437
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: PytLab <[email protected]>
Signed-off-by: PytLab <[email protected]>
@@ -47,6 +47,7 @@ | |||
from nemo.core.classes.common import Model | |||
from nemo.core.connectors.save_restore_connector import SaveRestoreConnector | |||
from nemo.core.optim import McoreDistributedOptimizer, prepare_lr_scheduler | |||
from nemo.lightning.pytorch.callbacks.callback_group import CallbackGroup, wrap_methods_with_callbacks |
Check notice
Code scanning / CodeQL
Unused import Note
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 4 hours ago
To fix the issue, the unused import of CallbackGroup
should be removed from the file. This will eliminate unnecessary clutter and improve code readability. The removal should be limited to the specific import statement flagged by CodeQL, ensuring no unintended changes to other parts of the code.
-
Copy modified line R51
@@ -50,3 +50,3 @@ | ||
from nemo.core.optim import McoreDistributedOptimizer, prepare_lr_scheduler | ||
from nemo.lightning.pytorch.callbacks.callback_group import CallbackGroup, wrap_methods_with_callbacks | ||
from nemo.lightning.pytorch.callbacks.callback_group import wrap_methods_with_callbacks | ||
from nemo.utils import logging, model_utils |
Signed-off-by: sajup-oss <[email protected]>
Signed-off-by: liquor233 <[email protected]>
Signed-off-by: sajup-oss <[email protected]>
…o into zshao/add_callback_group
Signed-off-by: sajup-oss <[email protected]>
Signed-off-by: liquor233 <[email protected]>
Signed-off-by: liquor233 <[email protected]>
Signed-off-by: sajup-oss <[email protected]>
Signed-off-by: sajup-oss <[email protected]>
trainer: The PyTorch Lightning trainer | ||
pl_module: The PyTorch Lightning module | ||
""" | ||
self.one_logger.on_validation_start(val_iterations=trainer.global_step) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it doesnt seem to be compatible with on_validation_start of OneLoggerUtils
""" | ||
self.one_logger.on_validation_start(val_iterations=trainer.global_step) | ||
|
||
def on_validation_end(self, trainer: pl.Trainer, pl_module: pl.LightningModule) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this fails if the batch specific hooks for validation are not provided
def on_validation_batch_start(self, trainer, pl_module, batch, batch_idx, dataloader_idx=0) -> None:
"""Called when validation batch begins.
Args:
trainer: The PyTorch Lightning trainer
pl_module: The PyTorch Lightning module
"""
self.one_logger.on_validation_batch_start()
def on_validation_batch_end(self, trainer, pl_module, outputs, batch, batch_idx, dataloader_idx=0) -> None:
"""Called when validation batch ends.
Args:
trainer: The PyTorch Lightning trainer
pl_module: The PyTorch Lightning module
"""
self.one_logger.on_validation_batch_end()
""" | ||
|
||
import functools | ||
import logging |
Check notice
Code scanning / CodeQL
Unused import Note
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 3 hours ago
To fix the issue, the unused logging
import on line 8 should be removed. This will eliminate the unnecessary dependency and make the code cleaner. No other changes are required, as the removal of this import does not affect the functionality of the code.
-
Copy modified line R8
@@ -7,3 +7,3 @@ | ||
import functools | ||
import logging | ||
|
||
from typing import Any, Dict, List, Optional, Type |
import pytorch_lightning as pl | ||
import torch | ||
from pytorch_lightning.callbacks import Callback | ||
from pytorch_lightning.plugins.io import AsyncCheckpointIO |
Check notice
Code scanning / CodeQL
Unused import Note
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 3 hours ago
To fix the problem, we will remove the unused import statement from pytorch_lightning.plugins.io import AsyncCheckpointIO
on line 17. This will clean up the code and eliminate the unnecessary dependency without affecting the functionality of the program.
-
Copy modified line R17
@@ -16,3 +16,3 @@ | ||
from pytorch_lightning.core import LightningModule | ||
from pytorch_lightning.plugins.io import AsyncCheckpointIO | ||
|
||
from pytorch_lightning.utilities import rank_zero_only |
Signed-off-by: sajup-oss <[email protected]>
import logging | ||
from typing import Any, Dict, List, Optional, Type | ||
|
||
import nv_one_logger.training_telemetry.api.callbacks as CB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eric: it is better to publish to public index, if we have difficulties, make sure this won't fail with import guard, and maybe ref to the open sourced one logger repo.
nemo/utils/exp_manager.py
Outdated
trainer: Optional trainer instance for checkpoint configuration | ||
""" | ||
try: | ||
from nv_one_logger.exporter.wandb_exporter import WandbExporter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Saju: we need to change this import to be the config based exporter import. so we could change the exporter destination by simple onelogger package upgrade.
Signed-off-by: liquor233 <[email protected]>
from typing import Any, Dict, List, Optional, Type | ||
|
||
import nv_one_logger.training_telemetry.api.callbacks as CB | ||
import pytorch_lightning as pl |
Check notice
Code scanning / CodeQL
Unused import Note
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 3 hours ago
To fix the issue, the unused import statement import pytorch_lightning as pl
should be removed from the file. This will eliminate the unnecessary dependency and improve code clarity. No other changes are required, as the rest of the code already uses specific imports from PyTorch Lightning directly.
-
Copy modified line R12
@@ -11,3 +11,3 @@ | ||
import nv_one_logger.training_telemetry.api.callbacks as CB | ||
import pytorch_lightning as pl | ||
|
||
import torch |
|
||
import nv_one_logger.training_telemetry.api.callbacks as CB | ||
import pytorch_lightning as pl | ||
import torch |
Check notice
Code scanning / CodeQL
Unused import Note
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 3 hours ago
To fix the problem, the unused torch
import on line 13 should be removed. This will eliminate the unnecessary dependency and make the code cleaner. No other changes are required since the removal of this import does not affect the functionality of the code.
-
Copy modified line R13
@@ -12,3 +12,3 @@ | ||
import pytorch_lightning as pl | ||
import torch | ||
|
||
from pytorch_lightning import Trainer |
Important
The
Update branch
button must only be pressed in very rare occassions.An outdated branch is never blocking the merge of a PR.
Please reach out to the automation team before pressing that button.
What does this PR do ?
Add global callback group & metadata factory function for NeMo
Collection: [Note which collection this PR will affect]
Will keep updating
Changelog
CallbackGroup
andCallback
ABC in NeMo/nemo/lightning/pytorch/callbacks/callback_group.pyUsage
# Add a code snippet demonstrating how to use this
GitHub Actions CI
The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.
The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".
Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information