-
Notifications
You must be signed in to change notification settings - Fork 742
🚀 Initial pipeline design #1950
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🚀 Initial pipeline design #1950
Conversation
Signed-off-by: Ashwin Vaidya <[email protected]>
Signed-off-by: Ashwin Vaidya <[email protected]>
Signed-off-by: Ashwin Vaidya <[email protected]>
Signed-off-by: Ashwin Vaidya <[email protected]>
devices=devices, | ||
default_root_dir=temp_dir, | ||
) | ||
datamodule = get_datamodule(config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we pass the entire config here, but just the model
part to get_model
? This seems a bit inconsistent.
n_jobs: 2 | ||
seed: | ||
grid: [42, 43] | ||
hardware: cuda |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this field be called accelerator
to be more in line with Lightning terminology?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this file could be renamed to job.py
to better reflect the contents
"""Pool execution error should be raised when one or more jobs fail in the pool.""" | ||
|
||
|
||
class PoolExecutor(SerialExecutor): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit confusing that the PoolExecutor
inherits from SerialExecutor
. Maybe I'm missing something, but isn't the PoolExecutor
by nature the opposite of a SerialExecutor
? Maybe to make the code a bit easier to follow and navigate, we could rename the PoolExecutor to ParallelExecutor
, and use a common base class between this one and SerialExecutor
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me, PoolExecutor
is a more generic name compared to ParallelExecutor
. We can find similar namings in the concurrent.futures
python module. In my opinion, following such an unspoken agreement will help people who are looking at the anomalib codebase for the first time.
return _value | ||
|
||
|
||
class _Parser(ConfigParser): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think BenchmarkParser
would be a more descriptive name
from anomalib.pipelines.jobs.base import ConfigParser | ||
|
||
|
||
class _GridSearchAction(ActionTypeHint): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many of the methods in this class (e.g. namespace_from_dict
, dict_to_namespace
, flatten_dict
) are not specific to grid search. Maybe these could be moved to their own separate file in anomalib.pipelines.utils
so that other pipelines can use them as well.
args (Namespace): Arguments to run the pipeline. These are the args returned by ArgumentParser. | ||
""" | ||
if args is None: | ||
logger.warning("No arguments provided, parsing arguments from command line.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this should be a warning if it's the default behaviour when running a pipeline from the CLI.
# extract all grid keys and return cross product of all grid values | ||
container = _GridSearchAction.flatten_dict(container) | ||
grid_dict = {key: value for key, value in container.items() if "grid" in key} | ||
container = {key: value for key, value in container.items() if key not in grid_dict} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
variable naming in this method could be a bit more clear. e.g. common_params
instead of container
, current_params
instead of _container
.
📝 Description
You can also play around with
Addresses: #1732
✨ Changes
Select what type of change your PR is:
✅ Checklist
Before you submit your pull request, please make sure you have completed the following steps:
For more information about code review checklists, see the Code Review Checklist.