fix_bug simplify.py #1113

Vibsteamer · 2023-01-13T15:36:56Z

expected behavior:
when "labeled":true in dpgen simplify, 02.fp will soft-link "labeled data", and the soft-linked "task dir" will also be created, for format consistency.

it is expected to be data.000 and task.000.000000,
being respectively guaranteed by funcs data_system_fmt and fp_task_fmt

bug:
the typo_bug used data_system_fmt for the "task dir" instead of fp_task_fmt,
then gives task.000 instead of task.000.000000,

which makes _check_empty_iter (who checks glob.glog("task.000.*")) in generator/run.py sentence this iter empty,
then 00.train of the next iter is always skipped.

consequence:
this make the "simplify_labeled" process never starts correctly,

no iter0 model presents and randomly-picked data in iter0 are never trained,
then iter1 gives error that can't finding the graph file from iter0 when trying copying them due to the train-skip.

BTW
thought "simplify_labeled" valuable in some complex or big-data scenarios but seems not loved by users yet.
pity : (

Signed-off-by: Wanrun Jiang [email protected]

when `"labeled":true` in `dpgen simplify`, 02.fp will soft-link labeled `data`, and soft-linked `task` dir will also be created, for format consistency. it is expected to be `data.000` and `task.000.000000`, being respectively guaranteed by funcs `fp_task_fmt` and `data_system_fmt` the typo_bug used `data_system_fmt` at both place and give `data.000` and `task.000', which makes `_check_empty_iter` (who checks glob.glog("task.000.*")) in `generator/run.py` sentence this iter empty, then `00.train` of the next iter is always skipped this make the "simplify_labeled" process never starts correctly, cause no iter0 model presents and randomly-picked data in iter0 are never trained, then iter1 gives error that can't find graph file from iter0 when trying copying them due to the train-skip. thought "simplify_labeled" valuable particularly in some complex or big-data scenario but seems not yet loved by users Signed-off-by: Wanrun Jiang <[email protected]>

dpgen/simplify/simplify.py

sorry, now it's right

njzjz · 2023-01-13T18:43:13Z

dpgen/dpgen/generator/run.py

Lines 118 to 121 in 355f8ed

    
           def _check_empty_iter(iter_index, max_v = 0) : 
        
               fp_path = os.path.join(make_iter_name(iter_index), fp_name) 
        
               # check the number of collected data 
        
               sys_data = glob.glob(os.path.join(fp_path, "data.*"))

_check_empty_iter only checks data.*, so it should not be a problem.

Vibsteamer · 2023-01-13T18:47:27Z

dpgen/dpgen/generator/run.py

Lines 118 to 121 in 355f8ed

def _check_empty_iter(iter_index, max_v = 0) :

fp_path = os.path.join(make_iter_name(iter_index), fp_name)

# check the number of collected data

sys_data = glob.glob(os.path.join(fp_path, "data.*"))

_check_empty_iter only checks data.*, so it should not be a problem.

OK, I will do the upgrade. Thanks.

njzjz reviewed Jan 13, 2023

View reviewed changes

dpgen/simplify/simplify.py Outdated Show resolved Hide resolved

Update simplify.py

ea85a07

sorry, now it's right

njzjz approved these changes Jan 13, 2023

View reviewed changes

wanghan-iapcm merged commit b14063e into deepmodeling:devel Jan 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix_bug simplify.py #1113

fix_bug simplify.py #1113

Vibsteamer commented Jan 13, 2023 •

edited

Loading

njzjz commented Jan 13, 2023

Vibsteamer commented Jan 13, 2023

fix_bug simplify.py #1113

fix_bug simplify.py #1113

Conversation

Vibsteamer commented Jan 13, 2023 • edited Loading

njzjz commented Jan 13, 2023

Vibsteamer commented Jan 13, 2023

Vibsteamer commented Jan 13, 2023 •

edited

Loading