Skip to content

[bug] The newest version of dpgen (0.10.6) can't compatible with newest version of deepmd-kit (2.1.5) #1004

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gchenustc opened this issue Oct 19, 2022 · 2 comments
Labels
bug Something isn't working

Comments

@gchenustc
Copy link

gchenustc commented Oct 19, 2022

Bug summary

the default parameters in param.json will be set in a wrong way while training. For example, the option "set_prefix" will be wrongly added into the subsetction of "training" tag, but it's the subsection of "train"-"traing_data" tag. it raises error:
dargs.dargs.ArgumentKeyError: [at location training] undefined key set_prefix is not allowed in strict mode

DeePMD-kit Version

2.1.5

TensorFlow Version

2.9.0

How did you download the software?

conda

Input Files, Running Commands, Error Log, etc.

params.json for dpgen's input

{
    "type_map": [
        "N"
    ],
    "mass_map": [
                14
    ],
    "init_data_prefix": "../init/",
    "init_data_sys": [
        "cgN.POSCAR.01x01x01/02.md/sys-0008/deepmd"
    ],
    "sys_configs_prefix": "../init/",
    "sys_configs": [
         [
            "cgN.POSCAR.01x01x01/01.scale_pert/sys-0008/scale-1.000/00000[1-5]/POSCAR"
            ],
         [
            "cgN.POSCAR.01x01x01/01.scale_pert/sys-0008/scale-1.000/00000[6-9]/POSCAR",
            "cgN.POSCAR.01x01x01/01.scale_pert/sys-0008/scale-1.000/00001*/POSCAR"
            ]
    ],

    "_comment": " ***** train ***** ",
    "numb_models": 4,
    "default_training_param": {
        "model": {
            "type_map": [
                "N"
            ],
            "descriptor": {
                "type": "se_a",
                "sel": [
                                        8
                ],
                "rcut_smth": 0.5,
                "rcut": 5.0,
                "neuron": [
                    120,
                    120,
                    120
                ],
                "resnet_dt": true,
                "axis_neuron": 12,
                "seed": 1
            },
            "fitting_net": {
                "neuron": [
                    25,
                    50,
                    100
                ],
                "resnet_dt": false,
                "seed": 1
            }
        },
        "learning_rate": {
            "type": "exp",
            "start_lr": 0.001,
            "decay_steps": 100
        },
        "loss": {
            "start_pref_e": 0.02,
            "limit_pref_e": 2,
            "start_pref_f": 1000,
            "limit_pref_f": 1,
            "start_pref_v": 0.0,
            "limit_pref_v": 0.0
        },
        "training": {
            "training_data":{
                        "_systems":["system1_path", "system2_path", "..."],
                        "set_prefix": "set",
                        "batch_size": 4
             }, 
            "numb_steps": 2000,
            "disp_file": "lcurve.out",
            "disp_freq": 500,
            "_numb_test": 4,
            "save_freq": 1000,
            "_save_ckpt": "model.ckpt",
            "_disp_training": true,
            "_time_training": true,
            "_profiling": false,
            "_profiling_file": "timeline.json",
            "_comment": "that's all"
        }
    },
    "model_devi_dt": 0.002,
    "model_devi_skip": 0,
    "model_devi_f_trust_lo": 0.05,
    "model_devi_f_trust_hi": 0.15,
    "model_devi_e_trust_lo": 10000000000.0,
    "model_devi_e_trust_hi": 10000000000.0,
    "model_devi_clean_traj": true,
    "model_devi_jobs": [
        {
            "sys_idx": [
                0
            ],
            "temps": [
                100
            ],
            "press": [
                1.0
            ],
            "trj_freq": 10,
            "nsteps": 300,
            "ensemble": "nvt",
            "_idx": "00"
        },
        {
            "sys_idx": [
                1
            ],
            "temps": [
                100
            ],
            "press": [
                1.0
            ],
            "trj_freq": 10,
            "nsteps": 3000,
            "ensemble": "nvt",
            "_idx": "01"
        }
    ],
    "fp_style": "vasp",
    "shuffle_poscar": false,
    "fp_task_max": 40,
    "fp_task_min": 5,
    "fp_pp_path": "./",
    "fp_pp_files": [
        "POTCAR_N"
    ],
    "fp_incar": "./INCAR_scf"
}

input.json for training step (automatically generated by dpgen)

{
    "model": {
        "type_map": [
            "N"
        ],
        "descriptor": {
            "type": "se_a",
            "sel": [
                8
            ],
            "rcut_smth": 0.5,
            "rcut": 5.0,
            "neuron": [
                120,
                120,
                120
            ],
            "resnet_dt": true,
            "axis_neuron": 12,
            "seed": 4027317093
        },
        "fitting_net": {
            "neuron": [
                25,
                50,
                100
            ],
            "resnet_dt": false,
            "seed": 2450907653
        }
    },
    "learning_rate": {
        "type": "exp",
        "start_lr": 0.001,
        "decay_steps": 100
    },
    "loss": {
        "start_pref_e": 0.02,
        "limit_pref_e": 2,
        "start_pref_f": 1000,
        "limit_pref_f": 1,
        "start_pref_v": 0.0,
        "limit_pref_v": 0.0
    },
    "training": {
        "set_prefix": "set",    ########## error here! ############
        "numb_steps": 2000,
        "batch_size": 4,
        "disp_file": "lcurve.out",
        "disp_freq": 500,
        "numb_test": 4,
        "_save_freq": 1000,
        "save_ckpt": "model.ckpt",
        "disp_training": true,
        "time_training": true,
        "profiling": false,
        "profiling_file": "timeline.json",
        "_comment": "that's all",
        "training_data": {
            "systems": [
                "../data.init/cgN.POSCAR.01x01x01/02.md/sys-0008/deepmd"
            ],
            "batch_size": [
                4
            ]
        },
        "seed": 3242942795
    }

Steps to Reproduce

nothing

Further Information, Files, and Links

nothing

@gchenustc gchenustc added the bug Something isn't working label Oct 19, 2022
@gchenustc gchenustc changed the title The newest version of dpgen (0.10.5) can't compatible with newest version of deepmd-kit (2.1.5) [bug] The newest version of dpgen (0.10.6) can't compatible with newest version of deepmd-kit (2.1.5) Oct 19, 2022
@njzjz njzjz transferred this issue from deepmodeling/deepmd-kit Oct 19, 2022
@HuangJiameng
Copy link
Collaborator

Strict check on run_param.json is loosen, see #952. Still, thanks for your report. It will be fixed.

@gchenustc
Copy link
Author

Sorry, Because it's interrupted before, So it restarts and not read params.json. I clean workdir and and don't add "set_prefix", it works now. Thanks

Sorry, Because it's interrupted before, So it restarts and not read params.json. I clean workdir and and don't add "set_prefix", it works now. Thanks

@njzjz njzjz closed this as completed Oct 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants