DeepMD manual


Table of contents

Use DeePMD-kit

In this tutorial, we will call the deep neural network that is used to represent the interatomic interactions (Deep Potential) the model. The typical procedure of using DeePMD-kit is

  1. Prepare data
  2. Train a model
  3. Freeze the model
  4. MD runs with the model (LAMMPS or ASE)

In this tutorial, we will take water as an example to practice the deepmd-kit, the folders in /root/workshop/deepmd-kit are:

.
├── ase
│   ├── calc
│   └── ref
├── data
│   ├── deepmd
│   │   ├── set.000
│   │   └── set.001
│   ├── others
│   │   ├── Li
│   │   │   └── deepmd
│   │   │       ├── set.000
│   │   │       └── set.001
│   │   └── multi_systems
│   │       └── deepmd
│   │           ├── Li2
│   │           │   ├── set.000
│   │           │   └── set.001
│   │           └── Li40
│   │               ├── set.000
│   │               └── set.001
│   └── test
│       └── set.000
├── interface
│   ├── calc
│   └── ref
├── lmp
│   ├── calc
│   └── ref
└── train
    ├── calc
    └── ref

ase folder contains the script for invoking deep potential from ASE
,data folder contains the raw data for training, test and practice, interfaces folder contains a simple example for python script, lmp folder contains the input file for LAMMPS MD run and train folder contains the input file for trainning the model. In every folder, the sub-folder ref means the reference data and the calc can be used for your practice.

Prepare data

One needs to extract the following information from DFT calculation to train a model: the atom type, the simulation box, the atom coordinate, the atom force, system energy and virial. A snapshot of a system that contains these information is called a frame. We use the following convention of units:

Properties Unit
Time ps
Length Å
Energy eV
Force eV/Å
Pressure Bar

Once you finished runing the DFT calculation, you may run the following command:

from dpdata import LabeledSystem
ls=LabeledSystem('OUTCAR',fmt='outcar')
ls.to_deepmd_raw('deepmd')
ls.to_deepmd_npy('deepmd',set_size=300)

then you can convert DFT data into DeepMD-kit format. The directory tree is similar to this:

deepmd/
├── box.raw
├── coord.raw
├── energy.raw
├── force.raw
├── set.000
│   ├── box.npy
│   ├── coord.npy
│   ├── energy.npy
│   └── force.npy
├── set.001
│   ├── box.npy
│   ├── coord.npy
│   ├── energy.npy
│   └── force.npy
├── type_map.raw
└── type.raw

we will skip the above step, since we have already prepared the data set in the folder: /root/workshop/deepmd-kit/data/deepmd

You may change directory to this folder:

cd /root/workshop/deepmd-kit/data/deepmd
ls

it contains related files:

box.raw  coord.raw  energy.raw  force.raw  set.000  set.001  type_map.raw  type.raw

Or you may change directory to this folder:

$  cd /root/workshop/deepmd-kit/data/others/Li

then run the following command:

$ python script.py

The standard output will shows:

Data Summary
Unlabeled System
-------------------
Frame Numbers     : 1
Atom Numbers      : 40
Element List      :
-------------------
Li
40
Data Summary
Labeled System
-------------------
Frame Numbers      : 10
Atom Numbers       : 40
Including Virials  : Yes
Element List       :
-------------------
Li
40

By running this script, you can convert VASP output file to deepmd raw and npy data format.

Train a model

Write the input script

Here we provide a small training dataset taken from 400 frames generated by NVT ab-initio water MD trajectory with 300 frames for training and 100 for validation. One can configure the input file by

$ cd /root/workshop/deepmd-kit/train/calc
$ cat  cat water.json 
{
    "_comment": " model parameters",
    "model": {
	"type_map":	["O", "H"],
	"descriptor" :{
	    "type":		"se_a",
	    "sel":		[46, 92],
	    "rcut_smth":	5.80,
	    "rcut":		6.00,
	    "neuron":		[25, 50, 100],
	    "resnet_dt":	false,
	    "axis_neuron":	16,
	    "seed":		1,
	    "_comment":		" that's all"
	},
	"fitting_net" : {
	    "neuron":		[240, 240, 240],
	    "resnet_dt":	true,
	    "seed":		1,
	    "_comment":		" that's all"
	},
	"_comment":	" that's all"
    },

    "learning_rate" :{
	"type":		"exp",
	"start_lr":	0.001,
	"decay_steps":	2000,
	"decay_rate":	0.95,
	"_comment":	"that's all"
    },

    "loss" :{
	"start_pref_e":	0.02,
	"limit_pref_e":	1,
	"start_pref_f":	1000,
	"limit_pref_f":	1,
	"start_pref_v":	0,
	"limit_pref_v":	0,
	"_comment":	" that's all"
    },

    "_comment": " traing controls",
    "training" : {
	"systems":	["../../data/deepmd"],
	"set_prefix":	"set",    
	"stop_batch":	400000,
	"batch_size":	1,

	"seed":		1,

	"_comment": " display and restart",
	"_comment": " frequencies counted in batch",
	"disp_file":	"lcurve.out",
	"disp_freq":	100,
	"numb_test":	10,
	"save_freq":	1000,
	"save_ckpt":	"model.ckpt",
	"load_ckpt":	"model.ckpt",
	"disp_training":true,
	"time_training":true,
	"profiling":	false,
	"profiling_file":"timeline.json",
	"_comment":	"that's all"
    },

    "_comment":		"that's all"
}

where water.json is the json format parameter file that controls the training.

Training

The training can be invoked by

$  cd /root/workshop/deepmd-kit/train/calc
$  dp train water.json 1> runlog 2>err &

During the training, the error of the model is tested every disp_freq batches with numb_test frames from the last set in the systems directory on the fly, and the results are output to lcurve.out. A typical lcurve.out looks like

# batch      l2_tst    l2_trn    l2_e_tst  l2_e_trn    l2_f_tst  l2_f_trn         lr
      0    3.25e+01  3.23e+01    1.03e+01  1.03e+01    8.08e-01  8.01e-01    1.0e-03
    100    2.59e+01  2.67e+01    1.71e+00  1.70e+00    8.13e-01  8.39e-01    1.0e-03
    200    2.54e+01  2.59e+01    2.25e-01  2.29e-01    8.03e-01  8.19e-01    1.0e-03
    300    2.44e+01  2.30e+01    1.55e-01  1.55e-01    7.72e-01  7.27e-01    1.0e-03
    400    2.21e+01  2.19e+01    3.00e-01  3.08e-01    6.98e-01  6.93e-01    1.0e-03
    500    2.05e+01  1.94e+01    1.71e-01  1.76e-01    6.48e-01  6.14e-01    1.0e-03
    600    1.46e+01  1.49e+01    1.42e-01  1.37e-01    4.61e-01  4.70e-01    1.0e-03
    700    1.22e+01  1.19e+01    1.31e-01  1.32e-01    3.85e-01  3.75e-01    1.0e-03
    800    1.35e+01  1.35e+01    3.74e-02  4.24e-02    4.28e-01  4.28e-01    1.0e-03
    900    9.58e+00  9.21e+00    1.09e-01  1.12e-01    3.03e-01  2.91e-01    1.0e-03
   1000    8.88e+00  8.60e+00    3.31e-02  3.46e-02    2.81e-01  2.72e-01    1.0e-03
   1100    8.47e+00  9.02e+00    6.44e-03  5.08e-03    2.68e-01  2.85e-01    1.0e-03
   1200    9.03e+00  8.85e+00    4.14e-02  4.10e-02    2.86e-01  2.80e-01    1.0e-03

Freeze a model

After finishing training, the trained neural network can be extracted from a checkpoint and dumped into a database. This process is called “freezing” a model. To freeze a model, typically one does

$ cd /root/workshop/deepmd-kit/train/calc
$ dp freeze -o graph.pb

in the folder where the model is trained. The output database is called graph.pb.

Test a model

The frozen model can be used in many ways.The most straightforward test can be performed using dp test. Assuming that you have prepared the test set in folder: /root/workshop/deepmd-kit/data/test. To test the performace of model, you should run the following command:

$ cd /root/workshop/deepmd-kit/train/calc
$ dp test -m graph.pb -s /root/workshop/deepmd-kit/data/test -d result

The standard ouput is:

# number of test data : 30 
Energy L2err        : 1.358342e-01 eV
Energy L2err/Natoms : 7.074696e-04 eV
Force  L2err        : 3.553606e-02 eV/A
Virial L2err        : 5.000555e+00 eV
Virial L2err/Natoms : 2.604456e-02 eV

at the same time, the output files result.e.out, result.f.out and result.v.out will record the predicted energy, force and viral information.

Model inference

To use the python interface of DeePMD-kit for model inference, an example is given as follows

import deepmd.DeepPot as DP
from pprint import pprint
import numpy as np
dp = DP('graph.pb')
coord = np.array([[1,0,0], [0,0,1.5], [1,0,3]]).reshape([1, -1])
cell = np.diag(10 * np.ones(3)).reshape([1, -1])
atype = [1,0,1]
e, f, v = dp.eval(coord, cell, atype)
print('-'*20)
pprint(e)
print('-'*20)
pprint(f)
print('-'*20)
pprint(v)

where e, f and v are predicted energy, force and virial of the system, respectively.

To run this example, just type:

$ cd /root/workshop/deepmd-kit/interfaces/calc
$ python -u run.py | tee run.log

Run MD with LAMMPS

Include deepmd in the pair style

Here we will use the previous water potential to run MD with LAMMPS. In the LAMMPS input file, one needs to specify the pair style as follows

# bulk water

units           metal
boundary        p p p
atom_style      atomic

neighbor        2.0 bin
neigh_modify    every 10 delay 0 check no

read_data	water.lmp
mass 		1 16
mass		2 2

pair_style	deepmd graph.pb
pair_coeff	

velocity        all create 330.0 23456789

fix             1 all nvt temp 330.0 330.0 0.5
timestep        0.0005
thermo_style    custom step pe ke etotal temp press vol
thermo          100
dump		1 all custom 100 water.dump id type x y z 

run             1000f     

where graph.pb is the file name of the frozen model. The pair_coeff should be left blank. It should be noted that LAMMPS counts atom types starting from 1, therefore, all LAMMPS atom type will be firstly subtracted by 1, and then passed into the DeePMD-kit engine to compute the interactions.

To run this example, just type:

$ cd /root/workshop/deepmd-kit/lmp/calc
$ lmp -in in.lammps  | tee run.log

Use deep potential with ASE

Deep potential can be set up as a calculator with ASE to obtain potential energies and forces.

from ase import Atoms
from pprint import pprint
from deepmd.calculator import DP
water = Atoms('H2O',
              positions=[(0.7601, 1.9270, 1),
                         (1.9575, 1, 1),
                         (1., 1., 1.)],
              cell=[100, 100, 100],
              calculator=DP(model="graph.pb"))
pprint(water.get_potential_energy())
pprint(water.get_forces())

To run this example, just type:

$ cd /root/workshop/deepmd-kit/ase/calc
$ python run.py  | tee run.log

Optimization is also available:

from ase import Atoms
from ase.optimize import BFGS
from pprint import pprint
from deepmd.calculator import DP

water = Atoms('H2O',
              positions=[(0.7601, 1.9270, 1),
                         (1.9575, 1, 1),
                         (1., 1., 1.)],
              cell=[100, 100, 100],
              calculator=DP(model="graph.pb"))
dyn = BFGS(water)
dyn.run(fmax=1e-6)
print(water.get_positions())
print(water.get_potential_energy())
print(water.get_forces())

To run this example, just type:

$ cd /root/workshop/deepmd-kit/ase/calc
$ python opt.py  | tee opt.log

文章作者: haidi-ustc
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 haidi-ustc !
  目录