Table of contents
Use DeePMD-kit
In this tutorial, we will call the deep neural network that is used to represent the interatomic interactions (Deep Potential) the model. The typical procedure of using DeePMD-kit is
- Prepare data
- Train a model
- Freeze the model
- MD runs with the model (LAMMPS or ASE)
In this tutorial, we will take water as an example to practice the deepmd-kit, the folders in
/root/workshop/deepmd-kit
are:
.
├── ase
│ ├── calc
│ └── ref
├── data
│ ├── deepmd
│ │ ├── set.000
│ │ └── set.001
│ ├── others
│ │ ├── Li
│ │ │ └── deepmd
│ │ │ ├── set.000
│ │ │ └── set.001
│ │ └── multi_systems
│ │ └── deepmd
│ │ ├── Li2
│ │ │ ├── set.000
│ │ │ └── set.001
│ │ └── Li40
│ │ ├── set.000
│ │ └── set.001
│ └── test
│ └── set.000
├── interface
│ ├── calc
│ └── ref
├── lmp
│ ├── calc
│ └── ref
└── train
├── calc
└── ref
ase
folder contains the script for invoking deep potential fromASE
,data
folder contains the raw data for training, test and practice,interfaces
folder contains a simple example for python script,lmp
folder contains the input file for LAMMPS MD run andtrain
folder contains the input file for trainning the model. In every folder, the sub-folderref
means the reference data and thecalc
can be used for your practice.
Prepare data
One needs to extract the following information from DFT calculation to train a model: the atom type, the simulation box, the atom coordinate, the atom force, system energy and virial. A snapshot of a system that contains these information is called a frame. We use the following convention of units:
Properties | Unit |
---|---|
Time | ps |
Length | Å |
Energy | eV |
Force | eV/Å |
Pressure | Bar |
Once you finished runing the DFT calculation, you may run the following command:
from dpdata import LabeledSystem
ls=LabeledSystem('OUTCAR',fmt='outcar')
ls.to_deepmd_raw('deepmd')
ls.to_deepmd_npy('deepmd',set_size=300)
then you can convert DFT data into DeepMD-kit format. The directory tree is similar to this:
deepmd/
├── box.raw
├── coord.raw
├── energy.raw
├── force.raw
├── set.000
│ ├── box.npy
│ ├── coord.npy
│ ├── energy.npy
│ └── force.npy
├── set.001
│ ├── box.npy
│ ├── coord.npy
│ ├── energy.npy
│ └── force.npy
├── type_map.raw
└── type.raw
we will skip the above step, since we have already prepared the data set in the folder:
/root/workshop/deepmd-kit/data/deepmd
You may change directory to this folder:
cd /root/workshop/deepmd-kit/data/deepmd
ls
it contains related files:
box.raw coord.raw energy.raw force.raw set.000 set.001 type_map.raw type.raw
Or you may change directory to this folder:
$ cd /root/workshop/deepmd-kit/data/others/Li
then run the following command:
$ python script.py
The standard output will shows:
Data Summary
Unlabeled System
-------------------
Frame Numbers : 1
Atom Numbers : 40
Element List :
-------------------
Li
40
Data Summary
Labeled System
-------------------
Frame Numbers : 10
Atom Numbers : 40
Including Virials : Yes
Element List :
-------------------
Li
40
By running this script, you can convert VASP output file to deepmd raw and npy data format.
Train a model
Write the input script
Here we provide a small training dataset taken from 400 frames generated by NVT ab-initio water MD trajectory with 300 frames for training and 100 for validation. One can configure the input file by
$ cd /root/workshop/deepmd-kit/train/calc
$ cat cat water.json
{
"_comment": " model parameters",
"model": {
"type_map": ["O", "H"],
"descriptor" :{
"type": "se_a",
"sel": [46, 92],
"rcut_smth": 5.80,
"rcut": 6.00,
"neuron": [25, 50, 100],
"resnet_dt": false,
"axis_neuron": 16,
"seed": 1,
"_comment": " that's all"
},
"fitting_net" : {
"neuron": [240, 240, 240],
"resnet_dt": true,
"seed": 1,
"_comment": " that's all"
},
"_comment": " that's all"
},
"learning_rate" :{
"type": "exp",
"start_lr": 0.001,
"decay_steps": 2000,
"decay_rate": 0.95,
"_comment": "that's all"
},
"loss" :{
"start_pref_e": 0.02,
"limit_pref_e": 1,
"start_pref_f": 1000,
"limit_pref_f": 1,
"start_pref_v": 0,
"limit_pref_v": 0,
"_comment": " that's all"
},
"_comment": " traing controls",
"training" : {
"systems": ["../../data/deepmd"],
"set_prefix": "set",
"stop_batch": 400000,
"batch_size": 1,
"seed": 1,
"_comment": " display and restart",
"_comment": " frequencies counted in batch",
"disp_file": "lcurve.out",
"disp_freq": 100,
"numb_test": 10,
"save_freq": 1000,
"save_ckpt": "model.ckpt",
"load_ckpt": "model.ckpt",
"disp_training":true,
"time_training":true,
"profiling": false,
"profiling_file":"timeline.json",
"_comment": "that's all"
},
"_comment": "that's all"
}
where water.json
is the json
format parameter file that controls the training.
Training
The training can be invoked by
$ cd /root/workshop/deepmd-kit/train/calc
$ dp train water.json 1> runlog 2>err &
During the training, the error of the model is tested every disp_freq
batches with numb_test
frames from the last set in the systems
directory on the fly, and the results are output to lcurve.out
. A typical lcurve.out
looks like
# batch l2_tst l2_trn l2_e_tst l2_e_trn l2_f_tst l2_f_trn lr
0 3.25e+01 3.23e+01 1.03e+01 1.03e+01 8.08e-01 8.01e-01 1.0e-03
100 2.59e+01 2.67e+01 1.71e+00 1.70e+00 8.13e-01 8.39e-01 1.0e-03
200 2.54e+01 2.59e+01 2.25e-01 2.29e-01 8.03e-01 8.19e-01 1.0e-03
300 2.44e+01 2.30e+01 1.55e-01 1.55e-01 7.72e-01 7.27e-01 1.0e-03
400 2.21e+01 2.19e+01 3.00e-01 3.08e-01 6.98e-01 6.93e-01 1.0e-03
500 2.05e+01 1.94e+01 1.71e-01 1.76e-01 6.48e-01 6.14e-01 1.0e-03
600 1.46e+01 1.49e+01 1.42e-01 1.37e-01 4.61e-01 4.70e-01 1.0e-03
700 1.22e+01 1.19e+01 1.31e-01 1.32e-01 3.85e-01 3.75e-01 1.0e-03
800 1.35e+01 1.35e+01 3.74e-02 4.24e-02 4.28e-01 4.28e-01 1.0e-03
900 9.58e+00 9.21e+00 1.09e-01 1.12e-01 3.03e-01 2.91e-01 1.0e-03
1000 8.88e+00 8.60e+00 3.31e-02 3.46e-02 2.81e-01 2.72e-01 1.0e-03
1100 8.47e+00 9.02e+00 6.44e-03 5.08e-03 2.68e-01 2.85e-01 1.0e-03
1200 9.03e+00 8.85e+00 4.14e-02 4.10e-02 2.86e-01 2.80e-01 1.0e-03
Freeze a model
After finishing training, the trained neural network can be extracted from a checkpoint and dumped into a database. This process is called “freezing” a model. To freeze a model, typically one does
$ cd /root/workshop/deepmd-kit/train/calc
$ dp freeze -o graph.pb
in the folder where the model is trained. The output database is called graph.pb
.
Test a model
The frozen model can be used in many ways.The most straightforward test can be performed using dp test
. Assuming that you have prepared the test set in folder: /root/workshop/deepmd-kit/data/test
. To test the performace of model, you should run the following command:
$ cd /root/workshop/deepmd-kit/train/calc
$ dp test -m graph.pb -s /root/workshop/deepmd-kit/data/test -d result
The standard ouput is:
# number of test data : 30
Energy L2err : 1.358342e-01 eV
Energy L2err/Natoms : 7.074696e-04 eV
Force L2err : 3.553606e-02 eV/A
Virial L2err : 5.000555e+00 eV
Virial L2err/Natoms : 2.604456e-02 eV
at the same time, the output files result.e.out
, result.f.out
and result.v.out
will record the predicted energy, force and viral information.
Model inference
To use the python interface of DeePMD-kit for model inference, an example is given as follows
import deepmd.DeepPot as DP
from pprint import pprint
import numpy as np
dp = DP('graph.pb')
coord = np.array([[1,0,0], [0,0,1.5], [1,0,3]]).reshape([1, -1])
cell = np.diag(10 * np.ones(3)).reshape([1, -1])
atype = [1,0,1]
e, f, v = dp.eval(coord, cell, atype)
print('-'*20)
pprint(e)
print('-'*20)
pprint(f)
print('-'*20)
pprint(v)
where e
, f
and v
are predicted energy, force and virial of the system, respectively.
To run this example, just type:
$ cd /root/workshop/deepmd-kit/interfaces/calc
$ python -u run.py | tee run.log
Run MD with LAMMPS
Include deepmd in the pair style
Here we will use the previous water potential to run MD with LAMMPS. In the LAMMPS input file, one needs to specify the pair style as follows
# bulk water
units metal
boundary p p p
atom_style atomic
neighbor 2.0 bin
neigh_modify every 10 delay 0 check no
read_data water.lmp
mass 1 16
mass 2 2
pair_style deepmd graph.pb
pair_coeff
velocity all create 330.0 23456789
fix 1 all nvt temp 330.0 330.0 0.5
timestep 0.0005
thermo_style custom step pe ke etotal temp press vol
thermo 100
dump 1 all custom 100 water.dump id type x y z
run 1000f
where graph.pb
is the file name of the frozen model. The pair_coeff
should be left blank. It should be noted that LAMMPS counts atom types starting from 1, therefore, all LAMMPS atom type will be firstly subtracted by 1, and then passed into the DeePMD-kit engine to compute the interactions.
To run this example, just type:
$ cd /root/workshop/deepmd-kit/lmp/calc
$ lmp -in in.lammps | tee run.log
Use deep potential with ASE
Deep potential can be set up as a calculator with ASE to obtain potential energies and forces.
from ase import Atoms
from pprint import pprint
from deepmd.calculator import DP
water = Atoms('H2O',
positions=[(0.7601, 1.9270, 1),
(1.9575, 1, 1),
(1., 1., 1.)],
cell=[100, 100, 100],
calculator=DP(model="graph.pb"))
pprint(water.get_potential_energy())
pprint(water.get_forces())
To run this example, just type:
$ cd /root/workshop/deepmd-kit/ase/calc
$ python run.py | tee run.log
Optimization is also available:
from ase import Atoms
from ase.optimize import BFGS
from pprint import pprint
from deepmd.calculator import DP
water = Atoms('H2O',
positions=[(0.7601, 1.9270, 1),
(1.9575, 1, 1),
(1., 1., 1.)],
cell=[100, 100, 100],
calculator=DP(model="graph.pb"))
dyn = BFGS(water)
dyn.run(fmax=1e-6)
print(water.get_positions())
print(water.get_potential_energy())
print(water.get_forces())
To run this example, just type:
$ cd /root/workshop/deepmd-kit/ase/calc
$ python opt.py | tee opt.log