Abstract: The molecular dynamics simulation of 16113919a14b86 is how to build a model to describe the interaction between molecules.

This article is shared from Huawei Cloud Community " AI Modeling-Molecular Dynamics Simulation ", author: Muzi_007.

1. Background

image.png

Molecular dynamics simulation is widely used in the fields of medicine, chemistry, biology, materials, etc. The study of the microstructure of simulated substances can help us understand the macroscopic properties of substances, and even make predictions on the macroscopic properties of substances. The microstructure of substances is It is determined by the interaction between atoms, so the focus of molecular dynamics simulation is how to build a model to describe the interaction between molecules.
image.png

There are two traditional modeling methods: DFT (first principles) and empirical force field.

DFT

DFT is also called first principles. The specific modeling and calculation process is very complicated and requires very deep mathematics and domain knowledge.
image.png

The model built by DFT can be regarded as a black box. According to the input, the state of the atoms in the next frame can be calculated

The calculation result of DFT is relatively accurate, but because the calculation cost is too high, the calculation efficiency is too low, and it can only simulate a physical system of a few hundred atoms.

Empirical force field

The empirical force field is a high-order function constructed by people based on its physical characteristics and some experimental results when studying a certain physical system, plus some processing experience, such as the potential energy function when studying inert gases.
image.png

This construction method is computationally efficient, but its modeling method itself determines its accuracy is not high. Therefore, traditional molecular dynamics simulation has a dilemma, that is, it is impossible to balance efficiency and accuracy. With the popularization of deep learning, this dilemma has been solved. Next, I will give a brief introduction to the modeling ideas of DeepMD, a molecular dynamics simulation framework.
image.png

2. AI modeling (water H2O as an example)

Interactions on microscopic particles
image.png

It is essentially a high-dimensional function of atomic space coordinates. If this function can be calculated, then performance and accuracy can be taken into account during simulation. Traditional mathematical tools lack effective means for high-dimensional functions, while AI deep learning , Is essentially a mathematical tool. It provides a powerful tool for the approximation of high-dimensional functions. Next, I will introduce the idea and process of modeling. First, we should clarify our purpose and the conditions we already have:

Purpose: Build a deep learning network, and finally can train and calculate a high-dimensional function (train a model) to represent this physical system, and the final calculated result is similar to the first-principle calculation result.

Model:
image.png

training and test data:

The atomic states of different frames in the system calculated from the first principles, including the atomic coordinate coord, the space scale coordinate box, the energy of the system, and the force of each atom in the system.

training:

Training L2 loss
image.png

2.1 Data processing

Through the above analysis, we are going to build this model. The input is the coordinates of the atoms in the system, and the output is the potential energy of the system and the force on the atoms. However, there is one point that needs special attention. For a system such as a water molecule, if a water molecule occurs If it rotates, translates, or exchanges between its two hydrogen atoms, then the energy is constant.
image.png

This change is normal for us to understand intuitively, but from the perspective of the model, this requires that after the coordinate changes, the output can remain the same as before the change. This is often impossible for the model, so Before actually entering the deep learning DNN network, it is necessary to do some processing on the original atomic three-dimensional coordinates, so that it can meet the invariance of translation, rotation and exchange in space.

2.1.1 Translation invariance

To ensure translation invariance, we can convert the left side of the atom in space to the relative distance between atoms to create a distance matrix Ri.
image.png

According to the atomic center frame, the distance matrix is smoothed, and the influence of atoms beyond the truncated radius is set to 0.
image.png
image.png

2.1.2 Exchange and rotation invariance

To satisfy the invariance of exchange and rotation in space, some changes need to be made to the distance matrix. Here are some conclusions

Create a matrix G and express the environment descriptor D as the following form, which can satisfy the constant exchange of space.
image.png

At the same time, because after the distance matrix is rotated, it is still itself, so it also satisfies the rotation invariance.
image.png

In this way, the coordinates of the space are converted into the environment description D with actual physical meaning, and can participate in the following DNN network training.

2.2 Whole network training

At this point, the construction of the training network has been completed, as shown in the following figure
image.png

The structure of the depth potential energy is as follows:
image.png

Environment description
image.png

The force on the atom is calculated according to its smooth potential energy, and it is obtained as follows, which is actually obtained by calculating the gradient of the atom's potential energy surface.
image.png

The loss function is defined as
image.png

Then this modeling is completed, and then the concrete realization is carried out.

Three, summary

The above is just to share some of the construction ideas of the DeepMD-kit framework. There are many frameworks and ideas in the field of molecular dynamics simulation that are worth learning and mining. For example, in order to enhance the model's coverage of the physical scene distribution, it is necessary to randomly sample and generate data during the training process. Let the model make predictions. If the prediction is inaccurate, add the data to the training set to continue training. This operation is to package the content of DeepGen, another framework of DeepMD-kit. There will be time to summarize some of the design ideas and algorithms in the future. Optimized content.

Click to follow to learn about Huawei Cloud's fresh technology for the first time~


华为云开发者联盟
1.4k 声望1.8k 粉丝

生于云,长于云,让开发者成为决定性力量