[Recommended topics in this issue] must-read for IoT practitioners: HUAWEI CLOUD experts will give you a detailed explanation of the development of LiteOS modules and their implementation principles.

Abstract: based on MindSpore's automatic parallelism, graph-calculation fusion and other features. SPONGE can efficiently complete the traditional molecular simulation process. Using MindSpore's automatic differentiation feature, it can combine AI methods such as neural networks with traditional molecular simulations.

This article is shared from the HUAWEI cloud community " MindSpore new generation molecular simulation library: SPONGE ", the original author: Yu Fan, MindSpore algorithm scientist

MindSpore new-generation molecular simulation library: SPONGE, is jointly developed by Peking University and Shenzhen Bay Laboratory Gao Yiqin's research group and Huawei MindSpore team. It has high performance and modularity. It is a completely self-developed molecular simulation software library. Based on the features of MindSpore's automatic parallelism and graph-calculation fusion, SPONGE can efficiently complete the traditional molecular simulation process. Using the feature of MindSpore's automatic differentiation, AI methods such as neural networks can be combined with traditional molecular simulations.

Background introduction

Molecular simulation refers to the use of computers to simulate molecular structure and behavior with atomic-level molecular models, and then simulate various physical and chemical properties of molecular systems. It builds a set of models and algorithms based on experiments and basic principles to calculate reasonable molecular structures and molecular behaviors. In recent years, molecular simulation technology has developed rapidly and has been widely used in many disciplines. In the field of drug design, it can be used to study the mechanism of action of viruses and drugs; in the field of biological sciences, it can be used to characterize the hierarchical structure and properties of proteins; in the field of materials science, it can be used to study the structure and mechanical properties, optimized design of materials, etc. ; In the field of chemistry, it can be used to study surface catalysis and mechanism; in the field of petrochemical industry, it can be used for molecular sieve catalyst structure characterization, synthesis design, adsorption and diffusion, and can construct and characterize the structure of polymer chains and crystalline or amorphous bulk polymers , Prediction includes important properties such as blending behavior, mechanical properties, diffusion, and cohesion.

Due to the time and space limitations of simulation, the application of traditional molecular dynamics simulation software is greatly restricted. Researchers need to continuously develop new force fields, sampling methods, and combine new technologies (such as AI algorithms) to expand molecular dynamics simulation. Scene. Therefore, SPONGE came into being with completely independent intellectual property rights. SPONGE uses modular design features to support scientists to efficiently and conveniently build relevant calculation modules needed in molecular dynamics simulations. At the same time, SPONGE also has the efficiency required by traditional simulation. In addition, SPONGE also naturally supports natural integration with artificial intelligence algorithms, and can use the high-performance computing features of the MindSPore framework itself.

Compared with the traditional molecular simulation software combined with the SITS method for enhanced sampling of biomolecules, SPONGE natively supports SITS and optimizes the calculation process to make it more efficient to simulate biological systems using the SITS method. Aiming at the polarization system, traditional molecular simulation uses a combination of quantitative calculations and other methods to solve problems such as charge floating. Even if machine learning is used to reduce the amount of calculation, a lot of time will be wasted on the problem of program data transmission. SPONGE utilizes modular features to support direct communication with machine learning programs on the memory, which greatly reduces the overall calculation time.
image.png

Figure 1: Enhanced sampling of alanine dipeptide in dominant solvent combined with SITS and other methods

SPONGE, which is open sourced with MindSpore1.2 version, has the following advantages:

  1. Fully modular molecular simulation. The modular construction of molecular simulation algorithms makes it easy for field developers to quickly implement theories and algorithms, and provides a friendly open source community environment for external developers to contribute sub-modules.
  2. The whole process of artificial intelligence algorithm combined with traditional molecular simulation and MindSpore is realized. In MindSpore, developers can conveniently apply AI methods to molecular simulations. The fully operatorized SPONGE will be further combined with MindSpore to become a new generation of end-to-end differentiable molecular simulation software, realizing the natural integration of artificial intelligence and molecular simulation.

Case Introduction

Below, here will briefly introduce a simple case of SPONGE on MindSpore, which uses SPONGE to simulate the aqueous system of alanine tripeptide.

Before practicing, make sure that MindSpore has been installed correctly. If not, you can install MindSpore through the MindSpore installation page (MindSpore official website).

1. Input file preparation

Three input files need to be loaded in the simulation system of this tutorial, namely:

· Attribute file (file with suffix .in), which declares the basic conditions of the simulation and controls the parameters of the entire simulation process.

· Topology files (files with a suffix of .param7), which describe the topological relationships and various parameters of the molecules within the system.

· Coordinate file (file with the suffix .rst7). The coordinate file describes the coordinates of the initial moment of each atom in the system.

The topology file and coordinate file can be modeled by the tleap tool that comes with AmberTools through the modeling process, download address (Download Amber MD).

After constructing the required topology file and coordinate file through tleap, it is necessary to declare the basic conditions of the simulation through the property file to control the parameters of the entire simulation process. Take the properties file in this tutorial as an example, the content of the file is as follows:

NVT 290k

 mode = 1, # 分子动力学(MD)模式,1 表示模拟采用 NVT 系综

 dt= 0.001, # 模拟步长

 step_limit = 1, # 模拟总步数

 thermostat=1, # 控温方法,1 表示采用的是 Liujian-Langevin 方法

 langevin_gamma=1.0, # 控温器中的 Gamma_ln 参数

 target_temperature=290, # 目标温度

 write_information_interval=1000, # 输出频率

 amber_irest=0, # 输入方式,0 表示读入amber格式的输入坐标文件,其中不包含速度

 cut=10.0,  # 非键相互作用的距离

After the input files of the case are completed, they are named http://NVT_290_10ns.in , WATER_ALA.parm7 and WATER_ALA_350_cool_290.rst7. These three files can be stored in a custom path in the local workspace.

2. Load data

From the three input files, read the parameters required by the simulation system and use them for the calculation of the final system. The loading code is as follows:

import argparse

from mindspore import context

 

parser = argparse.ArgumentParser(description='Sponge Controller')

parser.add_argument('--i', type=str, default=None, help='input file')

parser.add_argument('--amber_parm', type=str, default=None, help='paramter file in AMBER type')

parser.add_argument('--c', type=str, default=None, help='initial coordinates file')

parser.add_argument('--r', type=str, default="restrt", help='')

parser.add_argument('--x', type=str, default="mdcrd", help='')

parser.add_argument('--o', type=str, default="mdout", help="")

parser.add_argument('--box', type=str, default="mdbox", help='')

parser.add_argument('--device_id', type=int, default=0, help='')

args_opt = parser.parse_args()

 

context.set_context(mode=context.GRAPH_MODE, device_target="GPU", device_id=args_opt.device_id, save_graphs=False)

3. Build the simulation process

Using the computational force module and computational energy module defined in SPONGE, the molecular dynamics process evolution is carried out through multiple iterations, so that the system reaches the equilibrium state we need, and the energy and other data obtained in each simulation step are recorded. The simulation process construction code is as follows:

from src.simulation_initial import Simulation

from mindspore import Tensor

 

if __name__ == "__main__":

 simulation = Simulation(args_opt)

 save_path = args_opt.o

 for steps in range(simulation.md_info.step_limit):

 print_step = steps % simulation.ntwx

 if steps == simulation.md_info.step_limit - 1:

 print_step = 0

 temperature, total_potential_energy, sigma_of_bond_ene, sigma_of_angle_ene, sigma_of_dihedral_ene, \

 nb14_lj_energy_sum, nb14_cf_energy_sum, LJ_energy_sum, ee_ene, _ = simulation(Tensor(steps), Tensor(print_step))

 # compute energy and temperature

4. Run the script

python main.py --i /path/NVT_290_10ns.in \

 --amber_parm /path/WATER_ALA.parm7 \

 --c /path/WATER_ALA_350_cool_290.rst7 \

 --o /path/ala_NVT_290_10ns.out

Among them, --i is the attribute file of MD simulation, which controls the simulation process, --amber_parm is the topology file of the MD simulation system, --c is the initial coordinate file we input, and --o is the record file of our simulation output. The energy and other information of each output step are recorded. --path is the path where the input file is located. In this tutorial, it is the sponge_in folder.

Use the input file to calculate the force and energy through simulation at the specified temperature, and carry out the evolution of the molecular dynamics process.

5. Running results

The result of the operation is in the .out file, the energy changes of the system are recorded in this file, and the thermodynamic information of the simulated system can be viewed. The following information of the system is recorded in the .out file:

_steps_ _TEMP_ _TOT_POT_ENE_ _BOND_ENE_ _ANGLE_ENE_ _DIHEDRAL_ENE_ _14LJ_ENE_ _14CF_ENE_ _LJ_ENE_ _CF_PME_ENE_

The various types of energy output during the simulation process are recorded, which are the number of iterations (_steps_), temperature (_TEMP_), total energy (_TOT_POT_E_), bond length (_BOND_ENE_), bond angle (_ANGLE_ENE_), dihedral angle interaction (_DIHEDRAL_ENE_) ), non-bonding interaction, which includes electrostatic force and Leonard-Jones interaction.

Tutorial document: https://gitee.com/mindspore/docs/blob/master/tutorials/training/source_zh_cn/advanced_use/hpc_sponge.md

Outlook

In future versions, more practical molecular dynamics simulation modules will be added to support more applications. After that, each module of SPONGE will gradually support automatic differentiation and automatic parallelism, providing more friendly support for connecting machine learning solutions. Welcome molecular dynamics enthusiasts and researchers to join us to jointly develop and maintain SPONGE.

Click to follow, and learn about the fresh technology of Huawei Cloud for the first time~


华为云开发者联盟
1.4k 声望1.8k 粉丝

生于云,长于云,让开发者成为决定性力量