3

(Reproduced from Microsoft Research AI headlines)

Editor's note: In 2020, the new crown epidemic rages around the world. In order to control the spread of the epidemic and find countermeasures, the US Centers for Disease Control has released a large amount of epidemic-related data and rescued the world's top scientific research institutions. It is hoped that scientists can use technical capabilities to provide high Reference value forecast data to help formulate effective control strategies. Based on temporal and spatial prediction technology, Microsoft Research Asia has trained a prediction model for the new crown epidemic, and it will be adopted by the US Centers for Disease Control in the second half of 2020. In the past year, the performance of this prediction model has been better than those provided by more than 40 other scientific research institutions around the world. A few days ago, based on previous technology accumulation, Microsoft Research Asia officially launched FOST, an open source tool for spatio-temporal forecasting for the entire industry.

What is the concept of time and space? "Time" refers to time series, and "emptiness" refers to mutual influence and connection in space. For example, the historical delivery volume of each site in the logistics industry is in a time series relationship, and there are spatial connections among the transit/distribution sites; another example, in the prevention and control of the new crown epidemic, the number of daily infection cases in administrative regions at all levels , Seen separately, it is a temporal relationship, while the relationship between each other is a spatial relationship.

The widespread existence of "temporal and spatial" factors in various industries makes temporal and spatial forecasting the key to scientific decision-making and optimizing efficiency in many industries. Recently, Microsoft Research Asia open source tool for the entire industry, with high versatility and ease of use. Companies and institutions with related needs can generate efficient spatio-temporal forecasting solutions based on this convenient and easy-to-use tool.

2943bd0a73e3902144b6113ed53285aa.jpg

common abstraction: open source tool FOST for spatiotemporal prediction

In recent years, in close cooperation with industry partners, researchers from Microsoft Research Asia have found that the demand for spatio-temporal forecasting generally exists in many industries such as logistics, telecommunications, medical care, and transportation. However, most of the current spatio-temporal predictions are still in the research stage. When they are actually applied, everyone just learns from each other's ideas. If you want to solve practical problems, you need to start from scratch a little bit. There is no simple, easy-to-use general tool.

Based on cooperative research with many companies on spatio-temporal forecasting, the researchers of Microsoft Asia Research Institute abstracted the common problems of the industry, transformed years of accumulated technology and experience, and launched FOST, a spatio-temporal forecasting tool with extremely high industry versatility. .

df2cf86932d5714118423dddeb09885d.png
FOST architecture diagram

make both temporal and spatial prediction tool versatility, and availability need to solve three common problems : The first is the quality of the data, which requires data to reduce noise and reduce the impact of the missing information; the second is the timing to be able to Trends, cycles, emergencies and other dimensions have good tolerance; the third is to break the limitation of the previous forecasting models in the spatial dimension that can only predict at a single point, and accurately predict and utilize related influences in the spatial structure.

To this end, Microsoft Research Asia has integrated three functional modules for the spatiotemporal forecasting tool FOST to deal with forecasts under a variety of complex spatiotemporal conditions:

  • Data processing: data noise reduction, improve data quality
    In FOST, the collection of data is done by the user independently, which not only ensures that different business scenario models can be trained based on diverse scenario data, but also ensures the privacy and security of user data. After that, FOST will clean the low-quality data that has problems such as noise, improve the data quality, and ensure the accuracy of model training.
  • Timing Decoding: Lightweight Timing Neural Network
    For time series prediction, Microsoft Research Asia uses a lightweight deep time series neural network.
    Deep time series neural networks are mainly used to capture complex historical laws in actual business scenarios. Taking the logistics industry as an example, the data may show that certain sites have more shipments in the summer than usual. Can it be inferred that the shipments will also increase next summer? The actual relationship is usually not so simple that it can be inferred. The role of the deep timing network is to find out the complex associations and detailed rules.
    However, deep temporal neural networks often face the problem of slow training speed and sensitivity to noise. At the same time, when the amount of data is insufficient, it is easy to overfit the training data. Therefore, on the basis of deep time series neural network, Microsoft Research Asia reduces the dimensionality of time series data to make the structure lighter, thereby accelerating training efficiency and stabilizing prediction results.
  • spatial decoding: graph neural network construction hierarchical graph
    On the spatial level, Microsoft Research Asia uses graph neural networks to model the spatial interaction and correlation of signal changes through spatial connections between nodes. For example, in the prediction of epidemic data, the results of an epidemic in one area will be affected by other areas, especially adjacent areas, so the spatial correlation cannot be ignored when forecasting. In this regard, Microsoft Research Asia uses graph neural networks to predict the development of the epidemic and also refers to the information of other provinces and cities to further improve the accuracy of the prediction. After the introduction of graph convolutional networks, whether it is for fine-grained predictions for counties and districts, or for coarse-grained predictions at the provincial and municipal levels, the accuracy of the results is greatly improved.

Liu Tieyan, vice president of Microsoft Research Asia, said, “FOST is not a top-down research product, and it is not a clear plan to conduct research and development at the beginning, but after in-depth contact with the industry, we discovered Many industries have common needs in spatio-temporal forecasting, including problems, challenges, solutions and other levels. Therefore, we decided to abstract the common problems into a general open source tool to help more companies use advanced artificial intelligence technology to save money Energy, cost, improve operation and innovation efficiency."

With its high versatility, it can respond to the spatio-temporal forecasting needs of many industries

In an industry closely related to the concept of time and space, how does the spatio-temporal forecasting tool FOST work and play its role?

Still take the more typical logistics industry as an example. If a logistics company hopes to use FOST to predict the delivery volume of a large site next day, first of all, the company needs to enter the time series data of the recent period in the deep timing neural network module at the bottom, including the total daily output of this site Inventory quantity and total receipt quantity, and dispatch quantity with this station as the destination or transfer station. After that, the time series module of the model will first learn the features in the historical data and express them as a set of vectors in the hidden space.

Next, it is necessary to further superimpose the timing law information of neighboring sites for spatial information aggregation. One example is that there is often such a relationship between a site and its neighboring sites-when the number of couriers from the neighboring site increases, a part of the courier will be sent to the site. In this case, when it is predicted from the time sequence that the delivery volume of the site next day will be 200, and at the same time it is seen that the number of parcels delivered to the neighboring site on the space layer is expected to increase sharply, the site can be estimated The delivery volume of the next day may be far more than 200 pieces, so that the association relationship on the site space is also integrated into the model.

320a46383527a180dfc2a64d498a479a.png

The above are only examples of the logistics industry. Many other industry scenarios, such as network base station traffic forecasting, traffic flow forecasting, and power transmission forecasting, have the same concept of time and space as the logistics industry. The role of the time and space forecast tool FOST in these industries is basically similar.

However, it should be noted that, should give priority to their correlation . Otherwise, if all correlation information is calculated indiscriminately, the amount of calculation will be huge and unbearable. . For example, there are already thousands of locations. If the relationship between all locations is to be taken into account, such calculations will require very high server requirements, which is an expense that ordinary enterprises cannot afford. In this regard, Microsoft Research Asia has also made a lot of optimizations, including giving priority to strongly correlated information when randomly sampling graphs, thereby improving the operating efficiency of the entire prediction tool.

In addition, in some industries, the not necessarily stop at the geospatial level . For example, in the prediction of the condition of diabetic patients in the medical industry, different patients with the same type of diabetes can be regarded as multiple different spaces. The law of the development of a patient's condition can be used as a historical reference to help predict the development trend of the condition of other patients.

Microsoft Asia Research Institute's open source tool for temporal and spatial prediction provides users in various industries with a simple and easy-to-use deep learning "weapon". By using FOST, users can not only effectively improve the accuracy of business scenario predictions, but also avoid the repetitive work of developing similar platforms from scratch. In the future, on the basis of the current version, Microsoft Research Asia will continue to optimize the accuracy and training efficiency of models on spatio-temporal forecasting tools, helping more companies and institutions to create greater value by building spatio-temporal forecasting capabilities.


微软技术栈
423 声望996 粉丝

微软技术生态官方平台。予力众生,成就不凡!微软致力于用技术改变世界,助力企业实现数字化转型。