头图

Editor's note : Questions and answers (QA) tasks are a basic and important topic in the field of natural language understanding. At present, pre-trained language models and graph neural networks are usually used to reason about questions and answers. What role does the GNN module play in reasoning? This issue requires further investigation by scientific researchers. To this end, researchers from Microsoft Research Asia and Georgia Tech analyzed the most cutting-edge related methods and found that an extremely simple and efficient graph neural counter can achieve better results in mainstream knowledge question and answer data sets.

For a long time, question answering (QA) questions have been a basic and important subject in the field of artificial intelligence and natural language processing, and endless research efforts have tried to give the question answering system human-level reasoning capabilities. However, the human reasoning process is extremely complex. In order to approach such complex reasoning, the current cutting-edge methods generally use a pre-trained language model (LM) to obtain and utilize its implicit knowledge, supplemented by well-designed Graph Neural Network (GNN) to reason about the knowledge graph. However, what functions the GNN module performs in these inferences still needs to be further studied.

To this end, researchers from Microsoft Research Asia and Georgia Tech analyzed the most cutting-edge related methods and found that an extremely simple and efficient graph neural counter, can achieve better results in the mainstream knowledge question and answer data set . At the same time, the researchers also revealed that current knowledge-based reasoning GNN module is likely to be only performing simple reasoning functions such as counting. (Click to read the original text and view the paper)

d22d012280e15a5019550d503bc7c96c.png
(Link to the paper: https://arxiv.org/abs/2110.03192)

Knowledge acquisition and reasoning are the core of question answering (QA) tasks, and this knowledge is implicitly encoded in a pre-trained language model (LM) or explicitly stored in a structured knowledge graph (KG). The current LM uses a large-scale corpus in the pre-training process, which contains extremely rich knowledge, which allows the LM to achieve good performance on various QA data sets with a little finetune.

However, LM relies more on co-occurrance, which is difficult to deal with inference problems and lacks interpretability. The complementary KG, despite the need for manual sorting and limited scale, can directly display and store specific information and relationships, so it is interpretable.

How to combine the two in QA to maximize strengths and avoid weaknesses has been a hot topic in recent years. Most of the most cutting-edge work uses two steps to process the knowledge graph:

1. Schema graph grounding. retrieves the subgraph associated with the entity mentioned in the QA text in the knowledge graph. This subgraph contains nodes with conceptual text, edges representing relationships, and adjacency matrix.

2. Graph modeling for inference. uses a well-designed GNN module to model and reason about this subgraph.

The GNN module here is usually designed to be more complicated. For example, KagNet uses GCN-LSTM-HPA, that is, the path-based hierarchical attention mechanism (HPA) to couple GCN and LSTM, so as to characterize the path-based relationship graph; QA-GNN uses the GAT network as the main body, and uses LM to encode the QA text into the graph, becoming a single node, so as to perform joint reasoning with other concepts and relationships in the graph.

As QA systems become more and more complex, researchers have to further think about some basic questions: Are these GNN models not complicated enough or too complicated? What key roles do they play in reasoning?

In order to answer these questions, the researchers first analyzed the most advanced GNN-based QA system and its reasoning ability. Based on their findings, the researchers designed a graph counting network. This network is not only simple and efficient, but also achieves better results on the two mainstream inference-based data sets, CommonsenseQA and OpenBookQA.

b5965dfe7578da8af1d188ea4050fe43.png
(Figure 1: Researchers found that the current key role of GNN in QA is to count edges, so they designed an efficient and explanatory graph counting module to reason about QA.)

In order to analyze the most cutting-edge GNN pair-based system in QA, the researchers used the SparseVD pruning method to prun each subunit of GNN first, and then, without loss of accuracy, retain the sparsity rate of each layer Perform statistics to determine the importance and function of each sub-module. As shown in Figure 2, the usually retrieved KG subgraph will be preprocessed into Node Embedding, Edge Embedding and adjacency matrix, and the correlation score and other parts are used as the input of GNN. Researchers found through analysis that the initial Node Embedding and correlation scores are unnecessary, and the sparse rate of the correlation layers can be pruned to zero, that is, these correlation layers can be removed directly. The Edge Embedding related layer is more difficult to pruned, which shows that it is very important for reasoning in the current scene. For the message passing layer in GAT, it can be seen from observation that the sparsity rate of these layers is relatively low, especially the query and key layers of the first few layers are close to zero, that is, these layers have a tendency to parameterize, and pay attention The force mechanism almost degenerates into a linear transformation.

a3f16c0adab53b870fb9da4d54c97417.png
(Figure 2: Researchers use the pruning method SparseVD as a tool to analyze the various modules of GNN in QA and find that the information-related layer of the edge is extremely important, while many other layers have over-parameterized phenomena.)

Conclusion Based on the analysis, the researchers designed a very simple and efficient count-based Graph Soft Counter (GSC). such as GAT, GSC has only two basic components: 161866e4b6012e Edge encoder and Graph Soft Counter Layer , and hidden nodes and edges The layer dimension is also reduced to 1, which means that there are only single numbers that flow on the way, which can be interpreted as the importance scores of edges and nodes. As shown in Algorithm 1 of Figure 4, GSC greatly simplifies the process of massage passing into the two most basic operations, namely propagation and aggregation, so that these importance scores are added to the central node of the QA context and output as the scores of the options. .

112ce8edb155d52d5af03ef0ec6a0bdb.png
(Figure 3: The GSC layer alternately updates the count scores on edges and nodes)

1296c635785ace35742928623b77eef6.png
(Figure 4: Algorithm 1: After the edge information is encoded, the GSC layer executes message passing to summarize the scores to the central node, which is the graph score)

It is worth mentioning that the GSC layer is completely without parameters , which also makes it very efficient. As shown in Table 1, GSC learning parameters may amount less than one percent of other modules GNN , and since no initial Node Embedding, GSC model storage size is smaller five orders of magnitude . Another example is shown in Table 2, the GSC in time and space complexity are also extremely efficient .

4a62e556fa1c0953d719c6a9084c2aa2.png
(Table 1: GSC only uses the adjacency matrix and edge/node type information with very few parameters)

3b846d5489ac40867eb26e27994fd5bf.png
(Table 2: GSC is extremely efficient in terms of time and space complexity)

In addition to being simple and efficient, GSC's performance is also outstanding. Researchers conducted experiments on two mainstream inference-based data sets, CommonsenseQA and OpenBookQA. The baseline proposed in the paper includes not only the LM itself without KG, but also other cutting-edge methods that use GNN to process KG. As shown in Tables 3~5, the GSC method has advantages in both data sets, and OpenBookQA, even surpassing UnifiedQA (11B), a giant model with 11 billion parameters. .

f504f446e7ddbb3d34b45bfb2410e20f.png
(Table 3: GSC is superior to other GNN-based methods on the CommonsenseQA data set)

4daf573f70bac2c197c2b736827f50a7.png
(Table 4: GSC is superior to other GNN-based methods on the OpenBookQA dataset)

2eaa88ee41c406e530a0eb9833ec03d4.png
(Table 5: GSC ranks first in the official OpenBookQA ranking list, even surpassing UnifiedQA)

The analysis and method proposed in this paper reveal that the current complex GNN-based QA system is probably only performing some basic reasoning functions such as counting. How to build a comprehensive QA system to reach the level of human reasoning is still a grand proposition to be solved.


Here are more Microsoft official learning materials and technical documents, scan the code to get the free version!
The content will be updated from time to time!
208f6785e4bc3f899ded709a80dff426.jpg


微软技术栈
423 声望997 粉丝

微软技术生态官方平台。予力众生,成就不凡!微软致力于用技术改变世界,助力企业实现数字化转型。