原文
简介:
WIQA: A dataset for "What if..." reasoning over procedural text
WIQA contains three parts: a collection of paragraphs each describing a process, e.g., beach erosion; a set of crowdsourced influence graphs for each paragraph, describing how one change affects another; and a large (40k) collection of "What if...?"
数据集包含三部分:关于进程描述的段落;影响图(人工标注);问答(根据图,详情见下)
其中问答图里分为 在文本中提到过的节点,和文本中没提到过的节点
1. Out-of-para nodes: denoting events or changes to entities/events not mentioned in the paragraph, e.g, “during storms” in Figure 1. 2. In-para nodes: denoting events or changes to entities/events mentioned in the paragraph, e.g, “the wind is blowing harder” in Figure 1.
介绍/要解决的问题
While recent systems for procedural text comprehension can answer questions about what events happen, e.g., (Bosselut et al., 2018; Henaff et al., 2017; Dalvi et al., 2018), the extent to which they understand the influences between those events remains unclear.
现有的系统虽然可以回答发生了什么事情,但是无法搞明白这些事情之间的影响。 可以看出本数据集的重点是研究事情之间的相互影响,这样可以预测如果事情的某个过程以某种方式干扰,将会对事物造成什么样的影响。
问题的产生
1). Questions were then derived from paths in the graphs, each asking how the change described in one node affects another. Each question is a templated, multiple choice (MC) question of the form Does changeX result in changeY? (A) Correct (B) Opposite (C) No effect, where Opposite indicates a negative influence between changeX and changeY. To bound the task, perturbations are typically qualitative (e.g., “the wind is blowing harder”), and possible effects are restricted to changes to entities and events mentioned in the paragraph (e.g., “the waves are bigger”). Perturbations themselves include in-paragraph, out-of-paragraph, and irrelevant (no effect) changes. The WIQA task is to answer the questions, given the paragraph (but not the IG).
来自影响图,表明一个节点如何影响另一个节点(正向,反向,无)。为了约束问题,问题通常是定性的,例如,假如风吹的更大,会balabalabala.... ps:回答问题的时候,并不会提供影响图。
贡献
(1) the new dataset (2) performance measures and an analysis of its challenges, to support research on counterfactual, textual reasoning over procedural text.
新的数据集。评估数据集的表现,分析了他的挑战,给研究者提供了支持。
数据分布
实验结果
bert在不使用段落的时候,直接预测也有很好地结果,这可能是因为bert预训练的时候就包含了大量的常识。
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。