所处理的XML文件内容如下:
<incollection>
<author>Philippe Balbiani</author>
<author>Valentin Goranko</author>
<author>Ruaan Kellerman</author>
<author>Dimiter Vakarelov</author>
<booktitle>Handbook of Spatial Logics</booktitle>
</incollection>
<incollection>
<author>Jochen Renz</author>
<author>Bernhard Nebel</author>
<booktitle>Handbook of AI</booktitle>
</incollection>
...
格式内容如上所示,提取<author>标签内容和<booktitle>标签内容,它们都在<incollection>标签里,遍历每个<incollection>标签,并让多个author标签内容与一个booktitle标签内容形成对应元组
达到理想结果为:
('Philippe Balbiani', 'Handbook of Spatial Logics')
('Valentin Goranko', 'Handbook of Spatial Logics')
('Ruaan Kellerman', 'Handbook of Spatial Logics')
('Dimiter Vakarelov', 'Handbook of Spatial Logics')
('Jochen Renz', 'Handbook of AI')
('Bernhard Nebel', 'Handbook of AI')
或者能把每个<incollection>标签里的<booktitle>标签增加到和<author>标签数量一致,理想结果:
<incollection>
<author>Philippe Balbiani</author>
<author>Valentin Goranko</author>
<author>Ruaan Kellerman</author>
<author>Dimiter Vakarelov</author>
<booktitle>Handbook of Spatial Logics</booktitle>
<booktitle>Handbook of Spatial Logics</booktitle>
<booktitle>Handbook of Spatial Logics</booktitle>
<booktitle>Handbook of Spatial Logics</booktitle>
</incollection>
<incollection>
<author>Jochen Renz</author>
<author>Bernhard Nebel</author>
<booktitle>Handbook of AI</booktitle>
<booktitle>Handbook of AI</booktitle>
</incollection>