题目描述
文件格式内容如下:
<dblp>
<inproceedings>
<author>Zhenyu Li 0001</author>
<author>Gaogang Xie</author>
<year>2010</year>
<booktitle>GLOBECOM</booktitle>
</inproceedings>
<inproceedings>
<author>Abdulkadir Celik</author>
<author>Redha M. Radaydeh</author>
<author>Fawaz S. Al-Qahtani</author>
<author>Ahmed H. Abd El-Malek</author>
<year>2017</year>
<booktitle>GLOBECOM Workshops</booktitle>
</inproceedings>
.....
</dblp>
代码思路应该是遍历每个<inproceedings>标签,在<inproceedings>标签里根据<author>标签个数增加<booktitle>标签个数,使得每个<inproceedings>标签里的2者数量一致。
每个<inproceedings>标签里有很多个<author>标签,但只有一个<booktitle>标签,在每个<inproceedings>标签增加<booktitle>标签数量,使得和<author>标签的数量一致,理想结果如下:
<dblp>
<inproceedings>
<author>Zhenyu Li 0001</author>
<author>Gaogang Xie</author>
<year>2010</year>
<booktitle>GLOBECOM</booktitle>
<booktitle>GLOBECOM</booktitle>
</inproceedings>
<inproceedings>
<author>Abdulkadir Celik</author>
<author>Redha M. Radaydeh</author>
<author>Fawaz S. Al-Qahtani</author>
<author>Ahmed H. Abd El-Malek</author>
<year>2017</year>
<booktitle>GLOBECOM Workshops</booktitle>
<booktitle>GLOBECOM Workshops</booktitle>
<booktitle>GLOBECOM Workshops</booktitle>
<booktitle>GLOBECOM Workshops</booktitle>
</inproceedings>
.....
</dblp>
请给出实现代码,让<inproceedings>里两个标签数量一致,并具有推广性,我需要处理一个文件,几千万条数据,上面只是给出了基本的XML文件内容格式。