怎样把数据分列写入csv呢?用下标可以实现目的吗?

爬完数据存储在csv里面,但打开会出现下图情况,没有出现我期待的两列数据,这个要怎样处理呢?
图片描述

我的代码如下:

# -*- coding:utf-8 -*-
import requests
from bs4 import BeautifulSoup
import csv

user_agent = 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36'

def get_data(url):
    response = requests.get(url)
    soup = BeautifulSoup(response.text,'lxml')
    soup = soup.find('div',{'id':'listZone'}).findAll('a')
    return soup

csvfile = open('D:/Python34/test.csv','wt')
writer = csv.writer(csvfile,delimiter=',')
header=['url','title']
csvrow1=[]
csvrow2=[]
try:
    for links in get_data('http://finance.qq.com/gdyw.htm'):
        csvrow1.append('http://finance.qq.com/'+links.get('href'))
    for title in get_data('http://finance.qq.com/gdyw.htm'):
        csvrow2.append(title.get_text())
    writer.writerow(header)
    csvfile.write('\n'.join(csvrow1))
    csvfile.write('\n'.join(csvrow2))
finally:
    csvfile.close()
阅读 16.2k
3 个回答

首先你这个需求完全没有必要用csv这个模块来做, csv默认以半角逗号分隔不同的列, 但是如果单列内容有半角逗号的话, excel读取就有点尴尬. 我建议用TAB来做分隔符(定界符), 然后直接用with open(...) as fh这种方式写入

除此之外, 你的代码还有两点小问题:

  1. 函数get_data其实只需要调用一次就好了, 没必要调两次

  2. url里面多了个斜杠/

# -*- coding:utf-8 -*-
import requests
from bs4 import BeautifulSoup

user_agent = 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36'
URL = 'http://finance.qq.com'


def get_data(url):
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'lxml')
    soup = soup.find('div', {'id': 'listZone'}).findAll('a')
    return soup


def main():
    with open("hello.tsv", "w") as fh:
        fh.write("url\ttitile\n")
        for item in get_data(URL + "/gdyw.htm"):
            fh.write("{}\t{}\n".format(URL + item.get("href"), item.get_text()))


if __name__ == "__main__":
    main()

结果:

这样看上去更清爽

因为你先写入了csvrow1,然后才写入csvrow2,才导致了这种结果,应该同时遍历csvrow1和2,可以这样:

for i in zip(csvrow1, csvrow2):
    csvfile.write(i[0] + ',' + i[1] + '\n')
# -*- coding:utf-8 -*-
import requests
from bs4 import BeautifulSoup
import csv


user_agent = 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36'

def get_data(url):
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'lxml')
    soup = soup.find('div', {'id': 'listZone'}).findAll('a')
    return soup
urls = []
titles = []

for url in get_data('http://finance.qq.com/gdyw.htm'):
    urls.append('http://finance.qq.com/'+url.get('href'))
for title in get_data('http://finance.qq.com/gdyw.htm'):
    titles.append(title.get_text())
data = []
for url, title in zip(urls, titles):
    row = {
        'url': url,
        'title': title

    }
    data.append(row)
with open('a.csv', 'w') as csvfile:
    fieldnames = ['url', 'title']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    writer.writeheader()
    writer.writerows(data)
撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
推荐问题