为什么用Python3写的爬虫爬取到的图片无法打开
# -*- coding:utf8 -*-
import requests
from bs4 import BeautifulSoup
url = 'http://www.meizitu.com/a/5582.html'
req = requests.get(url)
soup = BeautifulSoup(req.text, 'lxml')
imgs = soup.select('#picture > p > img')
mm_imgs = []
for img in imgs:
src = img.get('src')
mm_imgs.append(src)
for mm in mm_imgs:
filename = '/'+(str(mm)[-20:]).replace('/','-')
target = "./{}".format(filename)
with open(target, "wb") as fs:
fs.write(req.content)
print("%s => %s" % (mm, target))
这里你拿到图片的src之后没有去请求而是用的原url的content, 原url的content是html
你需要每个图片src重新请求一次,并且在请求时带上User-Agent