python爬虫时循环过程报错

编写了一个爬虫文件,设定爬取指定网站,进行200次循环爬取,然后出门买东西,回来发现爬到第7条后出错,错误信息如下(基本一致):


Traceback (most recent call last):
  File "D:\PythonLearn\venv\lib\site-packages\urllib3\contrib\pyopenssl.py", line 472, in wrap_socket
    cnx.do_handshake()
  File "D:\PythonLearn\venv\lib\site-packages\OpenSSL\SSL.py", line 1915, in do_handshake
    self._raise_ssl_error(self._ssl, result)
  File "D:\PythonLearn\venv\lib\site-packages\OpenSSL\SSL.py", line 1639, in _raise_ssl_error
    raise SysCallError(errno, errorcode.get(errno))
OpenSSL.SSL.SysCallError: (10054, 'WSAECONNRESET')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\PythonLearn\venv\lib\site-packages\urllib3\connectionpool.py", line 603, in urlopen
    chunked=chunked)
  File "D:\PythonLearn\venv\lib\site-packages\urllib3\connectionpool.py", line 344, in _make_request
    self._validate_conn(conn)
  File "D:\PythonLearn\venv\lib\site-packages\urllib3\connectionpool.py", line 843, in _validate_conn
    conn.connect()
  File "D:\PythonLearn\venv\lib\site-packages\urllib3\connection.py", line 370, in connect
    ssl_context=context)
  File "D:\PythonLearn\venv\lib\site-packages\urllib3\util\ssl_.py", line 355, in ssl_wrap_socket
    return context.wrap_socket(sock, server_hostname=server_hostname)
  File "D:\PythonLearn\venv\lib\site-packages\urllib3\contrib\pyopenssl.py", line 478, in wrap_socket
    raise ssl.SSLError('bad handshake: %r' % e)
ssl.SSLError: ("bad handshake: SysCallError(10054, 'WSAECONNRESET')",)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\PythonLearn\venv\lib\site-packages\requests\adapters.py", line 449, in send
    timeout=timeout
  File "D:\PythonLearn\venv\lib\site-packages\urllib3\connectionpool.py", line 641, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "D:\PythonLearn\venv\lib\site-packages\urllib3\util\retry.py", line 399, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.ttbcdn.com', port=443): Max retries exceeded with url: /d/file/p/2017-01-30/urxmy3sppo1378.jpg (Caused by SSLError(SSLError("bad handshake: SysCallError(10054, 'WSAECONNRESET')")))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:/PythonLearn/practise/img_download.py", line 99, in <module>
    imgdownloader(img_info['imgs'], img_info['titles'])
  File "D:/PythonLearn/practise/img_download.py", line 84, in imgdownloader
    ig_content = requests.get(ig.attr('src')).content  # 获取每张图片的二进制数据
  File "D:\PythonLearn\venv\lib\site-packages\requests\api.py", line 75, in get
    return request('get', url, params=params, **kwargs)
  File "D:\PythonLearn\venv\lib\site-packages\requests\api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "D:\PythonLearn\venv\lib\site-packages\requests\sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "D:\PythonLearn\venv\lib\site-packages\requests\sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "D:\PythonLearn\venv\lib\site-packages\requests\adapters.py", line 514, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='www.ttbcdn.com', port=443): Max retries exceeded with url: /d/file/p/2017-01-30/urxmy3sppo1378.jpg (Caused by SSLError(SSLError("bad handshake: SysCallError(10054, 'WSAECONNRESET')")))
网上说增加如下代码避免SSL认证就可以:https://www.zhihu.com/questio...

不过不确定效果。

所以,是什么原因导致爬取中断?

阅读 5.9k
1 个回答

 我今天遇到这种问题了,然后关闭ssl认证可以了。你关闭认证后可以测试下看下效果。

撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
推荐问题