关于爬虫代理IP的问题

新手上路，请多包涵

# -*- coding:utf-8 -*-
import urllib.request
import re
url = "http://ip.chinaz.com/getip.aspx"  #打算抓取内容的网页
proxy_ip={'HTTP':'49.85.13.8:35909'}  #想验证的代理IP
proxy_support = urllib.request.ProxyHandler(proxy_ip)
opener = urllib.request.build_opener(proxy_support)
opener.addheaders=[("User-Agent","Mozilla/5.0 (Windows NT 10.0; WOW64)")]
urllib.request.install_opener(opener)
#shuchu = re.findall('"well"><p>(.*?)GeoIP', urllib.request.urlopen(url).read().decode("utf-8"), re.S)
#print(shuchu)
print(urllib.request.urlopen(url).read().decode("utf-8"))

我用的是3.6的版本。代码也是网上找的模版，按道理来说应该是没有什么问题的把，我也试过很多遍网上的免费IP，结果还是显示自己的IP。

python

阅读 2.9k

2 个回答

东哥起飞

✓ 已被采纳

这个代理IP数据类型为字典，如果是http协议，key值就为"http"，value值应为"代理IP：端口号"的格式，欢迎参考Python爬虫学习之（二）| urllib进阶篇

proxy = {'http': '115.193.101.21:61234'}

# 使用ProxyHandler方法创建proxy处理器对象
proxy_handler = urllib.request.ProxyHandler(proxy)

# 创建代理IP的opener实例，参数为proxy处理器对象
opener = urllib.request.build_opener(proxy_handler)

# 用代理IP的opener打开指定状态的URL信息
html = opener.open(response)