本文档中的信息仅用于学术研究和交流学习之目的,不涉及任何商业或非法用途。我们已对可能涉及的敏感信息,如抓包数据、特定网址、数据接口等,进行了必要的脱敏处理,以确保信息安全。任何未经授权的转载或修改后的二次传播均被严格禁止。
若擅自使用本文所述技术而引发的任何不良后果或意外,本文作者不承担任何责任。若认为本文内容涉及侵权,请通过公众号【小马哥逆向】与我们取得联系,我们将立即进行处理。感谢您的理解和配合
验证码是爬虫中常见的一种反爬手段,其中又分为文字验证码,滑块验证码,旋转验证码等
今天给大家介绍一个简单的文字验证码,进行学习研究
网站:aHR0cDovL29sZC5lZm5jaGluYS5jb20vaW5kZXgucGhwP209bWVtYmVyJmM9aW5kZXgmYT1sb2dpbiZmb3J3YXJkPWh0dHAlM0ElMkYlMkZvbGQuZWZuY2hpbmEuY29tJTJGJnNpdGVpZD0x
验证码类型:文字验证码
打开网站,开始f12直接开启抓包,可以多点击几次验证码,多抓几次包
发现点击多次后,url会出现时间拼接,但是没有影响。
接下来进行模拟请求下载图片
import requests
s = requests.Session()
headers = {
"Accept": "image/avif,image/webp,image/apng,image/svg+xml,image/*,*/*;q=0.8",
"Accept-Language": "zh-CN,zh;q=0.9",
"Cache-Control": "no-cache",
"Connection": "keep-alive",
"Pragma": "no-cache",
"Referer": "http://old.efnchina.com/index.php?m=member&c=index&a=login&forward=http%3A%2F%2Fold.efnchina.com%2F&siteid=1",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36"
}
cookies = {
"PsAvI__onlineid": "2f20VAEBU1JWAAQICQVWUVMEUVUFVVcHBgUCB1EFAA5UVAIHAAJTBlAJDA8JAFFVAwdSV1IIUwEDAgRRAg",
"PHPSESSID": "pq1imibu1jmh6aomqqpnni5av1"
}
url = "http://old.efnchina.com/api.php"
params = {
"op": "checkcode",
"code_len": "5",
"font_size": "14",
"width": "120",
"height": "26",
"font_color": "",
"background": "",
"0.05253885216200893": "",
"0.25073488177350733": "",
"0.35129750737816634": "",
"0.6442330764602027": ""
}
response = s.get(url, headers=headers, cookies=cookies, params=params, verify=False)
print(response.text)
print(response)
添加识别库
import requests
import ddddocr
s = requests.Session()
ocr = ddddocr.DdddOcr()
headers = {
"Accept": "image/avif,image/webp,image/apng,image/svg+xml,image/*,*/*;q=0.8",
"Accept-Language": "zh-CN,zh;q=0.9",
"Cache-Control": "no-cache",
"Connection": "keep-alive",
"Pragma": "no-cache",
"Referer": "http://old.efnchina.com/index.php?m=member&c=index&a=login&forward=http%3A%2F%2Fold.efnchina.com%2F&siteid=1",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36"
}
cookies = {
"PsAvI__onlineid": "2f20VAEBU1JWAAQICQVWUVMEUVUFVVcHBgUCB1EFAA5UVAIHAAJTBlAJDA8JAFFVAwdSV1IIUwEDAgRRAg",
"PHPSESSID": "pq1imibu1jmh6aomqqpnni5av1"
}
url = "http://old.efnchina.com/api.php"
params = {
"op": "checkcode",
"code_len": "5",
"font_size": "14",
"width": "120",
"height": "26",
"font_color": "",
"background": "",
"0.05253885216200893": "",
"0.25073488177350733": "",
"0.35129750737816634": "",
"0.6442330764602027": ""
}
response = s.get(url, headers=headers, cookies=cookies, params=params, verify=False)
result = ocr.classification(response.content)
print(result)
完成
小技巧
请求中使用requests.Session()开启一个全局的会话,因为验证码请求接口中一般是使用cookie进行接口确认,使用requests.Session(),不需要进行多余操作。
本文由mdnice多平台发布
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。