Pay attention to WeChat public account: K brother crawler, QQ exchange group: 808574309, continue to share advanced crawler, JS/Android reverse engineering and other technical dry goods!
statement
All the content in this article is for learning and communication only. The captured content, sensitive URLs, and data interfaces have been desensitized, and it is strictly forbidden to use them for commercial or illegal purposes. Otherwise, all the consequences arising therefrom will have nothing to do with the author, if any Infringement, please contact me to delete it immediately!
Reverse target
The reverse goal of this time is the login of WB. Although the login encryption parameters are not too many, the login process is a little more complicated. After many transfers, it takes about nine processing times to successfully log in.
There is only one encryption parameter encountered during the login process, that is, password encryption. The encrypted password will be used when obtaining the token. Obtaining the token is a POST request. The sp
in the Form Data is the encrypted password, similar to From: e23c5d62dbf9f8364005f331e487873c70d7ab0e8dd2057c3e66d1ae5d2837ef1dcf86......
Login process
First, let's clarify the login process. The special parameters of each step are explained. The parameters that are not mentioned are fixed values and can be copied directly.
The general process is as follows:
- Pre-login
- Get encrypted password
- Get token
- Get encrypted account
- Send the verification code
- Verification code
- Access redirect url
- Visit crossdomain2 url
- Login via passport url
1. Pre-login
Pre-login is a GET request. Query String Parameters mainly contains two more important parameters: su
: the user name is obtained by base64 encoding, _
: 13-bit timestamp, the returned data contains a JSON, which can be extracted by regular rules. The JSON contains retcode
seven parameter values: 061540ac7153e2, servertime
, pcid
, nonce
, pubkey
, rsakv
, exectime
, most of which are used in subsequent requests, some of which are used in the encrypted password example, return data:
xxxxSSOController.preloginCallBack({
"retcode": 0,
"servertime": 1627461942,
"pcid": "gz-1cd535198c0efe850b96944c7945e8fd514b",
"nonce": "GWBOCL",
"pubkey": "EB2A38568661887FA180BDDB5CABD5F21C7BFD59C090CB2D245......",
"rsakv": 1330428213,
"exectime": 16
})
2. Get the encrypted password
The encryption of the password uses RSA encryption, and the encrypted password can be obtained through Python or JS. The reverse of JS encryption will be analyzed separately later.
3. Get the token
This token value will be used in the subsequent steps of obtaining an encrypted phone number, sending a verification code, verifying a verification code, etc. The obtained token value is a POST request, and the value of Query String Parameters is fixed: client: ssologin.js(v1.4.19)
, the value of Form Data is relatively There are many, but in addition to the encrypted password, other parameters can actually be found in the data returned by the pre-login in step 1. The main parameters are as follows:
su
: The username is obtained through base64 encryptionservertime
: Obtained from the JSON returned by pre-login in step 1nonce
: Obtained from the JSON returned by the pre-login in step 1rsakv
: Obtained from the JSON returned by the pre-login in step 1sp
: encrypted passwordprelt
: random value
The returned data is the HTML source code, and the token value can be extracted from it, similar to: 2NGFhARzFAFAIp_QwX70Npj8gw4lgj7RbCnByb3RlY3Rpb24.
. If the returned token is not this way, it means that the account or password is wrong.
4. Get the encrypted account
su
we encountered earlier is the username obtained through base64 encryption. Here it further encrypts the username. The encrypted username will be used when sending the verification code and verification code, GET request, Query The parameters of String Parameters are also relatively simple, token
is the token value obtained in step 3, callback_url
is the homepage of the website, the returned data is HTML source code, you can use xpath syntax: //input[@name='encrypt_mobile']/@value
to extract the encrypted account, its value is similar to: f2de0b5e333a
, It should be noted here that even for the same account, the result of each encryption is different.
5. Send verification code
Sending the verification code is a POST request, and its parameters are relatively simple. The token
in Query String Parameters is the token obtained in step 3, and the encrypt_mobile
in Form Data is the encrypted account obtained in step 4. The returned data is the verification code. The sending status, for example: {'retcode': 20000000, 'msg': 'succ', 'data': []}
.
6. Verify the verification code
Check codes is a POST request, which parameter is very simple, Query String Parameters in the token
is acquired in Step 3 of token, Form Data Lane encrypt_mobile
in step 4 of the encrypted account acquisition, code
Step 5 The received verification code, the returned data is a JSON, retcode
and msg
represent the status redirect url
is the page to be accessed after the verification step is completed, it will be used in the next step, the returned data example:
{
"retcode": 20000000,
"msg": "succ",
"data": {
"redirect_url": "https://login.xxxx.com.cn/sso/login.php?entry=xxxxx&returntype=META&crossdomain=1&cdult=3&alt=ALT-NTcxNjMyMTA2OA==-1630292617-yf-78B1DDE6833847576B0DC4B77A6C77C4-1&savestate=30&url=https://xxxxx.com"
}
}
7. Visit the redirect url
The request interface in this step is actually the redirect url returned in step 6, GET request, similar to: https://login.xxxx.com.cn/sso/login.php?entry=xxxxx&returntype=META......
The returned data is the HTML source code. We need to extract the URL of crossdomain2 from it. The extracted result is similar to: https://login.xxxx.com.cn/crossdomain2.php?action=login&entry=xxxxx......
. Similarly, this URL is also the page that needs to be visited next.
8. Visit crossdomain2 url
The request interface in this step is the crossdomain2 url extracted in step 7, GET request, similar to: https://login.xxxx.com.cn/crossdomain2.php?action=login&entry=xxxxx......
The returned data is also the HTML source code. We need to extract the real login URL from it. The extracted result is similar to: https://passport.xxxxx.com/wbsso/login?ssosavestate=1661828618&url=https......
. The last step only needs to access the real login URL to realize the login operation.
9. Login via passport url
This is the last step and the real login operation. GET request, the request interface is the passport url extracted in step 8, similar to: https://passport.xxxxx.com/wbsso/login?ssosavestate=1661828618&url=https......
The returned data contains the login result, user ID and user name, similar to:
({"result":true,"userinfo":{"uniqueid":"5712321368","displayname":"tomb"}});
Since then, the complete login process of WB has been completed, and you can directly take the cookies after successful login for other operations.
Encrypted password reverse
In the login process, the second step is to obtain the encrypted password. In the third step of login to obtain the token, the requested Query String Parameters contains an encrypted parameter sp
, which is the encrypted password, and then we have the password Encryption for reverse analysis.
Directly search the sp
keyword globally and find that there are many values. Here we have used the techniques sp=
earlier. Try to search for 061540ac715890, sp:
or var sp
to narrow the scope. In this case, we try to search for sp=
, which can be seen in the index There is only one value in .js, so we can debug with a breakpoint. You can see that sp
is b
the value of 061540ac715899:
PS: When searching, you should pay attention that you cannot search on the page after successful login. At this time, the resource has been refreshed and reloaded. The encrypted JS file is no longer available. You need to enter the wrong account password in the login interface to capture, search, and Breakpoint.
Continue to track up b
, the key code has an if-else statement, which bury breakpoints respectively, after debugging, you can see b
the value of 061540ac7158fe is generated under the if:
Analyze two key lines of code:
f.setPublic(me.rsaPubkey, "10001");
b = f.encrypt([me.servertime, me.nonce].join("\t") + "\n" + b)
me.rsaPubkey
, me.servertime
, me.nonce
are all the data returned from the first step of pre-login.
f.setPublic
mouse to 061540ac71599f and f.encrypt
, you can see that they are br
and bt
functions:
Follow up these two functions separately, you can see that they are both under an anonymous function:
Copy the entire anonymous function directly, remove the outermost anonymous function, and perform local debugging. During the debugging process, it will prompt that navigator
undefined. Check the copied source code. navigator.appName
and navigator.appVersion
are used inside. You can define it directly, or leave it blank. .
navigator = {
appName: "Netscape",
appVersion: "5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
}
Debugging will continue to find var c = this.doPublic(b);
prompt object does not support this property or method, search doPublic
found a bq.prototype.doPublic = bs;
, where it directly instead doPublic = bs;
can.
Analyze the entire RSA encryption logic, in fact, it can also be implemented through Python, code example (pubkey needs to be completed):
import rsa
import binascii
pre_parameter = {
"retcode": 0,
"servertime": 1627461942,
"pcid": "gz-1cd535198c0efe850b96944c7945e8fd514b",
"nonce": "GWBOCL",
"pubkey": "EB2A38568661887FA180BDDB5CABD5F21C7BFD59C090CB2D245......",
"rsakv": 1330428213,
"exectime": 16
}
password = '12345678'
public_key = rsa.PublicKey(int(pre_parameter['pubkey'], 16), int('10001', 16))
text = '%s\t%s\n%s' % (pre_parameter['servertime'], pre_parameter['nonce'], password)
encrypted_str = rsa.encrypt(text.encode(), public_key)
encrypted_password = binascii.b2a_hex(encrypted_str).decode()
print(encrypted_password)
Complete code
GitHub pays attention to K brother crawler, and continues to share crawler-related code! Welcome star! https://github.com/kgepachong/
following 161540ac715b5a only demonstrates part of the key code and cannot be run directly! complete code warehouse address: https://github.com/kgepachong/crawler/
Key JS encryption code architecture
navigator = {
appName: "Netscape",
appVersion: "5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
}
function bt(a) {}
function bs(a) {}
function br(a, b) {}
// 此处省略 N 个函数
bl.prototype.nextBytes = bk;
doPublic = bs;
bq.prototype.setPublic = br;
bq.prototype.encrypt = bt;
this.RSAKey = bq
function getEncryptedPassword(me, b) {
br(me.pubkey, "10001");
b = bt([me.servertime, me.nonce].join("\t") + "\n" + b);
return b
}
// 测试样例
// var me = {
// "retcode": 0,
// "servertime": 1627283238,
// "pcid": "gz-a9243276722ed6d4671f21310e2665c92ba4",
// "nonce": "N0Y3SZ",
// "pubkey": "EB2A38568661887FA180BDDB5CABD5F21C7BFD59C090CB2D245A87AC253062882729293E5506350508E7F9AA3BB77F4333231490F915F6D63C55FE2F08A49B353F444AD3993CACC02DB784ABBB8E42A9B1BBFFFB38BE18D78E87A0E41B9B8F73A928EE0CCEE1F6739884B9777E4FE9E88A1BBE495927AC4A799B3181D6442443",
// "rsakv": "1330428213",
// "exectime": 13
// }
// var b = '12312312312' // 密码
// console.log(getEncryptedPassword(me, b))
Python login key code
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import re
import json
import time
import base64
import binascii
import rsa
import execjs
import requests
from lxml import etree
# 判断某些请求是否成功的标志
response_success_str = 'succ'
pre_login_url = '脱敏处理,完整代码关注 GitHub:https://github.com/kgepachong/crawler'
get_token_url = '脱敏处理,完整代码关注 GitHub:https://github.com/kgepachong/crawler'
protection_url = '脱敏处理,完整代码关注 GitHub:https://github.com/kgepachong/crawler'
send_code_url = '脱敏处理,完整代码关注 GitHub:https://github.com/kgepachong/crawler'
confirm_url = '脱敏处理,完整代码关注 GitHub:https://github.com/kgepachong/crawler'
headers = {
'Host': '脱敏处理,完整代码关注 GitHub:https://github.com/kgepachong/crawler',
'Referer': '脱敏处理,完整代码关注 GitHub:https://github.com/kgepachong/crawler',
'sec-ch-ua': '" Not;A Brand";v="99", "Google Chrome";v="91", "Chromium";v="91"',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
}
session = requests.session()
def get_pre_parameter(username: str) -> dict:
su = base64.b64encode(username.encode())
time_now = str(int(time.time() * 1000))
params = {
'entry': '脱敏处理,完整代码关注 GitHub:https://github.com/kgepachong/crawler',
'callback': '脱敏处理,完整代码关注 GitHub:https://github.com/kgepachong/crawler',
'su': su,
'rsakt': 'mod',
'checkpin': 1,
'client': 'ssologin.js(v1.4.19)',
'_': time_now,
}
response = session.get(url=pre_login_url, params=params, headers=headers).text
parameter_dict = json.loads(re.findall(r'\((.*)\)', response)[0])
# print('1.【pre parameter】: %s' % parameter_dict)
return parameter_dict
def get_encrypted_password(pre_parameter: dict, password: str) -> str:
# 通过 JS 获取加密后的密码
# with open('encrypt.js', 'r', encoding='utf-8') as f:
# js = f.read()
# encrypted_password = execjs.compile(js).call('getEncryptedPassword', pre_parameter, password)
# # print('2.【encrypted password】: %s' % encrypted_password)
# return encrypted_password
# 通过 Python 的 rsa 模块和 binascii 模块获取加密后的密码
public_key = rsa.PublicKey(int(pre_parameter['pubkey'], 16), int('10001', 16))
text = '%s\t%s\n%s' % (pre_parameter['servertime'], pre_parameter['nonce'], password)
encrypted_str = rsa.encrypt(text.encode(), public_key)
encrypted_password = binascii.b2a_hex(encrypted_str).decode()
# print('2.【encrypted password】: %s' % encrypted_password)
return encrypted_password
def get_token(encrypted_password: str, pre_parameter: dict, username: str) -> str:
su = base64.b64encode(username.encode())
data = {
'entry': '脱敏处理,完整代码关注 GitHub:https://github.com/kgepachong/crawler',
'gateway': 1,
'from': '',
'savestate': 7,
'qrcode_flag': False,
'useticket': 1,
'pagerefer': '',
'vsnf': 1,
'su': su,
'service': 'miniblog',
'servertime': pre_parameter['servertime'],
'nonce': pre_parameter['nonce'],
'pwencode': 'rsa2',
'rsakv': pre_parameter['rsakv'],
'sp': encrypted_password,
'sr': '1920*1080',
'encoding': 'UTF-8',
'prelt': 38,
'url': '脱敏处理,完整代码关注 GitHub:https://github.com/kgepachong/crawler',
'returntype': 'META'
}
response = session.post(url=get_token_url, headers=headers, data=data)
# response.encoding = 'gbk'
ajax_login_url = re.findall(r'replace\("(.*)"\)', response.text)[0]
token = ajax_login_url.split('token%3D')[-1]
if 'weibo' not in token:
# print('3.【token】: %s' % token)
return token
else:
raise Exception('登录失败! 用户名或者密码错误!')
def get_encrypted_mobile(token: str) -> str:
params = {
'token': token,
'callback_url': '脱敏处理,完整代码关注 GitHub:https://github.com/kgepachong/crawler'
}
response = session.get(url=protection_url, params=params, headers=headers)
tree = etree.HTML(response.text)
encrypted_mobile = tree.xpath("//input[@name='encrypt_mobile']/@value")[0]
# print('4.【encrypted mobile】: %s' % encrypted_mobile)
return encrypted_mobile
def send_code(token: str, encrypt_mobile: str) -> str:
params = {'token': token}
data = {'encrypt_mobile': encrypt_mobile}
response = session.post(url=send_code_url, params=params, data=data, headers=headers).json()
if response['msg'] == response_success_str:
code = input('请输入验证码: ')
# print('5.【code】: %s' % code)
return code
else:
# print('5.【failed to send verification code】: %s' % response)
raise Exception('验证码发送失败: %s' % response)
def confirm_code(encrypted_mobile: str, code: str, token: str) -> str:
params = {'token': token}
data = {
'encrypt_mobile': encrypted_mobile,
'code': code
}
response = session.post(url=confirm_url, params=params, data=data, headers=headers).json()
if response['msg'] == response_success_str:
redirect_url = response['data']['redirect_url']
# print('6.【redirect url】: %s' % redirect_url)
return redirect_url
else:
# print('6.【验证码校验失败】: %s' % response)
raise Exception('验证码校验失败: %s' % response)
def get_cross_domain2_url(redirect_url: str) -> str:
response = session.get(url=redirect_url, headers=headers).text
cross_domain2_url = re.findall(r'replace\("(.*)"\)', response)[0]
# print('7.【cross domain2 url】: %s' % cross_domain2_url)
return cross_domain2_url
def get_passport_url(cross_domain2_url: str) -> str:
response = session.get(url=cross_domain2_url, headers=headers).text
passport_url_str = re.findall(r'setCrossDomainUrlList\((.*)\)', response)[0]
passport_url = json.loads(passport_url_str)['arrURL'][0]
# print('8.【passport url】: %s' % passport_url)
return passport_url
def login(passport_url: str) -> None:
response = session.get(url=passport_url, headers=headers).text
login_result = json.loads(response.replace('(', '').replace(');', ''))
if login_result['result']:
user_unique_id = login_result['userinfo']['uniqueid']
user_display_name = login_result['userinfo']['displayname']
print('登录成功!用户 ID:%s,用户名:%s' % (user_unique_id, user_display_name))
else:
raise Exception('登录失败:%s' % login_result)
def main():
username = input('请输入登录账号: ')
password = input('请输入登录密码: ')
# 1.预登陆,获取一个字典参数,包含后面要用的 servertime、nonce、pubkey、rsakv
pre_parameter = get_pre_parameter(username)
# 2.通过 JS 或者 Python 获取加密后的密码
encrypted_password = get_encrypted_password(pre_parameter, password)
# 3.获取 token
token = get_token(encrypted_password, pre_parameter, username)
# 4.通过 protection url 获取加密后的手机号
encrypted_mobile = get_encrypted_mobile(token)
# 5.发送手机验证码
code = send_code(token, encrypted_mobile)
# 6.校验验证码,校验成功则返回一个重定向的 URL
redirect_url = confirm_code(encrypted_mobile, code, token)
# 7.访问重定向的 URL,提取 crossdomain2 URL
cross_domain2_url = get_cross_domain2_url(redirect_url)
# 8.访问 crossdomain2 URL,提取 passport URL
passport_url = get_passport_url(cross_domain2_url)
# 9.访问 passport URL 进行登录操作
login(passport_url)
if __name__ == '__main__':
main()
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。