Pay attention to WeChat public account: Brother K crawler, continue to share advanced crawler, JS/Android reverse engineering and other technical dry goods!
statement
All the content in this article is for learning and communication only. The captured content, sensitive URLs, and data interfaces have been desensitized, and it is strictly forbidden to use them for commercial or illegal purposes. Otherwise, all consequences arising therefrom will have nothing to do with the author. Infringement, please contact me to delete it immediately!
Reverse target
- Goal: Encrypt the login interface of a Peng Education, with simple JS obfuscation
- Homepage:
aHR0cHM6Ly9sZWFybi5vcGVuLmNvbS5jbi8=
- Interface:
aHR0cHM6Ly9sZWFybi5vcGVuLmNvbS5jbi9BY2NvdW50L1VuaXRMb2dpbg==
- Reverse parameters: Form Data:
black_box: eyJ2IjoiR01KM0VWWkVxMG0ydVh4WUd...
Reverse process
The goal of this reverse engineering is also a login interface, where the encrypted JS uses simple obfuscation, which can be used as an entry-level tutorial for obfuscation restoration. When you come to the login page, you can enter your account and password to log in. The login POST request, Form Data has an encryption parameter black_box, which is the goal of this reverse engineering. The packet capture is as follows:
Search black_box directly, you can easily find the encrypted place in login.js, as shown in the following figure:
Look _fmOpt.getinfo()
this method is called fm.js in the OO0O0()
way to see this is 0 is O, most were confused, as shown below:
Click in and take a look. The entire fm.js is obfuscated code. When we select OQoOo[251]
, we can see that it is actually a string object, or you can directly output it in the Console to see its actual value, which is returned by OO0O0
oOoo0[OQoOo[448]](JSON[OQoOo[35]](O0oOo[OQoOo[460]]))
is the value of black_box, as shown in the figure below:
Observe carefully, you can find that OQoOo
should be something similar to an array. By passing in the element subscripts to take its real value in turn, you can search for a value at random, and you can find an array at the end of the code. This array is actually OQoOo
, which can be passed in Subscript to verify, as shown in the figure below:
At this point, we actually know the general confusion principle. We can take this JS down, write a small script locally, and replace these values:
# ==================================
# --*-- coding: utf-8 --*--
# @Time : 2021-11-09
# @Author : 微信公众号:K哥爬虫
# @FileName: replace_js.py
# @Software: PyCharm
# @describe: 混淆还原小脚本
# ==================================
# 待替换的值(太多了,仅列出少部分)
# 以实际列表为准,要和 fm_old.js 里的列表一致
item = ['referrer', 'absolute', 'replace',...]
# 混淆后的 JS
with open("fm_old.js", "r", encoding="utf-8") as f:
js_lines = f.readlines()
js = ""
for j in js_lines:
js += j
for i in item:
# Qo00o 需要根据你 fm_old.js 具体的字符串进行替换
str_old = "Qo00o[{}]".format(item.index(i))
js = js.replace(str_old, '"' + i + '"')
# 还原后的 JS
with open("fm_new.js", "w", encoding="utf-8") as f:
f.write(js)
with this script, you may find that JS will report an error. The reason is that some line breaks, slash parsing errors, and double quotes are repeatedly used. You can manually modify it yourself.
One thing to note here is that there is a suffix after fm.js, similar to t=454594, t=454570, etc. The JS content obtained by different suffixes is also different. The order of various function variable names and the list elements is different, and the actually called The method is the same, so the impact is not big, just pay attention to the content of the list when replacing, the string that needs to be replaced is consistent with the JS file you downloaded.
After restoring the JS, we can replace the restored JS with the obfuscated JS of the website itself. There are many ways to replace it, such as using Fiddler and other packet capture tools to replace the response, using plug-ins such as ReRes for replacement, and using a browser The Overrides function that comes with the developer tools is replaced (a function only available after Chrome 64), etc. Here we use Fiddler's Autoresponder function to replace it.
Actually, the suffix of this fm.js will not change in a short time, so you can directly copy its full address to replace it. To be more rigorous, we can use regular expressions to match the t value, select AutoResponder in Fiddler, and click Add Rule , Add replacement rules, the way to write regular expressions is as follows: regex:https:\/\/static\.tongdun\.net\/v3\/fm\.js\?t=\d+
, note that the regex prefix is indispensable, select Enable rules (application rules), Accept all CONNECTs (accept all connections), Unmatched requests passthrough (unmatched requests passthrough) Send the past according to the previous request address), Enable Latency is to set the delayed effective time, do not need to check, as shown in the following figure:
oQOQ0["blackBox"]
breakpoint. You can see that the current JS is clearer. Look at the return statement at the end of this function. 061a747b2f3298 contains it
, os
, t
, v
, using the stringify method of JSON Convert it into a string, and then call the QQo0
method to encrypt, as shown in the following figure:
Let's take a look at oQOQ0["blackBox"]
in the four parameters, which it
, os
, v
three parameters at the beginning of this function has been defined, v
is Q0oQQ["version"]
, is constant, the direct search may find that this value is at the beginning of In the big list, os
is a fixed value, it
is the subtracted value of two timestamps, O000o
a method of subtracting two values, oQOQo
a timestamp that can be searched for var oQOQo
, which is the timestamp generated at the beginning of loading, JS At the beginning, it takes about one minute to load it until you click to log in to enter the encryption function, so here we can directly generate a five-digit random number (the difference between one minute and one millisecond is about five digits).
Now there is only one t
. Looking down, t
is actually Q0oQQ["tokens"]
. After passing an if-else statement in the middle, you can bury breakpoints for debugging. I found that only the else statement was executed, and the t
is just this one. So the rest of the code can actually be deleted when it is deducted.
This tokens has been tested for many times and found to be unchanged. Try to directly search for the token keyword, and you can find the place where it is assigned. The id
is divided according to the | symbol, and the first index value is tokens, and then look at the value of id
, Did not find the obvious generation logic, copy its value and search it, and found that it is returned through an interface, you can directly write it to death, or you can request this interface first, and take the returned value, as shown in the following figure:
Since then, all the parameters have been searched, back to the original return position, there is still an encryption function, that is ooOoO["encode"]()
, follow up directly, and this method can be deducted. What is missing for local debugging, and the function used is completed. That's it.
Complete code
GitHub pays attention to K brother crawler, and continues to share crawler-related code! Welcome star! https://github.com/kgepachong/
following 161a747b2f34ef only demonstrates part of the key code and cannot be run directly! complete code warehouse address: https://github.com/kgepachong/crawler/
JavaScript encryption key code architecture
function oQ0OQ(Q0o0, o0OQ) {
return Q0o0 < o0OQ;
}
function O000O(Q0o0, o0OQ) {
return Q0o0 >> o0OQ;
}
function Qo0oo(Q0o0, o0OQ) {
return Q0o0 | o0OQ;
}
function OOO0Q(Q0o0, o0OQ) {
return Q0o0 << o0OQ;
}
function OooQo(Q0o0, o0OQ) {
return Q0o0 & o0OQ;
}
function Oo0OO(Q0o0, o0OQ) {
return Q0o0 + o0OQ;
}
var oQoo0 = {};
oQoo0["_keyStr"] = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=",
oQoo0["encode"] = function QQQ0(Q0o0) {
var o0OQ = 62;
while (o0OQ) {
switch (o0OQ) {
case 116 + 13 - 65: {}
case 118 + 8 - 63: {}
case 94 + 8 - 40: {}
case 122 + 6 - 63: {}
}
}
};
oQoo0["_utf8_encode"] = function oOQ0(Q0o0) {}
function OOoO0() {
var tokens = "e0ia+fB5zvGuTjFDgcKahQwg2UEH8b0k7EK/Ukt4KwzyCbpm11jjy8Au64MC6s7HvLRacUxd7ka4AdDidJmYAA==";
var version = "+X+3JWoUVBc12xtmgMpwzjAone3cp6/4QuFj7oWKNk+C4tqy4un/e29cODlhRmDy";
var Oo0O0 = {};
Oo0O0["blackBox"] = {};
Oo0O0["blackBox"]["v"] = version;
Oo0O0["blackBox"]["os"] = "web";
Oo0O0["blackBox"]["it"] = parseInt(Math.random() * 100000);
Oo0O0["blackBox"]["t"] = tokens;
return oQoo0["encode"](JSON.stringify(Oo0O0["blackBox"]));
}
// 测试样例
console.log(OOoO0())
Python login key code
# ==================================
# --*-- coding: utf-8 --*--
# @Time : 2021-11-10
# @Author : 微信公众号:K哥爬虫
# @FileName: open_login.py
# @Software: PyCharm
# ==================================
import time
import execjs
import requests
login_url = "脱敏处理,完整代码关注 GitHub:https://github.com/kgepachong/crawler"
def get_black_box():
with open('get_black_box.js', 'r', encoding='utf-8') as f:
exec_js = f.read()
black_box = execjs.compile(exec_js).call('OOoO0')
return black_box
def login(black_box, username, password):
params = {"bust": str(int(time.time() * 1000))}
data = {
"loginName": username,
"passWord": password,
"validateNum": "",
"black_box": black_box
}
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36"
}
response = requests.post(url=login_url, params=params, data=data, headers=headers)
print(response.json())
def main():
username = input("请输入登录账号: ")
password = input("请输入登录密码: ")
black_box = get_black_box()
login(black_box, username, password)
if __name__ == '__main__':
main()
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。