Pay attention to WeChat public account: Brother K crawler, continue to share advanced crawler, JS/Android reverse engineering and other technical dry goods!
statement
All the content in this article is for learning and communication only. The captured content, sensitive URLs, and data interfaces have been desensitized, and it is strictly forbidden to use them for commercial or illegal purposes. Otherwise, all the consequences arising therefrom will have nothing to do with the author. Infringement, please contact me to delete it immediately!
Reverse target
- Goal: Netluo's anti-crawler practice platform Second question: JJEncode encryption
- Link: http://spider.wangluozhe.com/challenge/2
- Introduction: This question is similar to the first question. It requires collecting all the numbers of 100 pages and calculating the sum of all the data. The algorithm used in the second question is the SHA1 magic revision, and there is also a JJEncode encryption.
Introduction to JJEncode
JJEncode was originally a web program developed by Japanese author Yosuke HASEGAWA in 2009. It can encode any JavaScript into an obfuscated form using only 18 symbols []()!+,\"$.:;_{}~=
. Online experience address: https://utf-8.jp/public /jjencode.html , if you want to explore the principle in depth, you can reply [JJEncode] on the official account of K brother crawler to get a PDF of its detailed principle introduction.
The author has a reminder: JJEncode is easy to decode. It is not a practical obfuscation, but an encoder. JJEncode is too characteristic and easy to be detected. It is also browser dependent, and the code cannot run on a certain browser. Its disadvantage is that the stack is very serious. If the JS is large, the encryption may overflow the memory, so it is only suitable for the core function encryption. In fact, there are still very few commercial JJEncodes, but there is no harm in knowing it.
A normal piece of JS code:
alert("Hello, JavaScript" )
The code after JJEncode obfuscation (custom variable name is $):
$=~[];$={___:++$,$$$$:(![]+"")[$],__$:++$,$_$_:(![]+"")[$],_$_:++$,$_$$:({}+"")[$],$$_$:($[$]+"")[$],_$$:++$,$$$_:(!""+"")[$],$__:++$,$_$:++$,$$__:({}+"")[$],$$_:++$,$$$:++$,$___:++$,$__$:++$};$.$_=($.$_=$+"")[$.$_$]+($._$=$.$_[$.__$])+($.$$=($.$+"")[$.__$])+((!$)+"")[$._$$]+($.__=$.$_[$.$$_])+($.$=(!""+"")[$.__$])+($._=(!""+"")[$._$_])+$.$_[$.$_$]+$.__+$._$+$.$;$.$$=$.$+(!""+"")[$._$$]+$.__+$._+$.$+$.$$;$.$=($.___)[$.$_][$.$_];$.$($.$($.$$+"\""+$.$_$_+(![]+"")[$._$_]+$.$$$_+"\\"+$.__$+$.$$_+$._$_+$.__+"(\\\"\\"+$.__$+$.__$+$.___+$.$$$_+(![]+"")[$._$_]+(![]+"")[$._$_]+$._$+",\\"+$.$__+$.___+"\\"+$.__$+$.__$+$._$_+$.$_$_+"\\"+$.__$+$.$$_+$.$$_+$.$_$_+"\\"+$.__$+$._$_+$._$$+$.$$__+"\\"+$.__$+$.$$_+$._$_+"\\"+$.__$+$.$_$+$.__$+"\\"+$.__$+$.$$_+$.___+$.__+"\\\"\\"+$.$__+$.___+")"+"\"")())();
The JJEncode de-obfuscation method is very simple, the following introduces several common methods:
- Use online tools to decrypt directly, such as: http://www.hiencode.com/jjencode.html
- The code of JJEncode is usually a self-executing method (IIFE). After removing the
()
at the end of the code, put it in the browser and execute it directly to see the source code. - Online debugging, place a breakpoint on the first line of JJEncode code, and then execute it step by step, and finally you will see the source code in the virtual machine (VM)
Reverse parameters
The goal of the reverse is mainly the _signature
. The encryption method called is still window.get_sign()
, which is the same as the first question. This article will not repeat it. If you are unclear, you can go to the previous article of Brother K.
After following 2.js, you will find that it is a JJEncode confusion:
We will remove the confusing part, remove the last ()
run it in the browser console (it is recommended to open an incognito window, sometimes it may affect), you can see the source code, click on the source code to go to the virtual machine ( VM), the entire source code is displayed in front of us:
In addition to directly removing ()
run, we can also place a breakpoint on the first line of the obfuscated code, and then follow up step by step, and finally get the source code, as shown in the following figure:
It is very simple to look at the source code. It is a magically modified SHA1 anonymous function. Copy its code and rewrite it. With the Python code carrying _signature, it calculates the data of each page one by one, and the final submission is successful:
Complete code
Follow K brother crawler on GitHub and continue to share crawler-related code! Welcome star! https://github.com/kgepachong/
following 161bb072c26727 only demonstrates part of the key code and cannot be run directly! complete code warehouse address: https://github.com/kgepachong/crawler/
JavaScript encryption code
/* ==================================
# @Time : 2021-12-10
# @Author : 微信公众号:K哥爬虫
# @FileName: challenge_2.js
# @Software: PyCharm
# ================================== */
var hexcase = 0;
var chrsz = 8;
function hex_sha1(s) {
return binb2hex(core_sha1(AlignSHA1(s)));
}
function sha1_vm_test() {
return hex_sha1("abc") == "a9993e364706816aba3e25717850c26c9cd0d89d";
}
function core_sha1(blockArray) {
var x = blockArray;
var w = Array(80);
var a = 1732584173;
var b = -271733877;
var c = -1752584194;
var d = 271733878;
var e = -1009589776;
for (var i = 0; i < x.length; i += 16) {
var olda = a;
var oldb = b;
var oldc = c;
var oldd = d;
var olde = e;
for (var j = 0; j < 80; j++) {
if (j < 16)
w[j] = x[i + j];
else
w[j] = rol(w[j - 3] ^ w[j - 8] ^ w[j - 14] ^ w[j - 16], 1);
var t = safe_add(safe_add(rol(a, 5), sha1_ft(j, b, c, d)), safe_add(safe_add(e, w[j]), sha1_kt(j)));
e = d;
d = c;
c = rol(b, 30);
b = a;
a = t;
}
a = safe_add(a, olda);
b = safe_add(b, oldb);
c = safe_add(c, oldc);
d = safe_add(d, oldd);
e = safe_add(e, olde);
}
return new Array(a, b, c, d, e);
}
function sha1_ft(t, b, c, d) {
if (t < 20) {
return (b & c) | ((~b) & d);
}
if (t < 40) {
return b ^ c ^ d;
}
if (t < 60) {
return (b & c) | (b & d) | (c & d);
}
return b ^ c ^ d;
}
function sha1_kt(t) {
return (t < 20) ? 1518500249 : (t < 40) ? 1859775393 : (t < 60) ? -1894007588 : -899497514;
}
function safe_add(x, y) {
var lsw = (x & 0xFFFF) + (y & 0xFFFF);
var msw = (x >> 16) + (y >> 16) + (lsw >> 16);
return (msw << 16) | (lsw & 0xFFFF);
}
function rol(num, cnt) {
return (num << cnt) | (num >>> (32 - cnt));
}
function AlignSHA1(str) {
var nblk = ((str.length + 8) >> 6) + 1;
var blks = new Array(nblk * 16);
for (var i = 0; i < nblk * 16; i++) {
blks[i] = 0;
}
for (i = 0; i < str.length; i++) {
blks[i >> 2] |= str.charCodeAt(i) << (24 - (i & 3) * 8);
}
blks[i >> 2] |= 0x80 << (24 - (i & 3) * 8);
blks[nblk * 16 - 1] = str.length * 8;
return blks;
}
function binb2hex(binarray) {
var hex_tab = hexcase ? "0123456789ABCDEF" : "0123456789abcdef";
var str = "";
for (var i = 0; i < binarray.length * 4; i++) {
str += hex_tab.charAt((binarray[i >> 2] >> ((3 - i % 4) * 8 + 4)) & 0xF) + hex_tab.charAt((binarray[i >> 2] >> ((3 - i % 4) * 8)) & 0xF);
}
return str;
}
function getSign() {
return hex_sha1(Date.parse(new Date).toString());
}
// 测试输出
// console.log(getSign())
Python calculation key code
# ==================================
# --*-- coding: utf-8 --*--
# @Time : 2021-12-10
# @Author : 微信公众号:K哥爬虫
# @FileName: challenge_2.py
# @Software: PyCharm
# ==================================
import execjs
import requests
challenge_api = "http://spider.wangluozhe.com/challenge/api/2"
headers = {
"Content-Type": "application/x-www-form-urlencoded; charset=UTF-8",
"Cookie": "将 cookie 值改为你自己的!",
"Host": "spider.wangluozhe.com",
"Origin": "http://spider.wangluozhe.com",
"Referer": "http://spider.wangluozhe.com/challenge/2",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36",
"X-Requested-With": "XMLHttpRequest"
}
def get_signature():
with open('challenge_2.js', 'r', encoding='utf-8') as f:
ppdai_js = execjs.compile(f.read())
signature = ppdai_js.call("getSign")
print("signature: ", signature)
return signature
def main():
result = 0
for page in range(1, 101):
data = {
"page": page,
"count": 10,
"_signature": get_signature()
}
response = requests.post(url=challenge_api, headers=headers, data=data).json()
for d in response["data"]:
result += d["value"]
print("结果为: ", result)
if __name__ == '__main__':
main()
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。