[100 Cases of JS Reverse] The second question of the anti-climbing practice platform for net losers: JJEncode encryption

Pay attention to WeChat public account: Brother K crawler, continue to share advanced crawler, JS/Android reverse engineering and other technical dry goods!

statement

All the content in this article is for learning and communication only. The captured content, sensitive URLs, and data interfaces have been desensitized, and it is strictly forbidden to use them for commercial or illegal purposes. Otherwise, all the consequences arising therefrom will have nothing to do with the author. Infringement, please contact me to delete it immediately!

Reverse target

Goal: Netluo's anti-crawler practice platform Second question: JJEncode encryption
Link: http://spider.wangluozhe.com/challenge/2
Introduction: This question is similar to the first question. It requires collecting all the numbers of 100 pages and calculating the sum of all the data. The algorithm used in the second question is the SHA1 magic revision, and there is also a JJEncode encryption.

Introduction to JJEncode

JJEncode was originally a web program developed by Japanese author Yosuke HASEGAWA in 2009. It can encode any JavaScript into an obfuscated form using only 18 symbols []()!+,\"$.:;_{}~= . Online experience address: https://utf-8.jp/public /jjencode.html , if you want to explore the principle in depth, you can reply [JJEncode] on the official account of K brother crawler to get a PDF of its detailed principle introduction.

The author has a reminder: JJEncode is easy to decode. It is not a practical obfuscation, but an encoder. JJEncode is too characteristic and easy to be detected. It is also browser dependent, and the code cannot run on a certain browser. Its disadvantage is that the stack is very serious. If the JS is large, the encryption may overflow the memory, so it is only suitable for the core function encryption. In fact, there are still very few commercial JJEncodes, but there is no harm in knowing it.

A normal piece of JS code:

alert("Hello, JavaScript" )

The code after JJEncode obfuscation (custom variable name is $):

$=~[];$={___:++$,$$$$:(![]+"")[$],__$:++$,$_$_:(![]+"")[$],_$_:++$,$_$$:({}+"")[$],$$_$:($[$]+"")[$],_$$:++$,$$$_:(!""+"")[$],$__:++$,$_$:++$,$$__:({}+"")[$],$$_:++$,$$$:++$,$___:++$,$__$:++$};$.$_=($.$_=$+"")[$.$_$]+($._$=$.$_[$.__$])+($.$$=($.$+"")[$.__$])+((!$)+"")[$._$$]+($.__=$.$_[$.$$_])+($.$=(!""+"")[$.__$])+($._=(!""+"")[$._$_])+$.$_[$.$_$]+$.__+$._$+$.$;$.$$=$.$+(!""+"")[$._$$]+$.__+$._+$.$+$.$$;$.$=($.___)[$.$_][$.$_];$.$($.$($.$$+"\""+$.$_$_+(![]+"")[$._$_]+$.$$$_+"\\"+$.__$+$.$$_+$._$_+$.__+"(\\\"\\"+$.__$+$.__$+$.___+$.$$$_+(![]+"")[$._$_]+(![]+"")[$._$_]+$._$+",\\"+$.$__+$.___+"\\"+$.__$+$.__$+$._$_+$.$_$_+"\\"+$.__$+$.$$_+$.$$_+$.$_$_+"\\"+$.__$+$._$_+$._$$+$.$$__+"\\"+$.__$+$.$$_+$._$_+"\\"+$.__$+$.$_$+$.__$+"\\"+$.__$+$.$$_+$.___+$.__+"\\\"\\"+$.$__+$.___+")"+"\"")())();

The JJEncode de-obfuscation method is very simple, the following introduces several common methods:

Use online tools to decrypt directly, such as: http://www.hiencode.com/jjencode.html
The code of JJEncode is usually a self-executing method (IIFE). After removing the () at the end of the code, put it in the browser and execute it directly to see the source code.
Online debugging, place a breakpoint on the first line of JJEncode code, and then execute it step by step, and finally you will see the source code in the virtual machine (VM)

Reverse parameters

The goal of the reverse is mainly the _signature . The encryption method called is still window.get_sign() , which is the same as the first question. This article will not repeat it. If you are unclear, you can go to the previous article of Brother K.

After following 2.js, you will find that it is a JJEncode confusion:

We will remove the confusing part, remove the last () run it in the browser console (it is recommended to open an incognito window, sometimes it may affect), you can see the source code, click on the source code to go to the virtual machine ( VM), the entire source code is displayed in front of us:

In addition to directly removing () run, we can also place a breakpoint on the first line of the obfuscated code, and then follow up step by step, and finally get the source code, as shown in the following figure:

It is very simple to look at the source code. It is a magically modified SHA1 anonymous function. Copy its code and rewrite it. With the Python code carrying _signature, it calculates the data of each page one by one, and the final submission is successful:

Complete code

Follow K brother crawler on GitHub and continue to share crawler-related code! Welcome star! https://github.com/kgepachong/

following 161bb072c26727 only demonstrates part of the key code and cannot be run directly! complete code warehouse address: https://github.com/kgepachong/crawler/

JavaScript encryption code

/* ==================================
# @Time    : 2021-12-10
# @Author  : 微信公众号：K哥爬虫
# @FileName: challenge_2.js
# @Software: PyCharm
# ================================== */


var hexcase = 0;
var chrsz = 8;

function hex_sha1(s) {
    return binb2hex(core_sha1(AlignSHA1(s)));
}

function sha1_vm_test() {
    return hex_sha1("abc") == "a9993e364706816aba3e25717850c26c9cd0d89d";
}

function core_sha1(blockArray) {
    var x = blockArray;
    var w = Array(80);
    var a = 1732584173;
    var b = -271733877;
    var c = -1752584194;
    var d = 271733878;
    var e = -1009589776;
    for (var i = 0; i < x.length; i += 16) {
        var olda = a;
        var oldb = b;
        var oldc = c;
        var oldd = d;
        var olde = e;
        for (var j = 0; j < 80; j++) {
            if (j < 16)
                w[j] = x[i + j];
            else
                w[j] = rol(w[j - 3] ^ w[j - 8] ^ w[j - 14] ^ w[j - 16], 1);
            var t = safe_add(safe_add(rol(a, 5), sha1_ft(j, b, c, d)), safe_add(safe_add(e, w[j]), sha1_kt(j)));
            e = d;
            d = c;
            c = rol(b, 30);
            b = a;
            a = t;
        }
        a = safe_add(a, olda);
        b = safe_add(b, oldb);
        c = safe_add(c, oldc);
        d = safe_add(d, oldd);
        e = safe_add(e, olde);
    }
    return new Array(a, b, c, d, e);
}

function sha1_ft(t, b, c, d) {
    if (t < 20) {
        return (b & c) | ((~b) & d);
    }
    if (t < 40) {
        return b ^ c ^ d;
    }
    if (t < 60) {
        return (b & c) | (b & d) | (c & d);
    }
    return b ^ c ^ d;
}

function sha1_kt(t) {
    return (t < 20) ? 1518500249 : (t < 40) ? 1859775393 : (t < 60) ? -1894007588 : -899497514;
}

function safe_add(x, y) {
    var lsw = (x & 0xFFFF) + (y & 0xFFFF);
    var msw = (x >> 16) + (y >> 16) + (lsw >> 16);
    return (msw << 16) | (lsw & 0xFFFF);
}

function rol(num, cnt) {
    return (num << cnt) | (num >>> (32 - cnt));
}

function AlignSHA1(str) {
    var nblk = ((str.length + 8) >> 6) + 1;
    var blks = new Array(nblk * 16);
    for (var i = 0; i < nblk * 16; i++) {
        blks[i] = 0;
    }
    for (i = 0; i < str.length; i++) {
        blks[i >> 2] |= str.charCodeAt(i) << (24 - (i & 3) * 8);
    }
    blks[i >> 2] |= 0x80 << (24 - (i & 3) * 8);
    blks[nblk * 16 - 1] = str.length * 8;
    return blks;
}

function binb2hex(binarray) {
    var hex_tab = hexcase ? "0123456789ABCDEF" : "0123456789abcdef";
    var str = "";
    for (var i = 0; i < binarray.length * 4; i++) {
        str += hex_tab.charAt((binarray[i >> 2] >> ((3 - i % 4) * 8 + 4)) & 0xF) + hex_tab.charAt((binarray[i >> 2] >> ((3 - i % 4) * 8)) & 0xF);
    }
    return str;
}

function getSign() {
    return hex_sha1(Date.parse(new Date).toString());
}

// 测试输出
// console.log(getSign())

Python calculation key code

# ==================================
# --*-- coding: utf-8 --*--
# @Time    : 2021-12-10
# @Author  : 微信公众号：K哥爬虫
# @FileName: challenge_2.py
# @Software: PyCharm
# ==================================


import execjs
import requests


challenge_api = "http://spider.wangluozhe.com/challenge/api/2"
headers = {
    "Content-Type": "application/x-www-form-urlencoded; charset=UTF-8",
    "Cookie": "将 cookie 值改为你自己的！",
    "Host": "spider.wangluozhe.com",
    "Origin": "http://spider.wangluozhe.com",
    "Referer": "http://spider.wangluozhe.com/challenge/2",
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36",
    "X-Requested-With": "XMLHttpRequest"
}


def get_signature():
    with open('challenge_2.js', 'r', encoding='utf-8') as f:
        ppdai_js = execjs.compile(f.read())
    signature = ppdai_js.call("getSign")
    print("signature: ", signature)
    return signature


def main():
    result = 0
    for page in range(1, 101):
        data = {
            "page": page,
            "count": 10,
            "_signature": get_signature()
        }
        response = requests.post(url=challenge_api, headers=headers, data=data).json()
        for d in response["data"]:
            result += d["value"]
    print("结果为: ", result)


if __name__ == '__main__':
    main()

[100 Cases of JS Reverse] The second question of the anti-climbing practice platform for net losers: JJEncode encryption

statement

Reverse target

Introduction to JJEncode

Reverse parameters

Complete code

JavaScript encryption code

Python calculation key code

K哥爬虫

引用和评论

【验证码逆向专栏】某采购网，360 磐云盾、文字点选验证码逆向分析

Python 与 PostgreSQL 集成：深入 psycopg2 的应用与实践

Anaconda安装教程以及Anaconda和pip配置国内镜像

如何减少跨团队交付摩擦？——基于 DevOps 与敏捷的最佳实践

pip安装报错：No such file or directory 'conda-forge' 没有那个文件或目录

Python 描述符

使用 chardet 判断文件编码需要注意的坑——过大的文件会导致高耗时