1

statement

All the content in this article is for learning and communication only. The captured content, sensitive URLs, and data interfaces have been desensitized. Commercial and illegal uses are strictly prohibited. Otherwise, all consequences arising therefrom have nothing to do with the author. If there is any infringement , please contact me to delete immediately!

reverse goal

  • Goal: Accelerate Le Encryption Reverse
  • Website: aHR0cHM6Ly93d3cubXBzLmdvdi5jbi9pbmRleC5odG1s
  • Reverse difficulties: OB obfuscation, dynamic encryption algorithm, multi-layer cookie acquisition

Accelerate music

Acceleration Music is a website CDN acceleration and website security protection platform launched by Zhichuangyu.

The feature of Accelerator is that there are generally three requests to visit a website:

  1. When the website is requested for the first time, the response status code returned by the website is 521, and the response returned is the JS code obfuscated by AAEncode;
  2. The second time the website is requested, the response status code returned by the website is 521, and the response returned is the obfuscated JS code;
  3. The third time the website is requested, the response status code returned by the website is 200, and the webpage content can be accessed normally.

reverse thinking

According to the characteristics of Accelerator we mentioned above, we need to go through the following three steps to get the real HTML page:

  1. The first time a website is requested, the Set-Cookie returned by the server carries the jsluid_s parameter, and the obtained response content is decrypted to obtain the value of the first jsl_clearance_s parameter;
  2. Visit the website again with the cookie value obtained from the first request to the website, and reverse the obtained response content to get the value of the second jsl_clearance_s parameter;
  3. Use the cookie with the jsluid_s and jsl_clearance_s parameters to visit the website again, get the real HTML page content, and then collect data.

Packet capture analysis

Enter the website, open the developer tool to capture packets, we can see in the Network that the request page has responded to index.html three times, and the first two return status codes are 521, which is in line with the characteristics of Accelerator:

2

First layer cookie acquisition

Checking the response directly shows no response content. We capture the website through Fiddler. We can see that the response content returned by the first index.html is encrypted by AAEncode. of:

 <script>
    document.cookie=('_')+('_')+('j')+('s')+('l')+('_')+('c')+('l')+('e')+('a')+('r')+('a')+('n')+('c')+('e')+('_')+('s')+('=')+(-~[]+'')+((1+[2])/[2]+'')+(([2]+0>>2)+'')+((2<<2)+'')+(-~(8)+'')+(~~{}+'')+(6+'')+(7+'')+(~~[]+'')+((1<<2)+'')+('.')+((+true)+'')+(~~{}+'')+(9+'')+('|')+('-')+(+!+[]+'')+('|')+(1+6+'')+('n')+((1<<2)+'')+('k')+('X')+((2)*[4]+'')+('R')+('w')+('z')+('c')+(1+7+'')+('w')+('T')+('j')+('r')+('b')+('H')+('m')+('W')+('H')+('j')+([3]*(3)+'')+('G')+('X')+('C')+('t')+('I')+('%')+(-~[2]+'')+('D')+(';')+('m')+('a')+('x')+('-')+('a')+('g')+('e')+('=')+(3+'')+(3+3+'')+(~~{}+'')+(~~[]+'')+(';')+('p')+('a')+('t')+('h')+('=')+('/');location.href=location.pathname+location.search
</script>

The facial expression string in document.cookie is actually the value of the first __jsl_clearance_s. After extracting the encrypted content directly through regular expressions, use the execjs.eval() method to get the decrypted value:

 import re
import execjs


AAEncode_text = """以上内容"""
content_first = re.findall('cookie=(.*?);location', AAEncode_text)[0]
jsl_clearance_s = execjs.eval(content_first).split(';')[0]
print(jsl_clearance_s)
# __jsl_clearance_s=1658906704.109|-1|7n4kX8Rwzc8wTjrbHmWHj9GXCtI%3D

Second layer cookie acquisition

The second index.html captured by the package returns a JS file that has been obfuscated by OB. We need to debug and analyze it, but it is difficult to find the location of the JS file directly through search on the web page. Here are two methods recommended. Locate it:

1. File replacement

Right-click the index.html file with the second status code 521 captured, and save it locally as follows:

3

After saving it locally, you will find that the JS file is compressed, which is not conducive to observation. You can format it through the JS formatting tool on the following website: https://spidertools.cn/#/formatJS , paste the formatted code into For processing in the editor, some fine-tuning may be required. For example, there will be more spaces before and after the first and last Script tags. Add debugger; after <script> as follows:

 <script>
debugger;
var _0x1c58 = ['wpDCsRDCuA==', 'AWc8w7E=', 'w6llwpPCqA==', 'w61/wow7',

Finally, replace it through Fiddler, click Add Rule to add a new rule, and the replacement can be completed as follows:

4

After the above operations are completed, open Fiddler to capture packets (Capturing is displayed in the lower left corner of F12), clear the web page cache, refresh the web page, and you will find that the break is successful, that is, the location of the JS file is located, and you can debug the breakpoint:

5

2. Hook Cookie Value

Because the JS file we obtained generates a cookie, which contains the values of the jsluid_s and jsl_clearance_s parameters, so we might as well directly hook the cookie to the location of the JS file. If you don’t know the hook method, you can check out Brother K’s previous articles. , the following is the Hook code:

 (function () {
    'use strict';
    var org = document.cookie.__lookupSetter__('cookie');
    document.__defineSetter__('cookie', function (cookie) {
        if (cookie.indexOf('__jsl_clearance_s') != -1) {
            debugger;
        }
        org = cookie;
    });
    document.__defineGetter__('cookie', function () {
        return org;
    });
})();

There are many ways of Hook injection. Here, it is injected through the plug-in in Fiddler. The plug-in can be obtained by sending [Fiddler plug-in] in the official account of Brother K crawler:

6

Similarly, after the setting is completed, start the packet capture, clear the webpage cache, refresh the webpage, and the page can be successfully interrupted. The upper part is the code segment we injected through the hook method, which shows the value of the __jsl_clearance_s keyword in the cookie. The following After formatting the boxed part, you will find that it is the content of the JS file that was obfuscated by OB before:

7

Debug and analyze JS files

After Hook, you can find the encryption location by following the stack forward. We know that the document.cookie attribute is generally used in JavaScript to create, read, and delete cookies. After analyzing some parameters in the JS file, they are dynamically changing, so we Use the local replacement method to fix a set, and then search the document through CTRL + F in the JS file, there is only one, break the debugging at line 558, select _0x2a9a('0xdb', 'WGP(') + 'ie' After the mouse hovers, you will find it here This is the obfuscated style of the cookie:

8

Select all the content after the equal sign, and hover over it to find that the value of the __jsl_clearance_s parameter in the cookie is generated here:

9

At this point, we know the location where the cookie is generated. Next, we need to understand its encryption logic and encryption method, and then reproduce it through python. The complete code in the document part is as follows:

 document[_0x2a9a('0xdb', 'WGP(') + 'ie'] = _0x2228a0[_0x2a9a('0x52', '$hOV') + 'W'](_0x2228a0[_0x2a9a('0x3', '*hjw') + 'W'](_0x2228a0[_0x2a9a('0x10b', 'rV*F') + 'W'](_0x60274b['tn'] + '=' + _0x732635[0x0], _0x2228a0[_0x2a9a('0x3d', 'QRZ0') + 'q']), _0x60274b['vt']), _0x2228a0[_0x2a9a('0x112', ']A89') + 'x']);

For OB confusion related content, you can watch Brother K's previous articles. The content after the equal sign here is more complicated. In fact, what we want to get is the value of the jsl_clearance_s parameter. Through debugging, we can see that its value is generated by 0x60274b['tn'] + '=' + _0x732635[0x0] :

10

It can be seen from the above 0x60274b['tn'] corresponding part is __jsl_clearance_s, and its value is 0x732635[0x0] , so we need to further track 0x732635 , in the position generated by the search, in the 91b45--- Line 538 can find the location where its definition is generated. Breakpoint debugging can see that 0x732635[0x0] is actually the value of the first position in the array of 0x732635 :

11

Let's further analyze 0x732635 the respective meanings of the following codes, _0x14e035(_0x60274b['ct']) takes the value of the ct parameter in the dictionary passed in by the go function:

 go({
    "bts": ["1658906704.293|0|YYj", "Jm5cKs%2B1v1GqTYAtpQjthM%3D"],
    "chars": "vUzQIgamgWnnFOJyKwXiGK",
    "ct": "690f55a681f304c95b35941b20538480",
    "ha": "md5",
    "tn": "__jsl_clearance_s",
    "vt": "3600",
    "wt": "1500"
})

12

Analysis shows that the value of _0x60274b[_0x2a9a('0xf9', 'uUBi')] in the array is spliced according to a certain rule is the value of the __jsl_clearance_s parameter, and _0x2a9a('0xf9', 'uUBi') corresponds to the value of bts in the dictionary:

13

14

Next, trace it further _0x14e035 , you can find that it is a function body, and the return value after return on line 533 is the value of the __jsl_clearance_s parameter:

15

Breakpoint debugging on line 532, you can know that after the hash _0x2a7ea9 is the value of the __jsl_clearance_s parameter:

16

The value of hash( _0x2a7ea9 ) _0x2a7ea9 the encrypted result, in this example, the encrypted result is a 32-bit string composed of 0-9 and af, which is an obvious MD5 encryption feature. Find an online MD5 encryption for verification, and find that they are consistent. The encryption method here, that is, the hash method is not all MD5. After refreshing several times, it will change. In fact, this hash method is the same as the value of ha in the dictionary passed in by the original call to the go function. Correspondingly, ha is the type of encryption algorithm. There are three types of md5, sha1, and sha256. Therefore, when we process locally, we must have these three encryption algorithms at the same time, and use the value of ha to match different algorithms.

Further observation here is a for loop, the analysis found that the value of each loop hash(_0x2a7ea9) is dynamically changing, the reason is that the value of _0x2a7ea9 is changing dynamically, _0x2a7ea9 Only the middle two letters are changing, and you can't see it unless you look closely:

17

_ 0x2a7ea9 , the analysis shows that _0x2a7ea9 The value of the parameter is the first value of the 0x5e5712 array plus two letters plus the array The second value consists of the result:

18

The middle two letters are generated by writing the following paragraph twice, namely _0x60274b['chars']['substr'][1] , take one letter of the chars parameter in the dictionary, and take it twice, here the for loop is used to continuously take these two values , until its value is encrypted and equal to the value of _0x56cbce (ie ct), then it is passed as the return value to the __jsl_clearance_s parameter:

 _0x60274b[_0x2a9a('0x45', 'XXkw') + 's'][_0x2a9a('0x5a', 'ZN)]') + 'tr'](_0x8164, 0x1)

0x56cbce is the value of ct:

19

The front 0x2228a0[_0x2a9a('0x6d', 'U0Y3') + 's'] is a method. Let's follow up the past and see what kind of logic is implemented in this method:

20

Its content is as follows, you can see that the value returned by this method is two equal parameters:

 _0x560b67[_0x2a9a('0x15', 'NwFy') + 's'] = function(_0x4573a2, _0x3855be) {
    return _0x4573a2 == _0x3855be;
};

Simulation execution

To sum up, the logic in the _0x14e035 function is to judge whether the value of ---712813170ee99620453e5c3871957615 _0x2a7ea9 is equal to the value encrypted by the hash method. Then pass the return value to the __jsl_clearance_s parameter. After the loop, if there is no successful matching value, line 509 will be executed to prompt failure. The value of ha in the incoming parameter is changing, that is, the encryption algorithm is also在变化的,有三种加密方式SHA1SHA256 9cb8b29a0769575930531bb1d88f299c MD5 ,我们可以扣下hash方法,也可以直接Use the crypto-js library to achieve:

 var CryptoJS = require('crypto-js');


function hash(type, value){
    if(type == 'md5'){
        return CryptoJS.MD5(value).toString();
    }
    if(type == 'sha1'){
        return CryptoJS.SHA1(value).toString();
    }
    if(type == 'sha256'){
        return CryptoJS.SHA256(value).toString();
    }
}


var _0x2228a0 = {
    "mLZyz" : function(_0x435347, _0x8098d) {
        return _0x435347 < _0x8098d;
    },
    "SsARo" : function(_0x286fd4, _0x10b2a6) {
        return _0x286fd4 + _0x10b2a6;
    },
    "jfMAx" : function(_0x6b4da, _0x19c099) {
        return _0x6b4da + _0x19c099;
    },
    "HWzBW" : function(_0x3b9d7f, _0x232017) {
        return _0x3b9d7f + _0x232017;
    },
    "DRnYs" : function(_0x4573a2, _0x3855be) {
        return _0x4573a2 == _0x3855be;
    },
    "ZJMqu" : function(_0x3af043, _0x1dbbb7) {
        return _0x3af043 - _0x1dbbb7;
    },
};


function cookies(_0x60274b){
    var _0x34d7a8 = new Date();
    function _0x14e035(_0x56cbce, _0x5e5712) {
    var _0x2d0a43 = _0x60274b['chars']['length'];
    for (var _0x212ce4 = 0x0; _0x212ce4 < _0x2d0a43; _0x212ce4++) {
        for (var _0x8164 = 0x0; _0x2228a0["mLZyz"](_0x8164, _0x2d0a43); _0x8164++) {
            var _0x2a7ea9 = _0x5e5712[0] + _0x60274b["chars"]["substr"](_0x212ce4, 1) + _0x60274b["chars"]["substr"](_0x8164, 1) + _0x5e5712[1];
            if (_0x2228a0["DRnYs"](hash(_0x60274b['ha'], _0x2a7ea9), _0x56cbce)) {
                return [_0x2a7ea9, _0x2228a0["ZJMqu"](new Date(), _0x34d7a8)];
            }
        }
    }
    }
    var _0x732635 = _0x14e035(_0x60274b['ct'], _0x60274b['bts']);
    return {'__jsl_clearance_s' : _0x732635[0]};
}

// console.log(cookies({
//     "bts": ["1658906704.293|0|YYj", "Jm5cKs%2B1v1GqTYAtpQjthM%3D"],
//     "chars": "vUzQIgamgWnnFOJyKwXiGK",
//     "ct": "690f55a681f304c95b35941b20538480",
//     "ha": "md5",
//     "tn": "__jsl_clearance_s",
//     "vt": "3600",
//     "wt": "1500"
// }))

// __jsl_clearance_s: '1658906704.293|0|YYjzaJm5cKs%2B1v1GqTYAtpQjthM%3D'

full code

bilibili pays attention to Brother K's reptile, and the little assistant does video teaching: https://space.bilibili.com/1622879192

GitHub pays attention to Brother K's crawler and continues to share crawler-related code! Welcome star! https://github.com/kgepachong/

The following only demonstrates some key codes and cannot be run directly! Complete code repository address: https://github.com/kgepachong/crawler/

 # =======================
# --*-- coding: utf-8 --*--
# @Time    : 2022/7/27
# @Author  : 微信公众号:K哥爬虫
# @FileName: jsl.py
# @Software: PyCharm
# =======================


import json
import re
import requests
import execjs


cookies = {}
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36"
}
url = "脱敏处理,完整代码关注 https://github.com/kgepachong/crawler/"


def get_first_cookie():
    global cookies
    resp_first = requests.get(url=url, headers=headers)
    # 获取 cookie 值 __jsluid_s
    cookies.update(resp_first.cookies)
    # 获取第一层响应内容, AAEncode 加密
    content_first = re.findall('cookie=(.*?);location', resp_first.text)[0]
    jsl_clearance_s = execjs.eval(content_first).split(';')[0]
    # 获取 cookie 值 __jsl_clearance_s
    cookies['__jsl_clearance_s'] = jsl_clearance_s.split("=")[1]


def get_second_cookie():
    global cookies
    # 通过携带 jsluid_s 和 jsl_clearance_s 值的 cookie 获取第二层响应内容
    resp_second = requests.get(url=url, headers=headers, cookies=cookies)
    # 获取 go 字典参数
    go_params = re.findall(';go\((.*?)\)</script>', resp_second.text)[0]
    params = json.loads(go_params)
    return params


def get_third_cookie():
    with open('jsl.js', 'r', encoding='utf-8') as f:
        jsl_js = f.read()
    params = get_second_cookie()
    # 传入字典
    third_cookie = execjs.compile(jsl_js).call('cookies', params)
    cookies.update(third_cookie)


def main():
    get_first_cookie()
    get_third_cookie()
    resp_third = requests.get(url=url, headers=headers, cookies=cookies)
    resp_third.encoding = 'utf-8'
      print(resp_third.text)


if __name__ == '__main__':
    main()

21


K哥爬虫
166 声望154 粉丝

Python网络爬虫、JS 逆向等相关技术研究与分享。