2
头图
Pay attention to WeChat public account: K brother crawler, QQ exchange group: 808574309, continue to share advanced crawler, JS/Android reverse engineering and other technical dry goods!

What is Hook?

Hook is translated into hook in Chinese. Hook is actually a system mechanism provided in Windows to replace "interrupt" under DOS. The concept of Hook is very common in Windows desktop software development, especially the mechanism of triggering various events. After a specific system event is hooked, once a hooked event occurs, the program that hooks the event will receive a notification from the system, and then the program can respond to the event as soon as possible. It may be better to understand it as "hijacking" in the program. We can hijack an object through Hook technology, pull out the program of an object and replace it with code fragments that we rewrite, modify parameters or replace return values. , So as to control its interaction with other objects.

In layman's terms, Hook is actually blocking the road and robbing. Ma Bangde took his wife out of the city, ate hot pot, and sang songs. Suddenly he was robbed by the maid. Zhang Mazi robbed Ma Bangde’s train and transformed himself into the magistrate. , Rushed to Goose City with his subordinates to take office. The process of Hook is the process of Zhang Mazi replacing Mabonde.

01.png

Hook in JS reverse

In JavaScript reverse engineering, the process of replacing the original function can be called Hook. The following is a simple code to understand the Hook process:

function a() {
  console.log("I'm a.");
}

a = function b() {
  console.log("I'm b.");
};

a()  // I'm b.

Direct overwrite the original function is the easiest way, the code above will be a function has been rewritten, calling a function again will output I'm b. , if you want to perform original a content function can be used to store intermediate variables:

function a() {
  console.log("I'm a.");
}

var c = a;

a = function b() {
  console.log("I'm b.");
};

a()  // I'm b.
c()  // I'm a.

At this time, calling the a function will output I'm b. , and calling the c function will output I'm a. .

This method of direct coverage of the original function is usually only used for temporary debugging, which is not very useful, but it can help us understand the process of Hook. In the actual JS reverse process, we will use a more advanced method, such as Object.defineProperty() .

Object.defineProperty()

Basic syntax: Object.defineProperty(obj, prop, descriptor) , its function is to directly define a new attribute on an object, or modify an existing attribute of an object, the meaning of the three parameters received is as follows:

obj : The current object whose attributes need to be defined;

prop : The name of the attribute that needs to be defined currently;

descriptor : Attribute descriptor, which can take the following values:

Attribute nameDefaultsmeaning
getundefinedAccess descriptor, the method of obtaining the value of the target attribute
setundefinedAccess descriptor, the method of setting the value of the target attribute
valueundefinedData descriptor, set the value of the attribute
writablefalseData descriptor, whether the value of the target attribute can be rewritten
enumerablefalseWhether the target attribute can be enumerated
configurablefalseWhether the target attribute can be deleted or whether the characteristic can be modified again

Under normal circumstances, the definition and assignment of an object is like this:

var people = {}
people.name = "Bob"
people["age"] = "18"

console.log(people)
// { name: 'Bob', age: '18' }

Use the Object.defineProperty() method:

var people = {}

Object.defineProperty(people, 'name', {
   value: 'Bob',
   writable: true  // 是否可以被重写
})

console.log(people.name)  // 'Bob'

people.name = "Tom"
console.log(people.name)  // 'Tom'

In Hook, the most used are access descriptors, namely get and set.

get: The getter function of the property. If there is no getter, it is undefined. When the property is accessed, this function will be called. No parameters will be passed in during execution, but the this object will be passed in (due to inheritance, this here is not It must be the object that defines the property), and the return value of the function will be used as the value of the property.

set: The setter function of the property. If there is no setter, it is undefined. When the property value is modified, this function will be called. This method accepts a parameter, which is the new value assigned, and will be passed into the this object during assignment.

Use an example to demonstrate:

var people = {
  name: 'Bob',
};
var count = 18;

// 定义一个 age 获取值时返回定义好的变量 count
Object.defineProperty(people, 'age', {
  get: function () {
    console.log('获取值!');
    return count;
  },
  set: function (val) {
    console.log('设置值!');
    count = val + 1;
  },
});

console.log(people.age);
people.age = 20;
console.log(people.age);

Output:

获取值!
18
设置值!
获取值!
21

Through this method, we can add some code when setting a certain value, such as debugger; , let it be disconnected, and then use the call stack to debug to find the place of parameter encryption or parameter generation. It should be noted that, When the website loads, we must first run our Hook code, and then run the website's own code to be able to successfully break. This process can be called Hook code injection. The following will introduce several mainstream injection methods.

Several methods of Hook injection

__dfp in a certain Qiyi cookie to demonstrate how to inject Hook.

1. Fiddler plug-in injection

When you come to the homepage of a certain Qiyi, you can see that its cookie has a value __dfp

02.png

If the direct search cannot be found, we want to use the Hook method to __dfp value is generated, and then we can write the following self-executing function:

(function () {
  'use strict';
  var cookieTemp = '';
  Object.defineProperty(document, 'cookie', {
    set: function (val) {
      if (val.indexOf('__dfp') != -1) {
        debugger;
      }
      console.log('Hook捕获到cookie设置->', val);
      cookieTemp = val;
      return val;
    },
    get: function () {
      return cookieTemp;
    },
  });
})();

if (val.indexOf('__dfp') != -1) {debugger;} means to retrieve __dfp in the string. Equal to -1 means that the string value does not appear, and vice versa. If it does, then the debugger will be disconnected. It should be noted that it cannot be written as if (val == '__dfp') {debugger} , because the value passed by val is similar to __dfp=xxxxxxxxxx , so the writing cannot be interrupted.

How to use it with the code? That is, how to inject Hook code? It is recommended to use the Fiddler packet capture tool with the plug-in of the programming cat. The plug-in can be obtained by entering the keyword [ Fiddler plug-in . The principle can be understood as a process of interception -> processing -> release, using Fiddler to replace the response. After Fiddler intercepts the data, insert the Hook code in the first line of the source code. Since the Hook code is a self-executing function, once the web page is loaded, the Hook code will inevitably run first. After the installation is complete, as shown in the figure below, open the packet capture and click to open the injection Hook:

03.png

After the browser clears the cookie, re-enter a page of Qiyi, you can see that it is successfully broken, and you can see some of the captured cookie values in the console. At this time, val is __dfp , and then the Call Stack on the right You can see the calling process of some functions in the call stack, and follow up in turn to find the place where __dfp

04.png

2. TamperMonkey injection

TamperMonkey, commonly known as the oil monkey plug-in, is a free browser extension and the most popular user script manager. It supports many mainstream browsers, including Chrome, Microsoft Edge, Safari, Opera, Firefox, UC browser, 360 browser, QQ browser and so on, basically realize the script writing once, and all platforms can run. It can be said that browser-based applications are truly cross-platform. Users can directly obtain scripts published by others on platforms such as GreasyFork, OpenUserJS, etc., with many and powerful functions, such as video analysis and removal of advertisements.

We still use a certain Qiyi cookie as an example to demonstrate how to write a TamperMonkey script. First, go to the app store to install TamperMonkey. The installation process will not be repeated, and then click the icon to add a new script, or click the management panel, and then click the plus sign to create a new script. Write the following code:

// ==UserScript==
// @name         Cookie Hook
// @namespace    http://tampermonkey.net/
// @version      0.1
// @description  Cookie Hook 脚本示例
// @author       K哥爬虫
// @match        *
// @icon         https://www.kuaidaili.com/img/favicon.ico
// @grant        none
// @run-at       document-start
// ==/UserScript==

(function () {
  'use strict';
  var cookieTemp = '';
  Object.defineProperty(document, 'cookie', {
    set: function (val) {
      if (val.indexOf('__dfp') != -1) {
        debugger;
      }
      console.log('Hook捕获到cookie设置->', val);
      cookieTemp = val;
      return val;
    },
    get: function () {
      return cookieTemp;
    },
  });
})();

05.png

Body self-executing JavaScript functions are the same as before, to note here is that the top of the comments, each option is meaningful, all options reference TamperMonkey official document , the following list of more common, more Some important options (including @match , @include and @run-at options):

Optionsmeaning
@nameThe name of the script
@namespaceNamespace, used to distinguish scripts with the same name, generally write the author's name or URL
@versionScript version, the update of the oil monkey script will read this version number
@descriptionDescribe what this script is for
@authorThe name of the author who wrote this script
@matchMatch the regular expression from the beginning of the string, and only the matched URL will execute the corresponding script. For example, * matches all, https://www.baidu.com/* matches Baidu, etc. You can refer to the re.match() method in the Python re module, allowing multiple instances
@includeSimilar to @match, only matching URLs will execute the corresponding script, but @include will not match from the beginning of the string. For example, *://*baidu.com/* matches Baidu. For specific differences, please refer to TamperMonkey official document
@iconScript icon icon
@grantSpecify the permissions required to run the script. If the script has the corresponding permissions, you can call the API provided by the oil monkey extension to interact with the browser. If it is set to none, the sandbox environment is not used, and the script will run directly in the environment of the web page. At this time, most of the oil monkey extended APIs cannot be used. If not specified, Oil Monkey will add several of the most commonly used APIs by default
@requireIf the script depends on other JS libraries, you can use the require command to import, and load other libraries before running the script
@run-atWhen the script is injected, this option is the key to whether it can be document-start . There are five values to choose from: 0615a5926804cd: when the web page starts; document-body : when the body appears; document-end : execute during or after document-idle : execute after loading is complete, Default option; context-menu : When you click the script in the browser context menu, it is generally set to document-start

Clear the cookie, open the TamperMonkey plug-in, and once again come to the homepage of a certain Qiyi, you can see that it was successfully broken, and you can also follow up the call stack to further analyze the source of the value of __dfp

06.png

3. Browser plug-in injection

The official name of the browser plug-in should be browser extension (Extension). The browser plug-in can enhance the browser function and also help us hook. The writing of the browser plug-in is not complicated. Taking the Chrome plug-in as an example, you only need to guarantee the project There is just a manifest.json file, which is used to set all plug-in-related configurations and must be placed in the root directory. Which manifest_version , name , version 3 parameter is essential if you want in-depth study, refer to Xiao-Ming students the blog and Google official document . It should be noted that the Firefox browser plug-in may not be able to run on other browsers, and the Chrome plug-in can run on all webkit kernel domestic browsers, in addition to the Chrome browser, such as 360 speed browser, 360 security browser, Sogou browser, QQ browser, etc. We still use a certain Qiyi cookie to demonstrate how to write a Chrome browser hook plug-in.

Create a new manifest.json file:

{
    "name": "Cookie Hook",          // 插件名称
    "version": "1.0",               // 插件版本
    "description": "Cookie Hook",   // 插件描述
    "manifest_version": 2,          // 清单版本,必须是2或者3
    "content_scripts": [{
        "matches": ["<all_urls>"],  // 匹配所有地址
        "js": ["cookie_hook.js"],   // 注入的代码文件名和路径,如果有多个,则依次注入
        "all_frames": true,         // 允许将内容脚本嵌入页面的所有框架中
        "permissions": ["tabs"],    // 权限申请,tabs 表示标签
        "run_at": "document_start"  // 代码注入的时间
    }]
}

Create a new cookie_hook.js file:

var hook = function() {
    'use strict';
    var cookieTemp = '';
    Object.defineProperty(document, 'cookie', {
        set: function(val) {
            if (val.indexOf('__dfp') != -1) {
                debugger;
            }
            console.log('Hook捕获到cookie设置->', val);
            cookieTemp = val;
            return val;
        },
        get: function() {
            return cookieTemp;
        },
    });
}
var script = document.createElement('script');
script.textContent = '(' + hook + ')()';
(document.head || document.documentElement).appendChild(script);
script.parentNode.removeChild(script);

Put these two files in the same folder, open the chrome extension, open the developer mode, load the unzipped extension, and select the created folder:

07.png

When you come to a certain Qiyi page, clear the cookie and re-enter, you can see that it is also successfully broken, and you can find the place where its value is generated by tracing the call stack:

08.png

Summary of commonly used hook codes

In addition to using the above Object.defineProperty() method, you can also directly capture the relevant interface, and then rewrite this interface, the following lists the common Hook code. Note: The following are only the key Hook code, the specific injection method is different, and relevant modifications must be made.

Hook Cookie

Cookie Hook is used to locate the key parameter generation position in the Cookie. The following code demonstrates that when the __dfp keyword is matched in the Cookie, a breakpoint is inserted:

(function () {
  'use strict';
  var cookieTemp = '';
  Object.defineProperty(document, 'cookie', {
    set: function (val) {
      if (val.indexOf('__dfp') != -1) {
        debugger;
      }
      console.log('Hook捕获到cookie设置->', val);
      cookieTemp = val;
      return val;
    },
    get: function () {
      return cookieTemp;
    },
  });
})();
(function () {
    'use strict';
    var org = document.cookie.__lookupSetter__('cookie');
    document.__defineSetter__('cookie', function (cookie) {
        if (cookie.indexOf('__dfp') != -1) {
            debugger;
        }
        org = cookie;
    });
    document.__defineGetter__('cookie', function () {
        return org;
    });
})();

Hook Header

The Header Hook is used to locate the key parameter generation position in the Header. The following code demonstrates that when the Authorization keyword is included in the Header, a breakpoint is inserted:

(function () {
    var org = window.XMLHttpRequest.prototype.setRequestHeader;
    window.XMLHttpRequest.prototype.setRequestHeader = function (key, value) {
        if (key == 'Authorization') {
            debugger;
        }
        return org.apply(this, arguments);
    };
})();

Hook URL

URL Hook is used to locate the key parameter generation location in the request URL. The following code demonstrates that when the requested URL contains the login keyword, a breakpoint is inserted:

(function () {
    var open = window.XMLHttpRequest.prototype.open;
    window.XMLHttpRequest.prototype.open = function (method, url, async) {
        if (url.indexOf("login") != 1) {
            debugger;
        }
        return open.apply(this, arguments);
    };
})();

Hook JSON.stringify

JSON.stringify() method is used to convert JavaScript values into JSON strings, which may be encountered during the encryption process of some sites. The following code demonstrates that when JSON.stringify() is encountered, a breakpoint is inserted:

(function() {
    var stringify = JSON.stringify;
    JSON.stringify = function(params) {
        console.log("Hook JSON.stringify ——> ", params);
        debugger;
        return stringify(params);
    }
})();

Hook JSON.parse

JSON.parse() method is used to convert a JSON string into an object, which may be encountered during the encryption process of some sites. The following code demonstrates that a breakpoint is inserted JSON.parse()

(function() {
    var parse = JSON.parse;
    JSON.parse = function(params) {
        console.log("Hook JSON.parse ——> ", params);
        debugger;
        return parse(params);
    }
})();

Hook eval

The eval() is to calculate JavaScript string and execute it as script code. If the argument is an expression, the eval() function will execute the expression. If the parameter is a Javascript statement, eval() will execute the Javascript statement, which is often used to dynamically execute JS. After the following code is executed, all eval() operations will print out the JS source code to be executed on the console:

(function() {
    // 保存原始方法
    window.__cr_eval = window.eval;
    // 重写 eval
    var myeval = function(src) {
        console.log(src);
        console.log("=============== eval end ===============");
        debugger;
        return window.__cr_eval(src);
    }
    // 屏蔽 JS 中对原生函数 native 属性的检测
    var _myeval = myeval.bind(null);
    _myeval.toString = window.__cr_eval.toString;
    Object.defineProperty(window, 'eval', {
        value: _myeval
    });
})();

Hook Function

After the following code is executed, all function operations will print out the JS source code to be executed on the console:

(function() {
    // 保存原始方法
    window.__cr_fun = window.Function;
    // 重写 function
    var myfun = function() {
        var args = Array.prototype.slice.call(arguments, 0, -1).join(","),
            src = arguments[arguments.length - 1];
        console.log(src);
        console.log("=============== Function end ===============");
        debugger;
        return window.__cr_fun.apply(this, arguments);
    }
    // 屏蔽js中对原生函数native属性的检测
    myfun.toString = function() {
        return window.__cr_fun + ""
    }
    Object.defineProperty(window, 'Function', {
        value: myfun
    });
})();


K哥爬虫
166 声望143 粉丝

Python网络爬虫、JS 逆向等相关技术研究与分享。