Nodejs如何实现抓包功能

类似Fiddler功能,如何通过nodejs去抓取http请求包,例如我在浏览器访问https://www.baidu.com,如何通过运行的nodejs拦截这个请求

阅读 7k
2 个回答
  • 正常访问百度,可以参考nodejs官网示例,request

    const http = require('http');
    
    const options = {
    hostname: 'www.baidu.com',
    port: 80,
    path: '/',
    method: 'GET',
    };
    
    const req = http.request(options, (res) => {
    console.log(`STATUS: ${res.statusCode}`);
    console.log(`HEADERS: ${JSON.stringify(res.headers)}`);
    res.setEncoding('utf8');
    res.on('data', (chunk) => {
      console.log(`BODY length: ${Buffer.byteLength(chunk)}`);
    });
    res.on('end', () => {
      console.log('No more data in response.');
    });
    });
    
    req.on('error', (e) => {
    console.error(`problem with request: ${e.message}`);
    });
    
    // Write data to request body
    req.end();

    控制台会输出:

    STATUS: 200
    HEADERS: {"accept-ranges":"bytes","cache-control":"no-cache","content-length":"9508","content-type":"text/html","date":"Thu, 05 May 2022 09:23:50 GMT","p3p":"CP=\" OTI DSP COR IVA OUR IND COM \", CP=\" OTI DSP COR IVA OUR IND COM \"","pragma":"no-cache","server":"BWS/1.1","set-cookie":["BAIDUID=D9375A468F8A7724E44444A64FC1F7AC:FG=1; expires=Thu, 31-Dec-37 23:55:55 GMT; max-age=2147483647; path=/; domain=.baidu.com","BIDUPSID=D9375A468F8A7724E44444A64FC1F7AC; expires=Thu, 31-Dec-37 23:55:55 GMT; max-age=2147483647; path=/; domain=.baidu.com","PSTM=1651742630; expires=Thu, 31-Dec-37 23:55:55 GMT; max-age=2147483647; path=/; domain=.baidu.com","BAIDUID=D9375A468F8A77249C5F79F1AF5BB856:FG=1; max-age=31536000; expires=Fri, 05-May-23 09:23:50 GMT; domain=.baidu.com; path=/; version=1; comment=bd"],"traceid":"1651742630024386996210573671075705266080","vary":"Accept-Encoding","x-frame-options":"sameorigin","x-ua-compatible":"IE=Edge,chrome=1","connection":"close"}
    BODY length: 4548
    BODY length: 4960
    No more data in response.

    req 里面已经包含了请求信息,res 则是响应信息,基本跟当前请求相关的任何信息都能从这两个对象获取。

  • 如果是想做APM,监控某个node服务中所有请求,可以在应用启动前对http模块做patch,示例代码:

    const http = require('http');
    
    const originalRequest = http.request;
    
    http.request = function (options) {
      const args = Array.prototype.slice.apply(arguments);
      const callback = args[args.length - 1];
      if (typeof callback === 'function') {
          args[args.length - 1] = function proxyCallback(res) {
              console.log(`request ${options.hostname}, STATUS: ${res.statusCode}`);
              callback(res);
          }
      }
      return originalRequest.apply(this, args);
    }
    
    // 实际请求
    
    const hosts = ['www.baidu.com', 'www.taobao.com', 'www.tmall.com'];
    
    hosts.forEach(hostname => {
      const options = {
          hostname: hostname,
          port: 80,
          path: '/',
          method: 'GET',
      };
    
      const req = http.request(options, (res) => {
          res.setEncoding('utf8');
          res.on('end', () => {
              console.log('request finished.');
          });
      });
    
      req.on('error', (e) => {
          console.error(`problem with request: ${e.message}`);
      });
    
      // Write data to request body
      req.end();
    });

    输出:

    request www.baidu.com, STATUS: 200
    request www.taobao.com, STATUS: 301
    request www.tmall.com, STATUS: 302
  • 如果想搭建代理服务器,建议采用一些成熟的模块或者软件,代理原理其实就是将请求发送到代理服务器,然后由代理服务器将请求转交发送到目标服务器,然后将响应结果原路返回到请求客户端。

Fiddler抓包的原理也是代理,也就是说你请求发给Fiddler,然后Fiddler转发给服务器,同时,服务器的响应也是先发送给Fiddler,在转发给客户端,你也可以用node运行一个服务,然后客户端通过这个node服务代理去达到目的

撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
推荐问题
宣传栏