概念说明

FTP数据通道的主动模式和被动模式

主动模式:服务端通过指定的数据传输端口(默认20),主动连接客户端提交的端口,向客户端发送数据。
客户端主动发送:"PORT xxx,xxx,xxx,xxx,ppp,ppp"。等待服务器端发起数据连接。
服务器回复:“200”表示同意,到此数据通道可以建立了。

被动模式:服务端采用客户端建议使用被动模式,开启数据传输端口的监听,被动等待客户端的连接然后向客户端发送数据。

客户端主动发送:"PASV"。通知服务器端使用被动模式。
服务器回复:“227 xxx,xxx,xxx,xxx,ppp,ppp”。 表示同意,同时将本端监听的端口和IP发送给客户端。

一言以蔽之,服务端主动连接客户端就是主动模式,服务端被动等待客户端连接(客户端主动连接服务端)就是被动模式。
ftp有主动模式被动模式而ssh等其他协议没有这种说法的根源是:ftp要使用别的端口来传输数据。

LVS中对FTP的数据通道的处理

out2in

从上面的概念可以知道,主动模式客户端会通过”PORT“命令将本地监听的端口和IP发送给服务器。所以在out2in方向可以获取数据通道的信息。目前只有nat模式需要支持ftp的alg处理。在dnat函数中会解析ftp的内容,找到PORT命令信息,添加数据通道的连接跟踪。

假设我们从内容中获取的ip为dataip,端口为dataport。控制通道的连接跟踪为cn,则添加的连接跟踪7元祖为:

caddr:dataip
cport:dataport
vaddr:cn->vaddr
vport:cn->vport-1(即20)
daddr:cn->daddr
dport:cn->dport-1 (即20)

创建该连接跟踪后,会将其状态设置为IP_VS_TCP_S_LISTEN,同时设置其超时定时器的时间为对应的时间。主动模式下不需要进行seq的修正,因为报文不会进行ip地址的变化。

in2out

从上面的概念可以知道,被动模式服务器端会将本地监听的端口和IP发送给客户端。所以在in2out方向可以获取数据通道的信息。目前只有nat模式需要支持ftp的alg处理。因为是在in2out方向,所以在snat的反向动作dnat中会进行数据通道的连接跟踪处理。

假设我们从内容中获取的ip为dataip,端口为dataport。控制通道的连接跟踪为cn,则添加的连接跟踪7元祖为:

caddr:cn->addr
cport:0
vaddr:cn->vaddr
vport:dataport
daddr:dataip
dport:dataport

从上面的连接跟踪信息可以知道,LVS希望客户端发送的数据通道的目的IP也是VIP,这与实服务器指定的dataip是不一样的。所以需要修改ftp报文中的端口IP信息,让客户端发起数据通道连接时能命中上面的连接跟踪内容。同时还不知道客户端会以哪个端口来连接服务器的数据通道,所以连接跟踪中的cport设置为0,并且设置了标志IP_VS_CONN_F_NO_CPORT。表示这个连接跟踪需要在确定的时候设置该cport(这个时候就是客户端发送连接数据通道的syn包的时候,命中了该连接跟踪)。

由于需要修改应用层信息,那么涉及到tcp的序列号的变化,LVS想借用netfilter的adjseq机制进行处理,设置了标志IP_VS_CONN_F_NFCT,表示不要删除conntrack。

关键函数分析

out2in

/*
 * Look at incoming ftp packets to catch the PASV/PORT command
 * (outside-to-inside).
 *
 * The incoming packet having the PORT command should be something like
 *      "PORT xxx,xxx,xxx,xxx,ppp,ppp\n".
 * xxx,xxx,xxx,xxx is the client address, ppp,ppp is the client port number.
 * In this case, we create a connection entry using the client address and
 * port, so that the active ftp data connection from the server can reach
 * the client.
 */
static int ip_vs_ftp_in(struct ip_vs_app *app, struct ip_vs_conn *cp,
            struct sk_buff *skb, int *diff)
{
    struct iphdr *iph;
    struct tcphdr *th;
    char *data, *data_start, *data_limit;
    char *start, *end;
    union nf_inet_addr to;
    __be16 port;
    struct ip_vs_conn *n_cp;

    /* no diff required for incoming packets */
    *diff = 0;

#ifdef CONFIG_IP_VS_IPV6
    /* This application helper doesn't work with IPv6 yet,
     * so turn this into a no-op for IPv6 packets
     */
    if (cp->af == AF_INET6)
        return 1;
#endif

    /* Only useful for established sessions */
    if (cp->state != IP_VS_TCP_S_ESTABLISHED)
        return 1;

    /* Linear packets are much easier to deal with. */
    if (!skb_make_writable(skb, skb->len))
        return 0;

    /*
     * Detecting whether it is passive
     */
    iph = ip_hdr(skb);
    th = (struct tcphdr *)&(((char *)iph)[iph->ihl*4]);

    /* Since there may be OPTIONS in the TCP packet and the HLEN is
       the length of the header in 32-bit multiples, it is accurate
       to calculate data address by th+HLEN*4 */
    data = data_start = (char *)th + (th->doff << 2);
    data_limit = skb_tail_pointer(skb);
    //家产是否为从模式,6表示"PASV\r\n"的长度,这里采用了暴力匹配
    while (data <= data_limit - 6) {
        if (strncasecmp(data, "PASV\r\n", 6) == 0) {
            /* Passive mode on */
            IP_VS_DBG(7, "got PASV at %td of %td\n",
                  data - data_start,
                  data_limit - data_start);
            cp->app_data = &ip_vs_ftp_pasv;
            return 1;
        }
        data++;
    }

    /*
     * To support virtual FTP server, the scenerio is as follows:
     *       FTP client ----> Load Balancer ----> FTP server
     * First detect the port number in the application data,
     * then create a new connection entry for the coming data
     * connection.
     * 这种情况为主动模式。
     */
    if (ip_vs_ftp_get_addrport(data_start, data_limit,
                   CLIENT_STRING, sizeof(CLIENT_STRING)-1,
                   ' ', '\r', &to.ip, &port,
                   &start, &end) != 1)
        return 1;

    IP_VS_DBG(7, "PORT %pI4:%d detected\n", &to.ip, ntohs(port));

    /* Passive mode off */
    cp->app_data = NULL;

    /*
     * Now update or create a connection entry for it
     */
    IP_VS_DBG(7, "protocol %s %pI4:%d %pI4:%d\n",
          ip_vs_proto_name(iph->protocol),
          &to.ip, ntohs(port), &cp->vaddr.ip, 0);

    {
        struct ip_vs_conn_param p;
        //为主动模式创建请求方向的连接跟踪,只记录了
        //主动模式端口号为20。
        ip_vs_conn_fill_param(cp->ipvs, AF_INET,
                      iph->protocol, &to, port, &cp->vaddr,
                      htons(ntohs(cp->vport)-1), &p);//vport==(vportcp->vport)-1
        n_cp = ip_vs_conn_in_get(&p);
        if (!n_cp) {
            /* This is ipv4 only 使用同一个服务器。*/
            n_cp = ip_vs_conn_new(&p, AF_INET, &cp->daddr,
                          htons(ntohs(cp->dport)-1),//dport==(vportcp->dport)-1
                          IP_VS_CONN_F_NFCT, cp->dest,
                          skb->mark);
            if (!n_cp)
                return 0;

            /* add its controller */
            ip_vs_control_add(n_cp, cp);
        }
    }

    /*
     *    Move tunnel to listen state
     *  设置连接跟踪的状态为listen。
     */
    ip_vs_tcp_conn_listen(n_cp);
    ip_vs_conn_put(n_cp);

    return 1;
}

in2out

/*
 * Look at outgoing ftp packets to catch the response to a PASV command
 * from the server (inside-to-outside).
 * When we see one, we build a connection entry with the client address,
 * client port 0 (unknown at the moment), the server address and the
 * server port.  Mark the current connection entry as a control channel
 * of the new entry. All this work is just to make the data connection
 * can be scheduled to the right server later.
 *
 * The outgoing packet should be something like
 *   "227 Entering Passive Mode (xxx,xxx,xxx,xxx,ppp,ppp)".
 * xxx,xxx,xxx,xxx is the server address, ppp,ppp is the server port number.
 */
static int ip_vs_ftp_out(struct ip_vs_app *app, struct ip_vs_conn *cp,
             struct sk_buff *skb, int *diff)
{
    struct iphdr *iph;
    struct tcphdr *th;
    char *data, *data_limit;
    char *start, *end;
    union nf_inet_addr from;
    __be16 port;
    struct ip_vs_conn *n_cp;
    char buf[24];        /* xxx.xxx.xxx.xxx,ppp,ppp\000 */
    unsigned int buf_len;
    int ret = 0;
    enum ip_conntrack_info ctinfo;
    struct nf_conn *ct;

    *diff = 0;

#ifdef CONFIG_IP_VS_IPV6
    /* This application helper doesn't work with IPv6 yet,
     * so turn this into a no-op for IPv6 packets
     */
    if (cp->af == AF_INET6)
        return 1;
#endif

    /* Only useful for established sessions */
    if (cp->state != IP_VS_TCP_S_ESTABLISHED)
        return 1;

    /* Linear packets are much easier to deal with. */
    if (!skb_make_writable(skb, skb->len))
        return 0;
    //被动模式,说明是客户端发起连接,服务器会发送端口和地址
    if (cp->app_data == &ip_vs_ftp_pasv) {//被动模式,端口来自服务器端,需要在out方向获取端口。
        iph = ip_hdr(skb);
        th = (struct tcphdr *)&(((char *)iph)[iph->ihl*4]);
        data = (char *)th + (th->doff << 2);
        data_limit = skb_tail_pointer(skb);

        if (ip_vs_ftp_get_addrport(data, data_limit,
                       SERVER_STRING,
                       sizeof(SERVER_STRING)-1,
                       '(', ')',
                       &from.ip, &port,
                       &start, &end) != 1)
            return 1;

        IP_VS_DBG(7, "PASV response (%pI4:%d) -> %pI4:%d detected\n",
              &from.ip, ntohs(port), &cp->caddr.ip, 0);

        /*
         * Now update or create an connection entry for it
         * 获取的服务器端打开的地址和端口
         */
        {
            struct ip_vs_conn_param p;
            ip_vs_conn_fill_param(cp->ipvs, AF_INET,
                          iph->protocol, &from, port,
                          &cp->caddr, 0, &p);//这里填写了客户端的端口为0
            //查看是否存在输出的
            n_cp = ip_vs_conn_out_get(&p);
        }
        if (!n_cp) {
            struct ip_vs_conn_param p;
            ip_vs_conn_fill_param(cp->ipvs,
                          AF_INET, IPPROTO_TCP, &cp->caddr,
                          0, &cp->vaddr, port, &p);
            /* As above, this is ipv4 only */
            /* 设置客户端端口可以为0,因为没有端口 */
            n_cp = ip_vs_conn_new(&p, AF_INET, &from, port,
                          IP_VS_CONN_F_NO_CPORT |
                          IP_VS_CONN_F_NFCT,
                          cp->dest, skb->mark);
            if (!n_cp)
                return 0;

            /* add its controller */
            ip_vs_control_add(n_cp, cp);
        }

        /*
         * Replace the old passive address with the new one
         * 修改报文内容,使用新的ip通知客户端
         */
        from.ip = n_cp->vaddr.ip;
        port = n_cp->vport;
        snprintf(buf, sizeof(buf), "%u,%u,%u,%u,%u,%u",
             ((unsigned char *)&from.ip)[0],
             ((unsigned char *)&from.ip)[1],
             ((unsigned char *)&from.ip)[2],
             ((unsigned char *)&from.ip)[3],
             ntohs(port) >> 8,
             ntohs(port) & 0xFF);

        buf_len = strlen(buf);
        //使用nf_ct机制进行变换
        ct = nf_ct_get(skb, &ctinfo);
        if (ct) {
            bool mangled;

            /* If mangling fails this function will return 0
             * which will cause the packet to be dropped.
             * Mangling can only fail under memory pressure,
             * hopefully it will succeed on the retransmitted
             * packet.
             * 会涉及seqadjst。
             */
            mangled = nf_nat_mangle_tcp_packet(skb, ct, ctinfo,
                               iph->ihl * 4,
                               start - data,
                               end - start,
                               buf, buf_len);
            if (mangled) {
                ip_vs_nfct_expect_related(skb, ct, n_cp,
                              IPPROTO_TCP, 0, 0);
                if (skb->ip_summed == CHECKSUM_COMPLETE)
                    skb->ip_summed = CHECKSUM_UNNECESSARY;
                /* csum is updated */
                ret = 1;
            }
        }

        /*
         * Not setting 'diff' is intentional, otherwise the sequence
         * would be adjusted twice.
         */

        cp->app_data = NULL;
        //设置连接跟踪的状态为listen状态。
        ip_vs_tcp_conn_listen(n_cp);
        ip_vs_conn_put(n_cp);
        return ret;
    }
    return 1;
}

IP_VS_CONN_F_NO_CPORT

/*
 *    Fill a no_client_port connection with a client port number
 */
void ip_vs_conn_fill_cport(struct ip_vs_conn *cp, __be16 cport)
{
    if (ip_vs_conn_unhash(cp)) {
        spin_lock_bh(&cp->lock);
        if (cp->flags & IP_VS_CONN_F_NO_CPORT) {
            atomic_dec(&ip_vs_conn_no_cport_cnt);
            cp->flags &= ~IP_VS_CONN_F_NO_CPORT;
            cp->cport = cport;
        }
        spin_unlock_bh(&cp->lock);

        /* hash on new dport */
        ip_vs_conn_hash(cp);
    }
}

lvs与nf_ct的关系

在前面我们分析passive模式的时候,提到了lvs调用ip_vs_nfct_expect_related为nf-ct添加了一个期望连接,并且注册了help函数。

/*
 * Create NF conntrack expectation with wildcard (optional) source port.
 * Then the default callback function will alter the reply and will confirm
 * the conntrack entry when the first packet comes.
 * Use port 0 to expect connection from any port.
 */
void ip_vs_nfct_expect_related(struct sk_buff *skb, struct nf_conn *ct,
                   struct ip_vs_conn *cp, u_int8_t proto,
                   const __be16 port, int from_rs)
{
    struct nf_conntrack_expect *exp;

    if (ct == NULL)
        return;

    exp = nf_ct_expect_alloc(ct);
    if (!exp)
        return;

    nf_ct_expect_init(exp, NF_CT_EXPECT_CLASS_DEFAULT, nf_ct_l3num(ct),
            from_rs ? &cp->daddr : &cp->caddr,//源IP,如果是实服务器侧,主动模式,源IP为daddr,否则为caddr
            from_rs ? &cp->caddr : &cp->vaddr,//目的IP,如果是实服务器侧,主动模式,目的IP为caddr,否则为vaddr
            proto, port ? &port : NULL,
            from_rs ? &cp->cport : &cp->vport);
    //注册expect函数
    exp->expectfn = ip_vs_nfct_expect_callback;

    IP_VS_DBG(7, "%s: ct=%p, expect tuple=" FMT_TUPLE "\n",
        __func__, ct, ARG_TUPLE(&exp->tuple));
    nf_ct_expect_related(exp);
    nf_ct_expect_put(exp);
}

/*
 * Called from init_conntrack() as expectfn handler.
 * 参数ct为新的连接首包创建的ct。所以首包方向即为original方向。
 * 在这里来说。对于active模式下。RS->client为original方向。
 * passvie模式下,client->RS方向为original方向。
 */
static void ip_vs_nfct_expect_callback(struct nf_conn *ct,
    struct nf_conntrack_expect *exp)
{
    struct nf_conntrack_tuple *orig, new_reply;
    struct ip_vs_conn *cp;
    struct ip_vs_conn_param p;
    struct net *net = nf_ct_net(ct);

    if (exp->tuple.src.l3num != PF_INET)
        return;

    /*
     * We assume that no NF locks are held before this callback.
     * ip_vs_conn_out_get and ip_vs_conn_in_get should match their
     * expectations even if they use wildcard values, now we provide the
     * actual values from the newly created original conntrack direction.
     * The conntrack is confirmed when packet reaches IPVS hooks.
     */

    /* RS->CLIENT 主动模式 */
    orig = &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple;
    //根据五元组构建lvs的五元组。对于in2out来说,查找lvs连接跟踪的时候,主要匹配
    //目的端口,目的IP为cp的客户端IP和端口,匹配sip,sport为cp的dport和dip
    ip_vs_conn_fill_param(net_ipvs(net), exp->tuple.src.l3num, orig->dst.protonum,
                  &orig->src.u3, orig->src.u.tcp.port,
                  &orig->dst.u3, orig->dst.u.tcp.port, &p);
    cp = ip_vs_conn_out_get(&p);
    if (cp) {
        /* Change reply CLIENT->RS to CLIENT->VS */
        new_reply = ct->tuplehash[IP_CT_DIR_REPLY].tuple;
        IP_VS_DBG(7, "%s: ct=%p, status=0x%lX, tuples=" FMT_TUPLE ", "
              FMT_TUPLE ", found inout cp=" FMT_CONN "\n",
              __func__, ct, ct->status,
              ARG_TUPLE(orig), ARG_TUPLE(&new_reply),
              ARG_CONN(cp));
        //在命中期望连接后,nf-ct创建的连接跟踪五元组为 请求方向:dip,dport,cip,cport
        //应答方向为cip,cport, dip,dport。而我们实际需要的是cip,cport, vip,vport。
        //在这里进行修改。记住,进入到这里是在prerouting节点的。
        new_reply.dst.u3 = cp->vaddr;
        new_reply.dst.u.tcp.port = cp->vport;
        IP_VS_DBG(7, "%s: ct=%p, new tuples=" FMT_TUPLE ", " FMT_TUPLE
              ", inout cp=" FMT_CONN "\n",
              __func__, ct,
              ARG_TUPLE(orig), ARG_TUPLE(&new_reply),
              ARG_CONN(cp));
        goto alter;
    }

    /* CLIENT->VS 被动模式 */
    /* 获取请求方向的连接跟踪 */
    cp = ip_vs_conn_in_get(&p);
    if (cp) {
        /* Change reply VS->CLIENT to RS->CLIENT */
        new_reply = ct->tuplehash[IP_CT_DIR_REPLY].tuple;
        IP_VS_DBG(7, "%s: ct=%p, status=0x%lX, tuples=" FMT_TUPLE ", "
              FMT_TUPLE ", found outin cp=" FMT_CONN "\n",
              __func__, ct, ct->status,
              ARG_TUPLE(orig), ARG_TUPLE(&new_reply),
              ARG_CONN(cp));
        //在命中期望连接后,nf-ct创建的连接跟踪五元组为 请求方向:cip,cport, vip,vport
        //应答方向为vip,vport, cip,cport。而我们实际需要的是dip,dport, cip,cport。
        //在这里进行修改。记住,进入到这里是在prerouting节点的。。
        new_reply.src.u3 = cp->daddr;
        new_reply.src.u.tcp.port = cp->dport;
        IP_VS_DBG(7, "%s: ct=%p, new tuples=" FMT_TUPLE ", "
              FMT_TUPLE ", outin cp=" FMT_CONN "\n",
              __func__, ct,
              ARG_TUPLE(orig), ARG_TUPLE(&new_reply),
              ARG_CONN(cp));
        goto alter;
    }

    IP_VS_DBG(7, "%s: ct=%p, status=0x%lX, tuple=" FMT_TUPLE
          " - unknown expect\n",
          __func__, ct, ct->status, ARG_TUPLE(orig));
    return;

alter:
    /* Never alter conntrack for non-NAT conns */
    /* 只有nat模式才会有 */
    if (IP_VS_FWD_METHOD(cp) == IP_VS_CONN_F_MASQ)
        nf_conntrack_alter_reply(ct, &new_reply);
    ip_vs_conn_put(cp);
    return;
}

ouyangxibao
189 声望163 粉丝

不生产代码,只是代码的搬运工