1

内核态用户态读写流程

clipboard.png

write调用的过程

  • 用户态的用户程序对socket进行write调用
  • 内核会搬运用户程序缓冲区的数据到内核写缓冲区(发送缓冲区),搬运完毕write调用就会返回(即使缓冲区上的数据还没发送出去)
  • 内核TCP协议栈会搬运数据从内核写缓冲区(发送缓冲区)到网卡
  • 网卡在物理层把数据发送到目标网卡上,中间的网络过程略过

read调用的过程:

  • 用户态的用户程序对socket进行read调用
  • 内核TCP协议栈会搬运网卡上来自源的数据到内核读缓冲区(接收缓冲区)
  • 内核会搬运内核读缓冲区(接收缓冲区)的数据到用户程序缓冲区
  • 用户程序就可以在用户程序缓冲区访问到这些数据了

shutdown与close

int close(int sockfd)

close函数会对套接字引用计数(引用了这个套接字描述符的进程数)减一,一旦发现套接字引用计数到0,就会对套接字进行彻底释放,并且会关闭TCP两个方向的数据流并回收连接和相关资源,是所谓的粗暴式关闭:

  • 在read方向,内核会将该套接字设置为不可读,对套接字的read都会返回异常
  • 在write方向,内核尝试将发送缓冲区的数据发送给对端,并最后向对端发送一个FIN报文,接下来如果再对该套接字进行write会返回异常
int shutdown(int sockfd, int howto)

shutdown函数可以单向或者双向的关闭连接,是所谓的优雅式关闭,howto来设置:

  • SHUT_RD(0):关闭连接的read方向,对该套接字进行read直接返回EOF。从数据角度来看,套接字上接收缓冲区已有的数据将被丢弃,如果再有新的数据流到达,会对数据进行ACK,然后悄悄地丢弃。也就是说,对端还是会接收到ACK,在这种情况下根本不知道数据已经被丢弃了

    clipboard.png

  • SHUT_WR(1):关闭连接的write方向,这就是常被称为半关闭的连接。此时,不管套接字引用计数的值是多少,都会直接关闭连接的write方向。套接字上发送缓冲区已有的数据将被立即发送出去,并发送一个FIN报文给对端,之后应用程序如果对该套接字进行write会报错

    clipboard.png

  • SHUT_RDWR(2):相当于SHUT_RD和SHUT_WR操作各一次,关闭套接字的read和write两个方向

    clipboard.png

写程序来看一下close和shutdown的区别

client:

int main(int argc, char **argv) {
    int socket_fd;
    socket_fd = socket(AF_INET, SOCK_STREAM, 0);

    struct sockaddr_in server_addr;
    bzero(&server_addr, sizeof(server_addr));
    server_addr.sin_family = AF_INET;
    server_addr.sin_port = htons(SERV_PORT);
    inet_pton(AF_INET, "127.0.0.1", &server_addr.sin_addr);

    socklen_t server_len = sizeof(server_addr);
    int connect_rt = connect(socket_fd, (struct sockaddr *) &server_addr, server_len);
    if (connect_rt < 0) {
        error(1, errno, "connect failed ");
    }

    char send_line[MAXLINE], recv_line[MAXLINE + 1];
    int n;

    fd_set readmask;
    fd_set allreads;

    FD_ZERO(&allreads);
    FD_SET(0, &allreads);
    FD_SET(socket_fd, &allreads);
    for (;;) {
        readmask = allreads;
        // IO多路复用select函数,可以同时监听socket_fd和标准输入
        int rc = select(socket_fd + 1, &readmask, NULL, NULL, NULL);
        if (rc <= 0)
            error(1, errno, "select failed");
        if (FD_ISSET(socket_fd, &readmask)) {
            n = read(socket_fd, recv_line, MAXLINE);
            if (n < 0) {
                error(1, errno, "read error");
            } else if (n == 0) {
                error(1, 0, "server terminated \n");
            }
            recv_line[n] = 0;
            fputs(recv_line, stdout);
            fputs("\n", stdout);
        }
        if (FD_ISSET(0, &readmask)) {
            if (fgets(send_line, MAXLINE, stdin) != NULL) {
                if (strncmp(send_line, "shutdown", 8) == 0) {
                    FD_CLR(0, &allreads);
                    if (shutdown(socket_fd, 1)) {
                        error(1, errno, "shutdown failed");
                    }
                } else if (strncmp(send_line, "close", 5) == 0) {
                    FD_CLR(0, &allreads);
                    if (close(socket_fd)) {
                        error(1, errno, "close failed");
                    }
                    sleep(6);
                    exit(0);
                } else {
                    int i = strlen(send_line);
                    if (send_line[i - 1] == '\n') {
                        send_line[i - 1] = 0;
                    }

                    printf("now sending %s\n", send_line);
                    size_t rt = write(socket_fd, send_line, strlen(send_line));
                    if (rt < 0) {
                        error(1, errno, "write failed ");
                    }
                    printf("send bytes: %zu \n", rt);
                }
            }
        }
    }
}
static void sig_int(int signo) {
    printf("\nreceived %d datagrams\n", count);
    exit(0);
}


int main(int argc, char **argv) {
    int listenfd;
    listenfd = socket(AF_INET, SOCK_STREAM, 0);

    struct sockaddr_in server_addr;
    bzero(&server_addr, sizeof(server_addr));
    server_addr.sin_family = AF_INET;
    server_addr.sin_addr.s_addr = htonl(INADDR_ANY);
    server_addr.sin_port = htons(SERV_PORT);

    int rt1 = bind(listenfd, (struct sockaddr *) &server_addr, sizeof(server_addr));
    if (rt1 < 0) {
        error(1, errno, "bind failed ");
    }

    int rt2 = listen(listenfd, LISTENQ);
    if (rt2 < 0) {
        error(1, errno, "listen failed ");
    }

    signal(SIGINT, sig_int);
    signal(SIGPIPE, SIG_IGN);

    int connfd;
    struct sockaddr_in client_addr;
    socklen_t client_len = sizeof(client_addr);

    if ((connfd = accept(listenfd, (struct sockaddr *) &client_addr, &client_len)) < 0) {
        error(1, errno, "bind failed ");
    }

    char message[MAXLINE];
    count = 0;

    for (;;) {
        int n = read(connfd, message, MAXLINE);
        if (n < 0) {
            error(1, errno, "error read");
        } else if (n == 0) {
            error(1, 0, "client closed \n");
        }
        message[n] = 0;
        printf("received %d bytes: %s\n", n, message);
        count++;

        char send_line[MAXLINE];
        sprintf(send_line, "Hi, %s", message);
        // 休眠几秒模拟服务器工作一段时间
        sleep(5);

        int write_nc = send(connfd, send_line, strlen(send_line), 0);
        printf("send bytes: %zu \n", write_nc);
        if (write_nc < 0) {
            error(1, errno, "error write");
        }
    }
}

close的效果

client:

aaa
now sending aaa
send bytes: 3 
close

server:

received 3 bytes: aaa
send bytes: 7 
error read: Connection reset by peer (54)

可以看到client发送完aaa的数据后随即调用close,会导致client的TCP连接断开且资源回收,server处理完数据发回来的时候发现TCP连接已经没有了,所以就connection reset了,下面用tcpdump追踪一下:

> sudo tcpdump 'tcp and port 9527' -i lo0 -S
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo0, link-type NULL (BSD loopback), capture size 262144 bytes
11:06:23.013648 IP localhost.55463 > localhost.9527: Flags [S], seq 3739428838, win 65535, options [mss 16344,nop,wscale 6,nop,nop,TS val 904150004 ecr 0,sackOK,eol], length 0
11:06:23.013755 IP localhost.9527 > localhost.55463: Flags [S.], seq 2449498522, ack 3739428839, win 65535, options [mss 16344,nop,wscale 6,nop,nop,TS val 904150004 ecr 904150004,sackOK,eol], length 0
11:06:23.013771 IP localhost.55463 > localhost.9527: Flags [.], ack 2449498523, win 6379, options [nop,nop,TS val 904150004 ecr 904150004], length 0
11:06:23.013783 IP localhost.9527 > localhost.55463: Flags [.], ack 3739428839, win 6379, options [nop,nop,TS val 904150004 ecr 904150004], length 0
11:06:30.327692 IP localhost.55463 > localhost.9527: Flags [P.], seq 3739428839:3739428842, ack 2449498523, win 6379, options [nop,nop,TS val 904157265 ecr 904150004], length 3
11:06:30.327740 IP localhost.9527 > localhost.55463: Flags [.], ack 3739428842, win 6379, options [nop,nop,TS val 904157265 ecr 904157265], length 0
11:06:31.826987 IP localhost.55463 > localhost.9527: Flags [F.], seq 3739428842, ack 2449498523, win 6379, options [nop,nop,TS val 904158750 ecr 904157265], length 0
11:06:31.827034 IP localhost.9527 > localhost.55463: Flags [.], ack 3739428843, win 6379, options [nop,nop,TS val 904158750 ecr 904158750], length 0
11:06:35.328859 IP localhost.9527 > localhost.55463: Flags [P.], seq 2449498523:2449498530, ack 3739428843, win 6379, options [nop,nop,TS val 904162236 ecr 904158750], length 7
11:06:35.328946 IP localhost.55463 > localhost.9527: Flags [R], seq 3739428843, win 0, length 0

分析一下上面的抓包结果:

C -> S [S]
S -> C [S.]
C -> S [.]
S -> C [.]
C -> S [P.] aaa
S -> C [.]
C -> S [F.]
S -> C [.]
S -> C [P.] Hi, aaa
C -> S [R]

client发完数据aaa后server响应了ack,然后client主动close,client会发送了FIN包给server,server响应了ack后client回收了连接和资源,server处理完数据发了结果Hi, aaa给client,这时client连接已经断了所以无法识别这个连接响应了RST包。

shutdown的效果

client:

aaa
now sending aaa
send bytes: 3 
shutdown
Hi, aaa
server terminated 

server:

received 3 bytes: aaa
send bytes: 7 
client closed

可以看到client发送完aaa的数据后随即调用shutdown,会导致client的TCP连接处于半关闭状态,这时read方向还是正常的但是write方向已经断开了,server处理完数据发回来的时候client还可以读到,等一段时间client exit退出连接就全部断开了,服务端read到EOF也就关闭了,同样的用tcpdump追踪一下:

> sudo tcpdump 'tcp and port 9527' -i lo0 -S
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo0, link-type NULL (BSD loopback), capture size 262144 bytes
11:06:53.692427 IP localhost.55594 > localhost.9527: Flags [S], seq 2938836011, win 65535, options [mss 16344,nop,wscale 6,nop,nop,TS val 904180495 ecr 0,sackOK,eol], length 0
11:06:53.692546 IP localhost.9527 > localhost.55594: Flags [S.], seq 2801533649, ack 2938836012, win 65535, options [mss 16344,nop,wscale 6,nop,nop,TS val 904180495 ecr 904180495,sackOK,eol], length 0
11:06:53.692562 IP localhost.55594 > localhost.9527: Flags [.], ack 2801533650, win 6379, options [nop,nop,TS val 904180495 ecr 904180495], length 0
11:06:53.692577 IP localhost.9527 > localhost.55594: Flags [.], ack 2938836012, win 6379, options [nop,nop,TS val 904180495 ecr 904180495], length 0
11:06:58.429387 IP localhost.55594 > localhost.9527: Flags [P.], seq 2938836012:2938836015, ack 2801533650, win 6379, options [nop,nop,TS val 904185206 ecr 904180495], length 3
11:06:58.429435 IP localhost.9527 > localhost.55594: Flags [.], ack 2938836015, win 6379, options [nop,nop,TS val 904185206 ecr 904185206], length 0
11:07:00.789790 IP localhost.55594 > localhost.9527: Flags [F.], seq 2938836015, ack 2801533650, win 6379, options [nop,nop,TS val 904187548 ecr 904185206], length 0
11:07:00.789847 IP localhost.9527 > localhost.55594: Flags [.], ack 2938836016, win 6379, options [nop,nop,TS val 904187548 ecr 904187548], length 0
11:07:03.431085 IP localhost.9527 > localhost.55594: Flags [P.], seq 2801533650:2801533657, ack 2938836016, win 6379, options [nop,nop,TS val 904190180 ecr 904187548], length 7
11:07:03.431161 IP localhost.55594 > localhost.9527: Flags [.], ack 2801533657, win 6379, options [nop,nop,TS val 904190180 ecr 904190180], length 0
11:07:03.431663 IP localhost.9527 > localhost.55594: Flags [F.], seq 2801533657, ack 2938836016, win 6379, options [nop,nop,TS val 904190180 ecr 904190180], length 0
11:07:03.431728 IP localhost.55594 > localhost.9527: Flags [.], ack 2801533658, win 6379, options [nop,nop,TS val 904190180 ecr 904190180], length 0

分析一下上面的抓包结果:

C -> S [S]
S -> C [S.]
C -> S [.]
S -> C [.]
C -> S [P.] aaa
S -> C [.]
C -> S [F.]
S -> C [.]
S -> C [P.] Hi, aaa
C -> S [.]
S -> C [F.]
C -> S [.]

client发完数据aaa后server响应了ack,然后client主动shutdown,client会发送了FIN包给server,server响应了ack后client半关闭只能读不能再写了,server处理完数据发了结果Hi, aaa给client,这时client读了最后的结果全关闭读写,注意这时只是关闭了读写没有回收资源,server读到了EOF发松了最后的FIN,client回复了ACK,最后是完整的四次挥手。

注意关闭的是socket不是连接

之前分析问题的时候我有一个疑问:既然client处于半关闭了,也就是只能读不能写了,那为什么还可以发送ack给server呢,其实这里就是没彻底理解关闭的意义,半关闭是说socket这个套接字描述符半关闭了,不是连接本身半关闭了,连接在内核态还存在,所以还是可以通过内核TCP协议栈正常通信,但是用户态的程序对socket的write调用不行了。再明白的来看其实还是下面这张图:

clipboard.png

用户态里红色的write虽然关闭了,但是内核态里面写缓冲到网卡之间还是通的。


JinhaoPlus
1.5k 声望92 粉丝

扎瓦程序员


引用和评论

0 条评论