内核态用户态读写流程
write调用的过程
- 用户态的用户程序对socket进行write调用
- 内核会搬运
用户程序缓冲区
的数据到内核写缓冲区(发送缓冲区)
,搬运完毕write调用就会返回(即使缓冲区上的数据还没发送出去) - 内核TCP协议栈会搬运数据从
内核写缓冲区(发送缓冲区)
到网卡 - 网卡在物理层把数据发送到目标网卡上,中间的网络过程略过
read调用的过程:
- 用户态的用户程序对socket进行read调用
- 内核TCP协议栈会搬运网卡上来自源的数据到
内核读缓冲区(接收缓冲区)
- 内核会搬运
内核读缓冲区(接收缓冲区)
的数据到用户程序缓冲区
- 用户程序就可以在
用户程序缓冲区
访问到这些数据了
shutdown与close
int close(int sockfd)
close函数会对套接字引用计数(引用了这个套接字描述符的进程数)减一,一旦发现套接字引用计数到0,就会对套接字进行彻底释放,并且会关闭TCP两个方向的数据流并回收连接和相关资源,是所谓的粗暴式关闭:
- 在read方向,内核会将该套接字设置为不可读,对套接字的read都会返回异常
- 在write方向,内核尝试将发送缓冲区的数据发送给对端,并最后向对端发送一个FIN报文,接下来如果再对该套接字进行write会返回异常
int shutdown(int sockfd, int howto)
shutdown函数可以单向或者双向的关闭连接,是所谓的优雅式关闭,howto来设置:
- SHUT_RD(0):关闭连接的read方向,对该套接字进行read直接返回EOF。从数据角度来看,套接字上接收缓冲区已有的数据将被丢弃,如果再有新的数据流到达,会对数据进行ACK,然后悄悄地丢弃。也就是说,对端还是会接收到ACK,在这种情况下根本不知道数据已经被丢弃了
- SHUT_WR(1):关闭连接的write方向,这就是常被称为半关闭的连接。此时,不管套接字引用计数的值是多少,都会直接关闭连接的write方向。套接字上发送缓冲区已有的数据将被立即发送出去,并发送一个FIN报文给对端,之后应用程序如果对该套接字进行write会报错
- SHUT_RDWR(2):相当于SHUT_RD和SHUT_WR操作各一次,关闭套接字的read和write两个方向
写程序来看一下close和shutdown的区别
client:
int main(int argc, char **argv) {
int socket_fd;
socket_fd = socket(AF_INET, SOCK_STREAM, 0);
struct sockaddr_in server_addr;
bzero(&server_addr, sizeof(server_addr));
server_addr.sin_family = AF_INET;
server_addr.sin_port = htons(SERV_PORT);
inet_pton(AF_INET, "127.0.0.1", &server_addr.sin_addr);
socklen_t server_len = sizeof(server_addr);
int connect_rt = connect(socket_fd, (struct sockaddr *) &server_addr, server_len);
if (connect_rt < 0) {
error(1, errno, "connect failed ");
}
char send_line[MAXLINE], recv_line[MAXLINE + 1];
int n;
fd_set readmask;
fd_set allreads;
FD_ZERO(&allreads);
FD_SET(0, &allreads);
FD_SET(socket_fd, &allreads);
for (;;) {
readmask = allreads;
// IO多路复用select函数,可以同时监听socket_fd和标准输入
int rc = select(socket_fd + 1, &readmask, NULL, NULL, NULL);
if (rc <= 0)
error(1, errno, "select failed");
if (FD_ISSET(socket_fd, &readmask)) {
n = read(socket_fd, recv_line, MAXLINE);
if (n < 0) {
error(1, errno, "read error");
} else if (n == 0) {
error(1, 0, "server terminated \n");
}
recv_line[n] = 0;
fputs(recv_line, stdout);
fputs("\n", stdout);
}
if (FD_ISSET(0, &readmask)) {
if (fgets(send_line, MAXLINE, stdin) != NULL) {
if (strncmp(send_line, "shutdown", 8) == 0) {
FD_CLR(0, &allreads);
if (shutdown(socket_fd, 1)) {
error(1, errno, "shutdown failed");
}
} else if (strncmp(send_line, "close", 5) == 0) {
FD_CLR(0, &allreads);
if (close(socket_fd)) {
error(1, errno, "close failed");
}
sleep(6);
exit(0);
} else {
int i = strlen(send_line);
if (send_line[i - 1] == '\n') {
send_line[i - 1] = 0;
}
printf("now sending %s\n", send_line);
size_t rt = write(socket_fd, send_line, strlen(send_line));
if (rt < 0) {
error(1, errno, "write failed ");
}
printf("send bytes: %zu \n", rt);
}
}
}
}
}
static void sig_int(int signo) {
printf("\nreceived %d datagrams\n", count);
exit(0);
}
int main(int argc, char **argv) {
int listenfd;
listenfd = socket(AF_INET, SOCK_STREAM, 0);
struct sockaddr_in server_addr;
bzero(&server_addr, sizeof(server_addr));
server_addr.sin_family = AF_INET;
server_addr.sin_addr.s_addr = htonl(INADDR_ANY);
server_addr.sin_port = htons(SERV_PORT);
int rt1 = bind(listenfd, (struct sockaddr *) &server_addr, sizeof(server_addr));
if (rt1 < 0) {
error(1, errno, "bind failed ");
}
int rt2 = listen(listenfd, LISTENQ);
if (rt2 < 0) {
error(1, errno, "listen failed ");
}
signal(SIGINT, sig_int);
signal(SIGPIPE, SIG_IGN);
int connfd;
struct sockaddr_in client_addr;
socklen_t client_len = sizeof(client_addr);
if ((connfd = accept(listenfd, (struct sockaddr *) &client_addr, &client_len)) < 0) {
error(1, errno, "bind failed ");
}
char message[MAXLINE];
count = 0;
for (;;) {
int n = read(connfd, message, MAXLINE);
if (n < 0) {
error(1, errno, "error read");
} else if (n == 0) {
error(1, 0, "client closed \n");
}
message[n] = 0;
printf("received %d bytes: %s\n", n, message);
count++;
char send_line[MAXLINE];
sprintf(send_line, "Hi, %s", message);
// 休眠几秒模拟服务器工作一段时间
sleep(5);
int write_nc = send(connfd, send_line, strlen(send_line), 0);
printf("send bytes: %zu \n", write_nc);
if (write_nc < 0) {
error(1, errno, "error write");
}
}
}
close的效果
client:
aaa
now sending aaa
send bytes: 3
close
server:
received 3 bytes: aaa
send bytes: 7
error read: Connection reset by peer (54)
可以看到client发送完aaa
的数据后随即调用close,会导致client的TCP连接断开且资源回收,server处理完数据发回来的时候发现TCP连接已经没有了,所以就connection reset了,下面用tcpdump追踪一下:
> sudo tcpdump 'tcp and port 9527' -i lo0 -S
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo0, link-type NULL (BSD loopback), capture size 262144 bytes
11:06:23.013648 IP localhost.55463 > localhost.9527: Flags [S], seq 3739428838, win 65535, options [mss 16344,nop,wscale 6,nop,nop,TS val 904150004 ecr 0,sackOK,eol], length 0
11:06:23.013755 IP localhost.9527 > localhost.55463: Flags [S.], seq 2449498522, ack 3739428839, win 65535, options [mss 16344,nop,wscale 6,nop,nop,TS val 904150004 ecr 904150004,sackOK,eol], length 0
11:06:23.013771 IP localhost.55463 > localhost.9527: Flags [.], ack 2449498523, win 6379, options [nop,nop,TS val 904150004 ecr 904150004], length 0
11:06:23.013783 IP localhost.9527 > localhost.55463: Flags [.], ack 3739428839, win 6379, options [nop,nop,TS val 904150004 ecr 904150004], length 0
11:06:30.327692 IP localhost.55463 > localhost.9527: Flags [P.], seq 3739428839:3739428842, ack 2449498523, win 6379, options [nop,nop,TS val 904157265 ecr 904150004], length 3
11:06:30.327740 IP localhost.9527 > localhost.55463: Flags [.], ack 3739428842, win 6379, options [nop,nop,TS val 904157265 ecr 904157265], length 0
11:06:31.826987 IP localhost.55463 > localhost.9527: Flags [F.], seq 3739428842, ack 2449498523, win 6379, options [nop,nop,TS val 904158750 ecr 904157265], length 0
11:06:31.827034 IP localhost.9527 > localhost.55463: Flags [.], ack 3739428843, win 6379, options [nop,nop,TS val 904158750 ecr 904158750], length 0
11:06:35.328859 IP localhost.9527 > localhost.55463: Flags [P.], seq 2449498523:2449498530, ack 3739428843, win 6379, options [nop,nop,TS val 904162236 ecr 904158750], length 7
11:06:35.328946 IP localhost.55463 > localhost.9527: Flags [R], seq 3739428843, win 0, length 0
分析一下上面的抓包结果:
C -> S [S]
S -> C [S.]
C -> S [.]
S -> C [.]
C -> S [P.] aaa
S -> C [.]
C -> S [F.]
S -> C [.]
S -> C [P.] Hi, aaa
C -> S [R]
client发完数据aaa
后server响应了ack,然后client主动close,client会发送了FIN包给server,server响应了ack后client回收了连接和资源,server处理完数据发了结果Hi, aaa
给client,这时client连接已经断了所以无法识别这个连接响应了RST包。
shutdown的效果
client:
aaa
now sending aaa
send bytes: 3
shutdown
Hi, aaa
server terminated
server:
received 3 bytes: aaa
send bytes: 7
client closed
可以看到client发送完aaa
的数据后随即调用shutdown,会导致client的TCP连接处于半关闭状态,这时read方向还是正常的但是write方向已经断开了,server处理完数据发回来的时候client还可以读到,等一段时间client exit退出连接就全部断开了,服务端read到EOF也就关闭了,同样的用tcpdump追踪一下:
> sudo tcpdump 'tcp and port 9527' -i lo0 -S
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo0, link-type NULL (BSD loopback), capture size 262144 bytes
11:06:53.692427 IP localhost.55594 > localhost.9527: Flags [S], seq 2938836011, win 65535, options [mss 16344,nop,wscale 6,nop,nop,TS val 904180495 ecr 0,sackOK,eol], length 0
11:06:53.692546 IP localhost.9527 > localhost.55594: Flags [S.], seq 2801533649, ack 2938836012, win 65535, options [mss 16344,nop,wscale 6,nop,nop,TS val 904180495 ecr 904180495,sackOK,eol], length 0
11:06:53.692562 IP localhost.55594 > localhost.9527: Flags [.], ack 2801533650, win 6379, options [nop,nop,TS val 904180495 ecr 904180495], length 0
11:06:53.692577 IP localhost.9527 > localhost.55594: Flags [.], ack 2938836012, win 6379, options [nop,nop,TS val 904180495 ecr 904180495], length 0
11:06:58.429387 IP localhost.55594 > localhost.9527: Flags [P.], seq 2938836012:2938836015, ack 2801533650, win 6379, options [nop,nop,TS val 904185206 ecr 904180495], length 3
11:06:58.429435 IP localhost.9527 > localhost.55594: Flags [.], ack 2938836015, win 6379, options [nop,nop,TS val 904185206 ecr 904185206], length 0
11:07:00.789790 IP localhost.55594 > localhost.9527: Flags [F.], seq 2938836015, ack 2801533650, win 6379, options [nop,nop,TS val 904187548 ecr 904185206], length 0
11:07:00.789847 IP localhost.9527 > localhost.55594: Flags [.], ack 2938836016, win 6379, options [nop,nop,TS val 904187548 ecr 904187548], length 0
11:07:03.431085 IP localhost.9527 > localhost.55594: Flags [P.], seq 2801533650:2801533657, ack 2938836016, win 6379, options [nop,nop,TS val 904190180 ecr 904187548], length 7
11:07:03.431161 IP localhost.55594 > localhost.9527: Flags [.], ack 2801533657, win 6379, options [nop,nop,TS val 904190180 ecr 904190180], length 0
11:07:03.431663 IP localhost.9527 > localhost.55594: Flags [F.], seq 2801533657, ack 2938836016, win 6379, options [nop,nop,TS val 904190180 ecr 904190180], length 0
11:07:03.431728 IP localhost.55594 > localhost.9527: Flags [.], ack 2801533658, win 6379, options [nop,nop,TS val 904190180 ecr 904190180], length 0
分析一下上面的抓包结果:
C -> S [S]
S -> C [S.]
C -> S [.]
S -> C [.]
C -> S [P.] aaa
S -> C [.]
C -> S [F.]
S -> C [.]
S -> C [P.] Hi, aaa
C -> S [.]
S -> C [F.]
C -> S [.]
client发完数据aaa
后server响应了ack,然后client主动shutdown,client会发送了FIN包给server,server响应了ack后client半关闭只能读不能再写了,server处理完数据发了结果Hi, aaa
给client,这时client读了最后的结果全关闭读写,注意这时只是关闭了读写没有回收资源,server读到了EOF发松了最后的FIN,client回复了ACK,最后是完整的四次挥手。
注意关闭的是socket不是连接
之前分析问题的时候我有一个疑问:既然client处于半关闭了,也就是只能读不能写了,那为什么还可以发送ack给server呢,其实这里就是没彻底理解关闭的意义,半关闭是说socket这个套接字描述符半关闭了,不是连接本身半关闭了,连接在内核态还存在,所以还是可以通过内核TCP协议栈正常通信,但是用户态的程序对socket的write调用不行了。再明白的来看其实还是下面这张图:
用户态里红色的write虽然关闭了,但是内核态里面写缓冲到网卡之间还是通的。
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。