3

Preface

Recently, Python wrote a few simple scripts with 060a1b9182e076 to process some data. Because it is only a simple function, I directly use print to print the log.

Some exceptions occasionally occur when the task is running:

Because I have printed logs in different places, the place where the error is reported is different each time, which leads to very strange program running results; sometimes this code is not running, the next time another piece of code is not triggered.

Although there were noticed Broken pipe this critical exception, but did not particularly care about, because there are some code to send http local request, always I thought it was a network IO there is a problem, it did not even go print this basic print function thinking 🤔.

I didn't take a serious look at this exception until this problem print , and I took a closer look at whether 060a1b9182e0e4 is also a IO operation. Is it really that the built-in print function has a problem?


But in the local and test environment, I have run countless times and failed to find anomalies; so I looked for operation and maintenance to get the online operation mode.

Originally, in order to facilitate the maintenance of the script tasks submitted by everyone, the operation and maintenance itself has maintained a unified script, which is used in this script:

cmd = 'python /xxx/test.py'
os.popen(cmd)

To trigger tasks, this is the only difference from my local and development environment.

popen principle

For this reason, I simulated an exception in the development environment:

test.py:

import time
if __name__ == '__main__':
    time.sleep(20)
    print '1000'*1024

task.py:

import os
import time
if __name__ == '__main__':
    start = int(time.time())
    cmd = 'python test.py'
    os.popen(cmd)
    end = int(time.time())
    print 'end****{}s'.format(end-start)

run:

python task.py

Waiting for 20s will inevitably reproduce this exception:

Traceback (most recent call last):
  File "test.py", line 4, in <module>
    print '1000'*1024
IOError: [Errno 32] Broken pipe

Why does this exception occur?

First, you have to understand the operating principle of this function os.popen(command[, mode[, bufsize]])

According to the explanation of the official document, this function will execute fork a child process executes command , and at the same time connect the standard output of the child process to the parent process through a pipe;

That is, the file descriptor returned by this method.

Draw a picture here to better understand the principle:

In the usage scenario here, the return value of popen() command is essentially asynchronous;

That is to say, when the task.py completed, the pipeline at the reading end will be automatically closed.

As shown, after closing the child process to pipe output print '1000'*1024 , since the contents will once more where the output buffer after the pipeline;

So the writing end will receive the SIGPIPE signal, which will cause the Broken pipe exception.

From Wikipedia, we can also see some conditions for this exception:

SIGPIPE signal is also mentioned.

Solution

Now that the cause of the problem is known, it is relatively simple to solve it. There are mainly the following solutions:

  1. Use the read() function to read the data in the pipeline, and then close it after reading all the data.
  2. If you do not need to output sub-process can also be command standard output is redirected to /dev/null .
  3. You can also use Python3 of subprocess.Popen module to run.

Here is a demonstration using the first scheme:

import os
import time
if __name__ == '__main__':
    start = int(time.time())
    cmd = 'python test.py'
    with os.popen(cmd) as p:
        print p.read()
    end = int(time.time())
    print 'end****{}s'.format(end-start)

After running task.py , no exception will be thrown, and the output of command

I didn't use this solution when repairing online. In order to facilitate the viewing of logs, I still used the standard log framework to output the logs to es, which is convenient for unified viewing kibana

Since the logging framework does not use pipes, this problem naturally does not occur.

more content

Although the problem is solved, it still involves some knowledge points that we usually don't pay much attention to. This time we will review it together.

The first is the content of parent and child, and this in c/c++/python more common, the Java/golang using multiple threads directly, the coroutine will be some more.

For example, Python in os.popen() mentioned this time created a child process. Since it is a child process, it must communicate with the parent process to achieve the purpose of cooperative work.

It is easy to imagine that the parent and child processes can communicate through the pipes (anonymous pipes) mentioned above.

Take the Python program just now as an example. When task.py is run, two processes will be generated:

/proc/pid/fd directory of the two programs respectively to see the file descriptors opened by the two processes.

Parent process:

Child process:

You can see that the standard output of the child process is associated with the parent process, which is the file descriptor returned by popen()

Here 0 1 2 corresponds to stdin (standard input)/ stdout (standard output)/ stderr (standard error) of a process.

One more thing to note is that when we open the file descriptor in the parent process, the child process will also inherit it;

For example, add a piece of code task.py

x = open("1.txt", "w")

Later, when you check the file descriptor, you will find that the parent and child processes will have this file:

But on the contrary, the parent process will not have the file opened in the child process. This should be easy to understand.

to sum up

Some basic knowledge is particularly important when troubleshooting some weird problems, such as the pipeline communication of the parent-child process involved this time. Finally, let’s summarize:

  1. os.popen() function is executed asynchronously. If you need to get the output of the child process, you need to call the read() function yourself.
  2. The parent and child processes communicate through anonymous pipes. When the reader is closed, the output of the write end will receive the SIGPIPE signal when the output reaches the maximum buffer of the pipe, and the Broken pipe exception will be thrown.
  3. The child process inherits the file descriptor of the parent process.

Your likes and sharing are my greatest support


crossoverJie
5.4k 声望4k 粉丝