How to efficiently develop smart algorithms on the end? MNN workbench Python debugging detailed explanation

With the rapid development of the mobile Internet, the application of artificial intelligence on the mobile terminal has become more and more extensive, and the internal intelligence of the group plays an important role in core scenarios such as image recognition, video detection, and data calculation. In the development stage, Python is undoubtedly the language of choice for algorithm development. But on the mobile terminal, the deployment, debugging, and verification of algorithms are still in the era of "slash and burn". At present, algorithms mainly verify the logic and results of the program by inserting logs into the code.

Of course, you can verify the results and locate problems by logging, but once the project is a little more complicated, the production efficiency will be very low. Therefore, the end-side Python debugging capabilities are embedded in the MNN workbench (click on the end of the article to read the original text and go to the MNN official website: www.mnn.zone to download). Students who use Python frequently must be familiar with the pdb module. It is an interactive code debugger provided by the official Python standard library. Like the debugging capabilities provided by any language, pdb provides source code line-level setting breakpoints and single-step execution The general debugging ability is a very important tool module for Python development.

Today, let us focus on analyzing the source code of the official pdb module and take a look at the underlying technical principles of its debugging function.

principle

As you can see from the cpython source code, the pdb module is not a built-in module implemented by c, but a module implemented and encapsulated in pure Python. The core file is pdb.py, which inherits from the bdb and cmd modules:

class Pdb(bdb.Bdb, cmd.Cmd):    ...

Basic principle: Use the cmd module to define and implement a series of interactive input of debugging commands, based on sys.settrace instrumentation to track the stack frame of the code running, control the running and breakpoint status of the code for different debugging commands, and send it to the console Output the corresponding information.

The cmd module mainly provides the command interaction capability of a console, realizes input waiting through the blocking methods of raw_input/readline, and then hands the command to the subclass for processing to decide whether to continue the loop input, just like its main method name runloop.
cmd is a commonly used module, not specifically designed for pdb, pdb uses the cmd framework to achieve interactive custom debugging.

bdb provides the core framework for debugging, relying on sys.settrace for single-step running tracking of the code, and then distributing the corresponding events (call/line/return/exception) to the subclass (pdb) for processing. The core logic of interrupt control for debugging commands, such as inputting a single-step "s" command to determine whether to continue tracking or interrupt waiting for interactive input, which frame to interrupt and so on.

Basic process

pdb is started, and the trace_dispatch function is bound to the current frame

def trace_dispatch(self, frame, event, arg):
     if self.quitting:
         return # None
     if event == 'line':
         return self.dispatch_line(frame)
     if event == 'call':
         return self.dispatch_call(frame, arg)
     if event == 'return':
         return self.dispatch_return(frame, arg)
     if event == 'exception':
     ...

The processing of different events in each frame will go through the interrupt control logic, mainly the stop_here (line event will also go through the break_here) function. After processing, it is determined whether the code is interrupted and which line needs to be interrupted.
If you need to interrupt, trigger the subclass method user_#event, the subclass realizes the stack frame information update through interaction, and prints the corresponding information on the console, and then executes cmdloop to make the console wait for interactive input

def interaction(self, frame, traceback):
     self.setup(frame, traceback) # 当前栈、frame、local vars
     self.print_stack_entry(self.stack[self.curindex])
     self.cmdloop()
     self.forget()

The user enters a debugging command such as "next" and press Enter. First, the set_# command will be called to set the stopframe, returnframe, and stoplineno settings. It will affect `stop_here , and thus determine the interrupt result that runs to the next frame.

def _set_stopinfo(self, stopframe, returnframe, stoplineno=0):
     self.stopframe = stopframe
     self.returnframe = returnframe
     self.quitting = 0
     # stoplineno >= 0 means: stop at line >= the stoplineno
     # stoplineno -1 means: don't stop at all
     self.stoplineno = stoplineno

For debugging process control commands, generally the do_# command will return 1, so this runloop will end immediately, and the next run will start the runloop again when an interrupt is triggered (see step 3); for information acquisition commands, do_ The #commands have no return value, and the current interrupt status is maintained.
The code runs to the next frame, repeat step 3

Interrupt control

Interrupt control means that after inputting different debugging commands, the code can be executed to the correct position to stop, waiting for user input, such as input "s", the console should stop at the next code that runs the frame, and output "c" Need to run to the next interruption point. Interrupt control occurs in each step of sys.settrace trace, which is the core logic of debugging operation.

Four events of frame are mainly tracked in pdb:

line: the sequential execution of events in the same frame
call: A function call occurs, jump to the next level of frame, and a call event is generated in the first line of the function
return: After the function has executed the last line (line), the result will be returned, and it will jump out of the current frame and return to the previous frame. A return event will be generated in the last line of the function.
exception: An exception occurs in the execution of the function, an exception event is generated in the exception line, and then the line returns (return event), and the exception and return events are generated in the frame one level up until the bottom frame is returned.

They are different node types during code tracing. According to the debugging commands input by the user, pdb will perform interrupt control at each frame tracing step to determine whether to interrupt next and which line to interrupt. The main method of interrupt control is stop_here:

def stop_here(self, frame):
        # (CT) stopframe may now also be None, see dispatch_call.
        # (CT) the former test for None is therefore removed from here.
        if self.skip and \
               self.is_skipped_module(frame.f_globals.get('__name__')):
            return False


        # next
        if frame is self.stopframe:
            # stoplineno >= 0 means: stop at line >= the stoplineno
            # stoplineno -1 means: don't stop at all
            if self.stoplineno == -1:
                return False
            return frame.f_lineno >= self.stoplineno


        # step：当前只要追溯到botframe，就等待执行。
        while frame is not None and frame is not self.stopframe:
            if frame is self.botframe:
                return True
            frame = frame.f_back
        return False

Debug commands are roughly divided into two categories:

Process control: such as setp, next, continue, etc., immediately enter the next stage of code execution after execution
Information acquisition/setting: the current information such as args, p, list, etc. will not affect the cmd status

The following focuses on the implementation principles of the most common debugging command interrupt control used for process control:

s（step）

1 Command definition

Execute the next command. If this sentence is a function call, s will execute to the first sentence of the function.

2 code analysis

The implementation logic in pdb is to execute each frame sequentially and wait for execution, and its execution granularity is the same as settrace.

def stop_here(self, frame):
        ...
        # stopframe为None
        if frame is self.stopframe:
            ...
        # 当前frame一定会追溯到botframe，返回true
        while frame is not None and frame is not self.stopframe:
            if frame is self.botframe:
                return True
            frame = frame.f_back
        return False

The step will set the stopframe to None, so as long as the current frame can be traced back to the underlying frame (botframe), it means that it can wait for execution, that is, pdb is in an interactive waiting state.

Because the execution granularity of step is the same as settrace, it will wait for execution every frame.

n（next）

1 Command definition

Execute the next statement. If this statement is a function call, execute the function, and then execute the next statement of the currently executed statement.

2 Code analysis

The logic implemented in pdb is that the next trace running to the current frame is interrupted, but it will not be interrupted when entering the next frame (function call).

def stop_here(self, frame):
        ...
        # 如果frame还没跳出stopframe，永远返回true
        if frame is self.stopframe:
            if self.stoplineno == -1:
                return False
            return frame.f_lineno >= self.stoplineno


        # 如果frame跳出了stopframe，进入下一个frame，则执行不会中断，一直到跳出到stopframe
        # 还有一种情况，如果在return事件中断执行了next，下一次跟踪在上一级frame中，此时上一级frame能跟踪到botframe，中断
        while frame is not None and frame is not self.stopframe:
            if frame is self.botframe:
                return True
            frame = frame.f_back
        return False

Next will set stopframe as the current frame, that is, unless in the current frame, entering other frames will not perform interruption.

c

1 Command definition

Continue execution until the next breakpoint is encountered

2 Code analysis

Set stopframe to botframe and stoplineno to -1. stop_here always returns false, and the operation will not be interrupted until a breakpoint is encountered (break_here condition is established)

def stop_here(self, frame):        ...        # 如果在botframe中，stoplineno为-1返回false        if frame is self.stopframe:            if self.stoplineno == -1:                return False            return frame.f_lineno >= self.stoplineno        # 如果在非botframe中，会先追溯到stopframe，返回false        while frame is not None and frame is not self.stopframe:            if frame is self.botframe:                return True            frame = frame.f_back        return False

r（return）

1 Command definition

Execute the current running function to the end.

2 Code analysis

The return command is only interrupted when the execution reaches the end of the frame (function call), that is, when the return event is encountered. \
pdb will set stopframe as the previous frame and returnframe as the current frame. If it is a non-return event, stop_here will always return false without interruption;

def stop_here(self, frame):
        ...
        # 如果当前帧代码顺序执行，下一个frame的lineno==stoplineno
        # 如果执行到for循环的最后一行，下一个frame（for循环第一行）的lineno<stoplineno，不会中断。直到for循环执行结束，紧接着的下一行的lineno==stoplineno，执行中断
        if frame is self.stopframe:
            if self.stoplineno == -1:
                return False
            return frame.f_lineno >= self.stoplineno


        # 如果在非botframe中，会先追溯到stopframe，返回false，同next
        while frame is not None and frame is not self.stopframe:
            if frame is self.botframe:
                return True
            frame = frame.f_back
        return False

If it is a return event, stop_here still returns false, but returnframe is judged to be true for the current frame, and interruption will be executed.

def dispatch_return(self, frame, arg):
        if self.stop_here(frame) or frame == self.returnframe:
            self.user_return(frame, arg)
            if self.quitting: raise BdbQuit
        return self.trace_dispatch

unt（until）

1 Command definition

Execute to the next line, the difference with next is that the for loop will only track once

2 code analysis

Set stopframe and returnframe as the current frame, and stoplineno as the current lineno+1.

def stop_here(self, frame):
        ...
        # 如果当前帧代码顺序执行，下一个frame的lineno==stoplineno
        # 如果执行到for循环的最后一行，下一个frame（for循环第一行）的lineno<stoplineno，不会中断。直到for循环执行结束，紧接着的下一行的lineno==stoplineno，执行中断
        if frame is self.stopframe:
            if self.stoplineno == -1:
                return False
            return frame.f_lineno >= self.stoplineno


        # 如果在非botframe中，会先追溯到stopframe，返回false，同next
        while frame is not None and frame is not self.stopframe:
            if frame is self.botframe:
                return True
            frame = frame.f_back
        return False

If there is a for loop in the current frame, it will only be executed once from top to bottom. If the function returns the return event, the lineno of the next frame may be less than the stoplineno, so set returnframe to the current frame, so that the function execution will behave like next.

u（up）/ d（down）

1 Command definition

Switch to the previous/next stack frame

2 Code analysis

Stack frame information

The stack frame contains the frame information of each level in the code call path, and it will be refreshed every time the command execution is interrupted, and the frame can be switched up and down through the u/d command. \
The stack frame is obtained mainly through the get_stack method, the first parameter is frame, and the second parameter is traceback object. The traceback object is generated in the exception event, and the exception event will carry an arg parameter:

exc_type, exc_value, exc_traceback = arg
(<type 'exceptions.IOError'>, (2, 'No such file or directory', 'wdwrg'), <traceback object at 0x10bd08a70>)

The traceback object has several commonly used attributes:

tb_frame: The frame where the current exception occurs
tb_lineno: The line number of the frame in which the current exception occurs, that is, frame.tb_lineno
tb_next: points to the exc_traceback (traceback object) called at the next level of the stack, or None if it is the top level

The stack frame information consists of two parts, the call stack of the frame and the exception stack (if any), in order: botframe -> frame1 -> frame2 -> tb1 -> tb2 (error tb)

def get_stack(self, f, t):
        stack = []
        if t and t.tb_frame is f:
            t = t.tb_next
       # frame调用栈，从底到顶
        while f is not None:
            stack.append((f, f.f_lineno))
            if f is self.botframe:
                break
            f = f.f_back
        stack.reverse()
        i = max(0, len(stack) - 1) 


        # 异常栈，从底到顶（出错栈）
        while t is not None:
            stack.append((t.tb_frame, t.tb_lineno))
            t = t.tb_next


        if f is None:
            i = max(0, len(stack) - 1)
        return stack, i

Each time pdb executes an interrupt, it will update the called stack frame table and the current stack frame information. The stack switch only needs to switch the index up/down.

def setup(self, f, t):
        self.forget()
        self.stack, self.curindex = self.get_stack(f, t)
        self.curframe_locals = self.curframe.f_locals
        ...
...
def do_up(self, arg):
        if self.curindex == 0:
            print >>self.stdout, '*** Oldest frame'
        else:
            self.curindex = self.curindex - 1
            self.curframe = self.stack[self.curindex][0]
            self.curframe_locals = self.curframe.f_locals
            self.print_stack_entry(self.stack[self.curindex])
            self.lineno = None

b（break）

Different from the debugging commands for process control, the break command is used to set breakpoints, which will not immediately affect the program interruption state, but may affect subsequent interruptions. When the line event occurs, in addition to stop_here, it will increase the conditional judgment of break_here. The implementation of setting breakpoints is relatively simple. This section mainly introduces how to make the code execution stop until the first line of the function is set when a breakpoint is set.

When setting a breakpoint, the lineno of the breakpoint is the first line of the function:

# 函数断点示例：break func
def do_break(self, arg, temporary = 0):
        ...
        if hasattr(func, 'im_func'):
                        func = func.im_func


                        funcname = code.co_name
                        lineno = code.co_firstlineno
                        filename = code.co_filename

When the line event executes to the first line of code of the function, this line has not actively set a breakpoint, but the first line of the function co_firstlineno hits the breakpoint, so the validity of the breakpoint will continue to be judged.

def break_here(self, frame):
        ...
        lineno = frame.f_lineno
        if not lineno in self.breaks[filename]:
            lineno = frame.f_code.co_firstlineno
            if not lineno in self.breaks[filename]:
                return False


        # flag says ok to delete temp. bp
        (bp, flag) = effective(filename, lineno, frame)

The effectiveness of the breakpoint is judged through the effective method, which handles the ignore and enabled configurations, and the effectiveness of the function breakpoint is judged through the checkfuncname method:

def checkfuncname(b, frame):
    """Check whether we should break here because of `b.funcname`."""
    ...


    # Breakpoint set via function name.
    ...


    # We are in the right frame.
    if not b.func_first_executable_line:
        # The function is entered for the 1st time.
        b.func_first_executable_line = frame.f_lineno


    if  b.func_first_executable_line != frame.f_lineno:
        # But we are not at the first line number: don't break.
        return False
    return True

When the line event occurs on the first line of the function, the func_first_executable_line is not yet, so it is set to the current line number, and the breakpoint takes effect, so the function execution is interrupted to the first line. When the next line reaches the back of the line number, because func_first_executable_line already has a value and is definitely not equal to the current line number, break_here is judged to be invalid and will not be interrupted.

Case Analysis

The following combines a very simple Python code debugging example to review the implementation principles of the above commands:

In the console, execute the snapshot from the command line:

Execute python test.py in the command line, the Python code is actually executed from the first line, but because pdb.set_trace() is called in __main__, it is actually mounted to pdb from the next line of set_trace The tracking function starts the interrupt control of the frame.

The execution of this Python code will go through 3 frames:

The underlying root frame0, which is the frame0 where __main__ is located, contains a for loop code, and the back frame of frame0 is None
The second layer frame1, enter the frame1 where the func method is located, and the back frame of frame1 is frame0
The top frame2, enter the frame2 where the add method is located, the back frame of frame2 is frame1

Debugging process:

Track the frame (root frame0) where __main__ is located, and trigger the line event on line 20
The user enters the unt command and press Enter, frame0 triggers the line event on line 21, the line number is equal to the line number of the previous trace +1, stop_here is established, interrupt waiting
The user enters the unt command and press Enter, the same as 2, interrupted on line 22
The user enters the unt command and press Enter, the code traces to frame0 and triggers the line event on line 20, the line number is less than the last traced line number +1 (23), stop_here is not established, continue to execute
Trigger the line event on line 24, the line number is greater than the last trace line number + 1 (23), stop_here is established, interrupt waiting
The user enters the s command and press Enter, the code traces to frame1 and triggers the call event on line 12. The step execution granularity is the same as that of sys.settrace, and the interrupt waits on line 12.
The user sets a breakpoint for the add function, and the breakpoint of the first line (line 7) of the add function will be added to the breakpoint list
The user enters the c command to enter, stop_here always returns false, and continues to track until the line event is triggered on line 8. Although line 8 is no longer in the breakpoint list, the current function frame firstlineno is present and valid, so it is on line 8. Interrupt waiting
The user enters the r command to press Enter. In the subsequent line event processing, stop_here returns false until the return event is triggered on line 10. At this time, returnframe is the current frame, and the wait is interrupted on line 10.
The user enters the up command, the stack frame switches the index forward, and returns to the previous frame frame1, which is the place where add is called in the 13th line of func
The user enters the down command, the stack frame switches the index forward and backward, and returns to the current frame
The user enters the n command and runs to the next trace 14 lines (line event). This time the trace is on frame1, which can be traced back to the botframe, so it is interrupted at line 14
The user enters the n command, and runs to the next trace 14 lines (return event), still in the current frame1, interrupt
The user enters the n command and runs until the next trace 24 lines (return event), this time trace is botframe (frame0), interrupt
The user enters the n command, and the execution of frame0 ends.

summary

The implementation of pdb provided by the Python standard library is not complicated. This article explains the core logic in the source code. If you understand the principle, you can also customize or rewrite a Python debugger by yourself. In fact, many general IDEs in the industry, such as pycharm and vscode, do not use standard pdb. They have developed their own Python debugger to better adapt to the IDE. However, understanding the principle of pdb, rewriting and customizing the debugger on pdb to meet debugging needs, is also a low-cost and effective way.

The debugging capability on the opposite side of the MNN workbench is also based on native pdb, and it supports various research and development scenarios of Alibaba Group's internal computing, which greatly improves the efficiency of the research and development and deployment of algorithms. Click to read the original text and go to www.mnn.zone to download the MNN workbench and experience it.

How to efficiently develop smart algorithms on the end? MNN workbench Python debugging detailed explanation

principle

Basic process

Interrupt control

s（step）

n（next）

c

r（return）

unt（until）

u（up）/ d（down）

b（break）

Case Analysis

summary

阿里巴巴终端技术

引用和评论

SLS：基于 OTel 的移动端全链路 Trace 建设思考和实践

2025年医疗大模型各医疗场景赋能实践研究报告130+份汇总解读|附PDF下载

如何减少跨团队交付摩擦？——基于 DevOps 与敏捷的最佳实践

Anaconda安装教程以及Anaconda和pip配置国内镜像

科学计算编程涉及到的技术栈简介

使用 chardet 判断文件编码需要注意的坑——过大的文件会导致高耗时

Python3 格式化时间（qbit）