Python functional programming series 007: lazy evaluation

I have implemented some important functions, methods, and classes in this series of articles. You can github (click here) (If the internet speed is too slow, I will also put a copy in gitee( Click here) , but please do not mention issue or leave a message star / fork .

origin

We return to the chapter introducing higher-order functions. We mentioned that one of the advantages of higher-order functions, especially curried, is "evaluation in advance" and "evaluation". Through these operations, we can greatly optimize a lot of code. For example, we use the previous example:

def f(x): # x储存了某种我们需要的状态
    ## 所有可以提前计算的放在这里
    z = x ** 2 + x + 1
    print('z is {}'.format(z))
    def helper(y):
        ## 所有延迟计算的放在这里
        return y * z
    return helper

When we call f(1) , we have already calculated the z in advance. If we temporarily save this value, we can save a lot of time when we call it repeatedly:

>>> g = f(1)
z is 3
>>> g(2) + g(1) # 可以看到这次就不会打印`z is xxxx`的输出了
9

That is to say, timely "evaluation in advance" and "evaluation later" can help us greatly reduce a lot of computational overhead. This introduces the concept of "lazy evaluation" that we are going to talk about in this article. The concept of lazy evaluation is mainly: it is calculated when it is called, and it is calculated only once.

Lazy properties and lazy values

Let's consider the following example:

Define a circle class, described by the center and radius, but when we know the center and radius, we can know many things, such as:

Circumference ( perimeter )
Area ( area )
The position of the top coordinate of the circle ( upper_point )
The distance from the center of the circle to the origin ( distance_from_origin )
...

This list may be very, very large, and with the increase of software features, this list may be added. We may have two ways to achieve it. The first is to set the attributes of circle when initializing:

@dataclass
class CircleInitial:
    x: float
    y: float
    r: float

    def __init__(self, x, y, r):
        self.x = x
        self.y = y
        self.r = r

        self.perimeter = 2 * r
        self.area = r * r * 3.14
        self.upper_point = (x, y + r)
        self.lower_point = (x, y - r)
        self.left_point = (x - r, y)
        self.right_point = (x + r, y)
        self.distance_from_origin = (x ** 2 + y ** 2) ** (1/2)

We can immediately see the problem: if there are many such attributes and the calculations involved are also very many, then when we instantiate a new object, it will take a very long time. However, we may not use most of the attributes.

So, there is a second plan to implement these into a method (we only give an example of a area method here):

@dataclass
class CircleMethod:
    x: float
    y: float
    r: float

    def area(self):
        print("area calculating...")
        return self.r * self.r * 3.14

Of course, because this value is a concept of a "constant" quantity, we can also use the property modifier so that we can call it without parentheses:

@dataclass
class CircleMethod:
    x: float
    y: float
    r: float

    @property
    def area(self):
        print("area calculating...")
        return self.r * self.r * 3.14

I deliberately added a line of printing code, we can find that every time we call area , it will be calculated once:

>>> a = CircleMethod(1, 2, 3)
>>> a.area ** 2 + a.area + 1
area calculating...
area calculating...
827.8876000000001

This is another kind of waste, so we found that the first scheme is suitable for attributes that need to be called repeatedly, and the second scheme implements attributes that are rarely called. However, when we are maintaining the code, we may not be able to predict in advance whether a property is frequently called, and this is not a long-term solution. But we found that what we need is such an attribute:

This property will not be calculated when it is initialized
This property is only calculated when it is called
This property will only be calculated once and will not be called later

This is the concept of "lazy evaluation", and we also call this attribute "lazy attribute". Python no concept of built-in inertia property, however, we can easily find a realization from the Internet (you can also in my Python-functional-programming in lazy_evaluate.py found in):

def lazy_property(func):
    attr_name = "_lazy_" + func.__name__

    @property
    def _lazy_property(self):
        if not hasattr(self, attr_name):
            setattr(self, attr_name, func(self))
        return getattr(self, attr_name)

    return _lazy_property

For specific use, just switch the modifier property :

@dataclass
class Circle:
    x: float
    y: float
    r: float

    @lazy_property
    def area(self):
        print("area calculating...")
        return self.r * self.r * 3.14

We use the same calling method as above, and we can find that area only calculated once (only printed once):

>>> b = Circle(1, 2, 3)
>>> b.area ** 2 + b.area + 1
area calculating...
827.8876000000001

For the same reason, we can also implement the concept of a lazy value, but because python has no concept of code blocks, we can only use without parameters to achieve:

class _LazyValue:

    def __setattr__(self, name, value):
        if not callable(value) or value.__code__.co_argcount > 0:
            raise NotVoidFunctionError("value is not a void function")
        super(_LazyValue, self).__setattr__(name, (value, False))      
        
    def __getattribute__(self, name: str):
        try:
            _func, _have_called = super(_LazyValue, self).__getattribute__(name)
            if _have_called:
                return _func
            else:
                res = _func()
                super(_LazyValue, self).__setattr__(name, (res, True))
                return res
        except:
            raise AttributeError(
                "type object 'Lazy' has no attribute '{}'"
                .format(name)
            )

lazy_val = _LazyValue()

The specific calling method is as follows. If you want to design a module and this variable is not in the class, then you can use it very conveniently:

def f():
    print("f compute")
    return 12

>>> lazy_val.a = f
>>> lazy_val.a
f compute
12
>>> lazy_val.a
12

Lazy iterator/generator

In addition, Python built-in some lazy structures are mainly iterators and generators, we can easily verify that they are only calculated/retained once (here only the iterator is verified):

>>> a = (i for i in range(5))
>>> list(a)
[0, 1, 2, 3, 4]
>>> list(a)
[]

We can design the following two functions:

def f(x):
    print("f")
    return x + 1

def g(x):
    print("g")
    return x + 1

Then we consider the following results:

>>> a = (g(i) for i in (f(i) for i in range(5)))
>>> next(a)

It may have two results, one of its possible calculation methods is this:

>>> temp = [f(i) for i in range(5)]
>>> res = g(temp[0])

If this is the result, it will print out 5 f and then print out g

Another possibility is:

>>> res = (g(f(i)) for i in range(5))

Then, this will only print one f and one g . If according to the definition of lazy evaluation, i=1 is not actually called, so it should not be evaluated, so if it meets the second printing condition, it is a lazy object. The fact is true.

Of course, this feature is already very Fancy, but we can think of a very wonderful reference based on this, because in the iterator calculation, we do not calculate every value in the iterator when it is generated. Therefore, we can store an infinite series in this way. The result is returned after the calculation in the above method. One of the simplest examples is itertools.repeat in the built-in module. We can generate an infinite linear structure 1

from itertools import repeat

repeat_1 = repeat(1)

In this way, we can use the above list expression to do some calculations and then call next

res = (g(i) for i in (i * 3 for i in repeat_1))
next(res)

We also call these linear structures "lazy lists" ( repeat_1 here is an example of "infinite lazy lists"). In the following articles, we will use this method to accomplish some interesting things in detail.

Python functional programming series 007: lazy evaluation

origin

Lazy properties and lazy values

Lazy iterator/generator

三次方根

引用和评论

Python函数式编程系列012：惰性列表之生成器与迭代器

Anaconda安装教程以及Anaconda和pip配置国内镜像

科学计算编程涉及到的技术栈简介

使用 chardet 判断文件编码需要注意的坑——过大的文件会导致高耗时

Python3 格式化时间（qbit）

manus 的替代品有哪些？使用LLM大模型技术做手机/网页/浏览器自动化操作技术汇总

怎么判断自己下载的 trae 是国际版还是国内版？