Python functional programming series 007: lazy evaluation

I have implemented some important functions, methods, and classes in this series of articles. You can github (click here) (If the internet speed is too slow, I will also put a copy in gitee( Click here) , but please do not mention issue or leave a message star / fork .

origin

We return to the chapter introducing higher-order functions. We mentioned that one of the advantages of higher-order functions, especially curried, is "evaluation in advance" and "evaluation". Through these operations, we can greatly optimize a lot of code. For example, we use the previous example:

def f(x): # x储存了某种我们需要的状态
    ## 所有可以提前计算的放在这里
    z = x ** 2 + x + 1
    print('z is {}'.format(z))
    def helper(y):
        ## 所有延迟计算的放在这里
        return y * z
    return helper

When we call f(1) , we have already calculated the z in advance. If we temporarily save this value, we can save a lot of time when we call it repeatedly:

>>> g = f(1)
z is 3
>>> g(2) + g(1) # 可以看到这次就不会打印`z is xxxx`的输出了
9

That is to say, timely "evaluation in advance" and "evaluation later" can help us greatly reduce a lot of computational overhead. This introduces the concept of "lazy evaluation" that we are going to talk about in this article. The concept of lazy evaluation is mainly: it is calculated when it is called, and it is calculated only once.

Lazy properties and lazy values

Let's consider the following example:

Define a circle class, described by the center and radius, but when we know the center and radius, we can know many things, such as:

  1. Circumference ( perimeter )
  2. Area ( area )
  3. The position of the top coordinate of the circle ( upper_point )
  4. The distance from the center of the circle to the origin ( distance_from_origin )
  5. ...

This list may be very, very large, and with the increase of software features, this list may be added. We may have two ways to achieve it. The first is to set the attributes of circle when initializing:

@dataclass
class CircleInitial:
    x: float
    y: float
    r: float

    def __init__(self, x, y, r):
        self.x = x
        self.y = y
        self.r = r

        self.perimeter = 2 * r
        self.area = r * r * 3.14
        self.upper_point = (x, y + r)
        self.lower_point = (x, y - r)
        self.left_point = (x - r, y)
        self.right_point = (x + r, y)
        self.distance_from_origin = (x ** 2 + y ** 2) ** (1/2)

We can immediately see the problem: if there are many such attributes and the calculations involved are also very many, then when we instantiate a new object, it will take a very long time. However, we may not use most of the attributes.

So, there is a second plan to implement these into a method (we only give an example of a area method here):

@dataclass
class CircleMethod:
    x: float
    y: float
    r: float

    def area(self):
        print("area calculating...")
        return self.r * self.r * 3.14

Of course, because this value is a concept of a "constant" quantity, we can also use the property modifier so that we can call it without parentheses:

@dataclass
class CircleMethod:
    x: float
    y: float
    r: float

    @property
    def area(self):
        print("area calculating...")
        return self.r * self.r * 3.14

I deliberately added a line of printing code, we can find that every time we call area , it will be calculated once:

>>> a = CircleMethod(1, 2, 3)
>>> a.area ** 2 + a.area + 1
area calculating...
area calculating...
827.8876000000001

This is another kind of waste, so we found that the first scheme is suitable for attributes that need to be called repeatedly, and the second scheme implements attributes that are rarely called. However, when we are maintaining the code, we may not be able to predict in advance whether a property is frequently called, and this is not a long-term solution. But we found that what we need is such an attribute:

  1. This property will not be calculated when it is initialized
  2. This property is only calculated when it is called
  3. This property will only be calculated once and will not be called later

This is the concept of "lazy evaluation", and we also call this attribute "lazy attribute". Python no concept of built-in inertia property, however, we can easily find a realization from the Internet (you can also in my Python-functional-programming in lazy_evaluate.py found in):

def lazy_property(func):
    attr_name = "_lazy_" + func.__name__

    @property
    def _lazy_property(self):
        if not hasattr(self, attr_name):
            setattr(self, attr_name, func(self))
        return getattr(self, attr_name)

    return _lazy_property

For specific use, just switch the modifier property :

@dataclass
class Circle:
    x: float
    y: float
    r: float

    @lazy_property
    def area(self):
        print("area calculating...")
        return self.r * self.r * 3.14

We use the same calling method as above, and we can find that area only calculated once (only printed once):

>>> b = Circle(1, 2, 3)
>>> b.area ** 2 + b.area + 1
area calculating...
827.8876000000001

For the same reason, we can also implement the concept of a lazy value, but because python has no concept of code blocks, we can only use without parameters to achieve:

class _LazyValue:

    def __setattr__(self, name, value):
        if not callable(value) or value.__code__.co_argcount > 0:
            raise NotVoidFunctionError("value is not a void function")
        super(_LazyValue, self).__setattr__(name, (value, False))      
        
    def __getattribute__(self, name: str):
        try:
            _func, _have_called = super(_LazyValue, self).__getattribute__(name)
            if _have_called:
                return _func
            else:
                res = _func()
                super(_LazyValue, self).__setattr__(name, (res, True))
                return res
        except:
            raise AttributeError(
                "type object 'Lazy' has no attribute '{}'"
                .format(name)
            )

lazy_val = _LazyValue()

The specific calling method is as follows. If you want to design a module and this variable is not in the class, then you can use it very conveniently:

def f():
    print("f compute")
    return 12

>>> lazy_val.a = f
>>> lazy_val.a
f compute
12
>>> lazy_val.a
12

Lazy iterator/generator

In addition, Python built-in some lazy structures are mainly iterators and generators, we can easily verify that they are only calculated/retained once (here only the iterator is verified):

>>> a = (i for i in range(5))
>>> list(a)
[0, 1, 2, 3, 4]
>>> list(a)
[]

We can design the following two functions:

def f(x):
    print("f")
    return x + 1

def g(x):
    print("g")
    return x + 1

Then we consider the following results:

>>> a = (g(i) for i in (f(i) for i in range(5)))
>>> next(a)

It may have two results, one of its possible calculation methods is this:

>>> temp = [f(i) for i in range(5)]
>>> res = g(temp[0])

If this is the result, it will print out 5 f and then print out g

Another possibility is:

>>> res = (g(f(i)) for i in range(5))

Then, this will only print one f and one g . If according to the definition of lazy evaluation, i=1 is not actually called, so it should not be evaluated, so if it meets the second printing condition, it is a lazy object. The fact is true.

Of course, this feature is already very Fancy, but we can think of a very wonderful reference based on this, because in the iterator calculation, we do not calculate every value in the iterator when it is generated. Therefore, we can store an infinite series in this way. The result is returned after the calculation in the above method. One of the simplest examples is itertools.repeat in the built-in module. We can generate an infinite linear structure 1

from itertools import repeat

repeat_1 = repeat(1)

In this way, we can use the above list expression to do some calculations and then call next

res = (g(i) for i in (i * 3 for i in repeat_1))
next(res)

We also call these linear structures "lazy lists" ( repeat_1 here is an example of "infinite lazy lists"). In the following articles, we will use this method to accomplish some interesting things in detail.


λ and τ
介绍关于数据可视化的方方面面,不光技术,还有哲学、文化、传播学。
1.2k 声望
101 粉丝
0 条评论
推荐阅读
Python函数式编程系列012:惰性列表之生成器与迭代器
因为本系列还是基于一些已经对Python有一定熟悉度的读者,所以我们在此不做非常多的赘述来介绍基本知识了。而是回我们之前的主题,我们要用迭代器和生成器实现之前的指数函数。

三次方根阅读 1.3k

数据结构与算法:二分查找
一、常见数据结构简单数据结构(必须理解和掌握)有序数据结构:栈、队列、链表。有序数据结构省空间(储存空间小)无序数据结构:集合、字典、散列表,无序数据结构省时间(读取时间快)复杂数据结构树、 堆图二...

白鲸鱼9阅读 6.5k

基于Sanic的微服务基础架构
使用python做web开发面临的一个最大的问题就是性能,在解决C10K问题上显的有点吃力。有些异步框架Tornado、Twisted、Gevent 等就是为了解决性能问题。这些框架在性能上有些提升,但是也出现了各种古怪的问题难以...

jysong6阅读 3.9k评论 3

滚蛋吧,正则表达式!
你是不是也有这样的操作,比如你需要使用「电子邮箱正则表达式」,首先想到的就是直接百度上搜索一个,然后采用 CV 大法神奇地接入到你的代码中?

良许4阅读 2.2k

又一款眼前一亮的Linux终端工具!
今天给大家介绍一款最近发现的功能十分强大,颜值非常高的一款终端工具。这个神器我是在其他公众号文章上看到的,但他们都没把它的强大之处介绍明白,所以我自己体验一波后,再向大家分享自己的体验。

良许5阅读 1.8k

FastAPI性能碾压Flask?
不止一次的听过,FastAPI性能碾压Flask,直追Golang,不过一直没有测试过,今天闲着没事测试一下看看结果。不知道是哪里出了问题,结果大跌眼镜。

二毛erma02阅读 10.1k评论 3

封面图
程序员适合创业吗?
大家好,我是良许。从去年 12 月开始,我已经在视频号、抖音等主流视频平台上连续更新视频到现在,并得到了不错的评价。每个视频都花了很多时间精力用心制作,欢迎大家关注哦~考虑到有些小伙伴没有看过我的视频,...

良许3阅读 1.8k

1.2k 声望
101 粉丝
宣传栏