Python functional programming series 008: measurable

In the previous article, we have repeatedly emphasized many advantages of functional programming, such as expressive ability, and the benefits of delayed calculation. But in fact, a bigger advantage is actually measurability. This article is also the core of conveying the entire series. We are not going to completely exclude concepts such as procedural and side effects, but have limited use and can make improvements on the basis of existing code.

origin

Below, let’s look at an example: a company wants to design a time-based scheduler, they can provide a crontab , for example, can be based on the first three days of each month, weekly weekend, the second week of each month Such expressions on the first day. When designing this scheduler, many functions related to time are involved. For example, the following is a function that may be implemented:

from datetime import datetime, timedelta

def yesterday_str() -> str:
    """获取昨日的时间的字符串(YYYYMMDD)
    """
    return (
        datetime.now() - timedelta(days=1)
        ).strftime("%Y%m%d")

This is the most intuitive implementation, but this function we found to be untestable. You should see the reason, because datetime.now() side effects. Specifically, we can give examples of problems that may be encountered in the test as follows:

Problems in the unit test example

How would we write a unit test for this function? Obviously, most people would write this:

def test_yesterday_str():
    assert yesterday_str() == (
        datetime.now() - timedelta(days=1)
    ).strftime("%Y%m%d")

Obviously, this unit test shows several problems at a glance:

In fact, we just re-written the original code and did not really test it.
Even if we admit this way of writing, there is a certain probability that the test will fail in the early hours of the morning (23:59:59 seconds), but this is not an error caused by a problem in the implementation of the function.

Problems in integration testing

In actual testing, some integrated parts may be more difficult to test. For example, our following function that calls the above function, its function is to perform a task on the 1st of each month:

def run_at_first_day():
    if yesterday_str()[-2:] == '01':
        do_something()

This example not only passes the side effects step by step, but also in the test, if we are not testing on the 1st, we can only test the logic of do_something but not the logic of run_at_first_day It is conceivable that there will be many such examples in this system.

How to solve

Conventional solution

Conventional solutions, the first is to modify the system time. There is a Python FreezeGun , which does similar things:

from freezegun import freeze_time
import datetime

@freeze_time("2012-01-14")
def test():
    assert datetime.datetime.now() == datetime.datetime(2012, 1, 14)

Of course, this solution is aimed at time The side effects we encounter may be more than this one. It may be read configuration, database interaction, etc. This solution cannot solve these things.

The other category is the concept of the test field, such as concepts such as fake , mock , stub , we will of course use fake in the following work, but we don't need to be entangled in these complicated concepts.

Take side-effect functions as parameters

We rewrite it in the following way and find that the entire function becomes testable:

def yesterday_str(now_func = date.now) -> str:
    assert yesterday_str() == (
        now_func() - timedelta(days=1)
    ).strftime("%Y%m%d")

The specific test writing is as follows:

def fake_now(now_str):
    def helper():
        return datetime.strptime(now_str, "%Y-%m-%d")
    return helper

def test_yesterday_str():
    return yesterday_str(fake_now('2020-01-01')) == '2019-12-31'

We found that there are many advantages to writing this way:

The entire function becomes side-effect-free, and the side-effects are isolated in the parameters
Because there are no side effects, we only need to make the corresponding "false" function ourselves to simulate the required input, especially for the type of function Void -> A
We can simulate the operation of any state through a fake function, which makes the scheduling logic we mentioned above become testable.
When we specifically call, because the default value of the parameter is set, the specific method used has not changed.

With this method of writing side effects in parameters, we will encounter a similar scheme (random numbers without side effects) later, and see how Monad solves such problems in subsequent articles.

However, this article introduces the concept of "testability". Generally speaking, functions without side effects are absolutely testable and can be tested in the unit test phase. Functions/methods with side effects can make testing difficult. Therefore, through the concept of unit testing and coverage, we can expose most of the problems before going online, which is a very Fancy way. If you add type inference, the judgment of the usability of this system will be more perfect (of course, this is difficult to do in a language like Python, but you can do similar things mypy

Of course, this is also the beginning of functional programming test, we will introduce testing concepts behind another unique functional programming - based on the nature of the test ( Property-based Testing ), then introduce it based on enlightened Some good third-party modules and methods.

Python functional programming series 008: measurable

origin

Problems in the unit test example

Problems in integration testing

How to solve

Conventional solution

Take side-effect functions as parameters

三次方根

引用和评论

Python函数式编程系列012：惰性列表之生成器与迭代器

python与nodejs哪个性能高

Anaconda安装教程以及Anaconda和pip配置国内镜像

如何减少跨团队交付摩擦？——基于 DevOps 与敏捷的最佳实践

Python 描述符

科学计算编程涉及到的技术栈简介

使用 chardet 判断文件编码需要注意的坑——过大的文件会导致高耗时