Requests library

It is a must-have library for crawlers and is well-known. It is used to initiate get, post and other requests. It can be regarded as the successor of the url library in python3.

BeautifulSoup library

The necessary library for crawlers is also very famous for parsing html code and extracting useful data from it. It is generally recommended to use the lxml parsing library. If some codes encounter parsing problems, you can try to use the html.parser library instead.

tqdm library

It can be used as a progress bar to display the progress of program execution, such as the log of a crawler. However, note that in the windows command line, peer refresh cannot be implemented, and a new line will be output every time, which is not recommended. (I haven't tried it under powershell)

peewee library

It is very useful to use for database models, of course, you can also use the sqlalchemy library directly. I personally think that this library is easier to learn than the latter, and you can use the command line to import and export between model classes and database table structures with one click.

Arrow library

Personally think the best time conversion library, highly recommended. It supports various formats, and the api is also very readable, which is convenient for N days and N weeks before and after switching.

PIL library

The best graphics processing library for python, I only use it for image cropping, conversion, stitching, etc. It is said that it can realize pixel-by-pixel modification, detection, calculation, etc. It also has many uses in the field of image recognition.

OpenPyxl

I personally prefer the office document processing library, which is very convenient to process excel. However, it seems that it is not suitable for processing data in large batches. In large batches, the pandas library is recommended to directly load csv files.

Jsonlines library

WeChat applet development can be used. WeChat cloud development uses json lines format, which is a variant of json. It is a bit troublesome to convert. You have to write it yourself. Using this can save a little time and energy.

PyPinyin library

To convert Chinese characters into pinyin, you need to write a function yourself to splicing the converted pinyin together, otherwise it is a word-by-pinyin, which is an array format.


敲键盘的猫
772 声望131 粉丝

一只热爱科技的猫