First of all, Liuli
is 061f1fcb0c9421? This is an open source project I recently developed, the main purpose is to let friends have reading habits quickly build a multi-source, clean and personalized reading environment .
Why is it called Liuli
?
Liuli
was originally named 2C
, and friends in the exchange group provided the Liuli 161f1fcb0c94a4, which was taken from Temple", which means It fits the purpose of the project very well:
Build a reading pure land for users, such as the oriental glazed world
It's been a month since the last release (I'm really procrastinating, to reflect), and today I'm happy to announce that Liuli
has another wave of updates! 🥳See v0.2.0 task board .
Next, I will be based Liuli
latest version V0.1.5
to tell you under what based Liuli build public RSS pure number information flow , the final subscription results are shown below:
Start
First of all, I have three demands:
- Aggregate the currently subscribed public accounts, output through RSS and then subscribe separately
- Advertising recognition of subscribed articles
- Make a quick backup of the article
The above request is exactly the subject of this article build a pure RSS public information flow 161f1fcb0c95f1, the specific implementation will not be discussed in detail here, if you are interested, you can communicate in the exchange group. This article only describes how to use it.
If you have read the article I wrote before to create a clean and personalized public reading environment 161f1fcb0c960e, then you may have some understanding of the concepts of collectors, processors, distributors, etc., but here for the consistency of reading, so Let’s introduce again, first look at the architecture diagram:
Briefly explain:
- Collector : Monitor custom reading sources such as public accounts or blogs that they are concerned about, and flow into
Liuli
as an input source in a unified standard format; - processor : Customize the target content, such as using machine learning to implement automatic tagging of an advertisement classifier based on historical advertisement data, or introducing hook functions to execute on relevant nodes;
- Distributor : rely on the interface layer for data request & response, provide users with personalized configuration, and then automatically distribute according to the configuration, and stream clean articles to WeChat, DingTalk, TG and even self-built websites;
- : Back up the processed articles, such as persisting to the database or GitHub, etc.
In fact, it doesn't matter if you don't understand the process, you just need to know how to use it. Next, please follow the tutorial step by step in detail. It is best to have a computer to follow the operation.
use
All right, the play begins. Liuli
deployment is still very easy to use, it is recommended that you use Docker
deployment, so before you begin to install equipment at hand good Docker
, if not installed, click here installation can be.
configure
The current Liuli
is mainly divided into two parts:
- Global configuration: It is the global environment variable, see Liuli environment variable
- Task configuration: This configuration is formed for the problems that users need to solve. For example, this article will generate a configuration that collects, processes, and outputs public accounts into RSS (you can copy my configuration when you use it)
Global configuration
First of all, talk about the 161f1fcb0c9796 global configuration . In fact, the default configuration can also allow everyone to run, but if you need to distribute the article to WeChat or DingTalk, you need to fill in the relevant configuration, okay, let’s get started, please Open a terminal or in a way you are familiar with and create some folders or files.
mkdir liuli
cd liuli
# 存放调度任务配置,默认命名为default.json
mkdir liuli_config
# 数据库
mkdir mongodb_data
# 下拉 docker-compose 配置
# 如果网络不好请手动填写,内容见附录
wget https://raw.githubusercontent.com/howie6879/liuli/main/docker-compose.yaml
# 配置 pro.env 具体查看全局配置处的Liuli 环境变量
vim pro.env
For pro.env
, if you want to know the details, it is recommended to check the Liuli
global configuration . Of course, it doesn’t matter if you don’t want to see it, just follow this tutorial to fill it out. First, please copy the following configuration to pro.env
:
PYTHONPATH=${PYTHONPATH}:${PWD}
LL_M_USER="liuli"
LL_M_PASS="liuli"
LL_M_HOST="liuli_mongodb"
LL_M_PORT="27017"
LL_M_DB="admin"
LL_M_OP_DB="liuli"
LL_FLASK_DEBUG=0
LL_HOST="0.0.0.0"
LL_HTTP_PORT=8765
LL_WORKERS=1
# 上面这么多配置不用改,下面的才需要各自配置
# 请填写你的实际IP
LL_DOMAIN="http://{real_ip}:8765"
# 请填写微信分发配置
LL_WECOM_ID=""
LL_WECOM_AGENT_ID="-1"
LL_WECOM_SECRET=""
Assuming that you use WeChat as the distribution terminal like me, you only need to obtain the following parameters through the following steps:
- LL_WECOM_ID
- LL_WECOM_AGENT_ID
- LL_WECOM_SECRET
The acquisition process is as follows, please use your mobile phone number to register a enterprise WeChat .
First create the application:
Get the correlation ID:
The enterprise ID is at My Enterprise -> Enterprise Information -> Enterprise ID.
In order to conveniently receive messages on WeChat, remember to open the WeChat plugin, enter the location shown below, and scan the QR code to follow your QR code:
Now that you have obtained the following three parameters, please go to the corresponding configuration and fill in the secret key.
Task configuration
The task configuration is mainly to allow users to use Liuli
more personalized way, so as to meet the various needs of users. Currently, Liuli
can only support public account collection, filtering, distribution, and backup operations, which is the core purpose of this article. liuli_config/default.json
copy it to 061f1fcb0c99a6:
{
"name": "default",
"author": "liuli_team",
"collector": {
"wechat_sougou": {
"wechat_list": [
"老胡的储物柜"
],
"delta_time": 5,
"spider_type": "playwright"
}
},
"processor": {
"before_collect": [],
"after_collect": [{
"func": "ad_marker",
"cos_value": 0.6
}, {
"func": "to_rss",
"link_source": "github"
}]
},
"sender": {
"sender_list": ["wecom"],
"query_days": 7,
"delta_time": 3
},
"backup": {
"backup_list": ["mongodb"],
"query_days": 7,
"delta_time": 3,
"init_config": {},
"after_get_content": [{
"func": "str_replace",
"before_str": "data-src=\"",
"after_str": "src=\"https://images.weserv.nl/?url="
}]
},
"schedule": {
"period_list": [
"00:10",
"12:10",
"21:10"
]
}
}
Pay attention to the wechat_list
you want to subscribe to. The interface will be configured later in this section, so just use it first.
start up
Thank you for seeing this, now there is only one line of command to succeed, please check whether the file tree in the liuli
(base) [liuli] tree -L 1
├── docker-compose.yaml
├── liuli_config
├────default.json
├── mongodb_data
└── pro.env
After confirming that there is no problem, execute:
docker-compose up -d
Not surprisingly, you will see that Docker
starts these three containers:
View liuli_schedule
, there will be logs as follows:
The output log is as follows:
Loading .env environment variables...
[2022:01:26 23:09:24] INFO Liuli Schedule(v0.1.5) started successfully :)
[2022:01:26 23:09:24] INFO Liuli Schedule time:
00:10
12:10
21:10
[2022:01:26 23:09:36] INFO Liuli playwright 匹配公众号 老胡的储物柜(howie_locker) 成功! 正在提取最新文章: 我的周刊(第023期)
[2022:01:26 23:09:39] INFO Liuli 公众号文章持久化成功! 👉 老胡的储物柜
[2022:01:26 23:09:40] INFO Liuli 🤗 微信公众号文章更新完毕(1/1)
...
[2022:01:26 23:09:45] INFO Liuli 备份器执行完毕!
After the execution is complete, you can enter the MongoDB database, and the following collection
will appear:
- liuli_articles: Get article meta information
- liuli_backup: All articles are backed up
- liuli_rss: Generated RSS
- liuli_send_list: distribution status
- liuli_backup_list: backup status
Suppose you have a public source number Hu lockers, then boot, you can access
lockers Hu said
RSS
subscription address HTTP: // ip: 8765 / rss / liuli_wechat / Hu said storage cabinet/ , the effect is as follows:
Pay attention to the red box, because I am using the GitHub
backup device, so the address shows the GitHub
address. If you want to use this, you can refer to the tutorial backup device configuration , I use the GitHub
backup device and the effect is as follows:
The article will be updated daily Liuli
automatically synchronized to the project, if everyone uses Liuli
of GitHub
backup, backup results with the combination of words, it would be an enormous force, can look forward to the next.
exhibit
Liuli
successfully started, for users, the main perception is at the distribution and subscription layer.
WeChat distribution terminal renderings:
The subscription effect is as follows:
instruction
This project is still in the very early stage. If you find it useful, I hope you can use it soon, and give your Liuli
soon as possible, so that 061f1fcb0c9d6f can grow more quickly.
If you think this project is good, please give Liuli
GitHub
, and the project address is here 👉 liuli-io/liuli .
If you have any questions or comments during the construction & use process, you can directly mention Issue
or directly join the group to chat in detail (if it expires, there is my WeChat GitHub
appendix
docker-compose.yaml
configured as follows:
version: "3"
services:
liuli_api:
image: liuliio/api:v0.1.1
restart: always
container_name: liuli_api
ports:
- "8765:8765"
volumes:
- ./pro.env:/data/code/pro.env
links:
- liuli_mongodb
depends_on:
- liuli_mongodb
networks:
- liuli-network
liuli_schedule:
image: liuliio/schedule:v0.1.5
restart: always
container_name: liuli_schedule
volumes:
- ./pro.env:/data/code/pro.env
- ./liuli_config:/data/code/liuli_config
links:
- liuli_mongodb
depends_on:
- liuli_mongodb
networks:
- liuli-network
liuli_mongodb:
image: mongo:3.6
restart: always
container_name: liuli_mongodb
environment:
- MONGO_INITDB_ROOT_USERNAME=liuli
- MONGO_INITDB_ROOT_PASSWORD=liuli
ports:
- "27027:27017"
volumes:
- ./mongodb_data:/data/db
command: mongod
networks:
- liuli-network
networks:
liuli-network:
driver: bridge
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。