Constructing pure RSS public account information flow based on Liuli

First of all, Liuli is 061f1fcb0c9421? This is an open source project I recently developed, the main purpose is to let friends have reading habits quickly build a multi-source, clean and personalized reading environment .

Why is it called Liuli ?

Liuli was originally named 2C , and friends in the exchange group provided the Liuli 161f1fcb0c94a4, which was taken from Temple", which means It fits the purpose of the project very well:

Build a reading pure land for users, such as the oriental glazed world

It's been a month since the last release (I'm really procrastinating, to reflect), and today I'm happy to announce that Liuli has another wave of updates! 🥳See v0.2.0 task board .

Next, I will be based Liuli latest version V0.1.5 to tell you under what based Liuli build public RSS pure number information flow , the final subscription results are shown below:

liuli_demo

Start

First of all, I have three demands:

Aggregate the currently subscribed public accounts, output through RSS and then subscribe separately
Advertising recognition of subscribed articles
Make a quick backup of the article

The above request is exactly the subject of this article build a pure RSS public information flow 161f1fcb0c95f1, the specific implementation will not be discussed in detail here, if you are interested, you can communicate in the exchange group. This article only describes how to use it.

If you have read the article I wrote before to create a clean and personalized public reading environment 161f1fcb0c960e, then you may have some understanding of the concepts of collectors, processors, distributors, etc., but here for the consistency of reading, so Let’s introduce again, first look at the architecture diagram:

liuli_process

Briefly explain:

Collector : Monitor custom reading sources such as public accounts or blogs that they are concerned about, and flow into Liuli as an input source in a unified standard format;
processor : Customize the target content, such as using machine learning to implement automatic tagging of an advertisement classifier based on historical advertisement data, or introducing hook functions to execute on relevant nodes;
Distributor : rely on the interface layer for data request & response, provide users with personalized configuration, and then automatically distribute according to the configuration, and stream clean articles to WeChat, DingTalk, TG and even self-built websites;
: Back up the processed articles, such as persisting to the database or GitHub, etc.

In fact, it doesn't matter if you don't understand the process, you just need to know how to use it. Next, please follow the tutorial step by step in detail. It is best to have a computer to follow the operation.

use

All right, the play begins. Liuli deployment is still very easy to use, it is recommended that you use Docker deployment, so before you begin to install equipment at hand good Docker , if not installed, click here installation can be.

configure

The current Liuli is mainly divided into two parts:

Global configuration: It is the global environment variable, see Liuli environment variable
Task configuration: This configuration is formed for the problems that users need to solve. For example, this article will generate a configuration that collects, processes, and outputs public accounts into RSS (you can copy my configuration when you use it)

Global configuration

First of all, talk about the 161f1fcb0c9796 global configuration . In fact, the default configuration can also allow everyone to run, but if you need to distribute the article to WeChat or DingTalk, you need to fill in the relevant configuration, okay, let’s get started, please Open a terminal or in a way you are familiar with and create some folders or files.

mkdir liuli
cd liuli
# 存放调度任务配置，默认命名为default.json
mkdir liuli_config
# 数据库
mkdir mongodb_data
# 下拉 docker-compose 配置
# 如果网络不好请手动填写，内容见附录
wget https://raw.githubusercontent.com/howie6879/liuli/main/docker-compose.yaml
# 配置 pro.env 具体查看全局配置处的Liuli 环境变量
vim pro.env

For pro.env , if you want to know the details, it is recommended to check the Liuli global configuration . Of course, it doesn’t matter if you don’t want to see it, just follow this tutorial to fill it out. First, please copy the following configuration to pro.env :

PYTHONPATH=${PYTHONPATH}:${PWD}
LL_M_USER="liuli"
LL_M_PASS="liuli"
LL_M_HOST="liuli_mongodb"
LL_M_PORT="27017"
LL_M_DB="admin"
LL_M_OP_DB="liuli"
LL_FLASK_DEBUG=0
LL_HOST="0.0.0.0"
LL_HTTP_PORT=8765
LL_WORKERS=1
# 上面这么多配置不用改，下面的才需要各自配置
# 请填写你的实际IP
LL_DOMAIN="http://{real_ip}:8765"
# 请填写微信分发配置
LL_WECOM_ID=""
LL_WECOM_AGENT_ID="-1"
LL_WECOM_SECRET=""

Assuming that you use WeChat as the distribution terminal like me, you only need to obtain the following parameters through the following steps:

LL_WECOM_ID
LL_WECOM_AGENT_ID
LL_WECOM_SECRET

The acquisition process is as follows, please use your mobile phone number to register a enterprise WeChat .

First create the application:

Get the correlation ID:

The enterprise ID is at My Enterprise -> Enterprise Information -> Enterprise ID.

In order to conveniently receive messages on WeChat, remember to open the WeChat plugin, enter the location shown below, and scan the QR code to follow your QR code:

Now that you have obtained the following three parameters, please go to the corresponding configuration and fill in the secret key.

`Task configuration`

The task configuration is mainly to allow users to use Liuli more personalized way, so as to meet the various needs of users. Currently, Liuli can only support public account collection, filtering, distribution, and backup operations, which is the core purpose of this article. liuli_config/default.json copy it to 061f1fcb0c99a6:

{
    "name": "default",
    "author": "liuli_team",
    "collector": {
        "wechat_sougou": {
            "wechat_list": [
                "老胡的储物柜"
            ],
            "delta_time": 5,
            "spider_type": "playwright"
        }
    },
    "processor": {
        "before_collect": [],
        "after_collect": [{
            "func": "ad_marker",
            "cos_value": 0.6
        }, {
            "func": "to_rss",
            "link_source": "github"
        }]
    },
    "sender": {
        "sender_list": ["wecom"],
        "query_days": 7,
        "delta_time": 3
    },
    "backup": {
        "backup_list": ["mongodb"],
        "query_days": 7,
        "delta_time": 3,
        "init_config": {},
        "after_get_content": [{
            "func": "str_replace",
            "before_str": "data-src=\"",
            "after_str": "src=\"https://images.weserv.nl/?url="
        }]
    },
    "schedule": {
        "period_list": [
            "00:10",
            "12:10",
            "21:10"
        ]
    }
}

Pay attention to the wechat_list you want to subscribe to. The interface will be configured later in this section, so just use it first.

`start up`

Thank you for seeing this, now there is only one line of command to succeed, please check whether the file tree in the liuli

(base) [liuli] tree -L 1        
├── docker-compose.yaml
├── liuli_config
├────default.json
├── mongodb_data
└── pro.env

After confirming that there is no problem, execute:

docker-compose up -d

Not surprisingly, you will see that Docker starts these three containers:

View liuli_schedule , there will be logs as follows:

The output log is as follows:

Loading .env environment variables...
[2022:01:26 23:09:24] INFO  Liuli Schedule(v0.1.5) started successfully :)
[2022:01:26 23:09:24] INFO  Liuli Schedule time:
 00:10
 12:10
 21:10
[2022:01:26 23:09:36] INFO  Liuli playwright 匹配公众号 老胡的储物柜(howie_locker) 成功! 正在提取最新文章: 我的周刊(第023期)
[2022:01:26 23:09:39] INFO  Liuli 公众号文章持久化成功! 👉 老胡的储物柜
[2022:01:26 23:09:40] INFO  Liuli 🤗 微信公众号文章更新完毕(1/1)
...
[2022:01:26 23:09:45] INFO  Liuli 备份器执行完毕!

After the execution is complete, you can enter the MongoDB database, and the following collection will appear:

liuli_articles: Get article meta information
liuli_backup: All articles are backed up
liuli_rss: Generated RSS
liuli_send_list: distribution status
liuli_backup_list: backup status

Suppose you have a public source number Hu lockers, then boot, you can access lockers Hu said RSS subscription address HTTP: // ip: 8765 / rss / liuli_wechat / Hu said storage cabinet/ , the effect is as follows:

Pay attention to the red box, because I am using the GitHub backup device, so the address shows the GitHub address. If you want to use this, you can refer to the tutorial backup device configuration , I use the GitHub backup device and the effect is as follows:

The article will be updated daily Liuli automatically synchronized to the project, if everyone uses Liuli of GitHub backup, backup results with the combination of words, it would be an enormous force, can look forward to the next.

`exhibit`

Liuli successfully started, for users, the main perception is at the distribution and subscription layer.

WeChat distribution terminal renderings:

The subscription effect is as follows:

`instruction`

This project is still in the very early stage. If you find it useful, I hope you can use it soon, and give your Liuli soon as possible, so that 061f1fcb0c9d6f can grow more quickly.

If you think this project is good, please give Liuli GitHub , and the project address is here 👉 liuli-io/liuli .

If you have any questions or comments during the construction & use process, you can directly mention Issue or directly join the group to chat in detail (if it expires, there is my WeChat GitHub

`appendix`

docker-compose.yaml configured as follows:

version: "3"
services:
  liuli_api:
    image: liuliio/api:v0.1.1
    restart: always
    container_name: liuli_api
    ports:
      - "8765:8765"
    volumes:
      - ./pro.env:/data/code/pro.env
    links:
      - liuli_mongodb
    depends_on:
      - liuli_mongodb
    networks:
      - liuli-network
  liuli_schedule:
    image: liuliio/schedule:v0.1.5
    restart: always
    container_name: liuli_schedule
    volumes:
      - ./pro.env:/data/code/pro.env
      - ./liuli_config:/data/code/liuli_config
    links:
      - liuli_mongodb
    depends_on:
      - liuli_mongodb
    networks:
      - liuli-network
  liuli_mongodb:
    image: mongo:3.6
    restart: always
    container_name: liuli_mongodb
    environment:
      - MONGO_INITDB_ROOT_USERNAME=liuli
      - MONGO_INITDB_ROOT_PASSWORD=liuli
    ports:
      - "27027:27017"
    volumes:
      - ./mongodb_data:/data/db
    command: mongod
    networks:
      - liuli-network

networks:
  liuli-network:
    driver: bridge

Constructing pure RSS public account information flow based on Liuli

Start

use

configure

Global configuration

`Task configuration`

`start up`

`exhibit`

`instruction`

`appendix`

howie6879

`引用和评论`

ChatGPT 从注册到自建应用

2025年医疗大模型各医疗场景赋能实践研究报告130+份汇总解读|附PDF下载

如何减少跨团队交付摩擦？——基于 DevOps 与敏捷的最佳实践

Anaconda安装教程以及Anaconda和pip配置国内镜像

科学计算编程涉及到的技术栈简介

Python3 格式化时间（qbit）

manus 的替代品有哪些？使用LLM大模型技术做手机/网页/浏览器自动化操作技术汇总