扩展我们的速率限制以准备处理 10 亿个活动证书

Let's Encrypt protects a large portion of the web with over 550 million websites having TLS certificates, growing by 42% last year and issuing over 340,000 certificates per hour. Its infrastructure relies on rate limiting. In 2015, the first rate limiting system based on MariaDB was introduced but had limitations. A new system powered by Redis and a virtual scheduling algorithm was developed to handle future growth, reduce MariaDB load and adapt to subscriber patterns.
Rate limiting a free service is hard. In 2015, Let's Encrypt faced challenges with free certificate issuance and needed an atypical approach. They limited certificates per registered domain per week. Counting events was easy but became expensive with added rate limits in 2019. Buying runway by offloading reads improved database health but latency during peak hours persisted. Sliding windows were frustrating as subscribers hit limits unexpectedly. A patch in 2022 improved one limit but more was needed.
In 2023, flagging rate limit code endangered MariaDB databases. The authorizations table was heavily read and deletions were slow. By late 2023, a reckoning was needed and a new rate limiting system was designed.
The solution combines Redis for storage and the Generic Cell Rate Algorithm (GCRA). Redis was chosen for its high throughput and low latency, reducing read and write pressure and allowing for efficient key management. GCRA is a virtual scheduling algorithm for continuous rate limit enforcement with parameters like emission interval and burst tolerance. It automatically refills capacity and is storage and computationally efficient.
The results were immediate. Database load was cut, response times improved, and performance was consistent during peak traffic. Subscribers experienced smoother behavior, and the system became more permissive without sacrificing scalability or fairness. It also helped in tracking zombie clients and reducing resource consumption. With Redis, scalability was demonstrated even with a significant increase in unique TATs.
Next, many other ACME endpoints need to deploy the new infrastructure for better control over subscriber feedback. The goal is to keep adapting to the evolving web while providing free certificates in a user-friendly way.

阅读 8
0 条评论