Interviewer: Talk about the snowflake algorithm, the more detailed the better

When talking about the generation of distributed unique IDs in the previous article, I mentioned the snowflake algorithm. This time, we will explain it in detail and only talk about it.

SnowFlake algorithm

According to Charles Knight of the National Center for Atmospheric Research, the average snowflake consists of approximately 10^19 water molecules. During the formation of snowflakes, different structural branches are formed. Therefore, there are no two identical snowflakes in nature. Each snowflake has its own beautiful and unique shape. The snowflake algorithm means that the generated id is as unique as a snowflake.
Snowflake is Twitter's open source distributed ID generation algorithm, and the result is a long ID. The core idea is: use 41bit as the number of milliseconds, 10bit as the machine ID (5 bits are the data center, 5 bits of the machine ID), and 12bit as the serial number within milliseconds (meaning that each node can generate every millisecond 4096 IDs), and there is a sign bit at the end, which is always 0.

Core idea: Distributed and unique.

Algorithm specific introduction

The snowflake algorithm is a 64-bit binary, which contains a total of four parts:

The 1 bit is the sign bit, that is, the highest bit. It is always 0 and has no meaning, because if the only computer's twos complement is a negative number, 0 is a positive number.
41 bits are timestamps, specific to milliseconds, 41 bits of binary can be used for 69 years, because time theoretically increases forever, so it is possible to sort according to this.
The 10 digits are the machine ID, which can be used all as the machine ID, or it can be used to identify the machine room ID + the machine ID. The 10 digits can represent up to 1024 machines.
The 12-bit serial number is the counting serial number, that is, on the same machine at the same time, in theory, different IDs can be generated at the same time. The 12-bit serial number can distinguish 4096 IDs.

optimization

Since 41 bits are timestamps, our time calculation started in 1970 and can only be used for 69 years. In order not to waste, we can actually use the relative value of time, which is based on the time when the project started. Can be used for 69 years. The service of obtaining a unique ID requires relatively high processing speed, so we all use bit arithmetic and shift operations. You can use System.currentTimeMillis() obtain the current time.

Time callback problem

When acquiring the time, the time callback problem may arise. What is the time callback problem? That is, the time on the server suddenly falls back to the previous time.

For human reasons, the time of the system environment was changed.
Sometimes it is necessary to synchronize the time on different machines, and there may be errors between different machines, then the time callback problem may occur.

solution

When the callback time is small, no ID is generated, and the loop waits until the time arrives.
The above scheme is only suitable for small clock callbacks. If the interval is too large and blocking waiting, it is definitely not advisable. Therefore, either the callback exceeding a certain size will directly report an error and deny service, or there is a solution that uses the extension bit to return After dialing, add 1 to the extension position, so that the ID can still remain unique. But this requires us to reserve the number of bits in advance, either from the machine id or from the serial number, to spare a certain number of bits. When the time is called back, this position is +1 .

In fact, Baidu and Meituan both have their own solutions for the problem of producing duplicate IDs due to the time callback. If you are interested, you can check it out. The following is not the information on their official website documents:

Baidu UIDGenerator: https://github.com/baidu/uid-generator/blob/master/README.zh_cn.md
- UidGenerator is implemented in Java, a unique ID generator Snowflake UidGenerator works in application projects in the form of components, supports custom workerId digits and initialization strategies, so it is suitable for such as automatic restart and drift of instances in virtualized environments such as docker 16193c77f44c1a. In terms of implementation, UidGenerator uses the future time to solve the inherent concurrency limitation of the sequence; it uses RingBuffer to cache the generated UID, parallelizes the production and consumption of UID, and complements the CacheLine to avoid the hardware level caused by RingBuffer. "Pseudo-sharing" problem. The final single-machine QPS can reach 6 million.
Leaf: 16193c77f44c5b https://tech.meituan.com/2019/03/07/open-source-project-leaf.html
- leaf-segment plan
  - Optimization: double buffer + pre-allocation
  - Disaster tolerance: Mysql DB has one master and two slaves, remote computer room, semi-synchronous mode
  - Disadvantages: If you use the segment number scheme: the id is incremented and can be calculated. It is not suitable for the order ID generation scenario. For example, the bidding will place an order at 12 noon on two days, and it can be roughly calculated by subtracting the order id number. The company’s order volume per day is unbearable.
- leaf-snowflake solution
  - Use the feature of Zookeeper persistent sequential node to automatically configure workerID for snowflake nodes
    - 1. Start the Leaf-snowflake service, connect to Zookeeper, and check whether you have been registered under the leaf_forever parent node (whether there is a child node in this order).
    - 2. If you have registered, you can directly retrieve your workerID (the int type ID number generated by the zk sequence node) and start the service.
    - 3. If you have not registered, create a persistent sequence node under the parent node, and after the creation is successful, retrieve the sequence number as your worker ID number and start the service.
  - Cache workerID to reduce dependence on third-party components
  - Due to the strong dependence on the clock, it is more sensitive to time requirements. NTP synchronization will also cause a second-level rollback when the machine is working. It is recommended to turn off NTP synchronization directly. Either directly return ERROR_CODE without providing services when the clock is dialed back, and wait for the clock to catch up. or do a layer of retry, and then report to the alarm system, or automatically remove its own node and alarm

`Code display`

public class SnowFlake {

    // 数据中心(机房) id
    private long datacenterId;
    // 机器ID
    private long workerId;
    // 同一时间的序列
    private long sequence;

    public SnowFlake(long workerId, long datacenterId) {
        this(workerId, datacenterId, 0);
    }

    public SnowFlake(long workerId, long datacenterId, long sequence) {
        // 合法判断
        if (workerId > maxWorkerId || workerId < 0) {
            throw new IllegalArgumentException(String.format("worker Id can't be greater than %d or less than 0", maxWorkerId));
        }
        if (datacenterId > maxDatacenterId || datacenterId < 0) {
            throw new IllegalArgumentException(String.format("datacenter Id can't be greater than %d or less than 0", maxDatacenterId));
        }
        System.out.printf("worker starting. timestamp left shift %d, datacenter id bits %d, worker id bits %d, sequence bits %d, workerid %d",
                timestampLeftShift, datacenterIdBits, workerIdBits, sequenceBits, workerId);

        this.workerId = workerId;
        this.datacenterId = datacenterId;
        this.sequence = sequence;
    }

    // 开始时间戳（2021-10-16 22:03:32）
    private long twepoch = 1634393012000L;

    // 机房号，的ID所占的位数 5个bit 最大:11111(2进制)--> 31(10进制)
    private long datacenterIdBits = 5L;

    // 机器ID所占的位数 5个bit 最大:11111(2进制)--> 31(10进制)
    private long workerIdBits = 5L;

    // 5 bit最多只能有31个数字，就是说机器id最多只能是32以内
    private long maxWorkerId = -1L ^ (-1L << workerIdBits);

    // 5 bit最多只能有31个数字，机房id最多只能是32以内
    private long maxDatacenterId = -1L ^ (-1L << datacenterIdBits);

    // 同一时间的序列所占的位数 12个bit 111111111111 = 4095  最多就是同一毫秒生成4096个
    private long sequenceBits = 12L;

    // workerId的偏移量
    private long workerIdShift = sequenceBits;

    // datacenterId的偏移量
    private long datacenterIdShift = sequenceBits + workerIdBits;

    // timestampLeft的偏移量
    private long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits;

    // 序列号掩码 4095 (0b111111111111=0xfff=4095)
    // 用于序号的与运算，保证序号最大值在0-4095之间
    private long sequenceMask = -1L ^ (-1L << sequenceBits);

    // 最近一次时间戳
    private long lastTimestamp = -1L;


    // 获取机器ID
    public long getWorkerId() {
        return workerId;
    }


    // 获取机房ID
    public long getDatacenterId() {
        return datacenterId;
    }


    // 获取最新一次获取的时间戳
    public long getLastTimestamp() {
        return lastTimestamp;
    }


    // 获取下一个随机的ID
    public synchronized long nextId() {
        // 获取当前时间戳，单位毫秒
        long timestamp = timeGen();

        if (timestamp < lastTimestamp) {
            System.err.printf("clock is moving backwards.  Rejecting requests until %d.", lastTimestamp);
            throw new RuntimeException(String.format("Clock moved backwards.  Refusing to generate id for %d milliseconds",
                    lastTimestamp - timestamp));
        }

        // 去重
        if (lastTimestamp == timestamp) {

            sequence = (sequence + 1) & sequenceMask;

            // sequence序列大于4095
            if (sequence == 0) {
                // 调用到下一个时间戳的方法
                timestamp = tilNextMillis(lastTimestamp);
            }
        } else {
            // 如果是当前时间的第一次获取，那么就置为0
            sequence = 0;
        }

        // 记录上一次的时间戳
        lastTimestamp = timestamp;

        // 偏移计算
        return ((timestamp - twepoch) << timestampLeftShift) |
                (datacenterId << datacenterIdShift) |
                (workerId << workerIdShift) |
                sequence;
    }

    private long tilNextMillis(long lastTimestamp) {
        // 获取最新时间戳
        long timestamp = timeGen();
        // 如果发现最新的时间戳小于或者等于序列号已经超4095的那个时间戳
        while (timestamp <= lastTimestamp) {
            // 不符合则继续
            timestamp = timeGen();
        }
        return timestamp;
    }

    private long timeGen() {
        return System.currentTimeMillis();
    }

    public static void main(String[] args) {
        SnowFlake worker = new SnowFlake(1, 1);
        long timer = System.currentTimeMillis();
        for (int i = 0; i < 10000; i++) {
            worker.nextId();
        }
        System.out.println(System.currentTimeMillis());
        System.out.println(System.currentTimeMillis() - timer);
    }

}

`problem analysis`

`1. Why isn't the first one used?`

In computer representation, the first bit is the sign bit, 0 represents an integer, and if the first bit is 1, it represents a negative number. The ID we use is a positive number by default, so the default is 0, so this bit is meaningless by default.

`2. How to use the machine bit?`

Machine bit or computer room bit, a total of 10 bits. If all of them represent machines, then 1024 machines can be represented. If split, 5 bits represent the machine room, and 5 bits represent the machines in the machine room, then there can be 32 machine rooms, each of which can be used 32 machines.

`3. What does twepoch mean?`

Since the timestamp can only be used for 69 years, our timekeeping started in 1970, so this twepoch represents the time from the beginning of the project, and the time when the ID was generated minus twepoch as the timestamp can be used longer.

`4. What does -1L ^ (-1L << x) mean?`

Indicates how many values can be represented by x-bit binary, assuming that x is 3:

In a computer, the first bit is the sign bit, and the inverse code of a negative number is in addition to the sign bit, 1 becomes 0, 0 becomes 1, and the complement is the inverse code +1:

-1L 原码：1000 0001
-1L 反码：1111 1110
-1L 补码：1111 1111

From the above results, we can know that -1L is actually 1 in binary. , then -1L is shifted by 3 bits to the left, in fact, 1111 1000 is obtained, that is, the last 3 bits are 0, and -1L XOR calculation with 06193c77f44f33, in fact The result is that the last 3 digits are all ones. -1L ^ (-1L << x) actually represents the value of x-bits that are all 1, which is the maximum value that the x-bit binary can represent.

`5. Time stamp comparison`

When the obtained timestamp is less than the last obtained timestamp, the ID cannot be generated, but continues to loop until a usable ID is generated. Here, the extension bit is not used to prevent the clock from dialing back.

`6. The direct use of the front end has a loss of accuracy`

If the front-end directly uses the long type id generated by the server, the problem of precision loss will occur, because the Number in JS is 16 digits (referring to the decimal number), and the longest number calculated by the snowflake algorithm is 19 digits. At this time, you need to use String as an intermediate conversion and output to the front end.

`Qin Huai's Viewpoint`

The snowflake algorithm actually relies on the consistency of time. If the time is set back, there may be problems, and the expansion bit is generally used to solve it. However, you can only use the time limit of 69 years. In fact, you can set the number of timestamps to a bit more according to your needs. For example, 42 digits can be used for 139 years, but many companies have to survive first. Of course, the snowflake algorithm is not a silver bullet. It also has shortcomings. It increases on a single machine, while multiple machines only increase roughly, not strictly.

does not have the best design scheme, only suitable and inappropriate schemes.

[Profile of the author] : Qin Huai, [16193c77f45016 Qinhuai Grocery Store ], the road of technology is not at a time, the mountains are high and the rivers are long, even if it is slow, it will never stop. Personal writing direction: Java source code analysis, JDBC , Mybatis , Spring , redis , distributed, sword refers to Offer, LeetCode etc., write each article carefully, don’t like most of the articles in the series. , I cannot guarantee that what I have written is completely correct, but I guarantee that what I have written has been practiced or searched for information. I hope to correct any omissions or errors.

refers to all offer solutions PDF

What did I write in 2020?

Open source programming notes

Interviewer: Talk about the snowflake algorithm, the more detailed the better

SnowFlake algorithm

Algorithm specific introduction

optimization

Time callback problem

`Code display`

`problem analysis`

`1. Why isn't the first one used?`

`2. How to use the machine bit?`

`3. What does twepoch mean?`

`4. What does -1L ^ (-1L << x) mean?`

`5. Time stamp comparison`

`6. The direct use of the front end has a loss of accuracy`

`Qin Huai's Viewpoint`

秦怀杂货店

`引用和评论`

Redis【2】- SDS源码分析

深度解析：通过 AIBrix 多节点部署 DeepSeek-R1 671B 模型

百万架构师第三十课：协调服务-zookeeper：了解zookeeper的核心原理｜JavaGuide

海量数据融合互通丨TiDB 在安徽省住房公积金监管服务平台的应用实践

百万架构师第二十九课：协调服务-zookeeper：初步认识zookeeper｜JavaGuide

演讲实录|分布式 Python 计算服务 MaxFrame 介绍及场景应用方案

架构师必看！现代应用架构发展趋势与数据库选型建议丨TiDB vs MySQL 专题（一）