The interviewer asked me how the order ID was generated? Isn&#39;t MySQL self-incrementing primary key?

A beautiful interviewer sat across from me, and the MacBook with the glowing logo couldn't stop her round and lovely face.

Cheng Yuan is rare, and beautiful interviewers are even harder to find. What exactly does it look like? Like the following:

王冰冰-4525820.jpg

Such a gentle and lovely interviewer should not embarrass me. Well, it should be, after all, I'm so handsome, the interview may be just a cutscene. Is the beauty interviewer single? After all, programmers are not good at communicating, because I am also single, so my marriage is doomed here. I have already thought of the child's name. One ice! Good name.

Interviewer: Boy, why are you smiling with your head down? Start the interview, do you know how the order ID is generated?

What? How to generate order ID?
Why don't beauties play cards according to the routine! HashMap implementation principle, I have been recited, you don't ask. What is the order ID?

Me: How can it be generated? Use the database primary key to auto-increment.

Interviewer: No way. The primary key sequence of the database increases automatically, and the number of orders every day is clearly seen by competitors, and commercial secrets are exposed.
Moreover, a single-machine MySQL can only support several hundred orders of magnitude of concurrency. Our company has tens of millions of orders every day, and we can't hold it.

Me: Well, then use a database cluster, the starting value of the auto-increment ID is numbered by the machine, and the step size is equal to the number of machines.
For example, there are two machines, the IDs generated by the first machine are 1, 3, 5, 7, and the IDs generated by the second machine are 2, 4, 6, 8. If the performance is not good, add a machine, and this concurrency der will go up.

Interviewer: Boy, you have a good idea. Have you ever thought that to achieve a million-level concurrency, about 2,000 machines are needed. You are only used to generate order IDs. No matter how rich the company is, it cannot afford to do this.

Me: Since the concurrency of MySQL is not enough, can we obtain a batch of self-incrementing IDs from MySQL in advance, load them into the local memory, and then fetch them from the memory concurrently. Isn’t this concurrency performance a great deal?

Interviewer: You're quite on the road, this kind of calling segment mode. The concurrency has increased, but the auto-increment ID still cannot be used as the order ID.

Me: How about using Java's own UUID?

 import java.util.UUID;

/**
 * @author yideng
 * @apiNote UUID示例
 */
public class UUIDTest {
    public static void main(String[] args) {
        String orderId = UUID.randomUUID().toString().replace("-", "");
        System.out.println(orderId);
    }
}

Output result:

 58e93ecab9c64295b15f7f4661edcbc1

Interviewer: No way. 32-bit strings take up more space, and unordered strings are used as database primary keys. Every time MySQL is inserted into the database, in order to maintain the B+ tree structure, it needs to frequently adjust the node order, which affects performance. Besides, the string is too long and has no business meaning, pass.

Young man, you may not have participated in the e-commerce system. Let me first tell you what conditions must be met to generate an order ID:

Globally unique: If the order ID is repeated, it will definitely be finished.
High performance: To achieve high concurrency and low latency. The generation of order IDs has become a bottleneck, and that's okay.
High availability: At least 4 9s must be achieved, and it will not crash at any time.
Ease of use: If hundreds of servers are built to meet the above requirements, it is complex and difficult to maintain.
Numerical and orderly incrementing: Numerical values take up less space, and orderly incrementing can ensure higher performance when inserting into MySQL.
Embedded business meaning: If the business meaning can be embedded in the order ID, you can know which business line generated the order ID, which is convenient for troubleshooting.

I wipe, generate a small order ID, and come up with so many rules, can I still play? Is it possible to kneel in today's interview? I have been subscribing to Yideng's articles, this one is still hard to live with me, and it is serious to play with beautiful programmers.

Me: I heard that there is a long-standing distributed, high-performance, and high-availability order ID generation algorithm - Snowflake Algorithm, which can fully meet your above requirements. The ID generated by the snowflake algorithm is of type Long and has a length of 64 bits.

雪花算法.jpeg

Bit 1: Sign bit, temporarily unused.
Bits 2 to 42: a total of 41 bits, timestamp, in milliseconds, can support about 69 years
43~52 digits: 10 digits in total, machine ID, can accommodate up to 1024 machines
The 53rd to 64th digits: a total of 12 digits, the serial number, which is self-incrementing, indicating the ID generated within the same millisecond. A single machine can generate up to 4096 order IDs per millisecond.

Code:

 /**
 * @author 一灯架构
 * @apiNote 雪花算法
 **/
public class SnowFlake {

    /**
     * 起始时间戳，从2021-12-01开始生成
     */
    private final static long START_STAMP = 1638288000000L;

    /**
     * 序列号占用的位数 12
     */
    private final static long SEQUENCE_BIT = 12;

    /**
     * 机器标识占用的位数
     */
    private final static long MACHINE_BIT = 10;

    /**
     * 机器数量最大值
     */
    private final static long MAX_MACHINE_NUM = ~(-1L << MACHINE_BIT);

    /**
     * 序列号最大值
     */
    private final static long MAX_SEQUENCE = ~(-1L << SEQUENCE_BIT);

    /**
     * 每一部分向左的位移
     */
    private final static long MACHINE_LEFT = SEQUENCE_BIT;
    private final static long TIMESTAMP_LEFT = SEQUENCE_BIT + MACHINE_BIT;

    /**
     * 机器标识
     */
    private long machineId;
    /**
     * 序列号
     */
    private long sequence = 0L;
    /**
     * 上一次时间戳
     */
    private long lastStamp = -1L;

    /**
     * 构造方法
     * @param machineId 机器ID
     */
    public SnowFlake(long machineId) {
        if (machineId > MAX_MACHINE_NUM || machineId < 0) {
            throw new RuntimeException("机器超过最大数量");
        }
        this.machineId = machineId;
    }

    /**
     * 产生下一个ID
     */
    public synchronized long nextId() {
        long currStamp = getNewStamp();
        if (currStamp < lastStamp) {
            throw new RuntimeException("时钟后移，拒绝生成ID！");
        }

        if (currStamp == lastStamp) {
            // 相同毫秒内，序列号自增
            sequence = (sequence + 1) & MAX_SEQUENCE;
            // 同一毫秒的序列数已经达到最大
            if (sequence == 0L) {
                currStamp = getNextMill();
            }
        } else {
            // 不同毫秒内，序列号置为0
            sequence = 0L;
        }

        lastStamp = currStamp;

        return (currStamp - START_STAMP) << TIMESTAMP_LEFT // 时间戳部分
                | machineId << MACHINE_LEFT             // 机器标识部分
                | sequence;                             // 序列号部分
    }

    private long getNextMill() {
        long mill = getNewStamp();
        while (mill <= lastStamp) {
            mill = getNewStamp();
        }
        return mill;
    }

    private long getNewStamp() {
        return System.currentTimeMillis();
    }

    public static void main(String[] args) {
        // 订单ID生成测试，机器ID指定第0台
        SnowFlake snowFlake = new SnowFlake(0);
        System.out.println(snowFlake.nextId());
    }
}

Output result:

 6836348333850624

Access is very simple, no need to build a service cluster. The code logic is very simple, within the same millisecond, the serial number of the order ID is incremented automatically. The synchronization lock only acts on the local machine, and the machines do not affect each other. Four million order IDs can be generated every millisecond, which is very powerful.

The generation rules are not fixed and can be adjusted according to their own business needs. If you don't need such a large amount of concurrency, you can take out a part of the machine ID and use it as a business ID to identify which business line generated the order ID.

Interviewer: Young man, I have something to hide. To ask a more difficult question, do you think there is room for improvement in the Snowflake algorithm?

You really broke the casserole and asked to the end, don't ask me to get down and don't end. Fortunately, I glanced at Yideng's article before I came.

Me: Yes, the Snowflake algorithm relies heavily on the system clock. If the clock rolls back, duplicate IDs are generated.

Interviewer: Is there any solution?

Me: If there is a question, there will be an answer. For example, Meituan's Leaf (Meituan's self-developed distributed ID generation system) introduced zookeeper in order to solve the clock callback. The principle is also very simple, which is to compare the current system time with the time of the generation node.

Some systems with higher concurrency requirements, such as the Double Eleven spike, can not meet the requirements of 4 million concurrency per millisecond, you can use the snowflake algorithm combined with the number segment mode, such as Baidu's UidGenerator, Didi's TinyId. Think about it too, the pre-generated IDs of the number segment pattern are definitely the ultimate solution for high-performance distributed order IDs.

Interviewer: Young man, I see on your resume that you have resigned. Come to work tomorrow, pay double, that's all.

The interviewer asked me how the order ID was generated? Isn't MySQL self-incrementing primary key?

一灯架构

引用和评论

三道MySQL联合索引面试题，淘汰80%的面试者，你能答对几道

Java8的新特性

Java11的新特性

Java5的新特性

Java9的新特性

Java13的新特性

Java7的新特性