Is the snowflake algorithm really useful for System.currentTimeMillis() optimization?

I have already talked about the snowflake algorithm, which uses System.currentTimeMillis() obtain the time. There is a saying that System.currentTimeMillis() slow because it will deal with the system once every time it is called. In the case of high concurrency, a large number of concurrent system calls are likely to affect performance (to call it, even more than new a normal object to be time-consuming, after all new objects produced only in Java heap memory). We can see that it calls the native method:

// 返回当前时间，以毫秒为单位。注意，虽然返回值的时间单位是毫秒，但值的粒度取决于底层操作系统，可能更大。例如，许多操作系统以数十毫秒为单位度量时间。
public static native long currentTimeMillis();

So some people propose to use background threads to update the clock regularly, and it is a singleton, to avoid dealing with the system every time, and to avoid frequent thread switching, which may improve efficiency.

Does this optimization hold?

Optimize the code first:

package snowflake;
import java.util.concurrent.Executors;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicLong;

public class SystemClock {

    private final int period;

    private final AtomicLong now;

    private static final SystemClock INSTANCE = new SystemClock(1);

    private SystemClock(int period) {
        this.period = period;
        now = new AtomicLong(System.currentTimeMillis());
        scheduleClockUpdating();
    }

    private void scheduleClockUpdating() {
        ScheduledExecutorService scheduleService = Executors.newSingleThreadScheduledExecutor((r) -> {
            Thread thread = new Thread(r);
            thread.setDaemon(true);
            return thread;
        });
        scheduleService.scheduleAtFixedRate(() -> {
            now.set(System.currentTimeMillis());
        }, 0, period, TimeUnit.MILLISECONDS);
    }

    private long get() {
        return now.get();
    }

    public static long now() {
        return INSTANCE.get();
    }

}

Just use SystemClock.now() replace System.currentTimeMillis() can be.

The code of the snowflake algorithm SnowFlake is also placed here:

package snowflake;

public class SnowFlake {

    // 数据中心(机房) id
    private long datacenterId;
    // 机器ID
    private long workerId;
    // 同一时间的序列
    private long sequence;

    public SnowFlake(long workerId, long datacenterId) {
        this(workerId, datacenterId, 0);
    }

    public SnowFlake(long workerId, long datacenterId, long sequence) {
        // 合法判断
        if (workerId > maxWorkerId || workerId < 0) {
            throw new IllegalArgumentException(String.format("worker Id can't be greater than %d or less than 0", maxWorkerId));
        }
        if (datacenterId > maxDatacenterId || datacenterId < 0) {
            throw new IllegalArgumentException(String.format("datacenter Id can't be greater than %d or less than 0", maxDatacenterId));
        }
        System.out.printf("worker starting. timestamp left shift %d, datacenter id bits %d, worker id bits %d, sequence bits %d, workerid %d",
                timestampLeftShift, datacenterIdBits, workerIdBits, sequenceBits, workerId);

        this.workerId = workerId;
        this.datacenterId = datacenterId;
        this.sequence = sequence;
    }

    // 开始时间戳（2021-10-16 22:03:32）
    private long twepoch = 1634393012000L;

    // 机房号，的ID所占的位数 5个bit 最大:11111(2进制)--> 31(10进制)
    private long datacenterIdBits = 5L;

    // 机器ID所占的位数 5个bit 最大:11111(2进制)--> 31(10进制)
    private long workerIdBits = 5L;

    // 5 bit最多只能有31个数字，就是说机器id最多只能是32以内
    private long maxWorkerId = -1L ^ (-1L << workerIdBits);

    // 5 bit最多只能有31个数字，机房id最多只能是32以内
    private long maxDatacenterId = -1L ^ (-1L << datacenterIdBits);

    // 同一时间的序列所占的位数 12个bit 111111111111 = 4095  最多就是同一毫秒生成4096个
    private long sequenceBits = 12L;

    // workerId的偏移量
    private long workerIdShift = sequenceBits;

    // datacenterId的偏移量
    private long datacenterIdShift = sequenceBits + workerIdBits;

    // timestampLeft的偏移量
    private long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits;

    // 序列号掩码 4095 (0b111111111111=0xfff=4095)
    // 用于序号的与运算，保证序号最大值在0-4095之间
    private long sequenceMask = -1L ^ (-1L << sequenceBits);

    // 最近一次时间戳
    private long lastTimestamp = -1L;


    // 获取机器ID
    public long getWorkerId() {
        return workerId;
    }


    // 获取机房ID
    public long getDatacenterId() {
        return datacenterId;
    }


    // 获取最新一次获取的时间戳
    public long getLastTimestamp() {
        return lastTimestamp;
    }


    // 获取下一个随机的ID
    public synchronized long nextId() {
        // 获取当前时间戳，单位毫秒
        long timestamp = timeGen();

        if (timestamp < lastTimestamp) {
            System.err.printf("clock is moving backwards.  Rejecting requests until %d.", lastTimestamp);
            throw new RuntimeException(String.format("Clock moved backwards.  Refusing to generate id for %d milliseconds",
                    lastTimestamp - timestamp));
        }

        // 去重
        if (lastTimestamp == timestamp) {

            sequence = (sequence + 1) & sequenceMask;

            // sequence序列大于4095
            if (sequence == 0) {
                // 调用到下一个时间戳的方法
                timestamp = tilNextMillis(lastTimestamp);
            }
        } else {
            // 如果是当前时间的第一次获取，那么就置为0
            sequence = 0;
        }

        // 记录上一次的时间戳
        lastTimestamp = timestamp;

        // 偏移计算
        return ((timestamp - twepoch) << timestampLeftShift) |
                (datacenterId << datacenterIdShift) |
                (workerId << workerIdShift) |
                sequence;
    }

    private long tilNextMillis(long lastTimestamp) {
        // 获取最新时间戳
        long timestamp = timeGen();
        // 如果发现最新的时间戳小于或者等于序列号已经超4095的那个时间戳
        while (timestamp <= lastTimestamp) {
            // 不符合则继续
            timestamp = timeGen();
        }
        return timestamp;
    }

    private long timeGen() {
        return SystemClock.now();
        // return System.currentTimeMillis();
    }

    public static void main(String[] args) {
        SnowFlake worker = new SnowFlake(1, 1);
        long timer = System.currentTimeMillis();
        for (int i = 0; i < 10000000; i++) {
            worker.nextId();
        }
        System.out.println(System.currentTimeMillis());
        System.out.println(System.currentTimeMillis() - timer);
    }
}

Windows: i5-4590 16G memory, 4 cores, 512 solid state

Mac: Mac pro 2020 512G solid state 16G RAM

Linux: deepin system, virtual machine, 160G disk, 8G memory

System.currentTimeMillis() single-threaded environment:

Platform/data volume	10000	1000000	10000000	100000000
mac	5	247	2444	24416
windows	3	249	2448	24426
linux(deepin)	135	598	4076	26388

SystemClock.now() single-threaded environment:

Platform/data volume	10000	1000000	10000000	100000000
mac	52	299	2501	24674
windows	56	3942	38934	389983
linux(deepin)	336	1226	4454	27639

The above single-threaded test does not reflect the advantages of background clock thread processing. On the contrary, under Windows, when the amount of data is large, it becomes abnormally slow. On the Linux system, it is not fast, but slower.

Multi-threaded test code:

    public static void main(String[] args) throws InterruptedException {
        int threadNum = 16;
        CountDownLatch countDownLatch = new CountDownLatch(threadNum);
        int num = 100000000 / threadNum;
        long timer = System.currentTimeMillis();
        thread(num, countDownLatch);
        countDownLatch.await();
        System.out.println(System.currentTimeMillis() - timer);

    }

    public static void thread(int num, CountDownLatch countDownLatch) {
        List<Thread> threadList = new ArrayList<>();
        for (int i = 0; i < countDownLatch.getCount(); i++) {
            Thread cur = new Thread(new Runnable() {
                @Override
                public void run() {
                    SnowFlake worker = new SnowFlake(1, 1);
                    for (int i = 0; i < num; i++) {
                        worker.nextId();
                    }
                    countDownLatch.countDown();
                }
            });
            threadList.add(cur);
        }
        for (Thread t : threadList) {
            t.start();
        }
    }

Below we use different threads to test 100000000 (one hundred million) data volume System.currentTimeMillis() :

Platform/thread	2	4	8	16
mac	14373	6132	3410	3247
windows	12408	6862	6791	7114
linux	20753	19055	18919	19602

Use different threads to test 100000000 (one hundred million) data volume SystemClock.now() :

Platform/thread	2	4	8	16
mac	12319	6275	3691	3746
windows	194763	110442	153960	174974
linux	26516	25313	25497	25544

In the case of multi-threading, we can see that there is not much change on the mac. As the number of threads increases, the speed becomes faster, until it exceeds 8, but it is obviously slower on Windows. When I tested it, I started to refresh the small video before running out of results. And this data is also related to the core of the processor. When the number of threads in windows exceeds 4, it becomes slower. The reason is that my machine has only four cores. If it exceeds 4, many context switches will occur.

On Linux, due to the virtual machine, when the number of cores increases, it does not have much effect, but the time is actually slower System.currentTimeMillis()

But there is still a question. Which method has the highest probability of time duplication when calling different methods?

    static AtomicLong atomicLong = new AtomicLong(0);
    private long timeGen() {
        atomicLong.incrementAndGet();
        // return SystemClock.now();
        return System.currentTimeMillis();
    }

The following are 10 million IDs, eight threads, and timeGen() is measured, that is, the number of time conflicts can be seen:

Platform/Method	SystemClock.now()	System.currentTimeMillis()
mac	23067209	12896314
windows	705460039	35164476
linux	1165552352	81422626

It can be seen that SystemClock.now() maintains its own time. It is more likely to get the same time, which will trigger more repeated calls and increase the number of conflicts. This is a disadvantage! There is also a cruel fact, that is, the background time defined by oneself is refreshed, and the obtained time is not so accurate. The gap in Linux is even greater, and there are too many time conflicts.

result

In actual testing, it was not found that SystemClock.now() could optimize the efficiency very much. On the contrary, due to competition, the possibility of obtaining time conflicts is greater. JDK developers are really not stupid. They should have been tested for a long time and are much more reliable than our own tests. Therefore, my personal opinion proves that this optimization is not so reliable.

Do not believe a certain conclusion lightly. If you have any doubt, please do an experiment or find a sufficiently authoritative statement.

[Profile of the author] :
Qin Huai, [161a63c83c683c Qinhuai Grocery Store ], the road of technology is not at a time, the mountains are high and the rivers are long, even if it is slow, it will never stop. Personal writing direction: Java source code analysis, JDBC , Mybatis , Spring , redis , distributed, , write a series of articles that do not like to write serious articles, LeetCode , I don’t like to write a series of articles seriously, 061a63c83c6846 , I cannot guarantee that what I have written is completely correct, but I guarantee that what I have written has been practiced or searched for information. I hope to correct any omissions or errors.

refers to all offer solutions PDF

What did I write in 2020?

open source programming notes

Is the snowflake algorithm really useful for System.currentTimeMillis() optimization?

Does this optimization hold?

result

秦怀杂货店

`引用和评论`

Redis【2】- SDS源码分析

大模型时代，后端程序员如何避免被AI卷死？

C++ 中 VS 项目引入公共配置文件

疯狂推荐！从零开始 Dify 部署全攻略！

Cherry Studio 入门 MCP：为你的大模型插上翅膀

OpenWebUI：一站式 AI 应用构建平台体验

狂揽17k star！Docker可视化神器，一键部署项目真香！