为便于查看分布在多个机器上的应用日志,常需要聚合日志. 以下是个人的实现总结

市面上的做法

  1. elk (es, loglash, konlia): loglash 做日志聚合(以appender)的形式, es 做存储, konlia 做可视化.

简单实现

思路: 消息放mq,消费端有序消费,记录日志
  • 详细描述:
  1. web入口可以通过servlet filter 生成整个调用链的唯一 TraceId (可以由 请求url,业务端标识共同构成)
  2. 通过dubbo 的filter, 实现 调用开始的时候将traceId传入到threadLocal. 消费端调用服务端时,将traceId. 作为额外属性传参,服务端filter 接收到参数时放入到ThreadLocal 中从而达到标记同一个请求的目的.
  3. 自定义 logback的 Appender, 拿到ThreadLocal中的 TraceId,如果是相同的则放入RocketMq 的同一个队列,达到顺序消费的目的.继而保证记录日志是有序的.
  4. 消费日志时,按照队列循环打印,因为每个队列里面的对应的是整个的一个调用链的日志, 所以按照队列打印是时序正常的. 更有利于查看日志
  • 注意点:
  1. 对于消息生产者而言, 记录日志时,虽然是顺序发送的, 但不能保证先发出的就先到达队列. 兼顾性能, 又不能采用rocketmq 的同步发送消息形式.采用sendOneWay的方式效率快,但不保证到达队列里是有序的.
  2. 不能用 logback 的 AsyncAppender 包装自己实现的 appender, 因为全局 traceId保存在threadLocal 中,AsyncAppender 打印日志会在新启一个线程打印日志, 之前的ThreadLocal 中的TraceId 就获取不到了.
  • 示例代码
主要类放github上了

https://github.com/normalHeFei/normal_try/tree/master/java/src/main/java/wk/clulog

logback.xml 

<?xml version="1.0" encoding="UTF-8"?>
<configuration>

    <appender name="mqAppender1" class="wk.clulog.RocketMqAppender">
        <param name="Tag" value="logTag" />
        <param name="Topic" value="logTopic" />
        <param name="ProducerGroup" value="logGroup" />
        <param name="NameServerAddress" value="192.168.103.3:9876"/>
        <layout class="ch.qos.logback.classic.PatternLayout">
            <pattern>%date %p %t - %m%n</pattern>
        </layout>
    </appender>

    <appender name="mqAsyncAppender1" class="ch.qos.logback.classic.AsyncAppender">
        <queueSize>1024</queueSize>
        <discardingThreshold>80</discardingThreshold>
        <maxFlushTime>2000</maxFlushTime>
        <neverBlock>true</neverBlock>
        <appender-ref ref="mqAppender1"/>
    </appender>

    <root level="INFO">
        <appender-ref ref="mqAppender1"/>
    </root>

</configuration>

rocketmq 相关实现代码走读

  • producer几种发送方式实现
  1. sendOneWay / sendAsync:

根据负载均衡策略选取broker,获取channel 直接发送,虽然是sendOneWay但对并发发送的数量,rocketMq其实用信号量保护了一下最大的并发数,相关代码如下

public void invokeOnewayImpl(final Channel channel, final RemotingCommand request, final long timeoutMillis)
        throws InterruptedException, RemotingTooMuchRequestException, RemotingTimeoutException, RemotingSendRequestException {
        request.markOnewayRPC();
        boolean acquired = this.semaphoreOneway.tryAcquire(timeoutMillis, TimeUnit.MILLISECONDS);
        if (acquired) {
            //将信号量 的 release 用 cas 包装了一下,避免多线程环境下多个release重复操作
            final SemaphoreReleaseOnlyOnce once = new SemaphoreReleaseOnlyOnce(this.semaphoreOneway);
            try {
                channel.writeAndFlush(request).addListener(new ChannelFutureListener() {
                    @Override
                    public void operationComplete(ChannelFuture f) throws Exception {
                        once.release();
                        if (!f.isSuccess()) {
                            log.warn("send a request command to channel <" + channel.remoteAddress() + "> failed.");
                        }
                    }
                });
            } catch (Exception e) {
                once.release();
                log.warn("write send a request command to channel <" + channel.remoteAddress() + "> failed.");
                throw new RemotingSendRequestException(RemotingHelper.parseChannelRemoteAddr(channel), e);
            }
        } else {
            if (timeoutMillis <= 0) {
                throw new RemotingTooMuchRequestException("invokeOnewayImpl invoke too fast");
            } else {
                String info = String.format(
                    "invokeOnewayImpl tryAcquire semaphore timeout, %dms, waiting thread nums: %d semaphoreAsyncValue: %d",
                    timeoutMillis,
                    this.semaphoreOneway.getQueueLength(),
                    this.semaphoreOneway.availablePermits()
                );
                log.warn(info);
                throw new RemotingTimeoutException(info);
            }
        }
    }
  1. sendMessageSync

通过countDownLatch实现同步返回. 代码如下:


 channel.writeAndFlush(request).addListener(new ChannelFutureListener() {
        @Override
        public void operationComplete(ChannelFuture f) throws Exception {
            //将结果包装成Future
            if (f.isSuccess()) {
                responseFuture.setSendRequestOK(true);
                return;
            } else {
                responseFuture.setSendRequestOK(false);
            }

            responseTable.remove(opaque);
            responseFuture.setCause(f.cause());
            responseFuture.putResponse(null);
            log.warn("send a request command to channel <" + addr + "> failed.");
        }
});
RemotingCommand responseCommand = responseFuture.waitResponse(timeoutMillis);

//栅栏等待.
public RemotingCommand waitResponse(final long timeoutMillis) throws InterruptedException {
        this.countDownLatch.await(timeoutMillis, TimeUnit.MILLISECONDS);
        return this.responseCommand;
    }
//回调返回结果时,解除栅栏
public void putResponse(final RemotingCommand responseCommand) {
    this.responseCommand = responseCommand;
    this.countDownLatch.countDown();
}
  • consumer 是如何有序消费的

直接看代码


  try {
        //processQueue 为队列消息的处理快照,记录了处理消息的偏移量等信息, 通过对处理队列加锁来实现 单个队列里面消息的顺序消费. 
        this.processQueue.getLockConsume().lock();
        if (this.processQueue.isDropped()) {
            log.warn("consumeMessage, the message queue not be able to consume, because it's dropped. {}",
                this.messageQueue);
            break;
        }

        status = messageListener.consumeMessage(Collections.unmodifiableList(msgs), context);
    } catch (Throwable e) {
        log.warn("consumeMessage exception: {} Group: {} Msgs: {} MQ: {}",
            RemotingHelper.exceptionSimpleDesc(e),
            ConsumeMessageOrderlyService.this.consumerGroup,
            msgs,
            messageQueue);
        hasException = true;
    } finally {
        this.processQueue.getLockConsume().unlock();
    }

niguanwo
1 声望0 粉丝

这里当做笔记本.