kafka 总是在启动一段时间后自动关闭,区间大概是 2.3h~3h 之间。
kafka server log 没有错误抛出,log 和正常关闭一模一样。
kafka-server.log:
[2016-01-11 12:15:06,435] INFO Verifying properties (kafka.utils.VerifiableProperties)
[2016-01-11 12:15:06,489] INFO Property broker.id is overridden to 0 (kafka.utils.VerifiableProperties)
[2016-01-11 12:15:06,489] INFO Property host.name is overridden to 172.16.1.22 (kafka.utils.VerifiableProperties)
[2016-01-11 12:15:06,490] INFO Property log.cleaner.enable is overridden to false (kafka.utils.VerifiableProperties)
[2016-01-11 12:15:06,490] INFO Property log.dirs is overridden to /data/www/wifiin/logs/kafka (kafka.utils.VerifiableProperties)
[2016-01-11 12:15:06,490] INFO Property log.retention.check.interval.ms is overridden to 300000 (kafka.utils.VerifiableProperties)
[2016-01-11 12:15:06,491] INFO Property log.retention.hours is overridden to 168 (kafka.utils.VerifiableProperties)
[2016-01-11 12:15:06,491] INFO Property log.segment.bytes is overridden to 1073741824 (kafka.utils.VerifiableProperties)
[2016-01-11 12:15:06,491] INFO Property num.io.threads is overridden to 8 (kafka.utils.VerifiableProperties)
[2016-01-11 12:15:06,491] INFO Property num.network.threads is overridden to 3 (kafka.utils.VerifiableProperties)
[2016-01-11 12:15:06,492] INFO Property num.partitions is overridden to 1 (kafka.utils.VerifiableProperties)
[2016-01-11 12:15:06,492] INFO Property num.recovery.threads.per.data.dir is overridden to 1 (kafka.utils.VerifiableProperties)
[2016-01-11 12:15:06,492] INFO Property port is overridden to 9092 (kafka.utils.VerifiableProperties)
[2016-01-11 12:15:06,493] INFO Property socket.receive.buffer.bytes is overridden to 102400 (kafka.utils.VerifiableProperties)
[2016-01-11 12:15:06,493] INFO Property socket.request.max.bytes is overridden to 104857600 (kafka.utils.VerifiableProperties)
[2016-01-11 12:15:06,493] INFO Property socket.send.buffer.bytes is overridden to 102400 (kafka.utils.VerifiableProperties)
[2016-01-11 12:15:06,493] INFO Property zookeeper.connect is overridden to 127.0.0.1:2181 (kafka.utils.VerifiableProperties)
[2016-01-11 12:15:06,494] INFO Property zookeeper.connection.timeout.ms is overridden to 6000 (kafka.utils.VerifiableProperties)
[2016-01-11 12:15:06,550] INFO [Kafka Server 0], starting (kafka.server.KafkaServer)
[2016-01-11 12:15:06,553] INFO [Kafka Server 0], Connecting to zookeeper on 127.0.0.1:2181 (kafka.server.KafkaServer)
[2016-01-11 12:15:06,568] INFO Starting ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread)
[2016-01-11 12:15:06,579] INFO Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT (org.apache.zookeeper.ZooKeeper)
[2016-01-11 12:15:06,579] INFO Client environment:host.name=wifiin-analysis-22 (org.apache.zookeeper.ZooKeeper)
[2016-01-11 12:15:06,579] INFO Client environment:java.version=1.7.0_40 (org.apache.zookeeper.ZooKeeper)
[2016-01-11 12:15:06,579] INFO Client environment:java.vendor=Oracle Corporation (org.apache.zookeeper.ZooKeeper)
[2016-01-11 12:15:06,579] INFO Client environment:java.home=/usr/local/java/jdk1.7.0_40/jre (org.apache.zookeeper.ZooKeeper)
[2016-01-11 12:15:06,579] INFO Client environment:java.class.path=.:/usr/local/java/latest//lib/dt.jar:/usr/local/java/latest//lib/tools.jar:/usr/local/kafka/bin/../core/build/dependant-libs-2.10.4*/*.jar:/usr/local/kafka/bin/../examples/build/libs//kafka-examples*.jar:/usr/local/kafka/bin/../contrib/hadoop-consumer/build/libs//kafka-hadoop-consumer*.jar:/usr/local/kafka/bin/../contrib/hadoop-producer/build/libs//kafka-hadoop-producer*.jar:/usr/local/kafka/bin/../clients/build/libs/kafka-clients*.jar:/usr/local/kafka/bin/../libs/jopt-simple-3.2.jar:/usr/local/kafka/bin/../libs/kafka_2.9.1-0.8.2.1.jar:/usr/local/kafka/bin/../libs/kafka_2.9.1-0.8.2.1-javadoc.jar:/usr/local/kafka/bin/../libs/kafka_2.9.1-0.8.2.1-scaladoc.jar:/usr/local/kafka/bin/../libs/kafka_2.9.1-0.8.2.1-sources.jar:/usr/local/kafka/bin/../libs/kafka_2.9.1-0.8.2.1-test.jar:/usr/local/kafka/bin/../libs/kafka-clients-0.8.2.1.jar:/usr/local/kafka/bin/../libs/log4j-1.2.16.jar:/usr/local/kafka/bin/../libs/lz4-1.2.0.jar:/usr/local/kafka/bin/../libs/metrics-core-2.2.0.jar:/usr/local/kafka/bin/../libs/scala-library-2.9.1.jar:/usr/local/kafka/bin/../libs/slf4j-api-1.7.6.jar:/usr/local/kafka/bin/../libs/slf4j-log4j12-1.6.1.jar:/usr/local/kafka/bin/../libs/snappy-java-1.1.1.6.jar:/usr/local/kafka/bin/../libs/zkclient-0.3.jar:/usr/local/kafka/bin/../libs/zookeeper-3.4.6.jar:/usr/local/kafka/bin/../core/build/libs/kafka_2.10*.jar (org.apache.zookeeper.ZooKeeper)
[2016-01-11 12:15:06,579] INFO Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib (org.apache.zookeeper.ZooKeeper)
[2016-01-11 12:15:06,580] INFO Client environment:java.io.tmpdir=/tmp (org.apache.zookeeper.ZooKeeper)
[2016-01-11 12:15:06,580] INFO Client environment:java.compiler=<NA> (org.apache.zookeeper.ZooKeeper)
[2016-01-11 12:15:06,580] INFO Client environment:os.name=Linux (org.apache.zookeeper.ZooKeeper)
[2016-01-11 12:15:06,580] INFO Client environment:os.arch=amd64 (org.apache.zookeeper.ZooKeeper)
[2016-01-11 12:15:06,580] INFO Client environment:os.version=2.6.32-573.3.1.el6.x86_64 (org.apache.zookeeper.ZooKeeper)
[2016-01-11 12:15:06,580] INFO Client environment:user.name=root (org.apache.zookeeper.ZooKeeper)
[2016-01-11 12:15:06,580] INFO Client environment:user.home=/root (org.apache.zookeeper.ZooKeeper)
[2016-01-11 12:15:06,580] INFO Client environment:user.dir=/usr/local/kafka (org.apache.zookeeper.ZooKeeper)
[2016-01-11 12:15:06,581] INFO Initiating client connection, connectString=127.0.0.1:2181 sessionTimeout=6000 watcher=org.I0Itec.zkclient.ZkClient@52d73384 (org.apache.zookeeper.ZooKeeper)
[2016-01-11 12:15:06,609] INFO Opening socket connection to server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2016-01-11 12:15:06,614] INFO Socket connection established to 127.0.0.1/127.0.0.1:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2016-01-11 12:15:06,661] INFO Session establishment complete on server 127.0.0.1/127.0.0.1:2181, sessionid = 0x15210b1e3051357, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn)
[2016-01-11 12:15:06,664] INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient)
[2016-01-11 12:15:06,915] INFO Loading logs. (kafka.log.LogManager)
[2016-01-11 12:15:07,021] INFO Completed load of log sdkLog-0 with log end offset 209376831 (kafka.log.Log)
[2016-01-11 12:15:07,039] INFO Logs loading complete. (kafka.log.LogManager)
[2016-01-11 12:15:07,040] INFO Starting log cleanup with a period of 300000 ms. (kafka.log.LogManager)
[2016-01-11 12:15:07,044] INFO Starting log flusher with a default period of 9223372036854775807 ms. (kafka.log.LogManager)
[2016-01-11 12:15:17,096] INFO Awaiting socket connections on 172.16.1.22:9092. (kafka.network.Acceptor)
[2016-01-11 12:15:17,097] INFO [Socket Server on Broker 0], Started (kafka.network.SocketServer)
[2016-01-11 12:15:17,195] INFO Will not load MX4J, mx4j-tools.jar is not in the classpath (kafka.utils.Mx4jLoader$)
[2016-01-11 12:15:17,234] INFO 0 successfully elected as leader (kafka.server.ZookeeperLeaderElector)
[2016-01-11 12:15:17,500] INFO Registered broker 0 at path /brokers/ids/0 with address 172.16.1.22:9092. (kafka.utils.ZkUtils$)
[2016-01-11 12:15:17,502] INFO New leader is 0 (kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
[2016-01-11 12:15:17,515] INFO [Kafka Server 0], started (kafka.server.KafkaServer)
[2016-01-11 12:15:17,795] INFO [ReplicaFetcherManager on broker 0] Removed fetcher for partitions [sdkLog,0] (kafka.server.ReplicaFetcherManager)
[2016-01-11 12:15:17,856] INFO [ReplicaFetcherManager on broker 0] Removed fetcher for partitions [sdkLog,0] (kafka.server.ReplicaFetcherManager)
[2016-01-11 14:47:41,816] INFO [Kafka Server 0], shutting down (kafka.server.KafkaServer)
[2016-01-11 14:47:41,818] INFO [Kafka Server 0], Starting controlled shutdown (kafka.server.KafkaServer)
[2016-01-11 14:47:41,981] INFO [Kafka Server 0], Controlled shutdown succeeded (kafka.server.KafkaServer)
[2016-01-11 14:47:41,983] INFO Closing socket connection to /172.16.1.22. (kafka.network.Processor)
[2016-01-11 14:47:41,984] INFO [Socket Server on Broker 0], Shutting down (kafka.network.SocketServer)
[2016-01-11 14:47:41,989] INFO [Socket Server on Broker 0], Shutdown completed (kafka.network.SocketServer)
[2016-01-11 14:47:41,991] INFO [Kafka Request Handler on Broker 0], shutting down (kafka.server.KafkaRequestHandlerPool)
[2016-01-11 14:47:41,994] INFO [Kafka Request Handler on Broker 0], shut down completely (kafka.server.KafkaRequestHandlerPool)
[2016-01-11 14:47:42,267] INFO [Replica Manager on Broker 0]: Shut down (kafka.server.ReplicaManager)
[2016-01-11 14:47:42,267] INFO [ReplicaFetcherManager on broker 0] shutting down (kafka.server.ReplicaFetcherManager)
[2016-01-11 14:47:42,269] INFO [ReplicaFetcherManager on broker 0] shutdown completed (kafka.server.ReplicaFetcherManager)
[2016-01-11 14:47:42,314] INFO [Replica Manager on Broker 0]: Shut down completely (kafka.server.ReplicaManager)
[2016-01-11 14:47:42,315] INFO Shutting down. (kafka.log.LogManager)
[2016-01-11 14:47:42,376] INFO Shutdown complete. (kafka.log.LogManager)
[2016-01-11 14:47:42,384] INFO Terminate ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread)
[2016-01-11 14:47:42,396] INFO Session: 0x15210b1e3051357 closed (org.apache.zookeeper.ZooKeeper)
[2016-01-11 14:47:42,396] INFO EventThread shut down (org.apache.zookeeper.ClientCnxn)
[2016-01-11 14:47:42,396] INFO [Kafka Server 0], shut down completed (kafka.server.KafkaServer)
一开始我以为是 centos 的 OOM Killer 关了它,但是当我改了 oom_score 之后依旧不行。而且在/val/log/message 里面没有找的相关操作的 log。
我现在应该做些什么能修复他,或者进一步的确定错误原因?
万分感谢!
修改与 2016-01-18:
经过查看 kafka 的启动脚本,上周尝试使用
bin/kafka-server-start.sh -daemon ./config/server.properties
进行启动,到现在为止 kafka 还在正常运行。
和不加 -daemon 区别在于:
bin/kafka-run-class.sh
# Launch mode
if [ "x$DAEMON_MODE" = "xtrue" ]; then
#加 daemon 会使用该命令
nohup $JAVA $KAFKA_HEAP_OPTS $KAFKA_JVM_PERFORMANCE_OPTS $KAFKA_GC_LOG_OPTS $KAFKA_JMX_OPTS $KAFKA_LOG4J_OPTS -cp $CLASSPATH $KAFKA_OPTS "$@" > "$CONSOLE_OUTPUT_FILE" 2>&1 < /dev/null &
else
#不加时使用的命令
exec $JAVA $KAFKA_HEAP_OPTS $KAFKA_JVM_PERFORMANCE_OPTS $KAFKA_GC_LOG_OPTS $KAFKA_JMX_OPTS $KAFKA_LOG4J_OPTS -cp $CLASSPATH $KAFKA_OPTS "$@"
fi
至于为什么以守护进程模式启动 ok。 这个原因还没有找到
机器配置?集群情况?啥都没有...只有日志...
zookeeper和broker都需要一定资源...每个2G内存才能保证一定稳定性...
这俩放在一起也会降低稳定性...