Hadoop-cluster installation (high availability)

Hadoop-cluster installation has introduced how to install a Hadoop cluster, but if the NameNode of this cluster fails, the cluster cannot be used. We need to use zookeeper to install a high-availability cluster. When the Active NameNode fails, and the Standby NameNode switches in time, the Active NameNode provides external services.
zookeeper cluster see the previous, the configuration here is bigdata01, bigdata02, bigdata03 are zookeeper nodes, bigdata04 as the observer role of the cluster.
The other steps are the same as the previous cluster installation, the difference is the modification of several configuration files.

Configuration file

hdfs-site.xml

Configure the relevant information of the two namenodes here.

<configuration> 
    <!-- 指定副本数 --> 
    <property> 
        <name>dfs.replication</name> 
        <value>3</value> 
    </property> 
    <!--定义nameservice的名称为hadoopdajun--> 
    <property> 
        <name>dfs.nameservices</name> 
        <value>hadoopdajun</value> 
    </property> 
    <!-- 定义hadoopdajun两个NameNode的名称：nn1，nn2 --> 
    <property> 
        <name>dfs.ha.namenodes.hadoopdajun</name> 
        <value>nn1,nn2</value> 
    </property> 
    <!-- nn1的RPC通信地址，rpc用来和datanode通讯 --> 
    <property> 
        <name>dfs.namenode.rpc-address.hadoopdajun.nn1</name> 
        <value>bigdata01:9000</value> 
    </property> 
    <!-- nn1的http通信地址，用于web客户端 --> 
    <property> 
        <name>dfs.namenode.http-address.hadoopdajun.nn1</name> 
        <value>bigdata01:50070</value>
    </property> 
    <!-- nn2的RPC通信地址，rpc用来和datanode通讯 --> 
    <property> 
        <name>dfs.namenode.rpc-address.hadoopdajun.nn2</name> 
        <value>bigdata02:9000</value> 
    </property> 
    <!-- nn2的http通信地址，用于web客户端 --> 
    <property> 
        <name>dfs.namenode.http-address.hadoopdajun.nn2</name> 
        <value>bigdata02:50070</value> 
    </property> 
    <!-- edits元数据的共享存储位置 --> 
    <property> 
        <name>dfs.namenode.shared.edits.dir</name> 
        <value>qjournal://bigdata01:8485;bigdata02:8485;bigdata03:8485/hadoopdajun</value> 
    </property> 
    <!-- 本地磁盘存放数据的位置 --> 
    <property> 
        <name>dfs.journalnode.edits.dir</name> 
        <value>/home/bigdata/data/journaldata</value> 
    </property> 
    <!-- 开启NameNode失败自动切换 --> 
    <property> 
        <name>dfs.ha.automatic-failover.enabled</name> 
        <value>true</value> 
    </property> 
    <!-- 配置失败自动切换的代理类 --> 
    <property> 
        <name>dfs.client.failover.proxy.provider.hadoopdajun</name> 
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> 
    </property> 
    <!-- 配置隔离机制方法，多个机制用换行分割，即每个机制暂用一行-->
    <property> 
        <name>dfs.ha.fencing.methods</name> 
        <value> 
            sshfence 
            shell(/bin/true) 
        </value> 
    </property> 
    <!-- 使用sshfence隔离机制时需要ssh免登陆 --> 
    <property> 
        <name>dfs.ha.fencing.ssh.private-key-files</name> 
        <value>/home/bigdata/.ssh/id_rsa</value> 
    </property>
    <!-- 配置sshfence隔离机制超时时间 --> 
    <property> 
        <name>dfs.ha.fencing.ssh.connect-timeout</name> 
        <value>30000</value> 
    </property> 
    <property> 
        <name>dfs.webhdfs.enabled</name> 
        <value>true</value> 
    </property> 
</configuration>

core-site.xml

Configure zookeeper related information here, and fs.defaultFS configures the name of the nameservice instead of the previous address.

<configuration> 
    <!-- hdfs-site.xm中配置的nameservice --> 
    <property> 
        <name>fs.defaultFS</name>
        <value>hdfs://hadoopdajun/</value> 
    </property> 
    <!-- 指定hadoop工作目录 --> 
    <property> 
        <name>hadoop.tmp.dir</name> 
        <value>/home/bigdata/data/hadoopdata/</value> 
    </property> 
    <!-- 指定zookeeper集群访问地址 --> 
    <property> 
        <name>ha.zookeeper.quorum</name> 
        <value>bigdata01:2181,bigdata02:2181,bigdata03:2181</value> 
    </property>

yarn-site.xml

Configure zookeeper information for yarn.

<configuration>

<!-- Site specific YARN configuration properties -->
    <!-- 开启高可用 --> 
    <property> 
        <name>yarn.resourcemanager.ha.enabled</name> 
        <value>true</value> 
    </property> 
    <!-- 指定RM的cluster id --> 
    <property> 
        <name>yarn.resourcemanager.cluster-id</name> 
        <value>yarndajun</value> 
    </property> 
    <!-- 指定RM的两个节点 --> 
    <property> 
        <name>yarn.resourcemanager.ha.rm-ids</name> 
        <value>rm1,rm2</value> 
    </property> 
    <!-- 指定两个节点的位置 -->
    <property> 
        <name>yarn.resourcemanager.hostname.rm1</name> 
        <value>bigdata03</value> 
    </property> 
    <property> 
        <name>yarn.resourcemanager.hostname.rm2</name> 
        <value>bigdata04</value> 
    </property> 
    <!-- 指定zk集群地址 -->
    <property> 
        <name>yarn.resourcemanager.zk-address</name> 
        <value>bigdata01:2181,bigdata02:2181,bigdata03:2181</value> 
    </property> 
    <!-- 要运行MapReduce程序必须配置的附属服务 --> 
    <property> 
        <name>yarn.nodemanager.aux-services</name> 
        <value>mapreduce_shuffle</value> 
    </property> 
    <!-- 开启YARN集群的日志聚合功能 --> 
    <property> 
        <name>yarn.log-aggregation-enable</name> 
        <value>true</value> 
    </property> 
    <!-- YARN集群的聚合日志最长保留时长 --> 
    <property> 
        <name>yarn.log-aggregation.retain-seconds</name> 
        <value>86400</value> 
    </property> 
    <!-- 启用自动恢复 --> 
    <property> 
        <name>yarn.resourcemanager.recovery.enabled</name> 
        <value>true</value> 
    </property> 
    <!-- 制定resourcemanager的状态信息存储在zookeeper集群上--> 
    <property> 
        <name>yarn.resourcemanager.store.class</name> 
        <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
    </property>
</configuration>

mapred-site.xml

This is more historical server information than before.

<configuration>
    <!-- 执行框架设置为Hadoop YARN --> 
    <property> 
        <name>mapreduce.framework.name</name> 
        <value>yarn</value> 
    </property> 
    <!-- 设置mapreduce的历史服务器地址和端口号 --> 
    <property> 
        <name>mapreduce.jobhistory.address</name> 
        <value>bigdata04:10020</value> 
    </property> 
    <!-- mapreduce历史服务器的web访问地址 --> 
    <property> 
        <name>mapreduce.jobhistory.webapp.address</name> 
        <value>bigdata04:19888</value> 
    </property>
</configuration>

run

Start the journalnode, execute in bigdata01, bigdata02, bigdata03:

hadoop-daemon.sh start journalnode

Format the namenode in bigdata01:

hadoop namenode -format

Generate /home/bigdata/data/hadoopdata and send it to the bigdata02 server.
Zookeeper metadata initialization:

hdfs zkfc -formatZK

Start hdfs cluster and yarn cluster, hdfs start on any node, yarn start is distributed in bigdata03, bigdata04 and executed once.

start-dfs.sh
start-yarn.sh

http://bigdata01 :50070 Interface:

http://bigdata02 :50070 Interface:

As you can see, bigdata01 is in active state and bigdata02 is in standby state. If the process of bigdata01 is killed, then bigdata02 will automatically become active.
http://bigdata03 :8088/ The interface http://bigdata04 :8088/. Here, bigdata04 is the master node. If you visit bigdata03, it will be automatically transferred to the bigdata04 interface.

other

The installation of this HA cluster took a lot of effort compared to the previous one. It has also been reinstalled many times. The hadoopdata, journaldata, and zkdata folders of /home/bigdata/data/ and even the haddoop folders of bigdata02, bigdata03, and bigdata04 have also been deleted. Times. Some writes may be due to the configuration file not being written properly, some writes may be leftover data before, and some writes are due to zk service issues.

Hadoop-cluster installation (high availability)

Configuration file

hdfs-site.xml

core-site.xml

yarn-site.xml

mapred-site.xml

run

other

大军

引用和评论

trino -- 查询流程解析

【Hadoop】HBase系统解析及适用场景

【Hadoop】Yarn资源管理调度

【大数据内核解密】HDFS 架构与数据模型：从理论到实战全解析

【赵渝强老师】HBase的逻辑存储结构

【赵渝强老师】HBase的体系架构

【赵渝强老师】HBase的物理存储结构