一、前期准备
本篇是在上一篇https://segmentfault.com/a/11... 的基础上进行的操作。
1.1 修改/添加部分
#修改主机名
hostnamectl set-hostname hadoop104
#ssh免密钥登陆配置
##删除现有的ssh信息
[admin@hadoop104 ~]$ cd ~/.ssh
[admin@hadoop104 .ssh]$ rm -rf *
##然后不输入密码(直接按三次回车)生成私钥和公钥
[admin@hadoop104 .ssh]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/admin/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/admin/.ssh/id_rsa.
Your public key has been saved in /home/admin/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:kjL6k939tz4wgrdYIlA7/r5EgzGVJ12YlQB7BfkNMSU admin@hadoop104
The key's randomart image is:
+---[RSA 2048]----+
| o+oOE+. |
| ..o.*.oo |
| .o..o.. o |
| . o= . . . |
| oo+.S. |
| . oooo.+ o |
| . o +.* o o |
| .o ..+ o o |
| .. .o. ..ooo |
+----[SHA256]-----+
[admin@hadoop104 .ssh]$ ll
总用量 8
-rw------- 1 admin admin 1675 4月 2 21:26 id_rsa #id_rsa为私钥文件
-rw-r--r-- 1 admin admin 397 4月 2 21:26 id_rsa.pub #id_rsa.pub为公钥文件
##将公钥发送给从节点hadoop104
[admin@hadoop104 .ssh]$ ssh-copy-id hadoop104
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/admin/.ssh/id_rsa.pub"
The authenticity of host 'hadoop104 (192.168.119.104)' can't be established.
ECDSA key fingerprint is SHA256:X25gXFFr2vsKVxn7LLOpQtYBb1OHOmRGj9XmJpQQ9Vs.
ECDSA key fingerprint is MD5:d6:55:be:36:9b:b6:33:f7:4d:75:5a:c5:40:89:a1:7c.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
admin@hadoop104's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'hadoop104'"
and check to make sure that only the key(s) you wanted were added.
##然后就可以了
[admin@hadoop104 .ssh]$ ssh hadoop104
Last login: Thu Apr 4 10:48:25 2019 from hadoop104
为 “二、实际操作-2.1 HDFS上运行MapReduce 程序” 进行的配置,配置完请先进行 2.1 HDFS上运行MapReduce 程序 操作。
# core-site.xml
vi /opt/module/hadoop-3.1.1/etc/hadoop/core-site.xml
<!-- 指定HDFS中NameNode的地址 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop104:9000</value>
</property>
<!-- 指定hadoop运行时产生文件的存储目录 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/module/hadoop-3.1.1/data/tmp</value>
</property>
# hdfs-site.xml
vi /opt/module/hadoop-3.1.1/etc/hadoop/hdfs-site.xml
<!-- 指定HDFS副本的数量 -->
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
为 “二、实际操作-2.2 YARN上运行MapReduce 程序” 进行的配置,配置完进行 2.2 YARN上运行MapReduce 程序 操作。
#配置yarn-site.xml
<!-- reducer获取数据的方式 -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!-- 指定YARN的ResourceManager的地址 -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop104</value>
</property>
#配置mapred-site.xml
<!-- 指定mr运行在yarn上 -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
二、实际操作
2.1 HDFS上运行MapReduce 程序
#格式化namenode(第一次启动时格式化,以后不要总格式化)
[admin@centos104 hadoop-3.1.1]$ bin/hdfs namenode -format
#启动
[admin@hadoop104 hadoop-3.1.1]$ sbin/start-dfs.sh
#查看
[admin@hadoop104 hadoop-3.1.1]$ jps
14448 NameNode
14769 SecondaryNameNode
14571 DataNode
14892 Jps
#浏览器查看HDFS文件系统,Hadoop3.0中namenode的默认端口配置发生变化:从50070改为9870
http://192.168.119.104:9870/dfshealth.html#tab-overview 或者
http://hadoop104:9870/dfshealth.html#tab-overview (本地windows需要配hosts)
#在hdfs文件系统上创建一个input文件夹
[admin@hadoop104 hadoop-3.1.1]$ bin/hdfs dfs -mkdir -p /user/qianxkun/mapreduce/wordcount/input
#将测试文件内容上传到文件系统上
[admin@hadoop104 hadoop-3.1.1]$ bin/hdfs dfs -put wcinput/wc.input /user/qianxkun/mapreduce/wordcount/input/
#查看上传的文件
[admin@hadoop104 hadoop-3.1.1]$ bin/hdfs dfs -ls /user/qianxkun/mapreduce/wordcount/input/
Found 1 items
-rw-r--r-- 1 admin supergroup 47 2019-04-04 16:07 /user/qianxkun/mapreduce/wordcount/input/wc.input
[admin@hadoop104 hadoop-3.1.1]$ bin/hdfs dfs -cat /user/qianxkun/mapreduce/wordcount/input/wc.input
hadoop yarn
hadoop mapreduce
qianxkun
qianxkun
#在HDFS上运行mapreduce程序
[admin@hadoop104 hadoop-3.1.1]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1.jar wordcount /user/qianxkun/mapreduce/wordcount/input/ /user/qianxkun/mapreduce/wordcount/output
查看输出结果
- 命令行查看
[admin@hadoop104 hadoop-3.1.1]$ bin/hdfs dfs -cat /user/qianxkun/mapreduce/wordcount/output/*
hadoop 2
mapreduce 1
qianxkun 2
yarn 1
- 浏览器查看
#将测试文件内容下载到本地
[admin@hadoop104 hadoop-3.1.1]$ hadoop fs -get /user/qianxkun/mapreduce/wordcount/output/part-r-00000 ./wcoutput/
#删除输出结果
[admin@hadoop104 hadoop-3.1.1]$ hdfs dfs -rm -r /user/qianxkun/mapreduce/wordcount/output
2.2 YARN上运行MapReduce 程序
#启动
[admin@hadoop104 hadoop-3.1.1]$ sbin/start-yarn.sh
#查看
[admin@hadoop104 hadoop-3.1.1]$ jps
14448 NameNode
14769 SecondaryNameNode
15939 ResourceManager
16374 Jps
14571 DataNode
16063 NodeManager
#yarn的浏览器页面查看
http://192.168.119.104:8088/cluster 或者
http://hadoop104:8088/cluster
#删除文件系统上的output文件
[admin@hadoop104 hadoop-3.1.1]$ bin/hdfs dfs -rm -R /user/qianxkun/mapreduce/wordcount/output
Deleted /user/qianxkun/mapreduce/wordcount/output
#执行mapreduce程序
[admin@hadoop104 hadoop-3.1.1]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1.jar wordcount /user/qianxkun/mapreduce/wordcount/input /user/qianxkun/mapreduce/wordcount/output
##报错"错误: 找不到或无法加载主类 org.apache.hadoop.mapreduce.v2.app.MRAppMaster"
##解决
###停止yarn
[admin@hadoop104 hadoop-3.1.1]$ sbin/stop-yarn.sh
###将hadoop classpath 下内容配置到yarn-site.xml文件中
[admin@hadoop104 hadoop-3.1.1]$ hadoop classpath
/opt/module/hadoop-3.1.1/etc/hadoop:/opt/module/hadoop-3.1.1/share/hadoop/common/lib/*:/opt/module/hadoop-3.1.1/share/hadoop/common/*:/opt/module/hadoop-3.1.1/share/hadoop/hdfs:/opt/module/hadoop-3.1.1/share/hadoop/hdfs/lib/*:/opt/module/hadoop-3.1.1/share/hadoop/hdfs/*:/opt/module/hadoop-3.1.1/share/hadoop/mapreduce/lib/*:/opt/module/hadoop-3.1.1/share/hadoop/mapreduce/*:/opt/module/hadoop-3.1.1/share/hadoop/yarn:/opt/module/hadoop-3.1.1/share/hadoop/yarn/lib/*:/opt/module/hadoop-3.1.1/share/hadoop/yarn/*
[admin@hadoop104 hadoop-3.1.1]$ vi etc/hadoop/yarn-site.xml
<property>
<name>yarn.application.classpath</name>
<value>
/opt/module/hadoop-3.1.1/etc/hadoop:/opt/module/hadoop-3.1.1/share/hadoop/common/lib/*:/opt/module/hadoop-3.1.1/share/hadoop/common/*:/opt/module/hadoop-3.1.1/share/hadoop/hdfs:/opt/module/hadoop-3.1.1/share/hadoop/hdfs/lib/*:/opt/module/hadoop-3.1.1/share/hadoop/hdfs/*:/opt/module/hadoop-3.1.1/share/hadoop/mapreduce/lib/*:/opt/module/hadoop-3.1.1/share/hadoop/mapreduce/*:/opt/module/hadoop-3.1.1/share/hadoop/yarn:/opt/module/hadoop-3.1.1/share/hadoop/yarn/lib/*:/opt/module/hadoop-3.1.1/share/hadoop/yarn/*
</value>
</property>
##重新启动后执行成功
[admin@hadoop104 hadoop-3.1.1]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1.jar wordcount /user/qianxkun/mapreduce/wordcount/input /user/qianxkun/mapreduce/wordcount/output
#查看运行结果
[admin@hadoop104 hadoop-3.1.1]$ bin/hdfs dfs -cat /user/qianxkun/mapreduce/wordcount/output/*
hadoop 2
mapreduce 1
qianxkun 2
yarn 1
三、历史服务配置启动查看
# 配置mapred-site.xml
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop104:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop104:19888</value>
</property>
#查看启动历史服务器文件目录:
[admin@hadoop104 hadoop-3.1.1]$ ls sbin/ |grep mr
mr-jobhistory-daemon.sh
#启动历史服务器
[admin@hadoop104 hadoop-3.1.1]$ sbin/mr-jobhistory-daemon.sh start historyserver
#查看历史服务器是否启动
[admin@hadoop104 hadoop-3.1.1]$ jps
19442 SecondaryNameNode
19800 NodeManager
19257 DataNode
19146 NameNode
19692 ResourceManager
20142 Jps
18959 JobHistoryServer
#查看jobhistory
http://192.168.119.104:19888/jobhistory 或者
http://hadoop104:19888/jobhistory
四、日志的聚集
#配置yarn-site.xml
<!-- 日志聚集功能使能 -->
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<!-- 日志保留时间设置7天 -->
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
</property>
#启动hdfs、yarn和historymanager
[admin@hadoop104 hadoop-3.1.1]$ sbin/start-dfs.sh
[admin@hadoop104 hadoop-3.1.1]$ sbin/start-yarn.sh
[admin@hadoop104 hadoop-3.1.1]$ sbin/mr-jobhistory-daemon.sh start historyserver
#删除hdfs上已经存在的hdfs文件
[admin@hadoop104 hadoop-3.1.1]$ bin/hdfs dfs -rm -R /user/qianxkun/mapreduce/wordcount/output
#执行wordcount程序
[admin@hadoop104 hadoop-3.1.1]$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1.jar wordcount /user/qianxkun/mapreduce/wordcount/input /user/qianxkun/mapreduce/wordcount/output
2019-04-04 18:23:59,294 INFO client.RMProxy: Connecting to ResourceManager at hadoop104/192.168.119.104:8032
2019-04-04 18:24:00,525 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/admin/.staging/job_1554373388828_0001
#查看日志
http://192.168.119.104:19888/jobhistory 或者
http://hadoop104:19888/jobhistory
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。