step1 虚拟机安装与操作
新建虚拟机选择ubbuntu16镜像,然后一直下一步即可
由于之前安装了VMtools,所以接下来直接将hadoop和jdk的压缩包复制粘贴进虚拟机
step2 环境搭建
java环境配置
1 2
| sudo mkdir /usr/java sudo tar -zxvf jdk-8u221-linux-x64.tar.gz -C /usr/java/
|
在最后一行加上
在最后一行加上
1
| sudo vim /etc/environment
|
在最后一行加上
ssh-server配置
安装并启动ssh-server
1 2
| sudo apt-get install openssh-server sudo /etc/init.d/ssh start
|
生成RSA
1 2
| ssh-keygen -t rsa cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
|
关闭防火墙
链接测试
安装hadoop
解压
1 2 3 4
| sudo tar zxvf hadoop-2.7.4.tar.gz -C /usr/local cd /usr/local sudo mv hadoop-2.7.4 hadoop sudo chmod 777 -R /usr/local/hadoop
|
配置
在最后加上
1
| sudo vim /usr/local/hadoop/etc/hadoop/hadoop-env.sh
|
在最后加上
1
| sudo vim /usr/local/hadoop/etc/hadoop/yarn-env.sh
|
在最后加上
1
| sudo vim /usr/local/hadoop/etc/hadoop/core-site.xml
|
在configuration标签中加入
1 2 3 4 5 6 7 8 9 10 11
| <configuration> <property> <name>hadoop.tmp.dir</name> <value>file:/usr/local/hadoop/tmp</value> <description>Abase for other temporary directories.</description> </property> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> </property> </configuration>
|
1
| sudo vim /usr/local/hadoop/etc/hadoop/hdfs-site.xml
|
在configuration标签中加入
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/usr/local/hadoop/tmp/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/usr/local/hadoop/tmp/dfs/data</value> </property> </configuration>
|
1
| sudo vim /usr/local/hadoop/etc/hadoop/yarn-site.xml
|
在configuration标签中加入
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
| <configuration>
<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>127.0.0.1:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>127.0.0.1:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>127.0.0.1:8031</value> </property> </configuration>
|
重启后输入
启动
1 2 3
| hdfs namenode -format cd /usr/local/hadoop/sbin start-all.sh
|
访问本地的8088,和50070端口
Step3 平台操作
新建目录
1
| hadoop fs -mkdir /tmp/input
|
上传本地文件
1
| hadoop fs -put exp.py /tmp/input
|
查看
1
| hadoop fs -cat /tmp/input/exp.py
|
参考
https://blog.csdn.net/kh896424665/article/details/78765175
https://www.cnblogs.com/biehongli/p/7026809.html