伪分布式环境搭建实验报告

pic

step1 虚拟机安装与操作

5.png-18.4kB
新建虚拟机选择ubbuntu16镜像,然后一直下一步即可
6.png-39.1kB

由于之前安装了VMtools,所以接下来直接将hadoop和jdk的压缩包复制粘贴进虚拟机
7.png-32.5kB

step2 环境搭建

java环境配置

1
2
sudo mkdir /usr/java
sudo tar -zxvf jdk-8u221-linux-x64.tar.gz -C /usr/java/
1
sudo vim ~/.bashrc

在最后一行加上
8.png-12.7kB

1
source ~/.bashrc
1
sudo vim /etc/profile

在最后一行加上
9.png-15.8kB

1
sudo vim /etc/environment

在最后一行加上
10.png-10.4kB

1
source /etc/environment
1
java -version

11.png-14.5kB

ssh-server配置

安装并启动ssh-server

1
2
sudo apt-get install openssh-server
sudo /etc/init.d/ssh start

生成RSA

1
2
ssh-keygen -t rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

关闭防火墙

1
sudo ufw disable

链接测试
12.png-32.6kB

安装hadoop

解压

1
2
3
4
sudo tar zxvf hadoop-2.7.4.tar.gz -C /usr/local
cd /usr/local
sudo mv hadoop-2.7.4 hadoop
sudo chmod 777 -R /usr/local/hadoop

配置

1
sudo vim ~/.bashrc

在最后加上
13.png-36.4kB

1
source ~/.bashrc
1
sudo vim /usr/local/hadoop/etc/hadoop/hadoop-env.sh

在最后加上
14.png-9.7kB

1
sudo vim /usr/local/hadoop/etc/hadoop/yarn-env.sh

在最后加上
15.png-9.4kB

1
sudo vim /usr/local/hadoop/etc/hadoop/core-site.xml

在configuration标签中加入

1
2
3
4
5
6
7
8
9
10
11
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/usr/local/hadoop/tmp</value>
<description>Abase for other temporary directories.</description>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
1
sudo vim /usr/local/hadoop/etc/hadoop/hdfs-site.xml

在configuration标签中加入

1
2
3
4
5
6
7
8
9
10
11
12
13
14
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/tmp/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop/tmp/dfs/data</value>
</property>
</configuration>
1
sudo vim /usr/local/hadoop/etc/hadoop/yarn-site.xml 

在configuration标签中加入

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
<configuration> 
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>127.0.0.1:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>127.0.0.1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>127.0.0.1:8031</value>
</property>
</configuration>

重启后输入

1
hadoop version

16.png-34.8kB

启动

1
2
3
hdfs namenode -format
cd /usr/local/hadoop/sbin
start-all.sh

访问本地的8088,和50070端口
17.png-119.7kB
18.png-125.6kB

Step3 平台操作

新建目录

1
hadoop fs -mkdir /tmp/input

上传本地文件

1
hadoop fs -put exp.py /tmp/input

19.png-68kB
查看

1
hadoop fs -cat /tmp/input/exp.py

20.png-42.2kB

参考

https://blog.csdn.net/kh896424665/article/details/78765175
https://www.cnblogs.com/biehongli/p/7026809.html