hadoop伪分布式部署

2016-08-30

1.配置core-site.xml和hdfs-site.xml文件:

etc/hadoop/core-site.xml:

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
</configuration>

etc/hadoop/hdfs-site.xml:

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

2.打通ssh

略

3.格式化文件系统

Format the filesystem:

1	$ bin/hdfs namenode -format

启动 NameNode和DataNode:

1	$ sbin/start-dfs.sh

4.访问NameNode

50070 端口

1	NameNode - http://localhost:50070/

5.用dfsshell来访问hdfs系统:

1
2
3

$ bin/hdfs dfs -mkdir /user
$ bin/hdfs dfs -mkdir /user/<username>
$ bin/hdfs dfs -ls /

6.配置YARN

6.1 配置etc/hadoop/mapred-site.xml

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

etc/hadoop/yarn-site.xml:

<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
</configuration>

6.2 启动 ResourceManager和NodeManager

1	$ sbin/start-yarn.sh

6.3 通过8088端口访问web界面

6.4 运行mapreduce程序

编写mapreduce程序参考官网https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html#Purpose

6.5 结束之后关闭YARN

1	$ sbin/stop-yarn.sh

7.后记

所谓伪分布式部署就是把localhost当成slave节点。这是我的一点理解。

秦悦明的运维笔记

hadoop伪分布式部署

1.配置core-site.xml和hdfs-site.xml文件:

2.打通ssh

3.格式化文件系统

4.访问NameNode

5.用dfsshell来访问hdfs系统:

6.配置YARN

6.1 配置etc/hadoop/mapred-site.xml

6.2 启动 ResourceManager和NodeManager

6.3 通过8088端口访问web界面

6.4 运行mapreduce程序

6.5 结束之后关闭YARN

7.后记

About

Tags

Tag Cloud

Archives

Recents