CentOSにCDH4一式を入れてみる

わりと一式入れたのですが,一気に書くと長くなるし,検証しながらで大変なので,とりあえず本体のみにしておきますね.
ドキュメントの通りやっただけです.

$ wget http://archive.cloudera.com/cdh4/cdh/4/hadoop-2.0.0-cdh4.2.1.tar.gz
# sudo su
# cd /usr/local
# tar xvf ~bob/hadoop-2.0.0-cdh4.2.1.tar.gz
# chown -R hadoop hadoop-2.0.0-cdh4.2.1
# ln -s hadoop-2.0.0-cdh4.2.1 hadoop
# exit
# sudo su hadoop
# cd /usr/local/hadoop
# export JAVA_HOME=/usr/local/jdk
# export HADOOP_COMMON_HOME=/usr/local/hadoop/share/hadoop/common
# export HADOOP_HDFS_HOME=/usr/local/hadoop/share/hadoop/hdfs
# export HADOOP_MAPRED_HOME=/usr/local/hadoop/share/hadoop/mapreduce
# export YARN_HOME=/usr/local/hadoop/share/hadoop/yarn
# export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop
# emacs etc/hadoop/core-site.xml

  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://localhost:8020/</value>
    <description>
      The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation.
      The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class.
      The uri's authority is used to determine the host, port, etc. for a filesystem.
    </description>
  </property>

# emacs etc/hadoop/mapred-site.xml

  <property>
    <name>mapreduce.jobtracker.address</name>
    <value>0.0.0.0:8032</value>
    <description>
      The host and port that the MapReduce job tracker runs at.
      If "local", then jobs are run in-process as a single map and reduce task.
    </description>
  </property>

  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
    <description>
      The runtime framework for executing MapReduce jobs. Can be one of local, classic or yarn.
    </description>
  </property>

# emacs etc/hadoop/yarn-site.xml

  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce.shuffle</value>
    <description>

    </description>
  </property>


  <property>
    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    <description>

    </description>
  </property>

  <property>
    <name>yarn.application.classpath</name>
    <value>$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/share/hadoop/common/*,$HADOOP_COMMON_HOME/share/hadoop/common/lib/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/*,$HADOOP_H\
DFS_HOME/share/hadoop/hdfs/lib/*,$YARN_HOME/share/hadoop/yarn/*,$YARN_HOME/share/hadoop/yarn/lib/*,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,$HADOOP_MAPRED_HOME\
/share/hadoop/mapreduce/lib/*,/usr/local/bob/lib/*</value>
    <description>
     CLASSPATH for YARN applications. A comma-separated list of CLASSPATH entries
    </description>
  </property>

  <property>
    <name>yarn.nodemanager.resource.memory-mb</name>
    <value>2048</value>
  </property>

# mkdir -pv /tmp/hadoop-hadoop/dfs/name
# ./bin/hdfs namenode -format hadoop
# ./sbin/hadoop-daemon.sh --script hdfs start namenode
# ./sbin/hadoop-daemon.sh --script hdfs start datanode
# ./sbin/yarn-daemon.sh start resourcemanager
# ./sbin/yarn-daemon.sh start nodemanager
# ./sbin/mr-jobhistory-daemon.sh start historyserver
 # jps
18844 ResourceManager
18926 JobHistoryServer
18971 Jps
18761 NameNode
18798 DataNode
18884 NodeManager
# ./bin/hadoop fs -chmod 777 /home/hadoop/mr-history
# ./bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.0-cdh4.2.0.jar pi 10 5 

Estimated value of Pi is 3.28000000000000000000


# ./sbin/hadoop-daemon.sh --script hdfs stop namenode
# ./sbin/hadoop-daemon.sh --script hdfs stop datanode
# ./sbin/yarn-daemon.sh stop resourcemanager
# ./sbin/yarn-daemon.sh stop nodemanager
# ./sbin/mr-jobhistory-daemon.sh stop historyserver

HBase,Hive, Mahout, Sqoopも入れましたし,既にCDH4.3も出たのでそれも今週中に試しますし,あとFreeBSDにCDH4のインストールもやってみます.