Let's note/FreeBSDでHadoop(擬似分散モード)

http://d.hatena.ne.jp/tullio/20120915/1347724450

擬似分散モードも簡単にできます.
Hadoopの公式サイトの通りやれば動きます.

# emacs -nw conf/core-site.xml
<configuration>
     <property>
         <name>fs.default.name</name>
         <value>hdfs://localhost:9000</value>
     </property>
</configuration>

# emacs -nw conf/hdfs-site.xml 
<configuration>
     <property>
         <name>dfs.replication</name>
         <value>1</value>
     </property>
</configuration>

# emacs -nw conf/mapred-site.xml
<configuration>
     <property>
         <name>mapred.job.tracker</name>
         <value>localhost:9001</value>
     </property>
</configuration>

# cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

> ./bin/hadoop namenode -format
INFO common.Storage: Storage directory /tmp/hadoop-foo/dfs/name has been successfully formatted.
INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at www.example.com/1.1.1.1
************************************************************/
> ./bin/start-all.sh 
localhost: starting tasktracker, logging to /usr/local/lib/hadoop/bin/../logs/hadoop......out

> ./bin/hadoop fs -put conf input
./bin/hadoop jar hadoop-0.20.2-examples.jar grep input output 'dfs[a-z.]+'

INFO mapred.JobClient:  map 100% reduce 0%
INFO mapred.JobClient: Task Id : attempt_201209141516_0003_m_000000_0, Status : FAILED
Too many fetch-failures
WARN mapred.JobClient: Error reading task outputNo route to host

ん?やたらReduceが遅いと思ったら,エラーが出ました.

No route to host

これは,DHCPで取得したIPアドレスと,/etc/hostsに記述したアドレスが異なっていたためでした.
よく今まで他のエラーが出なかったものです...

DHCPで取得したアドレス.

# /sbin/ifconfig em0| grep inet
        inet 1.1.1.1 netmask 0xffffff00 broadcast 1.1.1.255

/etc/hostsに手動で記述したアドレス.

# grep www.example.com /etc/hosts
1.1.1.2 www.example.com

/etc/hostsの行は削除します.

# emacs -nw /etc/hosts

そして再実行.

> ./bin/hadoop jar hadoop-0.20.2-examples.jar grep input output 'dfs[a-z.]+'
> ./bin/hadoop fs -get output output 

> cat output/part-00000 
3       dfs.class
2       dfs.period
1       dfs.file
1       dfs.replication
1       dfs.servers
1       dfsadmin
1       dfsmetrics.log

OK!