リアルタイム処理 for Big DataのDempsyのサンプルを動かしてみる
以前,ドキュメントを流し読みしたDempsyです.
Streaming Processing for Big Data - なぜか数学者にはワイン好きが多い
順番ということで,試したかったStormよりも先に動かしてみました.
ドキュメントによると,
Prerequisites
You will need Java 1.6 or higher.
FreeBSDにはオフィシャルなOracle Javaは無いですが...
> java -version openjdk version "1.6.0_32" OpenJDK Runtime Environment (build 1.6.0_32-b25) OpenJDK Client VM (build 20.0-b12, mixed mode)
いけるかな...OpenJDKで.
To build an application against Dempsy you will need to add the Dempsy dependencies to your build. This should be as simple as including the following dependency in your maven pom.xml file (or the gradle equivalent).
mavenも必要.
> mvn --version Apache Maven 2.2.1 (r801777; 2009-08-07 04:16:01+0900) Java version: 1.6.0_32 Java home: /usr/local/openjdk6/jre Default locale: en, platform encoding: ISO8859-1 OS name: "freebsd" version: "9.0-release" arch: "i386" Family: "unix"
gitで,dempsyのサンプルプログラムを落とします.
> git clone git://github.com/Dempsy/Dempsy-examples.git Dempsy-examples Cloning into 'Dempsy-examples'... remote: Counting objects: 216, done. remote: Compressing objects: 100% (103/103), done. remote: Total 216 (delta 47), reused 210 (delta 41) Receiving objects: 100% (216/216), 1.56 MiB | 446 KiB/s, done. Resolving deltas: 100% (47/47), done.
(gitプロトコルじゃなくてhttpsやhttpだとerror: Could not resolve host: github.comが出て落とせなかったのですが,まだ調べ切れていません)
cd Dempsy-examples mvn install > ls -l userguide-wordcount/target/userguide-wordcount-1.0-SNAPSHOT.jar -rw-r--r-- 1 joe wheel 1359200 Aug 5 19:26 userguide-wordcount/target/userguide-wordcount-1.0-SNAPSHOT.jar
ビルドもできて,~/.m2/以下に,Springも含めた依存ライブラリもダウンロードされているはずです.
Javaのclasspathを通すのが面倒なので,mavenで取得されたライブラリのうち,最低限必要な下記のライブラリを,mkdir libしてその下に配置します.
ls lib > ls lib commons-io-1.4.jar lib-dempsyimpl-0.7.jar metrics-ganglia-2.0.2.jar slf4j-log4j12-1.6.4.jar spring-context-3.0.6.RELEASE.jar commons-logging-1.1.1.jar lib-dempsyspring-0.7.jar metrics-graphite-2.0.2.jar spring-aop-3.0.6.RELEASE.jar spring-core-3.0.6.RELEASE.jar lib-dempsyapi-0.7.jar log4j-1.2.14.jar quartz-2.0.1.jar spring-asm-3.0.6.RELEASE.jar spring-expression-3.0.6.RELEASE.jar lib-dempsycore-0.7.jar metrics-core-2.0.2.jar slf4j-api-1.6.4.jar spring-beans-3.0.6.RELEASE.jar
そうすると,これくらいで実行できます.
> cp userguide-wordcount/src/main/resources/WordCount.xml DempsyApplicationContext-WordCount.xml > java -Dapplication=WordCount -cp .:./lib/\*:./userguide-wordcount/target/\* com.nokia.dempsy.spring.RunAppInVm 2012-08-08 21:12:54,356 [main] INFO ClassPathXmlApplicationContext - Refreshing org.springframework.context.support.ClassPathXmlApplicationContext@1df5a8f: startup date [Wed Aug 08 21:12:54 JST 2012]; root of context hierarchy 2012-08-08 21:12:54,434 [main] INFO XmlBeanDefinitionReader - Loading XML bean definitions from class path resource [DempsyApplicationContext-WordCount.xml] 2012-08-08 21:12:54,619 [main] INFO XmlBeanDefinitionReader - Loading XML bean definitions from class path resource [Dempsy-localVm.xml] 2012-08-08 21:12:54,723 [main] INFO DefaultListableBeanFactory - Pre-instantiating singletons in org.springframework.beans.factory.support.DefaultListableBeanFactory@1f48262: defining beans [com.nokia.dempsy.config.ApplicationDefinition#0,properties,localVMContainerClusterSessionFactory,Dempsy]; root of factory hierarchy 2012-08-08 21:12:55,006 [Adaptor - "com.nokia.dempsy.example.userguide.wordcount.WordAdaptor@17918f0" of type "com.nokia.dempsy.example.userguide.wordcount.WordAdaptor"] INFO Dempsy - Starting adaptor thread for "com.nokia.dempsy.example.userguide.wordcount.WordAdaptor@17918f0" of type "com.nokia.dempsy.example.userguide.wordcount.WordAdaptor" 2012-08-08 21:12:55,170 [main] INFO SimpleThreadPool - Job execution threads will use class loader of thread: main 2012-08-08 21:12:55,192 [main] INFO SchedulerSignalerImpl - Initialized Scheduler Signaller of type: class org.quartz.core.SchedulerSignalerImpl 2012-08-08 21:12:55,194 [main] INFO QuartzScheduler - Quartz Scheduler v.2.0.1 created. 2012-08-08 21:12:55,196 [main] INFO RAMJobStore - RAMJobStore initialized. 2012-08-08 21:12:55,202 [main] INFO QuartzScheduler - Scheduler meta-data: Quartz Scheduler (v2.0.1) 'DefaultQuartzScheduler' with instanceId 'NON_CLUSTERED' Scheduler class: 'org.quartz.core.QuartzScheduler' - running locally. NOT STARTED. Currently in standby mode. Number of jobs executed: 0 Using thread pool 'org.quartz.simpl.SimpleThreadPool' - with 10 threads. Using job-store 'org.quartz.simpl.RAMJobStore' - which does not support persistence. and is not clustered. 2012-08-08 21:12:55,203 [main] INFO StdSchedulerFactory - Quartz scheduler 'DefaultQuartzScheduler' initialized from default resource file in Quartz package: 'quartz.properties' 2012-08-08 21:12:55,204 [main] INFO StdSchedulerFactory - Quartz scheduler version: 2.0.1 2012-08-08 21:12:55,219 [main] INFO QuartzScheduler - Scheduler DefaultQuartzScheduler_$_NON_CLUSTERED started. 2012-08-08 21:12:55,220 [main] INFO QuartzScheduler - Scheduler DefaultQuartzScheduler_$_NON_CLUSTERED started. 2012-08-08 21:12:57,342 [Timer-0] INFO UpdateChecker - New Quartz update(s) found: 2.1.4 [http://www.terracotta.org/kit/reflector?kitID=default&pageID=QuartzChangeLog] 2012-08-08 21:13:04,666 [Adaptor - "com.nokia.dempsy.example.userguide.wordcount.WordAdaptor@17918f0" of type "com.nokia.dempsy.example.userguide.wordcount.WordAdaptor"] INFO Dempsy - Adaptor thread for "com.nokia.dempsy.example.userguide.wordcount.WordAdaptor@17918f0" of type "com.nokia.dempsy.example.userguide.wordcount.WordAdaptor" is shutting down And:8783 in:7738 he:5819 unto:5626 a:5147 with:3822 LORD:3319 will:2239 as:1774 him:1754 (snip)
日本語の貴重な解説としては,以下を参考にさせて頂きました.
Dempsyアプリケーションを動作させてみます!(ローカル版 - Taste of Tech Topics
Adapterはファイルをのんびり読んでMessageProcessorに投げているだけであり,Hadoopにファイルをputしてジョブを走らせるのと比べると,ストリーミングに処理されていることがよく分かりました.