|
| 1 | +--- |
| 2 | +layout: default |
| 3 | +title: Creating a Production Storm Cluster |
| 4 | +--- |
| 5 | + |
| 6 | +# Creating a Production Storm Cluster |
| 7 | + |
| 8 | +[Storm](http://storm-project.net/) is a [free and open source](http://storm-project.net/about/free-and-open-source.html) distributed realtime computation system. Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing. Storm is [simple](http://storm-project.net/about/simple-api.html), can be used with [any programming language](http://storm-project.net/about/multi-language.html), and is a lot of fun to use! |
| 9 | + |
| 10 | +This tutorial will help you set up a production storm cluster from scratch. |
| 11 | + |
| 12 | +<div |
| 13 | +markdown="1" |
| 14 | +class="tutorial" |
| 15 | +data-author-github="Whitespace" |
| 16 | +data-license="http://creativecommons.org/licenses/by/3.0/" |
| 17 | +data-facets='{"Operating System": "Centos 6", "Zookeeper Version": "3.4.3", "ZeroMQ Version": "2.1.7"}'> |
| 18 | + |
| 19 | +## Assumptions |
| 20 | +We assume you have two machines that you can ssh into as the `deploy` user, and that user has `sudo` privleges. We'll call these machines `storm` and `zookeeper`. It's ok if you only have one machine, this tutorial will handle that case as well. |
| 21 | + |
| 22 | +## Java |
| 23 | + |
| 24 | +We need to install the JDK (which includes the JRE). Oracle requires you accept the license agreement, so I prefer to download this locally and then `scp` the file to my host. To download the JDK, go to [the jdk download page](http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1637583.html), accept the license agreement, and download the file called `jdk-7u5-linux-x64.rpm`. Then copy it to the `deploy` user's home directory using `scp`: |
| 25 | + |
| 26 | +{% highlight sh %} |
| 27 | +scp -C jdk-7u5-linux-x64.rpm deploy@zookeeper:/home/deploy |
| 28 | +scp -C jdk-7u5-linux-x64.rpm deploy@storm:/home/deploy |
| 29 | +{% endhighlight %} |
| 30 | + |
| 31 | +Then we install it on each machine (and setup our `JAVA_HOME` and `PATH` to find all the java binaries): |
| 32 | + |
| 33 | +{% highlight sh %} |
| 34 | +ssh zookeeper |
| 35 | +sudo rpm -Uvh jdk-7u5-linux-x64.rpm |
| 36 | +echo "export JAVA_HOME=/usr/java/default \ |
| 37 | +export PATH=$PATH:$JAVA_HOME/bin:$HOME/bin" > ~/.bash_profile |
| 38 | +logout |
| 39 | + |
| 40 | +ssh storm |
| 41 | +sudo rpm -Uvh jdk-7u5-linux-x64.rpm |
| 42 | +echo "export JAVA_HOME=/usr/java/default \ |
| 43 | +export PATH=$PATH:$JAVA_HOME/bin:$HOME/bin" > ~/.bash_profile |
| 44 | +logout |
| 45 | +{% endhighlight %} |
| 46 | + |
| 47 | +## Zookeeper |
| 48 | + |
| 49 | +### Installing Dependencies |
| 50 | + |
| 51 | +First we need to install some dependencies and setup a place to keep our source files: |
| 52 | + |
| 53 | +{% highlight sh %} |
| 54 | +ssh deploy@zookeeper |
| 55 | +mkdir -p ~/src |
| 56 | +sudo yum install -y libtool libuuid-devel gcc-c++ make |
| 57 | +{% endhighlight %} |
| 58 | + |
| 59 | +### Installing Zookeeper |
| 60 | + |
| 61 | +Now we're ready to install zookeeper: |
| 62 | + |
| 63 | +{% highlight sh %} |
| 64 | +cd ~/src |
| 65 | +wget http://mirrors.axint.net/apache/zookeeper/zookeeper-3.4.3/zookeeper-3.4.3.tar.gz |
| 66 | +tar xzf zookeeper-3.4.3.tar.gz |
| 67 | +{% endhighlight %} |
| 68 | + |
| 69 | +### Running Zookeeper |
| 70 | + |
| 71 | +Zookeeper is configured by a file located in `conf/zookeper.conf`. We'll use this as our zookeeper config when we start the zookeeper server: |
| 72 | + |
| 73 | +{% highlight sh %} |
| 74 | +~/src/zookeeper-3.4.3/bin/zkServer.sh start ~/src/zookeeper-3.4.3/conf/zoo_sample.cfg |
| 75 | +{% endhighlight %} |
| 76 | + |
| 77 | +## Storm |
| 78 | + |
| 79 | +Now that zookeeper is running, we can setup our storm servers. First we need to [install native dependencies](https://github.com/nathanmarz/storm/wiki/Installing-native-dependencies): |
| 80 | + |
| 81 | +### Installing Dependencies |
| 82 | + |
| 83 | +{% highlight sh %} |
| 84 | +ssh deploy@zookeeper |
| 85 | +sudo yum install -y git libtool libuuid-devel gcc-c++ make |
| 86 | +mkdir -p ~/src /tmp/storm |
| 87 | +{% endhighlight %} |
| 88 | + |
| 89 | +### Installing ZeroMQ |
| 90 | + |
| 91 | +{% highlight sh %} |
| 92 | +cd ~/src |
| 93 | +wget http://download.zeromq.org/zeromq-2.1.7.tar.gz |
| 94 | +tar xzf zeromq-2.1.7.tar.gz |
| 95 | +cd zeromq-2.1.7 |
| 96 | +./configure |
| 97 | +make |
| 98 | +sudo make install |
| 99 | +{% endhighlight %} |
| 100 | + |
| 101 | +### Installing JZMQ |
| 102 | + |
| 103 | +{% highlight sh %} |
| 104 | +cd ~/src |
| 105 | +git clone https://github.com/nathanmarz/jzmq.git |
| 106 | +cd jzmq |
| 107 | +./autogen.sh |
| 108 | +./configure |
| 109 | +make |
| 110 | +sudo make install |
| 111 | +{% endhighlight %} |
| 112 | + |
| 113 | +### Installing Storm |
| 114 | + |
| 115 | +{% highlight sh %} |
| 116 | +cd ~/src |
| 117 | +wget https://github.com/downloads/nathanmarz/storm/storm-0.7.0.zip |
| 118 | +unzip storm-0.7.0.zip |
| 119 | +{% endhighlight %} |
| 120 | + |
| 121 | +## Configuring Storm |
| 122 | + |
| 123 | +We need to point storm to the correct zookeeper servers, as well as [setup other config options](https://github.com/nathanmarz/storm/wiki/Setting-up-a-Storm-cluster). For now, we'll just modify the `conf/storm.yaml` file to look like this: |
| 124 | + |
| 125 | +{% highlight yaml %} |
| 126 | +storm.local.dir: "/tmp/storm" |
| 127 | +storm.zookeeper.servers: |
| 128 | + - "zookeeper" |
| 129 | +nimbus.host: "localhost" |
| 130 | +{% endhighlight %} |
| 131 | + |
| 132 | +You'll want to put some real values in there depending on the size of your cluster, and you'll definitely want to change `/tmp/storm` to something more persistent, but this sufices for a demo. |
| 133 | + |
| 134 | +## Running Storm |
| 135 | + |
| 136 | +In order to run storm, you need both nimbus and supervisor processes running. We can run them with `nohup`: |
| 137 | + |
| 138 | +{% highlight sh %} |
| 139 | +nohup ~/src/storm-0.7.0/bin/storm nimbus & |
| 140 | +nohup ~/src/storm-0.7.0/bin/storm supervisor & |
| 141 | +{% endhighlight %} |
| 142 | + |
| 143 | +Now it should be ready to process topologies! |
| 144 | + |
| 145 | +## Running the Storm UI |
| 146 | + |
| 147 | +Storm comes with a nifty dashboard to view cluster stats. To run it, just do: |
| 148 | + |
| 149 | +{% highlight sh %} |
| 150 | +nohup ~/src/storm-0.7.0/bin/storm ui & |
| 151 | +{% endhighlight %} |
| 152 | +</div> |
0 commit comments