storm
Estimated reading time: 5 minutesApache Storm is a free and open source distributed realtime computation system.
GitHub repo: https://github.com/31z4/storm-docker
Library reference
This content is imported from the official Docker Library docs, and is provided by the original uploader. You can view the Docker Hub page for this image at https://hub.docker.com/images/storm
Supported tags and respective Dockerfile
links
Quick reference
-
Where to get help:
the Docker Community Forums, the Docker Community Slack, or Stack Overflow -
Where to file issues:
https://github.com/31z4/storm-docker/issues -
Maintained by:
the Docker Community -
Supported architectures: (more info)
amd64
,arm32v6
,arm64v8
,i386
,ppc64le
,s390x
-
Published image artifact details:
repo-info repo’srepos/storm/
directory (history)
(image metadata, transfer size, etc) -
Image updates:
official-images PRs with labellibrary/storm
official-images repo’slibrary/storm
file (history) -
Source of this description:
docs repo’sstorm/
directory (history) -
Supported Docker versions:
the latest release (down to 1.6 on a best-effort basis)
What is Apache Storm?
Apache Storm is a distributed computation framework written predominantly in the Clojure programming language. Originally created by Nathan Marz and team at BackType, the project was open sourced after being acquired by Twitter. It uses custom created “spouts” and “bolts” to define information sources and manipulations to allow batch, distributed processing of streaming data. The initial release was on 17 September 2011.
How to use this image
Running topologies in local mode
Assuming you have topology.jar
in the current directory.
$ docker run -it -v $(pwd)/topology.jar:/topology.jar storm storm jar /topology.jar org.apache.storm.starter.ExclamationTopology
Setting up a minimal Storm cluster
-
Apache Zookeeper is a must for running a Storm cluster. Start it first. Since the Zookeeper “fails fast” it’s better to always restart it.
$ docker run -d --restart always --name some-zookeeper zookeeper
-
The Nimbus daemon has to be connected with the Zookeeper. It’s also a “fail fast” system.
$ docker run -d --restart always --name some-nimbus --link some-zookeeper:zookeeper storm storm nimbus
-
Finally start a single Supervisor node. It will talk to the Nimbus and Zookeeper.
$ docker run -d --restart always --name supervisor --link some-zookeeper:zookeeper --link some-nimbus:nimbus storm storm supervisor
-
Now you can submit a topology to our cluster.
$ docker run --link some-nimbus:nimbus -it --rm -v $(pwd)/topology.jar:/topology.jar storm storm jar /topology.jar org.apache.storm.starter.WordCountTopology topology
-
Optionally, you can start the Storm UI.
$ docker run -d -p 8080:8080 --restart always --name ui --link some-nimbus:nimbus storm storm ui
... via docker stack deploy
or docker-compose
Example stack.yml
for storm
:
version: '3.1'
services:
zookeeper:
image: zookeeper
container_name: zookeeper
restart: always
nimbus:
image: storm
container_name: nimbus
command: storm nimbus
depends_on:
- zookeeper
links:
- zookeeper
restart: always
ports:
- 6627:6627
supervisor:
image: storm
container_name: supervisor
command: storm supervisor
depends_on:
- nimbus
- zookeeper
links:
- nimbus
- zookeeper
restart: always
Run docker stack deploy -c stack.yml storm
(or docker-compose -f stack.yml up
) and wait for it to initialize completely. The Nimbus will be available at http://swarm-ip:6627
, http://localhost:6627
, or http://host-ip:6627
(as appropriate).
Configuration
This image uses default configuration of the Apache Storm. There are two main ways to change it.
-
Using command line arguments.
$ docker run -d --restart always --name nimbus storm storm nimbus -c storm.zookeeper.servers='["zookeeper"]'
-
Assuming you have
storm.yaml
in the current directory you can mount it as a volume.$ docker run -it -v $(pwd)/storm.yaml:/conf/storm.yaml storm storm nimbus
Logging
This image uses default logging configuration. All logs go to the /logs
directory by default.
Data persistence
No data are persisted by default. For convenience there are /data
and /logs
directories in the image owned by storm
user. Use them accordingly to persist data and logs using volumes.
$ docker run -it -v /logs -v /data storm storm nimbus
Please be noticed that using paths other than those predefined is likely to cause permission denied errors. It’s because for security reasons the Storm is running under the non-root storm
user.
License
View license information for the software contained in this image.
As with all Docker images, these likely also contain other software which may be under other licenses (such as Bash, etc from the base distribution, along with any direct or indirect dependencies of the primary software being contained).
Some additional license information which was able to be auto-detected might be found in the repo-info
repository’s storm/
directory.
As for any pre-built image usage, it is the image user’s responsibility to ensure that any use of this image complies with any relevant licenses for all software contained within.