elasticsearch

Milestone: 3

This output lets you store logs in Elasticsearch and is the most recommended output for Logstash. If you plan on using the Kibana web interface, you’ll need to use this output.

VERSION NOTE: Your Elasticsearch cluster must be running Elasticsearch 1.0.0 or later. If you use any older version of Elasticsearch, you should set protocol => http in this plugin.

If you want to set other Elasticsearch options that are not exposed directly as configuration options, there are two methods:

  • Create an elasticsearch.yml file in the $PWD of the Logstash process
  • Pass in es.* java properties (java -Des.node.foo= or ruby -J-Des.node.foo=)

With the default protocol setting (“node”), this plugin will join your Elasticsearch cluster as a client node, so it will show up in Elasticsearch’s cluster status.

You can learn more about Elasticsearch at http://www.elasticsearch.org

Operational Notes

Template management requires Elasticsearch version 0.90.7 or later. If you are using a version older than this, please upgrade. You will receive more benefits than just template management!

If using the default protocol setting (“node”), your firewalls might need to permit port 9300 in both directions (from Logstash to Elasticsearch, and Elasticsearch to Logstash)

Synopsis

This is what it might look like in your config file:
output {
  elasticsearch {
    action => ... # string (optional), default: "index"
    bind_host => ... # string (optional)
    bind_port => ... # number (optional)
    cluster => ... # string (optional)
    codec => ... # codec (optional), default: "plain"
    document_id => ... # string (optional), default: nil
    embedded => ... # boolean (optional), default: false
    embedded_http_port => ... # string (optional), default: "9200-9300"
    flush_size => ... # number (optional), default: 5000
    host => ... # string (optional)
    idle_flush_time => ... # number (optional), default: 1
    index => ... # string (optional), default: "logstash-%{+YYYY.MM.dd}"
    index_type => ... # string (optional)
    manage_template => ... # boolean (optional), default: true
    node_name => ... # string (optional)
    port => ... # string (optional)
    protocol => ... # string, one of ["node", "transport", "http"] (optional)
    template => ... # a valid filesystem path (optional)
    template_name => ... # string (optional), default: "logstash"
    template_overwrite => ... # boolean (optional), default: false
    workers => ... # number (optional), default: 1
  }
}

Details

action

  • Value type is string
  • Default value is "index"

The Elasticsearch action to perform. Valid actions are: index, delete.

Use of this setting REQUIRES you also configure the document_id setting because delete actions all require a document id.

What does each action do?

  • index: indexes a document (an event from logstash).
  • delete: deletes a document by id

For more details on actions, check out the Elasticsearch bulk API documentation

bind_host

  • Value type is string
  • There is no default value for this setting.

The name/address of the host to bind to for Elasticsearch clustering

bind_port

  • Value type is number
  • There is no default value for this setting.

This is only valid for the ‘node’ protocol.

The port for the node to listen on.

cluster

  • Value type is string
  • There is no default value for this setting.

The name of your cluster if you set it on the Elasticsearch side. Useful for discovery.

codec

  • Value type is codec
  • Default value is "plain"

The codec used for output data. Output codecs are a convenient method for encoding your data before it leaves the output, without needing a separate filter in your Logstash pipeline.

document_id

  • Value type is string
  • Default value is nil

The document ID for the index. Useful for overwriting existing entries in Elasticsearch with the same ID.

embedded

  • Value type is boolean
  • Default value is false

Run the Elasticsearch server embedded in this process. This option is useful if you want to run a single Logstash process that handles log processing and indexing; it saves you from needing to run a separate Elasticsearch process.

embedded_http_port

  • Value type is string
  • Default value is "9200-9300"

If you are running the embedded Elasticsearch server, you can set the http port it listens on here; it is not common to need this setting changed from default.

exclude_tags DEPRECATED

  • DEPRECATED WARNING: This config item is deprecated. It may be removed in a further version.
  • Value type is array
  • Default value is []

Only handle events without any of these tags. Note this check is additional to type and tags.

flush_size

  • Value type is number
  • Default value is 5000

This plugin uses the bulk index api for improved indexing performance. To make efficient bulk api calls, we will buffer a certain number of events before flushing that out to Elasticsearch. This setting controls how many events will be buffered before sending a batch of events.

host

  • Value type is string
  • There is no default value for this setting.

The hostname or IP address of the host to use for Elasticsearch unicast discovery This is only required if the normal multicast/cluster discovery stuff won’t work in your environment.

idle_flush_time

  • Value type is number
  • Default value is 1

The amount of time since last flush before a flush is forced.

This setting helps ensure slow event rates don’t get stuck in Logstash. For example, if your flush_size is 100, and you have received 10 events, and it has been more than idle_flush_time seconds since the last flush, Logstash will flush those 10 events automatically.

This helps keep both fast and slow log streams moving along in near-real-time.

index

  • Value type is string
  • Default value is "logstash-%{+YYYY.MM.dd}"

The index to write events to. This can be dynamic using the %{foo} syntax. The default value will partition your indices by day so you can more easily delete old data or only search specific date ranges. Indexes may not contain uppercase characters.

index_type

  • Value type is string
  • There is no default value for this setting.

The index type to write events to. Generally you should try to write only similar events to the same ‘type’. String expansion ‘%{foo}’ works here.

manage_template

  • Value type is boolean
  • Default value is true

Starting in Logstash 1.3 (unless you set option “manage_template” to false) a default mapping template for Elasticsearch will be applied, if you do not already have one set to match the index pattern defined (default of “logstash-%{+YYYY.MM.dd}”), minus any variables. For example, in this case the template will be applied to all indices starting with logstash-*

If you have dynamic templating (e.g. creating indices based on field names) then you should set “manage_template” to false and use the REST API to upload your templates manually.

max_inflight_requests DEPRECATED

  • DEPRECATED WARNING: This config item is deprecated. It may be removed in a further version.
  • Value type is number
  • Default value is 50

This setting no longer does anything. It exists to keep config validation from failing. It will be removed in future versions.

node_name

  • Value type is string
  • There is no default value for this setting.

The node name Elasticsearch will use when joining a cluster.

By default, this is generated internally by the ES client.

port

  • Value type is string
  • There is no default value for this setting.

The port for Elasticsearch transport to use.

If you do not set this, the following defaults are used: * protocol => http - port 9200 * protocol => transport - port 9300-9305 * protocol => node - port 9300-9305

protocol

  • Value can be any of: "node", "transport", "http"
  • There is no default value for this setting.

Choose the protocol used to talk to Elasticsearch.

The ‘node’ protocol will connect to the cluster as a normal Elasticsearch node (but will not store data). This allows you to use things like multicast discovery. If you use the node protocol, you must permit bidirectional communication on the port 9300 (or whichever port you have configured).

The ‘transport’ protocol will connect to the host you specify and will not show up as a ‘node’ in the Elasticsearch cluster. This is useful in situations where you cannot permit connections outbound from the Elasticsearch cluster to this Logstash server.

The ‘http’ protocol will use the Elasticsearch REST/HTTP interface to talk to elasticsearch.

All protocols will use bulk requests when talking to Elasticsearch.

The default protocol setting under java/jruby is “node”. The default protocol on non-java rubies is “http”

tags DEPRECATED

  • DEPRECATED WARNING: This config item is deprecated. It may be removed in a further version.
  • Value type is array
  • Default value is []

Only handle events with all of these tags. Note that if you specify a type, the event must also match that type. Optional.

template

  • Value type is path
  • There is no default value for this setting.

You can set the path to your own template here, if you so desire. If not set, the included template will be used.

template_name

  • Value type is string
  • Default value is "logstash"

This configuration option defines how the template is named inside Elasticsearch. Note that if you have used the template management features and subsequently change this, you will need to prune the old template manually, e.g. curl -XDELETE http://localhost:9200/_template/OldTemplateName?pretty where OldTemplateName is whatever the former setting was.

template_overwrite

  • Value type is boolean
  • Default value is false

Overwrite the current template with whatever is configured in the template and template_name directives.

type DEPRECATED

  • DEPRECATED WARNING: This config item is deprecated. It may be removed in a further version.
  • Value type is string
  • Default value is ""

The type to act on. If a type is given, then this output will only act on messages with the same type. See any input plugin’s “type” attribute for more. Optional.

workers

  • Value type is number
  • Default value is 1

The number of workers to use for this output. Note that this setting may not be useful for all outputs.


This is documentation from lib/logstash/outputs/elasticsearch.rb