s3

Milestone: 1

TODO integrate aws_config in the future INFORMATION: This plugin was created for store the logstash’s events into Amazon Simple Storage Service (Amazon S3). For use it you needs authentications and an s3 bucket. Be careful to have the permission to write file on S3’s bucket and run logstash with super user for establish connection. S3 plugin allows you to do something complex, let’s explain:) S3 outputs create temporary files into “/opt/logstash/S3_temp/”. If you want, you can change the path at the start of register method. This files have a special name, for example: ls.s3.ip-10-228-27-95.2013-04-18T10.00.tag_hello.part0.txt ls.s3 : indicate logstash plugin s3 “ip-10-228-27-95” : indicate you ip machine, if you have more logstash and writing on the same bucket for example. “2013-04-18T10.00” : represents the time whenever you specify time_file. “tag_hello” : this indicate the event’s tag, you can collect events with the same tag. “part0” : this means if you indicate size_file then it will generate more parts if you file.size > size_file. When a file is full it will pushed on bucket and will be deleted in temporary directory. If a file is empty is not pushed, but deleted. This plugin have a system to restore the previous temporary files if something crash. INFORMATION ABOUT CLASS: I tried to comment the class at best i could do. I think there are much thing to improve, but if you want some points to develop here a list: TODO Integrate aws_config in the future TODO Find a method to push them all files when logtstash close the session. TODO Integrate @field on the path file TODO Permanent connection or on demand? For now on demand, but isn’t a good implementation. Use a while or a thread to try the connection before break a time_out and signal an error. TODO If you have bugs report or helpful advice contact me, but remember that this code is much mine as much as yours, try to work on it if you want :) USAGE: This is an example of logstash config: output { s3{ access_key_id => “crazy_key” (required) secret_access_key => “monkey_access_key” (required) endpoint_region => “eu-west-1” (required) bucket => “boss_please_open_your_bucket” (required)
size_file => 2048 (optional) time_file => 5 (optional) format => “plain” (optional) canned_acl => “private” (optional. Options are “private”, “public_read”, “public_read_write”, “authenticated_read”. Defaults to “private” ) } } We analize this: access_key_id => “crazy_key” Amazon will give you the key for use their service if you buy it or try it. (not very much open source anyway) secret_access_key => “monkey_access_key” Amazon will give you the secret_access_key for use their service if you buy it or try it . (not very much open source anyway). endpoint_region => “eu-west-1” When you make a contract with Amazon, you should know where the services you use. bucket => “boss_please_open_your_bucket” Be careful you have the permission to write on bucket and know the name. size_file => 2048 Means the size, in KB, of files who can store on temporary directory before you will be pushed on bucket. Is useful if you have a little server with poor space on disk and you don’t want blow up the server with unnecessary temporary log files. time_file => 5 Means, in minutes, the time before the files will be pushed on bucket. Is useful if you want to push the files every specific time. format => “plain” Means the format of events you want to store in the files canned_acl => “private” The S3 canned ACL to use when putting the file. Defaults to “private”. LET’S ROCK AND ROLL ON THE CODE!

Synopsis

This is what it might look like in your config file:
output {
  s3 {
    access_key_id => ... # string (optional)
    aws_credentials_file => ... #  (optional)
    bucket => ... # string (optional)
    canned_acl => ... # string, one of ["private", "public_read", "public_read_write", "authenticated_read"] (optional), default: "private"
    codec => ... # codec (optional), default: "plain"
    endpoint_region => ... # string, one of ["us-east-1", "us-west-1", "us-west-2", "eu-west-1", "ap-southeast-1", "ap-southeast-2", "ap-northeast-1", "sa-east-1", "us-gov-west-1"] (optional), default: "us-east-1"
    format => ... # string, one of ["json", "plain", "nil"] (optional), default: "plain"
    proxy_uri => ... #  (optional)
    region => ... #  (optional)
    restore => ... # boolean (optional), default: false
    secret_access_key => ... # string (optional)
    size_file => ... # number (optional), default: 0
    time_file => ... # number (optional), default: 0
    use_ssl => ... #  (optional)
    workers => ... # number (optional), default: 1
  }
}

Details

access_key_id

  • Value type is string
  • There is no default value for this setting.

include LogStash::PluginMixins::AwsConfig Aws access_key.

aws_credentials_file

  • Value type is string
  • There is no default value for this setting.

Path to YAML file containing a hash of AWS credentials.
This file will only be loaded if access_key_id and secret_access_key aren’t set. The contents of the file should look like this:

:access_key_id: "12345"
:secret_access_key: "54321"

bucket

  • Value type is string
  • There is no default value for this setting.

S3 bucket

canned_acl

  • Value can be any of: "private", "public_read", "public_read_write", "authenticated_read"
  • Default value is "private"

Aws canned ACL

codec

  • Value type is codec
  • Default value is "plain"

The codec used for output data. Output codecs are a convenient method for encoding your data before it leaves the output, without needing a separate filter in your Logstash pipeline.

endpoint_region

  • Value can be any of: "us-east-1", "us-west-1", "us-west-2", "eu-west-1", "ap-southeast-1", "ap-southeast-2", "ap-northeast-1", "sa-east-1", "us-gov-west-1"
  • Default value is "us-east-1"

Aws endpoint_region

exclude_tags DEPRECATED

  • DEPRECATED WARNING: This config item is deprecated. It may be removed in a further version.
  • Value type is array
  • Default value is []

Only handle events without any of these tags. Note this check is additional to type and tags.

format

  • Value can be any of: "json", "plain", "nil"
  • Default value is "plain"

The event format you want to store in files. Defaults to plain text.

proxy_uri

  • Value type is string
  • There is no default value for this setting.

URI to proxy server if required

region

  • Value type is string
  • There is no default value for this setting.

The AWS Region

restore

  • Value type is boolean
  • Default value is false

secret_access_key

  • Value type is string
  • There is no default value for this setting.

Aws secret_access_key

size_file

  • Value type is number
  • Default value is 0

Set the size of file in KB, this means that files on bucket when have dimension > file_size, they are stored in two or more file. If you have tags then it will generate a specific size file for every tags

tags DEPRECATED

  • DEPRECATED WARNING: This config item is deprecated. It may be removed in a further version.
  • Value type is array
  • Default value is []

Only handle events with all of these tags. Note that if you specify a type, the event must also match that type. Optional.

time_file

  • Value type is number
  • Default value is 0

Set the time, in minutes, to close the current sub_time_section of bucket. If you define file_size you have a number of files in consideration of the section and the current tag. 0 stay all time on listerner, beware if you specific 0 and size_file 0, because you will not put the file on bucket, for now the only thing this plugin can do is to put the file when logstash restart.

type DEPRECATED

  • DEPRECATED WARNING: This config item is deprecated. It may be removed in a further version.
  • Value type is string
  • Default value is ""

The type to act on. If a type is given, then this output will only act on messages with the same type. See any input plugin’s “type” attribute for more. Optional.

use_ssl

  • Value type is string
  • There is no default value for this setting.

Should we require (true) or disable (false) using SSL for communicating with the AWS API
The AWS SDK for Ruby defaults to SSL so we preserve that

workers

  • Value type is number
  • Default value is 1

The number of workers to use for this output. Note that this setting may not be useful for all outputs.


This is documentation from lib/logstash/outputs/s3.rb