This functionality is experimental and may be changed or removed completely in a future release. Elastic will take a best effort approach to fix any issues, but experimental features are not subject to the support SLA of official GA features.
The Rollup Search endpoint allows searching rolled-up data using the standard query DSL. The Rollup Search endpoint is needed because, internally, rolled-up documents utilize a different document structure than the original data. The Rollup Search endpoint rewrites standard query DSL into a format that matches the rollup documents, then takes the response and rewrites it back to what a client would expect given the original query.
index
Rules for the index
parameter:
_all
, is not permitted
The request body supports a subset of features from the regular Search API. It supports:
query
param for specifying an DSL query, subject to some limitations (see Rollup Search Limitations and Rollup Aggregation Limitations
aggregations
param for specifying aggregations
Functionality that is not available:
size
: because rollups work on pre-aggregated data, no search hits can be returned and so size must be set to zero or
omitted entirely.
highlighter
, suggestors
, post_filter
, profile
, explain
are similarly disallowed
Imagine we have an index named sensor-1
full of raw data, and we have created a rollup job with the following configuration:
PUT _rollup/job/sensor { "index_pattern": "sensor-*", "rollup_index": "sensor_rollup", "cron": "*/30 * * * * ?", "page_size" :1000, "groups" : { "date_histogram": { "field": "timestamp", "interval": "1h", "delay": "7d" }, "terms": { "fields": ["node"] } }, "metrics": [ { "field": "temperature", "metrics": ["min", "max", "sum"] }, { "field": "voltage", "metrics": ["avg"] } ] }
This rolls up the sensor-*
pattern and stores the results in sensor_rollup
. To search this rolled up data, we
need to use the _rollup_search
endpoint. However, you’ll notice that we can use regular query DSL to search the
rolled-up data:
GET /sensor_rollup/_rollup_search { "size": 0, "aggregations": { "max_temperature": { "max": { "field": "temperature" } } } }
The query is targeting the sensor_rollup
data, since this contains the rollup data as configured in the job. A max
aggregation has been used on the temperature
field, yielding the following response:
{ "took" : 102, "timed_out" : false, "terminated_early" : false, "_shards" : ... , "hits" : { "total" : { "value": 0, "relation": "eq" }, "max_score" : 0.0, "hits" : [ ] }, "aggregations" : { "max_temperature" : { "value" : 202.0 } } }
The response is exactly as you’d expect from a regular query + aggregation; it provides some metadata about the request
(took
, _shards
, etc), the search hits (which is always empty for rollup searches), and the aggregation response.
Rollup searches are limited to functionality that was configured in the rollup job. For example, we are not able to calculate
the average temperature because avg
was not one of the configured metrics for the temperature
field. If we try
to execute that search:
GET sensor_rollup/_rollup_search { "size": 0, "aggregations": { "avg_temperature": { "avg": { "field": "temperature" } } } }
{ "error" : { "root_cause" : [ { "type" : "illegal_argument_exception", "reason" : "There is not a rollup job that has a [avg] agg with name [avg_temperature] which also satisfies all requirements of query.", "stack_trace": ... } ], "type" : "illegal_argument_exception", "reason" : "There is not a rollup job that has a [avg] agg with name [avg_temperature] which also satisfies all requirements of query.", "stack_trace": ... }, "status": 400 }
The Rollup Search API has the capability to search across both "live", non-rollup data as well as the aggregated rollup data. This is done by simply adding the live indices to the URI:
GET sensor-1,sensor_rollup/_rollup_search { "size": 0, "aggregations": { "max_temperature": { "max": { "field": "temperature" } } } }
When the search is executed, the Rollup Search endpoint will do two things:
When the two responses are received, the endpoint will then rewrite the rollup response and merge the two together. During the merging process, if there is any overlap in buckets between the two responses, the buckets from the non-rollup index will be used.
The response to the above query will look as expected, despite spanning rollup and non-rollup indices:
{ "took" : 102, "timed_out" : false, "terminated_early" : false, "_shards" : ... , "hits" : { "total" : { "value": 0, "relation": "eq" }, "max_score" : 0.0, "hits" : [ ] }, "aggregations" : { "max_temperature" : { "value" : 202.0 } } }