The cross-cluster search feature allows any node to act as a federated client across multiple clusters. A cross-cluster search node won’t join the remote cluster, instead it connects to a remote cluster in a light fashion in order to execute federated search requests. For details on communication and compatibility between different clusters, see Remote clusters.
Cross-cluster search requires configuring remote clusters.
PUT _cluster/settings { "persistent": { "cluster": { "remote": { "cluster_one": { "seeds": [ "127.0.0.1:9300" ] }, "cluster_two": { "seeds": [ "127.0.0.1:9301" ] }, "cluster_three": { "seeds": [ "127.0.0.1:9302" ] } } } } }
To search the twitter
index on remote cluster cluster_one
the index name
must be prefixed with the alias of the remote cluster followed by the :
character:
GET /cluster_one:twitter/_search { "query": { "match": { "user": "kimchy" } } }
{ "took": 150, "timed_out": false, "_shards": { "total": 1, "successful": 1, "failed": 0, "skipped": 0 }, "_clusters": { "total": 1, "successful": 1, "skipped": 0 }, "hits": { "total" : { "value": 1, "relation": "eq" }, "max_score": 1, "hits": [ { "_index": "cluster_one:twitter", "_type": "_doc", "_id": "0", "_score": 1, "_source": { "user": "kimchy", "date": "2009-11-15T14:12:12", "message": "trying out Elasticsearch", "likes": 0 } } ] } }
Indices with the same name on different clusters can also be searched:
GET /cluster_one:twitter,twitter/_search { "query": { "match": { "user": "kimchy" } } }
Search results are disambiguated the same way as the indices are disambiguated in the request. Indices with same names are treated as different indices when results are merged. All results retrieved from an index located in a remote cluster are prefixed with their corresponding cluster alias:
{ "took": 150, "timed_out": false, "num_reduce_phases": 3, "_shards": { "total": 2, "successful": 2, "failed": 0, "skipped": 0 }, "_clusters": { "total": 2, "successful": 2, "skipped": 0 }, "hits": { "total" : { "value": 2, "relation": "eq" }, "max_score": 1, "hits": [ { "_index": "cluster_one:twitter", "_type": "_doc", "_id": "0", "_score": 1, "_source": { "user": "kimchy", "date": "2009-11-15T14:12:12", "message": "trying out Elasticsearch", "likes": 0 } }, { "_index": "twitter", "_type": "_doc", "_id": "0", "_score": 2, "_source": { "user": "kimchy", "date": "2009-11-15T14:12:12", "message": "trying out Elasticsearch", "likes": 0 } } ] } }
By default, all remote clusters that are searched via cross-cluster search need to be
available when the search request is executed. Otherwise, the whole request
fails; even if some of the clusters are available, no search results are
returned. You can use the boolean skip_unavailable
setting to make remote
clusters optional. By default, it is set to false
.
GET /cluster_one:twitter,cluster_two:twitter,twitter/_search { "query": { "match": { "user": "kimchy" } } }
{ "took": 150, "timed_out": false, "num_reduce_phases": 3, "_shards": { "total": 2, "successful": 2, "failed": 0, "skipped": 0 }, "_clusters": { "total": 3, "successful": 2, "skipped": 1 }, "hits": { "total" : { "value": 2, "relation": "eq" }, "max_score": 1, "hits": [ { "_index": "cluster_one:twitter", "_type": "_doc", "_id": "0", "_score": 1, "_source": { "user": "kimchy", "date": "2009-11-15T14:12:12", "message": "trying out Elasticsearch", "likes": 0 } }, { "_index": "twitter", "_type": "_doc", "_id": "0", "_score": 2, "_source": { "user": "kimchy", "date": "2009-11-15T14:12:12", "message": "trying out Elasticsearch", "likes": 0 } } ] } }
Cross-cluster search (CCS) requests can be executed in two ways:
from
+ size
already fetched results. This is the default
strategy, used whenever possible. In case a scroll is provided, or inner hits
are requested as part of field collapsing, this strategy is not supported hence
network round-trips cannot be minimized and the following strategy is used
instead.
The search API supports the ccs_minimize_roundtrips
parameter, which defaults to true
and can be set to false
in case
minimizing network round-trips is not desirable.