The Index, Update, Delete, and
Bulk APIs support setting refresh to control when changes made
by this request are made visible to search. These are the allowed values:
true
wait_for
index.refresh_interval which defaults to one second. That setting is
dynamic. Calling the Refresh API or
setting refresh to true on any of the APIs that support it will also
cause a refresh, in turn causing already running requests with refresh=wait_for
to return.
false (the default)
Unless you have a good reason to wait for the change to become visible always
use refresh=false, or, because that is the default, just leave the refresh
parameter out of the URL. That is the simplest and fastest choice.
If you absolutely must have the changes made by a request visible synchronously
with the request then you must pick between putting more load on
Elasticsearch (true) and waiting longer for the response (wait_for). Here
are a few points that should inform that decision:
wait_for saves
compared to true. In the case that the index is only changed once every
index.refresh_interval then it saves no work.
true creates less efficient indexes constructs (tiny segments) that must
later be merged into more efficient index constructs (larger segments). Meaning
that the cost of true is paid at index time to create the tiny segment, at
search time to search the tiny segment, and at merge time to make the larger
segments.
refresh=wait_for requests in a row. Instead batch them
into a single bulk request with refresh=wait_for and Elasticsearch will start
them all in parallel and return only when they have all finished.
-1, disabling the automatic refreshes,
then requests with refresh=wait_for will wait indefinitely until some action
causes a refresh. Conversely, setting index.refresh_interval to something
shorter than the default like 200ms will make refresh=wait_for come back
faster, but it’ll still generate inefficient segments.
refresh=wait_for only affects the request that it is on, but, by forcing a
refresh immediately, refresh=true will affect other ongoing request. In
general, if you have a running system you don’t wish to disturb then
refresh=wait_for is a smaller modification.
refresh=wait_for Can Force a RefreshIf a refresh=wait_for request comes in when there are already
index.max_refresh_listeners (defaults to 1000) requests waiting for a refresh
on that shard then that request will behave just as though it had refresh set
to true instead: it will force a refresh. This keeps the promise that when a
refresh=wait_for request returns that its changes are visible for search
while preventing unchecked resource usage for blocked requests. If a request
forced a refresh because it ran out of listener slots then its response will
contain "forced_refresh": true.
Bulk requests only take up one slot on each shard that they touch no matter how many times they modify the shard.
These will create a document and immediately refresh the index so it is visible:
PUT /test/_doc/1?refresh
{"test": "test"}
PUT /test/_doc/2?refresh=true
{"test": "test"}These will create a document without doing anything to make it visible for search:
PUT /test/_doc/3
{"test": "test"}
PUT /test/_doc/4?refresh=false
{"test": "test"}This will create a document and wait for it to become visible for search:
PUT /test/_doc/4?refresh=wait_for
{"test": "test"}