elasticsearch/docs/reference/indices/forcemerge.asciidoc
Liam Thompson 33a71e3289
[DOCS] Refactor book-scoped variables in docs/reference/index.asciidoc (#107413)
* Remove `es-test-dir` book-scoped variable

* Remove `plugins-examples-dir` book-scoped variable

* Remove `:dependencies-dir:` and `:xes-repo-dir:` book-scoped variables

- In `index.asciidoc`, two variables (`:dependencies-dir:` and `:xes-repo-dir:`) were removed.
- In `sql/index.asciidoc`, the `:sql-tests:` path was updated to fuller path
- In `esql/index.asciidoc`, the `:esql-tests:` path was updated idem

* Replace `es-repo-dir` with `es-ref-dir`

* Move `:include-xpack: true` to few files that use it, remove from index.asciidoc
2024-04-17 14:37:07 +02:00

216 lines
6.5 KiB
Text

[[indices-forcemerge]]
=== Force merge API
++++
<titleabbrev>Force merge</titleabbrev>
++++
Forces a <<index-modules-merge,merge>> on the shards of one or more indices.
For data streams, the API forces a merge on the shards of the stream's backing
indices.
[source,console]
----
POST /my-index-000001/_forcemerge
----
// TEST[setup:my_index]
[[forcemerge-api-request]]
==== {api-request-title}
`POST /<target>/_forcemerge`
`POST /_forcemerge`
[[forcemerge-api-prereqs]]
==== {api-prereq-title}
* If the {es} {security-features} are enabled, you must have the `maintenance`
or `manage` <<privileges-list-indices,index privilege>> for the target data
stream, index, or alias.
[[forcemerge-api-desc]]
==== {api-description-title}
Use the force merge API to force a <<index-modules-merge,merge>> on the
shards of one or more indices. Merging reduces the number of segments in each
shard by merging some of them together, and also frees up the space used by
deleted documents. Merging normally happens automatically, but sometimes it is
useful to trigger a merge manually.
// tag::force-merge-read-only-warn[]
WARNING: **We recommend only force merging a read-only index (meaning the index
is no longer receiving writes).** When documents are updated or deleted, the
old version is not immediately removed, but instead soft-deleted and marked
with a "tombstone". These soft-deleted documents are automatically cleaned up
during regular segment merges. But force merge can cause very large (> 5GB)
segments to be produced, which are not eligible for regular merges. So the
number of soft-deleted documents can then grow rapidly, resulting in higher
disk usage and worse search performance. If you regularly force merge an index
receiving writes, this can also make snapshots more expensive, since the new
documents can't be backed up incrementally.
// end::force-merge-read-only-warn[]
[[forcemerge-blocks]]
===== Blocks during a force merge
Calls to this API block until the merge is complete (unless request contains
wait_for_completion=false, which is default true). If the client connection
is lost before completion then the force merge process will continue in the
background. Any new requests to force merge the same indices will also block
until the ongoing force merge is complete.
[[docs-forcemerge-task-api]]
===== Running force merge asynchronously
If the request contains `wait_for_completion=false`, {es}
performs some preflight checks, launches the request, and returns a
<<tasks,`task`>> you can use to get the status of the task. However, you can
not cancel this task as the force merge task is not cancelable. {es}
creates a record of this task as a document at `_tasks/<task_id>`. When you
are done with a task, you should delete the task document so {es}
can reclaim the space.
[[forcemerge-multi-index]]
===== Force merging multiple indices
You can force merge multiple indices with a single request by targeting:
* One or more data streams that contain multiple backing indices
* Multiple indices
* One or more aliases
* All data streams and indices in a cluster
Each targeted shard is force-merged separately using <<modules-threadpool,the
`force_merge` threadpool>>. By default each node only has a single
`force_merge` thread which means that the shards on that node are force-merged
one at a time. If you expand the `force_merge` threadpool on a node then it
will force merge its shards in parallel.
Force merge makes the storage for the shard being merged temporarily
increase, up to double its size in case `max_num_segments` parameter is set to
`1`, as all segments need to be rewritten into a new one.
[[forcemerge-api-path-params]]
==== {api-path-parms-title}
`<target>`::
(Optional, string) Comma-separated list of data streams, indices, and aliases
used to limit the request. Supports wildcards (`*`). To target all data streams
and indices, omit this parameter or use `*` or `_all`.
[[forcemerge-api-query-params]]
==== {api-query-parms-title}
include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=allow-no-indices]
+
Defaults to `true`.
include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=expand-wildcards]
+
Defaults to `open`.
`flush`::
(Optional, Boolean)
If `true`,
{es} performs a <<indices-flush,flush>> on the indices
after the force merge.
Defaults to `true`.
include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=index-ignore-unavailable]
`max_num_segments`::
+
--
(Optional, integer)
The number of segments to merge to.
To fully merge indices,
set it to `1`.
Defaults to checking if a merge needs to execute.
If so, executes it.
You can't specify this parameter and `only_expunge_deletes` in the same request.
--
`only_expunge_deletes`::
+
--
(Optional, Boolean)
If `true`,
expunge all segments containing more than `index.merge.policy.expunge_deletes_allowed`
(default to 10) percents of deleted documents.
Defaults to `false`.
In Lucene,
a document is not deleted from a segment;
just marked as deleted.
During a merge,
a new segment is created
that does not contain those document deletions.
You can't specify this parameter and `max_num_segments` in the same request.
--
`wait_for_completion`::
+
--
(Optional, Boolean)
If `true`, the request blocks until the operation is complete.
Defaults to `true`.
--
[[forcemerge-api-example]]
==== {api-examples-title}
[[forcemerge-api-specific-ex]]
===== Force merge a specific data stream or index
[source,console]
----
POST /my-index-000001/_forcemerge
----
// TEST[continued]
[[forcemerge-api-multiple-ex]]
===== Force merge several data streams or indices
[source,console]
----
POST /my-index-000001,my-index-000002/_forcemerge
----
// TEST[s/^/PUT my-index-000001\nPUT my-index-000002\n/]
[[forcemerge-api-all-ex]]
===== Force merge all indices
[source,console]
----
POST /_forcemerge
----
[[forcemerge-api-time-based-index-ex]]
===== Data streams and time-based indices
Force-merging is useful for managing a data stream's older backing indices and
other time-based indices, particularly after a
<<indices-rollover-index,rollover>>.
In these cases,
each index only receives indexing traffic for a certain period of time.
Once an index receive no more writes,
its shards can be force-merged to a single segment.
[source,console]
--------------------------------------------------
POST /.ds-my-data-stream-2099.03.07-000001/_forcemerge?max_num_segments=1
--------------------------------------------------
// TEST[setup:my_index]
// TEST[s/.ds-my-data-stream-2099.03.07-000001/my-index-000001/]
This can be a good idea because single-segment shards can sometimes use simpler
and more efficient data structures to perform searches.