elasticsearch/docs
Mridula 1d9b0a6009
Add Cluster Feature for L2 Norm (#129181)
* propgating retrievers to inner retrievers

* test feature taken care of

* Small changes in concurrent multipart upload interfaces (#128977)

Small changes in BlobContainer interface and wrapper.

Relates ES-11815

* Unmute FollowingEngineTests#testProcessOnceOnPrimary() test (#129054)

The reason the test fails is that operations contained _seq_no field with different doc value types (with no skippers and with skippers) and this isn't allowed, since field types need to be consistent in a Lucene index.

The initial operations were generated not knowing about the fact the index mode was set to logsdb or time_series. Causing the operations to not have doc value skippers. However when replaying the operations via following engine, the operations did have doc value skippers.

The fix is to set `index.seq_no.index_options` to `points_and_doc_values`, so that the initial operations are indexed without doc value skippers.

This test doesn't gain anything from storing seqno with doc value skippers, so there is no loss of testing coverage.

Closes #128541

* [Build] Add support for publishing to maven central (#128659)

This ensures we package an aggregation zip with all artifacts we want to publish to maven central as part of a release.
Running zipAggregation will produce a zip file in the build/nmcp/zip folder. The content of this zip is meant to match the maven artifacts we have currently declared as dra maven artifacts.

* ESQL: Check for errors while loading blocks (#129016)

Runs a sanity check after loading a block of values. Previously we were
doing a quick check if assertions were enabled. Now we do two quick
checks all the time. Better - we attach information about how a block
was loaded when there's a problem.

Relates to #128959

* Make `PhaseCacheManagementTests` project-aware (#129047)

The functionality in `PhaseCacheManagement` was already project-aware,
but these tests were still using deprecated methods.

* Vector test tools (#128934)

This adds some testing tools for verifying vector recall and latency
directly without having to spin up an entire ES node and running a rally
track.

Its pretty barebones and takes inspiration from lucene-util, but I
wanted access to our own formats and tooling to make our lives easier.

Here is an example config file. This will build the initial index, run
queries at num_candidates: 50, then again at num_candidates 100 (without
reindexing, and re-using the cached nearest neighbors).

```
[{
  "doc_vectors" : "path",
  "query_vectors" : "path",
  "num_docs" : 10000,
  "num_queries" : 10,
  "index_type" : "hnsw",
  "num_candidates" : 50,
  "k" : 10,
  "hnsw_m" : 16,
  "hnsw_ef_construction" : 200,
  "index_threads" : 4,
  "reindex" : true,
  "force_merge" : false,
  "vector_space" : "maximum_inner_product",
  "dimensions" : 768
},
{
"doc_vectors" : "path",
"query_vectors" : "path",
"num_docs" : 10000,
"num_queries" : 10,
"index_type" : "hnsw",
"num_candidates" : 100,
"k" : 10,
"hnsw_m" : 16,
"hnsw_ef_construction" : 200,
"vector_space" : "maximum_inner_product",
"dimensions" : 768
}
]
```

To execute:

```
./gradlew :qa:vector:checkVec --args="/Path/to/knn_tester_config.json"
```

Calling `./gradlew :qa:vector:checkVecHelp` gives some guidance on how
to use it, additionally providing a way to run it via java directly
(useful to bypass gradlew guff).

* ES|QL: refactor generative tests (#129028)

* Add a test of LOOKUP JOIN against a time series index (#129007)

Add a spec test of `LOOKUP JOIN` against a time series index.

* Make ILM `ClusterStateWaitStep` project-aware (#129042)

This is part of an iterative process to make ILM project-aware.

* Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {lookup-join.LookupJoinOnTimeSeriesIndex ASYNC} #129078

* Remove `ClusterState` param from ILM `AsyncBranchingStep` (#129076)

The `ClusterState` parameter of the `asyncPredicate` is not used
anywhere.

* Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {lookup-join.LookupJoinOnTimeSeriesIndex SYNC} #129082

* Mute org.elasticsearch.upgrades.UpgradeClusterClientYamlTestSuiteIT test {p0=upgraded_cluster/70_ilm/Test Lifecycle Still There And Indices Are Still Managed} #129097

* Mute org.elasticsearch.upgrades.UpgradeClusterClientYamlTestSuiteIT test {p0=upgraded_cluster/90_ml_data_frame_analytics_crud/Get mixed cluster outlier_detection job} #129098

* Mute org.elasticsearch.packaging.test.DockerTests test081SymlinksAreFollowedWithEnvironmentVariableFiles #128867

* Threadpool merge executor is aware of available disk space (#127613)

This PR introduces 3 new settings:
indices.merge.disk.check_interval, indices.merge.disk.watermark.high, and indices.merge.disk.watermark.high.max_headroom
that control if the threadpool merge executor starts executing new merges when the disk space is getting low.

The intent of this change is to avoid the situation where in-progress merges exhaust the available disk space on the node's local filesystem.
To this end, the thread pool merge executor periodically monitors the available disk space, as well as the current disk space estimates required by all in-progress (currently running) merges on the node, and will NOT schedule any new merges if the disk space is getting low (by default below the 5% limit of the total disk space, or 100 GB, whichever is smaller (same as the disk allocation flood stage level)).

* Add option to include or exclude vectors from _source retrieval (#128735)

This PR introduces a new include_vectors option to the _source retrieval context.
When set to false, vectors are excluded from the returned _source.
This is especially efficient when used with synthetic source, as it avoids loading vector fields entirely.

By default, vectors remain included unless explicitly excluded.

* Remove direct minScore propagation to inner retrievers

* cleaned up skip

* Mute org.elasticsearch.index.engine.ThreadPoolMergeExecutorServiceDiskSpaceTests testAvailableDiskSpaceMonitorWhenFileSystemStatErrors #129149

* Add transport version for ML inference Mistral chat completion (#129033)

* Add transport version for ML inference Mistral chat completion

* Add changelog for Mistral Chat Completion version fix

* Revert "Add changelog for Mistral Chat Completion version fix"

This reverts commit 7a57416bdc.

* Correct index path validation (#129144)

All we care about is if reindex is true or false. We shouldn't worry
about force merge. Because if reindex is true, we will create the
directory, if its false, we won't.

* Mute org.elasticsearch.index.engine.ThreadPoolMergeExecutorServiceDiskSpaceTests testUnavailableBudgetBlocksNewMergeTasksFromStartingExecution #129148

* Implemented completion task for Google VertexAI  (#128694)

* Google Vertex AI completion model, response entity and tests

* Fixed GoogleVertexAiServiceTest for Service configuration

* Changelog

* Removed downcasting and using `moveToFirstToken`

* Create GoogleVertexAiChatCompletionResponseHandler for streaming and non streaming responses

* Added unit tests

* PR feedback

* Removed googlevertexaicompletion model. Using just GoogleVertexAiChatCompletionModel for completion and chat completion

* Renamed uri -> nonStreamingUri. Added streamingUri and getters in GoogleVertexAiChatCompletionModel

* Moved rateLimitGroupHashing to subclasses of GoogleVertexAiModel

* Fixed rate limit has of GoogleVertexAiRerankModel and refactored uri for GoogleVertexAiUnifiedChatCompletionRequest

---------

Co-authored-by: lhoet-google <lhoet@google.com>
Co-authored-by: Jonathan Buttner <56361221+jonathan-buttner@users.noreply.github.com>

* Added cluster feature to yaml

* Node feature added

* Duplicate line - result of merge removed

* Update docs/changelog/129181.yaml

* Update 129181.yaml

---------

Co-authored-by: Tanguy Leroux <tlrx.dev@gmail.com>
Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
Co-authored-by: Rene Groeschke <rene@elastic.co>
Co-authored-by: Nik Everett <nik9000@gmail.com>
Co-authored-by: Niels Bauman <33722607+nielsbauman@users.noreply.github.com>
Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>
Co-authored-by: Luigi Dell'Aquila <luigi.dellaquila@gmail.com>
Co-authored-by: Bogdan Pintea <bogdan.pintea@elastic.co>
Co-authored-by: elasticsearchmachine <58790826+elasticsearchmachine@users.noreply.github.com>
Co-authored-by: Albert Zaharovits <email+github@zalbert.me>
Co-authored-by: Jim Ferenczi <jim.ferenczi@elastic.co>
Co-authored-by: Jan-Kazlouski-elastic <jan.kazlouski@elastic.co>
Co-authored-by: Leonardo Hoet <55866308+leo-hoet@users.noreply.github.com>
Co-authored-by: lhoet-google <lhoet@google.com>
Co-authored-by: Jonathan Buttner <56361221+jonathan-buttner@users.noreply.github.com>
2025-06-10 19:54:56 +02:00
..
changelog Add Cluster Feature for L2 Norm (#129181) 2025-06-10 19:54:56 +02:00
community-clients Adds SearchFlip to community clients (#110814) 2025-01-29 18:21:41 +01:00
extend [Entitlements] Small docs fixes (#127323) 2025-04-24 18:11:18 +02:00
internal Fix typo in DistributedArchitectureGuide (#128373) 2025-05-29 23:30:08 +10:00
java-rest
reference ES|QL - kNN function initial support (#127322) 2025-06-10 11:11:42 +02:00
release-notes [DOCS] Move applies_to to sit under heading in ES release notes (#128731) 2025-06-02 12:16:43 +02:00
src Update Maxmind mmdb fixture files (#122225) 2025-02-11 10:47:54 -05:00
build.gradle Add ability to redirect ingestion failures on data streams to a failure store (#126973) 2025-04-18 16:33:03 -04:00
docset.yml [docs] Add products to docset.yml (#128274) 2025-05-21 13:55:32 -05:00
httpCa.p12
README.md [DOCS] Update DOCS README.md backporting guidance (#124228) 2025-03-06 15:43:27 +01:00
transport.p12
Versions.asciidoc Update Elasticsearch main with snapshot version of Lucene (#127125) 2025-04-22 00:25:08 +02:00

Elasticsearch docs

Important

Elastic docs migration from asciidoc to Markdown is ongoing for 9.0. Elasticians can reach out in #docs or #es-docs on Slack if you have questions.

Tip

If you need to update the 8.x docs, you'll have to use the old asciidoc system. Refer to the 8.x README for more information.

This is a guide for writing, editing, and building the Elasticsearch docs.

Where do the docs source files live?

As of 9.0.0, the Elasticsearch docs have been redistributed as part of the Elastic docs migration, with the aim of telling a more cohesive story across Elastic products, features, and tools.

Docs live in three places:

  1. Reference content lives in this repo. This covers low-level stuff like settings and configuration information that is tightly coupled to code.
  • 👩🏽‍💻 Engineers own the bulk of this content.
  1. API reference docs live in the Elasticsearch specification
  • 👩🏽‍💻 Engineers own this content.
  1. Narrative, overview, and conceptual content mostly lives in the docs-content repo.
  • ✍🏼 Tech writers own the bulk of this content.

Where can I find the source files for a specific page?

Once the docs are published, you'll be able to choose the Edit this page option to find the source file in the appropriate GitHub repo.

In the meantime, grep around in the Elasticsearch or docs-content repos to identify the source file.

Build the docs

For 9.0.0+, all (ex-API reference) docs are written in Elastic docs V3 Markdown syntax and stored in .md files.

You can:

For open PRs, a GitHub action generates a live preview of any docs changes. If the check runs successfully, you can find the preview at:

https://docs-v3-preview.elastic.dev/elastic/elasticsearch/pull/<PR-NUMBER>/reference/

To re-run CI checks, an Elastic employee can select the Re-run this job option in the GitHub Actions tab.

Backporting

Tip

As of 9.0.0, we are currently only publishing from the main branch. We will continue to backport changes as usual, in case we need to revisit this approach in the future.

If you need to update the 8.x docs, you'll have to use the old asciidoc system. Refer to the 8.x README for more information.

Note

If you need to make changes to 9.x docs and 8.x docs, you'll need to use two different workflows:

  • For 9.x docs, create a PR using the new Markdown system against the main branch and backport as necessary.
  • For 8.x docs, create a PR using the old AsciiDoc system against the 8.x branch and backport the changes to any other 8.x branches needed.

Test code snippets

Important

Snippet testing has been temporarily disabled for Elasticsearch docs. This is a WIP and should be re-enabled soon for the new Markdown docs.

Snippets in console syntax are automatically tested by the command ./gradlew -pdocs check. To test just the docs from a single page, use e.g. ./gradlew -pdocs yamlRestTest --tests "*rollover*".

By default each console snippet runs as its own isolated test. You can manipulate the test execution in the following ways:

  • % TEST: Explicitly marks a snippet as a test. Snippets marked this way are tests even if they don't have ```console but usually % TEST is used for its modifiers:

    • % TEST[s/foo/bar/]: Replace foo with bar in the generated test. This should be used sparingly because it makes the snippet "lie". Sometimes, though, you can use it to make the snippet more clear. Keep in mind that if there are multiple substitutions then they are applied in the order that they are defined.
    • % TEST[catch:foo]: Used to expect errors in the requests. Replace foo with request to expect a 400 error, for example. If the snippet contains multiple requests then only the last request will expect the error.
    • % TEST[continued]: Continue the test started in the last snippet. Between tests the nodes are cleaned: indexes are removed, etc. This prevents that from happening between snippets because the two snippets are a single test. This is most useful when you have text and snippets that work together to tell the story of some use case because it merges the snippets (and thus the use case) into one big test.
      • You can't use % TEST[continued] immediately after % TESTSETUP or // TEARDOWN.
    • % TEST[skip:reason]: Skip this test. Replace reason with the actual reason to skip the test. Snippets without % TEST or // CONSOLE aren't considered tests anyway but this is useful for explicitly documenting the reason why the test shouldn't be run.
    • % TEST[setup:name]: Run some setup code before running the snippet. This is useful for creating and populating indexes used in the snippet. The name is split on , and looked up in the setups defined in docs/build.gradle. See % TESTSETUP below for a similar feature.
    • % TEST[teardown:name]: Run some teardown code after the snippet. This is useful for performing hidden cleanup, such as deleting index templates. The name is split on , and looked up in the teardowns defined in docs/build.gradle. See % TESTSETUP below for a similar feature.
    • % TEST[warning:some warning]: Expect the response to include a Warning header. If the response doesn't include a Warning header with the exact text then the test fails. If the response includes Warning headers that aren't expected then the test fails.
  • ```console-result: Matches this snippet against the body of the response of the last test. If the response is JSON then order is ignored. If you add % TEST[continued] to the snippet after ```console-result it will continue in the same test, allowing you to interleave requests with responses to check.

  • % TESTRESPONSE: Explicitly marks a snippet as a test response even without ```console-result. Similarly to % TEST this is mostly used for its modifiers.

    • You can't use ```console-result immediately after % TESTSETUP. Instead, consider using % TEST[continued] or rearrange your snippets.

    Note

    Previously we only used % TESTRESPONSE instead of ```console-result so you'll see that a lot in older branches but we prefer ```console-result now.

    • % TESTRESPONSE[s/foo/bar/]: Substitutions. See % TEST[s/foo/bar] for how it works. These are much more common than % TEST[s/foo/bar] because they are useful for eliding portions of the response that are not pertinent to the documentation.
      • One interesting difference here is that you often want to match against the response from Elasticsearch. To do that you can reference the "body" of the response like this: % TESTRESPONSE[s/"took": 25/"took": $body.took/]. Note the $body string. This says "I don't expect that 25 number in the response, just match against what is in the response." Instead of writing the path into the response after $body you can write $_path which "figures out" the path. This is especially useful for making sweeping assertions like "I made up all the numbers in this example, don't compare them" which looks like % TESTRESPONSE[s/\d+/$body.$_path/].
    • % TESTRESPONSE[non_json]: Add substitutions for testing responses in a format other than JSON. Use this after all other substitutions so it doesn't make other substitutions difficult.
    • % TESTRESPONSE[skip:reason]: Skip the assertions specified by this response.
  • % TESTSETUP: Marks this snippet as the "setup" for all other snippets in this file. In order to enhance clarity and simplify understanding for readers, a straightforward approach involves marking the first snippet in the documentation file with the % TESTSETUP marker. By doing so, it clearly indicates that this particular snippet serves as the setup or preparation step for all subsequent snippets in the file. This helps in explaining the necessary steps that need to be executed before running the examples. Unlike the alternative convention % TEST[setup:name], which relies on a setup defined in a separate file, this convention brings the setup directly into the documentation file, making it more self-contained and reducing ambiguity. By adopting this convention, users can easily identify and follow the correct sequence of steps to ensure that the examples provided in the documentation work as intended.

  • // TEARDOWN: Ends and cleans up a test series started with % TESTSETUP or % TEST[setup:name]. You can use // TEARDOWN to set up multiple tests in the same file.

  • // NOTCONSOLE: Marks this snippet as neither // CONSOLE nor % TESTRESPONSE, excluding it from the list of unconverted snippets. We should only use this for snippets that are JSON but are not responses or requests.

In addition to the standard CONSOLE syntax these snippets can contain blocks of yaml surrounded by markers like this:

startyaml
  - compare_analyzers: {index: thai_example, first: thai, second: rebuilt_thai}
endyaml

This allows slightly more expressive testing of the snippets. Since that syntax is not supported by ```console the usual way to incorporate it is with a % TEST[s//] marker like this:

% TEST[s/\n$/\nstartyaml\n  - compare_analyzers: {index: thai_example, first: thai, second: rebuilt_thai}\nendyaml\n/]

Any place you can use json you can use elements like $body.path.to.thing which is replaced on the fly with the contents of the thing at path.to.thing in the last response.