A docs page for the `terminate` processor was added in
https://github.com/elastic/elasticsearch/pull/114157, but the change
to include it in the outer processor reference page was omitted. This
change corrects that oversight.
This adds support for a new `index_template_substitutions` field to the
body of an ingest simulate API request. These substitutions can be used
to change the pipeline(s) used for ingest, or to change the mappings
used for validation. It is similar to the
`component_template_substitutions` added in #113276. Here is an example
that shows both of those usages working together:
```
## First, add a couple of pipelines that set a field to a boolean:
PUT /_ingest/pipeline/foo-pipeline?pretty
{
"processors": [
{
"set": {
"field": "foo",
"value": true
}
}
]
}
PUT /_ingest/pipeline/bar-pipeline?pretty
{
"processors": [
{
"set": {
"field": "bar",
"value": true
}
}
]
}
## Now, create three component templates. One provides a mapping enforces that the only field is "foo"
## and that field is a keyword. The next is similar, but adds a `bar` field. The final one provides a setting
## that makes "foo-pipeline" the default pipeline.
## Remember that the "foo-pipeline" sets the "foo" field to a boolean, so using both of these templates
## together would cause a validation exception. These could be in the same template, but are provided
## separately just so that later we can show how multiple templates can be overridden.
PUT _component_template/mappings_template
{
"template": {
"mappings": {
"dynamic": "strict",
"properties": {
"foo": {
"type": "keyword"
}
}
}
}
}
PUT _component_template/mappings_template_with_bar
{
"template": {
"mappings": {
"dynamic": "strict",
"properties": {
"foo": {
"type": "keyword"
},
"bar": {
"type": "boolean"
}
}
}
}
}
PUT _component_template/settings_template
{
"template": {
"settings": {
"index": {
"default_pipeline": "foo-pipeline"
}
}
}
}
## Here we create an index template pulling in both of the component templates above
PUT _index_template/template_1
{
"index_patterns": ["foo*"],
"composed_of": ["mappings_template", "settings_template"]
}
## We can index a document here to create the index, or not. Either way the simulate call ought to work the same
POST foo-1/_doc
{
"foo": "FOO"
}
## This will not blow up with validation exceptions because the substitute "index_template_substitutions"
## uses `mappings_template_with_bar`, which adds the bar field.
## And the bar-pipeline is executed rather than the foo-pipeline because the substitute
## "index_template_substitutions" uses a substitute `settings_template`, so the value of "foo"
## does not get set to an invalid type.
POST _ingest/_simulate?pretty&index=foo-1
{
"docs": [
{
"_id": "asdf",
"_source": {
"foo": "foo",
"bar": "bar"
}
}
],
"component_template_substitutions": {
"settings_template": {
"template": {
"settings": {
"index": {
"default_pipeline": "bar-pipeline"
}
}
}
}
},
"index_template_substitutions": {
"template_1": {
"index_patterns": ["foo*"],
"composed_of": ["mappings_template_with_bar", "settings_template"]
}
}
}
```
Adds a new option trace_redact in redact processor to indicate a document has been redacted in the ingest pipeline. If a document is processed by a redact processor AND any field is redacted, ingest metadata _ingest._redact._is_redacted = true will be set.
Closes#94633
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
The addition of the logger requires several updates to tests to deal with the possible warning, or muting if there is not way to specify an allowed (but not mandatory) warning
* (Doc+) Inference Pipeline ignores Mapping Analyzers
From internal Dev feedback (will cross-link after), this updates that inference processors within ingest pipelines run before mapping analyzers effectively ignoring them. So if users want analyzers to take effect, they would need to select the analyzer's ingest pipeline process equivalent and run it higher in flow than the inference processor.
---------
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
The max enrich cache size setting now also supports an absolute max size in bytes (of used heap space) and a percentage of the max heap space, next to the existing flat document count. The default is 1% of the max heap space.
This should prevent issues where the enrich cache takes up a lot of memory when there are large documents in the cache.
As preparation for #106081, this PR adds the `size_in_bytes`
field to the enrich cache. This field is calculated by summing
the ByteReference sizes of all the search hits in the cache.
It's not a perfect representation of the size of the enrich cache
on the heap, but some experimentation showed that it's quite close.
* Remove `es-test-dir` book-scoped variable
* Remove `plugins-examples-dir` book-scoped variable
* Remove `:dependencies-dir:` and `:xes-repo-dir:` book-scoped variables
- In `index.asciidoc`, two variables (`:dependencies-dir:` and `:xes-repo-dir:`) were removed.
- In `sql/index.asciidoc`, the `:sql-tests:` path was updated to fuller path
- In `esql/index.asciidoc`, the `:esql-tests:` path was updated idem
* Replace `es-repo-dir` with `es-ref-dir`
* Move `:include-xpack: true` to few files that use it, remove from index.asciidoc
The GeoIP endpoint does not use the xpack http client. The GeoIP downloader uses the JDKs builtin cacerts.
If customer is using custom https endpoint they need to provide the cacert in the jdk, whether our jdk bundled in or their jdk. Otherwise they will see something like
```
...PKiX path building failed: sun.security.provier.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target...
```
* [DOCS] Categorize ingest processors on overview page, summarize use cases
* Add overview info, subheading, links
* Apply suggestions from review
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
* Insert space
---------
Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
* added override flag for rename processer along with factory tests
* added yaml tests for rename processor using the override flag
* updated renameProcessor tests to include override flag as a parameter
* updated rename processor tests to incorporate override flag = true scenario
* updated rename processor asciidoc with override option
* updated rename processor asciidoc with override option
* removed unnecessary supresswarnings tag
* corrected formatting errors
* updated processor tests
* fixed yaml tests
* Prefer early throw style here
* Whitespace
* Move and rewrite this test
It's just a simple test of the primary behavior of the rename
processor, so put it first and simplify it.
* Rename this test
It doesn't actually exercise template snippets
* Tidy up this test
---------
Co-authored-by: Joe Gallo <joegallo@gmail.com>
This commit introduces a new _ingest/simulate API that runs any pipelines
on the given data that would be executed for a given index, but instead of
indexing the data into the index, returns the transformed documents and
the list of pipelines that were executed.