Commit graph

12605 commits

Author SHA1 Message Date
Kostas Krikellas
8cf2cb35f6
Fix minor formatting issue (#114815)
The list with two options doesn't get rendered as a list, due to the
snippet in between.

https://www.elastic.co/guide/en/elasticsearch/reference/master/passthrough.html#passthrough-conflicts
2024-10-15 23:39:33 +11:00
Kostas Krikellas
4d775cba4f
Add documentation for passthrough field type (#114720)
* Guard second doc parsing pass with index setting

* add test

* updates

* updates

* merge

* Add documentation for passthrough field type

* Apply suggestions from code review

Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com>

* updates

* updates

* Update docs/reference/mapping/types/passthrough.asciidoc

Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com>

* address comment

* address comment

* Update docs/reference/mapping/types/passthrough.asciidoc

Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com>

* address comment

---------

Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com>
2024-10-15 12:05:02 +02:00
Carlos Delgado
7ad1a0c39c
Remove snapshot build restriction for match and qstr functions (#114482) 2024-10-15 08:07:07 +02:00
Benjamin Trent
6c752abc23
Adding new bbq index types behind a feature flag (#114439)
new index types of bbq_hnsw and bbq_flat which utilize the better binary quantization formats. A 32x reduction in memory, with nice recall properties.
2024-10-14 20:13:27 -04:00
David Kyle
5efba5b43d
[ML] Default inference endpoint for the multilingual-e5-small model (#114683) 2024-10-15 01:05:40 +02:00
Kyle Thomas
ee74ce564f
[DOCS] ES|QL: Adding a tip to the WHERE documentation (#114050)
* Adding a tip to make null field behavior more apparent.

* Update docs/reference/esql/processing-commands/where.asciidoc

Co-authored-by: Andrei Stefan <astefan@users.noreply.github.com>

* Update docs/reference/esql/processing-commands/where.asciidoc

Rephrasing for clarity

Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>

---------

Co-authored-by: Andrei Stefan <astefan@users.noreply.github.com>
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
2024-10-14 13:05:12 -05:00
Kathleen DeRusso
69b4a9f8ff
Add a query rules tester API call (#114168)
* Add a query rules tester API call

* Update docs/changelog/114168.yaml

* Wrap client call in async with origin

* Remove unused param

* PR feedback

* Remove redundant test

* CI workaround - add ent-search as ml dependency so it can find node features
2024-10-14 12:55:11 -04:00
David Turner
9eab11c45b
Clarify use of special values for publish addresses (#114551)
Special values like `0.0.0.0` may resolve to multiple IP addresses just
like hostnames, so the same considerations apply when using such values
as a publish address. This commit spells this case out in the docs and
cleans up the nearby wording a little.
2024-10-15 02:39:00 +11:00
kosabogi
7bd6f2ce6a
Expands semantic_text tutorial with hybrid search (#114398)
* Creates a new page for the hybrid search tutorial

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

* Adds search  response example

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

---------

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
2024-10-14 15:57:00 +02:00
Carlos Delgado
a262eb6dbd
Add ESQL match function (#113374) 2024-10-14 07:31:55 +02:00
Nick Tindall
bc0d1d7f3c
Avoid throw exception in SyntheticSourceIndexSettingsProvider (#114479)
Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co>
2024-10-14 09:45:46 +11:00
Larisa Motova
2155f1bed5
[ES|QL] Add hypot function (#114382)
Adds a hypotenuse function
2024-10-11 09:33:45 -10:00
Michael Peterson
fd9d7335c8
CCS metadata is opt-in in ESQL JSON responses (#114437)
Since Kibana only needs CCS metadata in ESQL responses from certain well-defined locations,
we are making CCS metadata opt-in. This feature is patterned after ESQL profiling, where
you specify "profile": true in the ESQL body and if you asked for it will be present in the response
always (it will be written to the .async-search index and you can’t turn it off in later async-search
requests against this particular query ID) and if you didn’t ask for it at the beginning it will never
be present (it will NOT be written to the .async-search index when it is persisted).

The new option is "include_ccs_metadata": true/false.
2024-10-11 15:03:26 -04:00
matthewabbott
a0cd389b43
Add link to NO_COPIES allocation explain message (#113656)
* tweaked no-valid-shard-copies message

* untweaked misformatting in allocation explain asciidoc
2024-10-10 15:10:28 -07:00
Nicole Albee
3358071290
Update "Securing Clients and integrations" to include Fleet (#113731) 2024-10-10 17:01:09 -05:00
Pete Gillin
c8c6f5af53
Actually add terminate docs page (#114440)
A docs page for the `terminate` processor was added in
https://github.com/elastic/elasticsearch/pull/114157, but the change
to include it in the outer processor reference page was omitted. This
change corrects that oversight.
2024-10-10 08:34:43 +01:00
Panagiotis Bailis
4eab631e5f
Add telemetry for retrievers (#114109) 2024-10-10 09:57:42 +03:00
Stef Nestor
612ce0f996
(Doc+) Link API doc to parent object - part2 (#113541)
* (Doc+) Cross-link CAT APIs to parent object

---------

Co-authored-by: Lisa Cawley <lcawley@elastic.co>
Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com>
2024-10-09 14:21:56 -06:00
Johannes Mahne
e97aaa8c41
Update forcemerge.asciidoc (#114377)
As per request https://github.com/elastic/elasticsearch/pull/114315#issuecomment-2400521895 doing the PR on the main branch.
2024-10-09 13:47:24 +02:00
Keith Massey
fb482f863d
Adding index_template_substitutions to the simulate ingest API (#114128)
This adds support for a new `index_template_substitutions` field to the
body of an ingest simulate API request. These substitutions can be used
to change the pipeline(s) used for ingest, or to change the mappings
used for validation. It is similar to the
`component_template_substitutions` added in #113276. Here is an example
that shows both of those usages working together:

```
## First, add a couple of pipelines that set a field to a boolean:
PUT /_ingest/pipeline/foo-pipeline?pretty
{
  "processors": [
    {
      "set": {
        "field": "foo",
        "value": true
      }
    }
  ]
}

PUT /_ingest/pipeline/bar-pipeline?pretty
{
  "processors": [
    {
      "set": {
        "field": "bar",
        "value": true
      }
    }
  ]
}

## Now, create three component templates. One provides a mapping enforces that the only field is "foo"
## and that field is a keyword. The next is similar, but adds a `bar` field. The final one provides a setting
## that makes "foo-pipeline" the default pipeline.
## Remember that the "foo-pipeline" sets the "foo" field to a boolean, so using both of these templates
## together would cause a validation exception. These could be in the same template, but are provided
## separately just so that later we can show how multiple templates can be overridden.
PUT _component_template/mappings_template
{
  "template": {
    "mappings": {
      "dynamic": "strict",
      "properties": {
        "foo": {
          "type": "keyword"
        }
      }
    }
  }
}

PUT _component_template/mappings_template_with_bar
{
    "template": {
      "mappings": {
        "dynamic": "strict",
        "properties": {
          "foo": {
            "type": "keyword"
          },
          "bar": {
            "type": "boolean"
          }
        }
      }
    }
}

PUT _component_template/settings_template
{
  "template": {
    "settings": {
      "index": {
        "default_pipeline": "foo-pipeline"
      }
    }
  }
}

## Here we create an index template  pulling in both of the component templates above
PUT _index_template/template_1
{
  "index_patterns": ["foo*"],
  "composed_of": ["mappings_template", "settings_template"]
}

## We can index a document here to create the index, or not. Either way the simulate call ought to work the same
POST foo-1/_doc
{
  "foo": "FOO"
}

## This will not blow up with validation exceptions because the substitute "index_template_substitutions"
## uses `mappings_template_with_bar`, which adds the bar field.
## And the bar-pipeline is executed rather than the foo-pipeline because the substitute
## "index_template_substitutions" uses a substitute `settings_template`, so the value of "foo"
## does not get set to an invalid type.
POST _ingest/_simulate?pretty&index=foo-1
{
  "docs": [
    {
      "_id": "asdf",
      "_source": {
        "foo": "foo",
        "bar": "bar"
      }
    }
  ],
  "component_template_substitutions": {
    "settings_template": {
      "template": {
        "settings": {
          "index": {
            "default_pipeline": "bar-pipeline"
          }
        }
      }
    }
  },
  "index_template_substitutions": {
    "template_1": {
      "index_patterns": ["foo*"],
      "composed_of": ["mappings_template_with_bar", "settings_template"]
    }
  }
}
```
2024-10-09 10:15:37 +11:00
Nik Everett
ebe3c0f10d
ESQL: Document MV_SLICE limitations (#114162)
`MV_SLICE` is useful, but loading values from lucene frequently sorts
them so `MV_SLICE` is not as useful as you think it is. It's mostly for
after, say, a `SPLIT`. This documents that and adds a link to the
section on multivalues.

It also moves similar docs to a separate paragraph in the docs for
easier reading.
2024-10-09 05:04:36 +11:00
Pete Gillin
43e5258b3c
Add a terminate ingest processor (#114157)
This processor simply causes any remaining processors in the pipeline
to be skipped. It will normally be executed conditionally using the
`if` option. (If this pipeline is being called from another pipeline,
the calling pipeline is *not* terminated.)

For example, this:

```
POST /_ingest/pipeline/_simulate
{
  "pipeline":
  {
    "description": "Appends just 'before' to the steps field if the number field
 is present, or both 'before' and 'after' if not",
    "processors": [
      {
        "append": {
          "field": "steps",
          "value": "before"
        }
      },
      {
        "terminate": {
          "if": "ctx.error != null"
        }
      },
      {
        "append": {
          "field": "steps",
          "value": "after"
        }
      }
    ]
  },
  "docs": [
    {
      "_index": "index",
      "_id": "doc1",
      "_source": {
        "name": "okay",
        "steps": []
      }
    },
    {
      "_index": "index",
      "_id": "doc2",
      "_source": {
        "name": "bad",
        "error": "oh no",
        "steps": []
      }
    }
  ]
}
```

returns something like this:

```
{
  "docs": [
    {
      "doc": {
        "_index": "index",
        "_version": "-3",
        "_id": "doc1",
        "_source": {
          "name": "okay",
          "steps": [
            "before",
            "after"
          ]
        },
        "_ingest": {
          "timestamp": "2024-10-04T16:25:20.448881Z"
        }
      }
    },
    {
      "doc": {
        "_index": "index",
        "_version": "-3",
        "_id": "doc2",
        "_source": {
          "name": "bad",
          "error": "oh no",
          "steps": [
            "before"
          ]
        },
        "_ingest": {
          "timestamp": "2024-10-04T16:25:20.448932Z"
        }
      }
    }
  ]
}
```
2024-10-08 17:39:53 +01:00
Nik Everett
d3fa42cda0
ESQL: Entirely remove META FUNCTIONS (#113967)
This removes the undocumented `META FUNCTIONS` command that emits
descriptions for all functions. This shouldn't be used because we never
told anyone about it. I'd have preferred if we'd have explicitly
documented it as no public or if we'd have left it snapshot-only. But
sometimes you make a mistake. I'm hopeful no one is relying on it. It
was never reliable and not public.....
2024-10-08 18:37:55 +02:00
Nik Everett
f633148d10
Docs: ESQL doesn't preserve nulls in a list (#114335)
The doc values don't preserve `null`s in a list so ESQL doesn't either.

Closes #114324
2024-10-09 03:17:56 +11:00
Liam Thompson
10f6f25506
[DOCS] Update re-ranking intro to remove confusion about stages (#114302) 2024-10-08 08:08:47 -04:00
Liam Thompson
b80272a1a4
[DOCS] Update URL (#114292) 2024-10-08 13:51:02 +02:00
kosabogi
4af241b5d6
Adds note on reindexing existing data for semantic_text usage (#113590)
* Adds note on reindexing existing data for semantic_text usage

* Adds note about full crawl and full sync

* Style guide related fix

* Update docs/reference/search/search-your-data/semantic-search-semantic-text.asciidoc

Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>

---------

Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
2024-10-08 09:58:18 +02:00
David Turner
07c3acf1c0
Remove cluster state from /_cluster/reroute response (#114231)
Including the cluster state in responses to the `POST _cluster/state`
API  was deprecated in #90399 (v8.6.0) requiring callers to pass
`?metric=none` to avoid the deprecation warning. This commit adjusts the
behaviour as promised in v9 so that this API never returns the cluster
state, and deprecates the `?metric` parameter itself.

Closes #88978
2024-10-08 07:59:57 +01:00
István Zoltán Szabó
40bddafd92
[DOCS] Adds DeBERTa v2 tokenization params to infer trained model API docs (#114242)
* [DOCS] Adds DeBERTa v2 tokenization params to infer trained model API docs.

* [DOCS] Mode edits.
2024-10-08 08:41:11 +02:00
David Turner
740cb2e0c7
Document that ?wait_for_active_shards=0 is permitted (#114091)
Today the docs for the `?wait_for_active_shards` parameter say that it
must be a positive integer, proscribing `0`, yet `0` is a legitimate
value for this parameter. This commit fixes this point and rewords the
docs slightly for clarity.
2024-10-08 07:59:30 +02:00
moxarth-elastic
e1bba9b390
[Zoom] Update existing scopes with granular scopes (#113994) 2024-10-07 15:48:32 +02:00
Liam Thompson
1292580c03
[DOCS] Lookup runtime fields are now GA (#114221) 2024-10-07 14:52:42 +02:00
Sean Story
71faa01b7f
fix typo - "english" is not a valid language code (#114166)
This example request will succeed, but follow-up requests to run a sync on a connector with this language value will fail.
2024-10-07 13:28:25 +02:00
Simon Cooper
4ef5ea6d1c
Change default locale of date mappers to ENGLISH (#112799)
English is not changing between COMPAT and CLDR locale databases, whereas ROOT is
2024-10-07 10:22:38 +01:00
David Turner
6d3abe51a3
Clarify support for bundled JDK (#113993)
Spells out that third-party EOL schedules don't affect our support. Also
reorders the information to talk about the benefits of the bundled JDK
before mentioning alternatives, and clarifies the division of
responsibilities for "supported" JDKs other than the bundled one.
2024-10-07 10:12:24 +01:00
István Zoltán Szabó
57955cb8d4
[DOCS] Adds DeBERTA v2 to the tokenizers list in API docs (#112752)
Co-authored-by: Max Hniebergall <137079448+maxhniebergall@users.noreply.github.com>
2024-10-07 10:23:46 +02:00
Drew Tate
147461f5b1
[ES|QL] add reverse function (#113297)
Adds a REVERSE string function
2024-10-04 12:57:37 -05:00
David Turner
95ea135106
Clarify integer settings for repository-s3 repos (#114093)
Today there are a handful of integer settings for `repository-s3`
repositories whose docs link to the page about numeric field types. Yet
these settings are not fields, and do not support floating-point values
either. The convention throughout the rest of the docs is to just call
these things `integer` without linking to anything. This commit aligns
the `repository-s3` docs with this convention.
2024-10-04 17:21:59 +01:00
Mark Tozzi
60ae7463a8
[ESQL] Support datetime data type in Least and Greatest functions (#113961)
While working on Date Nanos, I noticed that Least and Greatest didn't have support for datetime. This PR corrects that and adds tests for it.

It seems to me that resolveType() is doing the wrong thing for these functions, as it accepts types that then do not have evaluator mappings, but refactoring that seems out of scope right now.

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2024-10-04 09:06:39 -04:00
elasticsearchmachine
9f7d7b12ce
Forward port release notes for v8.15.2 (#113634) 2024-10-04 13:05:08 +02:00
David Kyle
4d5906422b
[ML] Default inference endpoint for ELSER (#113873)
Adds a default configuration for the ELSER model. The config uses
adaptive allocations to automatically scale. Min number of allocations
is set to 1 for this PR, a follow up with change that to 0 and enable
scale from 0.

This end point is always visible in the GET API.

```
GET _inference

{
  "endpoints": [
    {
      "inference_id": ".elser-2",
      "task_type": "sparse_embedding",
      "service": "elser",
      "service_settings": {
        "num_threads": 1,
        "model_id": ".elser_model_2",
        "adaptive_allocations": {
          "enabled": true,
          "min_number_of_allocations": 1,
          "max_number_of_allocations": 8
        }
      },
      "task_settings": {}
    }
  ]
}
```

The default configuration can be used against without any prior setup.
If the model is not downloaded it is automatically downloaded. If it is
not deployed it is deployed

```
POST _inference/.elser-2
{
  "input": "Automagically deploy and infer" 
}
 ... 

{
  "sparse_embedding": [
    {
      "is_truncated": false,
      "embedding": {
        "##fer": 2.2107008,
        "deployment": 2.1624098,
        "deploy": 2.144009,
        "auto": 1.9384763,
```

### Follow up tasks - [ ] Add default config for the E5 text embedding
model - [ ] Select platform specific version - [ ] Scale from 0 - [ ]
Chunking settings - What happens when the end point is deleted, can it
be deleted? - Can the default config be modified - chunking settings for
example? Probably not
2024-10-04 19:27:10 +10:00
Mikhail Berezovskiy
fc1bee290a
Add max_multipart_parts setting to S3 repository (#113989) 2024-10-03 21:49:51 -07:00
Kostas Krikellas
dd2024881d
Add object param for keeping synthetic source (#113690)
* Add object param for keeping synthetic source

* Update docs/changelog/113690.yaml

* fix merging

* add tests

* merge

* fix randomized tests

* add documentation

* dedup id in docs

* update documentation

* update documentation

* fix bwc

* fix bwc

* fix unintended

* Revert "fix bwc"

This reverts commit 18dc913eee.

* Revert "fix bwc"

This reverts commit f4ddb0e5e5.

* add missing test

* fix transform

* fix transform

* fix transform

* fix transform

* fix transform
2024-10-03 21:19:04 +03:00
István Zoltán Szabó
b9adc701fa
[DOCS] Expands param descriptions for semantic_text (#114024)
Co-authored-by: Mike Pellegrini <mike.pellegrini@elastic.co>
2024-10-03 19:48:16 +02:00
István Zoltán Szabó
fb39147d90
[DOCS] Removes link from semantic text tutorial. (#114038) 2024-10-03 18:36:20 +02:00
Nik Everett
f4cb32991a
ESQL: change link to profile explanation (#114032)
Let's use a vendor neutral link.
2024-10-04 01:30:03 +10:00
Nik Everett
fe6f8d4b37
ESQL: Mention EXPLAIN ANALYZE in profile docs (#114025)
This mentions EXPLAIN ANALYZE and EXPLAIN PLAN in the docs for ESQL's
`profile` option. Those are things that folks from PostgreSQL and Oracle
are used to and might search for. And `profile` is the closest thing we
have to them.

EXPLAIN PLAN doesn't run the query - it just tells you what the plan is.
ESQL's `profile` always runs the query. So that's different. But it's
close!

EXPLAIN ANALYZE *does* run the query. It's pretty much the same.
2024-10-03 10:37:34 -04:00
Simon Cooper
7fd0a666cc
Copy 8.16 CLDR migration notes to main (#113954)
Copy from the 8.x branch to main
2024-10-04 00:17:25 +10:00
Panagiotis Bailis
dc8c20d3b6
Rework RRF to be evaluated during rewrite phase (#112648) 2024-10-03 12:39:13 +03:00
Mike Pellegrini
2f3bf74e6e
Revert semantic query passage ranking documentation (#113982) 2024-10-02 17:15:55 -04:00