Commit graph

12605 commits

Author SHA1 Message Date
István Zoltán Szabó
f4c05bdcab
[DOCS] Amends PUT inference API docs with model download info (#111278)
* [DOCS] Amends PUT inference API docs with model download info.

* [DOCS] Addresses feedback.
2024-07-26 11:32:00 +02:00
Iván Cea Fontenla
595d907f61
ESQL: SpatialCentroid aggregation tests and docs (#111236) 2024-07-26 10:41:18 +02:00
Alexander Spies
5cac9a0b7f
ESQL: Mark union types as experimental (#111297) 2024-07-26 10:20:21 +02:00
shainaraskas
50bccf5609
Round up shard allocation / recovery / relocation concepts (#109943) 2024-07-25 14:44:57 -04:00
Pooya Salehi
779f09ea87
Update get snapshot status API doc (#111240)
Make it clear that this API should be used only if the detailed shard
info is needed and only on ongoing snapshots. Remove incorrectly
mentioned `STATE` value.
2024-07-26 02:21:36 +10:00
Stef Nestor
05060f8413
(Doc+) Link Gateway Settings to Full Restart (#110902)
* (Doc+) Link Gateway Settings to Full Restart

---------

Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com>
2024-07-25 09:10:19 -06:00
Stef Nestor
a7470c05b1
(Doc+) How to resolve shards >50GB (#111254)
* (Doc+) How to resolve shards >50GB

---------

Co-authored-by: Ievgen Degtiarenko <ievgen.degtiarenko@gmail.com>
2024-07-25 08:28:46 -06:00
Valeriy Khakhutskyy
87d9a0b268
[ML] Extend lat_long documentation (#111239)
This PR adds the explanation of what "typical" means for the lat_long function.
2024-07-25 10:32:36 +02:00
Nik Everett
b5c6c2da30
ESQL: INLINESTATS (#109583)
This implements `INLINESTATS`. Most of the heavy lifting is done by
`LOOKUP`, with this change mostly adding a new abstraction to logical
plans, and interface I'm calling `Phased`. Implementing this interface
allows a logical plan node to cut the query into phases. `INLINESTATS`
implements it by asking for a "first phase" that's the same query, up to
`INLINESTATS`, but with `INLINESTATS` replaced with `STATS`. The next
phase replaces the `INLINESTATS` with a `LOOKUP` on the results of the
first phase.

So, this query:
```
FROM foo
| EVAL bar = a * b
| INLINESTATS m = MAX(bar) BY b
| WHERE m = bar
| LIMIT 1
```

gets split into
```
FROM foo
| EVAL bar = a * b
| STATS m = MAX(bar) BY b
```

followed by
```
FROM foo
| EVAL bar = a * b
| LOOKUP (results of m = MAX(bar) BY b) ON b
| WHERE m = bar
| LIMIT 1
```
2024-07-24 17:16:37 -04:00
Nhat Nguyen
f275dff609
Add Lucene segment-level fields stats (#111123)
This change returns the total number of fields at the segment level, 
allowing for a more accurate estimate of the memory used by Lucene. The
new estimate is expected to be closer to the actual memory usage than
the current estimate using the index-level field count, due to the
non-trivial overhead incurred by each Lucene segment. Two new fields are
introduced: total_segment_fields, which is the total number of fields at
the segment level, and average_fields_per_segment. The overhead per
field in segments with fewer fields is larger than in segments with many
fields.
2024-07-23 08:52:39 -07:00
Fang Xing
686c96f372
docs for named and positional parameters (#111178) 2024-07-23 08:27:34 -04:00
Craig Taverner
ba3501ae29
Simple addition of ES|QL to geo overview page (#111158) 2024-07-23 12:00:05 +02:00
David Kyle
12d26b7573
[ML DOCS]Timeout only applies to ELSER and built in E5 models (#111159) 2024-07-23 09:26:40 +01:00
Tommaso Teofili
9b86fd17aa
Document how to update dense vector field type (#111038) 2024-07-23 09:55:31 +02:00
Fang Xing
66dd2687d5
[ES|QL] Generate docs for unregistered esql functions from annotations (#108749)
* render docs for operators
2024-07-22 14:58:17 -04:00
Stef Nestor
cc245c4022
Add link to Max Shards Per Node docs (#110993)
... from exception message
2024-07-22 16:48:51 +01:00
Iván Cea Fontenla
195b916e2b
ESQL: TOP aggregation IP support (#111105)
Added IP support to TOP() aggregation.

Adapted a bit the stringtemplates organization for esql/compute to
(also?) work with specific datatypes. Right now it may be a bit messy,
but we need the specific support for cases like this.
2024-07-22 22:35:48 +10:00
Iván Cea Fontenla
101775b93d
Added Sum aggregation tests and docs (#110984)
- Added SUM() agg tests (Which autogenerates docs)
- Converted non-finite doubles to nulls in aggregator

The complete set of tests depends on
https://github.com/elastic/elasticsearch/issues/110437, as commented in
code. After completion, the test can be uncommented and everything
should work fine
2024-07-22 21:43:58 +10:00
David Turner
c8583cdcf8
Rework docs on logging levels (#111143)
Clarify that the default config is the recommended one, and that users
should not normally enable `DEBUG` or `TRACE` logging without looking at
the source code. Also reorders the information a bit for easier reading.
2024-07-22 20:23:06 +10:00
Iván Cea Fontenla
96e1b15b9d
ESQL: Support IP fields in MAX and MIN aggregations (#110921)
- Support IP in MAX() and MIN()
  - Used a custom IpArrayState for it, as it's quite different from the `X-ArrayState.java.st` generated ones
- Add IP test cases for aggregation tests
2024-07-19 23:23:13 +10:00
Iván Cea Fontenla
0e68117935
Added Percentile aggregation tests and Kibana docs (#111050)
- Added Percentile aggregation tests and autogen docs
- Added a new "appendix" section to FunctionInfo. Existing Percentile docs had a final, long section with info, and we need this to leep it. We have an "detailedDescription" attribute already, but it's right after the description, and it would make it harder to read the important bits of the function (types, examples...). So I'm not reusing it.
2024-07-19 14:28:11 +02:00
Salvatore Campagna
0f584176ca
Rename logs index mode to logsdb (#111054) 2024-07-19 13:38:58 +02:00
Iraklis Psaroudakis
89c8e8e06b
Correct force merge disk space requirements (#111066)
Correct force merge disk space requirements
2024-07-19 11:31:05 +03:00
Stef Nestor
67a8e890af
(Doc+) Flush out Data Tiers (#107981)
I highly value the content on this [Data Tiers](https://www.elastic.co/guide/en/elasticsearch/reference/current/data-tiers.html) page. Thanks for writing it! In my experience, some users may become slightly confused by its golden nuggets due to its brevity. This PR attempts to flush out common questions while remaining concise. 

The main changes are in the first and second-to-last sections; however, I do attempt some heading restructuring to make the TOC idea-groupings more clear for easier scan-throughs. 

The specific clarifications I'd like to push in order of appearance:

- There's content tier (for "data category" > "content" as we've dubbed it on the higher page) and the data temperature tiers (for time series). That the temperature tiers group together is technically not stated so users end up asking about when they'd go hot>warm vs content>warm, etc. I suspect this confusion is only because users come straight to this page instead of starting at the hierarchy-parent page so have linked up. 
- (Main) Frozen being accessed/searched "rarely" should imply, well rarely. I wrote 1% in the PR `[TIP]` guideline section as a discussion starting point. Frequently we see users not understanding either that they actually have been or that they shouldn't have ≥25% of all searches hitting frozen tier. This comes up because of architecture bugs (e.g. frozen indices with future timestamps) but also just happenstance (e.g. 01605242 where of searches they hit majority hot, ~5% cold, but then again hit 75% frozen).
- There's a slew of "how do I check that?", "how do I change that (at creation/later)?", "what if I set it null?" questions we get about `_tier_preference` so just extended the existing section already about it. 

---------

Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com>
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
2024-07-18 14:35:41 -06:00
Liam Thompson
3de980f8fc
[DOCS] Fix rendering bug (#111025)
Closes https://github.com/elastic/elasticsearch/issues/111023
2024-07-18 14:09:09 +02:00
David Turner
51d658e3cd
Always allow rebalancing by default (#111015)
Today `cluster.routing.allocation.allow_rebalance` defaults to
`indices_all_active` which blocks all rebalancing moves while the
cluster is in `yellow` or `red` health. This was appropriate for the
legacy allocator which might do too many rebalancing moves otherwise.
The desired-balance allocator has better support for rebalancing a
cluster that is not in `green` health, and expects to be able to
rebalance some shards away from over-full nodes to avoid allocating
shards to undesirable locations in the first place. This commit changes
the default `allow_rebalance` setting to `always`.
2024-07-18 12:35:50 +01:00
Simon Cooper
5b606b5799
Update known-issues for the features upgrade bug, and increase scope to include 8.12.x (#111014) 2024-07-18 11:22:10 +01:00
Carlos Delgado
6191fe3b16
Clarify synonyms docs (#110822) 2024-07-18 10:20:26 +02:00
Liam Thompson
b535df78df
[DOCS] Retrievers and rerankers (#110007)
Co-authored-by: Adam Demjen <demjened@gmail.com>
2024-07-18 09:41:00 +02:00
Joe Gallo
27e7601698
Directly download commercial ip geolocation databases from providers (#110844)
Co-authored-by: Keith Massey <keith.massey@elastic.co>
2024-07-17 20:55:14 -04:00
Kathleen DeRusso
d943a1fac4
Fix references to incorrect query rule criteria type (#110994) 2024-07-17 17:00:28 -04:00
Martijn van Groningen
22005952c6
Adding minimal docs around using index mode logs. (#110932)
This adds minimal docs  around how to the new logs index mode for data
streams (most common use case). This is minimal because logs index mode
is still in tech preview. Minimal docs should allow any interested users
to experiment with the new logs index mode.
2024-07-18 03:52:58 +10:00
Alexander Spies
da5392134f
ESQL: Validate unique plan attribute names (#110488)
* Enforce an invariant in our dependency checker so that logical plans never have duplicate output attribute names or ids.
* Fix ROW to not produce columns with duplicate names.
* Fix ResolveUnionTypes to not create multiple synthetic field attributes for the same union type.
* Add tests for commands using the same column name more than once.
* Update docs w.r.t. how commands behave if they are used with duplicate column names.
2024-07-17 11:39:02 +02:00
Liam Thompson
cadb3f9325
Remove typo put-lifecycle.asciidoc (#110875) (#110918) 2024-07-17 08:12:52 +01:00
Benjamin Trent
14bce355e5
Actually deprecate edge_ngram side parameter (#110829)
this parameter has been "deprecated" for a while, but no action was
actually taken. This actually deprecates the value for future removal.
2024-07-17 03:51:27 +10:00
Carlos Delgado
453b82706d
Add the EXP ES|QL function (#110879) 2024-07-16 16:36:01 +02:00
David Turner
3d39baa7c0
Add link to MAX_DOCS exception message (#110911)
Follow-up to #110449
2024-07-16 13:48:23 +01:00
Craig Taverner
1d6f1a0223
Union types documentation (#110183)
* Union types documentation

* Try remove asciidoc error

* Another attempt

* Using literal block

* Nicer formatting

* Remove partintro

* Small refinements

* Edits for clarity and style

---------

Co-authored-by: Marci W <333176+marciw@users.noreply.github.com>
2024-07-16 12:06:19 +02:00
Stef Nestor
512bca8669
(Doc+) Error "number of documents in the index can't exceed" (#110449)
* (Doc+) Error "number of documents in the index can't exceed"

👋 howdy, team! 

This adds resolution outline for error ... which induces ongoing, lowkey support
```
Number of documents in the index can't exceed [2147483519]
```

* feedback

* feedback

Co-authored-by: David Turner <david.turner@elastic.co>

* feedback

Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>

* feedback

* feedback

* Test change to address docs check failure

* Revert test change

* Test docs check

---------

Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
2024-07-16 11:39:30 +02:00
Nhat Nguyen
04845342f4
Fork field-caps for ES|QL (#110738)
We need to fork the field-caps API for ES|QL to allow changes to the new
internal API without risking breaking the external field-caps API.
2024-07-15 17:21:16 -07:00
Jonathan Buttner
07c7bf438f
Anthropic docs (#110850) 2024-07-15 13:43:14 -04:00
Iván Cea Fontenla
43a3af66e8
ESQL: Add boolean support to TOP aggregation (#110718)
- Added a custom implementation of BooleanBucketedSort to keep the top booleans
- Added boolean aggregator to TOP
- Added tests (Boolean aggregator tests, Top tests for boolean, and added boolean fields to CSV cases)
2024-07-16 03:14:29 +10:00
David Turner
3f9f70469e
Simplify reset-features API (#110866)
Today we return HTTP code 207 if some features successfully reset and
others failed. This is not an appropriate response code, it has a _very_
precise meaning according to the HTTP specification to which we do not
adhere. Since this API is used only in tests we can be stricter and
return a 500 unless it completely succeeds.
2024-07-15 13:23:21 +01:00
Jedr Blaszyk
8417542edd
[Connector APIs] Add docs for sync job claim endpoint (#110412) 2024-07-15 13:01:12 +02:00
Liam Thompson
6590894c99
[DOCS] Add note about ML model 502 timeout when using Create inference API (#110835)
* [DOCS] Add note about ml model 502 timeout

* Add note to API ref
2024-07-15 12:19:21 +02:00
Nik Everett
9f001169c6
ESQL: Document the pattern to count TRUE (#110820)
This adds an example to the docs an example of counting the TRUE results
of an expression. You do `COUNT(a > 0 OR NULL)`. That turns the `FALSE`
into `NULL`. Which you need to do because `COUNT(false)` is `1` -
because it's a value. But `COUNT(null)` is `0` - because it's the
absence of values.

We could like to make something more intuitive for this one day. But for
now, this is what works.
2024-07-12 14:08:22 -04:00
Kathleen DeRusso
7493403e5b
Remove preview from top level query rules API page (#110838) 2024-07-12 12:19:12 -04:00
Mark J. Hoy
560d4048d2
[Inference API] Add Docs for Amazon Bedrock Support for the Inference API (#110594)
* Add Amazon Bedrock Inference API to docs

* fix example errors

* update semantic search tutorial; add changelog

* fix typo

* fix error; accept suggestions
2024-07-12 10:14:54 -04:00
elasticsearchmachine
c61b9eebd5
Forward port release notes for v8.14.3 (#110787) 2024-07-12 08:09:12 +01:00
Niels Bauman
86727a8741
Add size_in_bytes to enrich cache stats (#110578)
As preparation for #106081, this PR adds the `size_in_bytes`
field to the enrich cache. This field is calculated by summing
the ByteReference sizes of all the search hits in the cache.
It's not a perfect representation of the size of the enrich cache
on the heap, but some experimentation showed that it's quite close.
2024-07-12 08:53:53 +02:00