[AI Infra] Update NL-2-ESQL docs (#224868)

## Summary

This PR pulls the latest changes from the Elasticsearch's ES|QL
documentation updates the ESQL docs. It also adds new ES|QL docs for:
- KQL
- TO_DATE_NANOS

Test results:

```
Model gpt-4o scored 27.700000000000003 out of 30
-------------------------------------------
-------------------------------------------
Model gpt-4o scores per category
- category: ES|QL commands and functions usage - scored 12 out of 13
- category: ES|QL query generation - scored 12.200000000000003 out of 13
- category: SPL to ESQL - scored 3.5 out of 4
-------------------------------------------

Model gpt-4o scored 25.300000000000004 out of 30
-------------------------------------------
-------------------------------------------
Model gpt-4o scores per category
- category: ES|QL commands and functions usage - scored 10.3 out of 13
- category: ES|QL query generation - scored 11.500000000000002 out of 13
- category: SPL to ESQL - scored 3.5 out of 4
-------------------------------------------
-------------------------------------------

Model gpt-4o scored 26.300000000000004 out of 30
-------------------------------------------
-------------------------------------------
Model gpt-4o scores per category
- category: ES|QL commands and functions usage - scored 10.8 out of 13
- category: ES|QL query generation - scored 11.700000000000003 out of 13
- category: SPL to ESQL - scored 3.8 out of 4


Model gpt-4o scored 27.500000000000004 out of 30
-------------------------------------------
-------------------------------------------
Model gpt-4o scores per category
- category: ES|QL commands and functions usage - scored 12 out of 13
- category: ES|QL query generation - scored 11.700000000000003 out of 13
- category: SPL to ESQL - scored 3.8 out of 4


```


### Checklist

Check the PR satisfies following conditions. 

Reviewers should verify this PR satisfies this list as well.

- [ ] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/src/platform/packages/shared/kbn-i18n/README.md)
- [ ]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
- [ ] If a plugin configuration key changed, check if it needs to be
allowlisted in the cloud and added to the [docker
list](https://github.com/elastic/kibana/blob/main/src/dev/build/tasks/os_packages/docker_generator/resources/base/bin/kibana-docker)
- [ ] This was checked for breaking HTTP API changes, and any breaking
changes have been approved by the breaking-change committee. The
`release_note:breaking` label should be applied in these situations.
- [ ] [Flaky Test
Runner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1) was
used on any tests changed
- [ ] The PR description includes the appropriate Release Notes section,
and the correct `release_note:*` label is applied per the
[guidelines](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)

### Identify risks

Does this PR introduce any risks? For example, consider risks like hard
to test bugs, performance regression, potential of data loss.

Describe the risk, its severity, and mitigation for each identified
risk. Invite stakeholders and evaluate how to proceed before merging.

- [ ] [See some risk
examples](https://github.com/elastic/kibana/blob/main/RISK_MATRIX.mdx)
- [ ] ...

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
This commit is contained in:
Quynh Nguyen (Quinn) 2025-06-24 18:17:52 -05:00 committed by GitHub
parent 535c27fb90
commit dd29b09929
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
145 changed files with 2110 additions and 1698 deletions

View file

@ -92,8 +92,8 @@ export function createOutputApi(chatCompleteApi: ChatCompleteAPI) {
return {
id,
output:
event.toolCalls.length && 'arguments' in event.toolCalls[0].function
? event.toolCalls[0].function.arguments
event?.toolCalls?.length && 'arguments' in event?.toolCalls[0]?.function
? event.toolCalls[0]?.function?.arguments
: undefined,
content: event.content,
type: OutputEventType.OutputComplete,
@ -107,8 +107,8 @@ export function createOutputApi(chatCompleteApi: ChatCompleteAPI) {
id,
content: chatResponse.content,
output:
chatResponse.toolCalls.length && 'arguments' in chatResponse.toolCalls[0].function
? chatResponse.toolCalls[0].function.arguments
chatResponse?.toolCalls?.length && 'arguments' in chatResponse?.toolCalls[0]?.function
? chatResponse?.toolCalls[0]?.function?.arguments
: undefined,
};
},

View file

@ -8,3 +8,20 @@ The generated documentation is validated and will emit warnings when invalid que
- checked out `built-docs` repo in the same folder as the `kibana` repository
- a running Kibana instance
- an installed Generative AI connector
### Run script to generate ES|QL docs and verify syntax
```
node x-pack/platform/plugins/shared/inference/scripts/load_esql_docs/index.js
```
The script will also generate a report of syntax errors found during the generation process, located at
`x-pack/platform/plugins/shared/inference/server/tasks/nl_to_esql/esql_docs/__tmp__/syntax-errors.json`. This file will not be checked into git.
### Checking syntax errors for generated files
After making modifications to fix the syntax error, and you just need to check if there any remaining errors left,you can run a script that reports.
```
node x-pack/platform/plugins/shared/inference/scripts/report_syntax_errors/index.js
```

View file

@ -60,9 +60,10 @@ export async function extractDocEntries({
log: ToolingLog;
inferenceClient: ScriptInferenceClient;
}): Promise<ExtractionOutput> {
const files = await fastGlob(`${builtDocsDir}/html/en/elasticsearch/reference/master/esql*.html`);
const path = `${builtDocsDir}/html/en/elasticsearch/reference/current/esql*.html`;
const files = await fastGlob(path);
if (!files.length) {
throw new Error('No files found');
throw new Error(`No files found at path: ${path}`);
}
const output: ExtractionOutput = {

View file

@ -6,8 +6,6 @@
*/
import { run } from '@kbn/dev-cli-runner';
import { ESQLMessage, EditorError } from '@kbn/esql-ast';
import { validateQuery } from '@kbn/esql-validation-autocomplete';
import Fs from 'fs/promises';
import Path from 'path';
import yargs, { Argv } from 'yargs';
@ -20,7 +18,8 @@ import { KibanaClient } from '../util/kibana_client';
import { selectConnector } from '../util/select_connector';
import { syncBuiltDocs } from './sync_built_docs_repo';
import { extractDocEntries } from './extract_doc_entries';
import { generateDoc, FileToWrite } from './generate_doc';
import { generateDoc } from './generate_doc';
import { reportSyntaxErrors } from '../report_syntax_errors/report_syntax_errors';
yargs(process.argv.slice(2))
.command(
@ -128,49 +127,10 @@ yargs(process.argv.slice(2))
);
}
log.info(`Checking syntax...`);
const syntaxErrors = (
await Promise.all(docFiles.map(async (file) => await findEsqlSyntaxError(file)))
).flat();
log.warning(
`Please verify the following queries that had syntax errors\n${JSON.stringify(
syntaxErrors,
null,
2
)}`
);
await reportSyntaxErrors(outDir, log, docFiles);
},
{ log: { defaultLevel: argv.logLevel as any }, flags: { allowUnexpected: true } }
);
}
)
.parse();
interface SyntaxError {
query: string;
errors: Array<ESQLMessage | EditorError>;
}
const findEsqlSyntaxError = async (doc: FileToWrite): Promise<SyntaxError[]> => {
return Array.from(doc.content.matchAll(INLINE_ESQL_QUERY_REGEX)).reduce(
async (listP, [match, query]) => {
const list = await listP;
const { errors, warnings } = await validateQuery(query, {
// setting this to true, we don't want to validate the index / fields existence
ignoreOnMissingCallbacks: true,
});
const all = [...errors, ...warnings];
if (all.length) {
list.push({
errors: all,
query,
});
}
return list;
},
Promise.resolve([] as SyntaxError[])
);
};

View file

@ -0,0 +1,10 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
require('@kbn/babel-register').install();
require('./report_syntax_errors');

View file

@ -0,0 +1,127 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
import { ESQLMessage, EditorError } from '@kbn/esql-ast';
import { validateQuery } from '@kbn/esql-validation-autocomplete';
import Fs from 'fs/promises';
import Path from 'path';
import yargs, { Argv } from 'yargs';
import type { ToolingLog } from '@kbn/tooling-log';
import { run } from '@kbn/dev-cli-runner';
import { INLINE_ESQL_QUERY_REGEX } from '../../common/tasks/nl_to_esql/constants';
import type { FileToWrite } from '../load_esql_docs/generate_doc';
interface SyntaxError {
query: string;
errors: Array<ESQLMessage | EditorError>;
}
/**
* Log out syntax errors and also write them to a {outDir}/__tmp__/syntax-errors.json
* If docsToCheck is provided, they will be used instead of reading the files from the outDir.
* @param docFiles - The files to check for syntax errors.
* @param outDir - The directory to write the syntax errors to.
* @param log - The logger to use.
*/
export const reportSyntaxErrors = async (
outDir: string,
log: ToolingLog,
docsToCheck?: FileToWrite[]
) => {
let docFiles: FileToWrite[] | undefined = docsToCheck;
if (docsToCheck) {
log.info(`Checking syntax for ${docsToCheck.length} provided files`);
} else {
log.info(`Checking syntax for files in ${outDir}`);
docFiles = await Fs.readdir(outDir).then(async (files) => {
return await Promise.all(
files
.filter((file) => file.endsWith('.txt'))
.map(async (file) => {
const content = await Fs.readFile(Path.join(outDir, file), 'utf8');
return {
name: file,
content,
};
})
);
});
log.info(`Found ${(docFiles ?? []).length} files to check in ${outDir}`);
}
if (!docFiles) return;
const syntaxErrors = (
await Promise.all(docFiles.map(async (file) => await findEsqlSyntaxError(file)))
).flat();
log.warning(
`Please verify the following queries that had syntax errors\n${JSON.stringify(
syntaxErrors,
null,
2
)}`
);
if (syntaxErrors.length > 0) {
const tmpDir = Path.join(outDir, '__tmp__');
await Fs.mkdir(tmpDir).catch((error) => (error.code === 'EEXIST' ? Promise.resolve() : error));
const syntaxErrorsFile = Path.join(tmpDir, 'syntax-errors.json');
await Fs.writeFile(syntaxErrorsFile, JSON.stringify(syntaxErrors, null, 2));
log.info(`Syntax errors written to ${syntaxErrorsFile}`);
}
};
const findEsqlSyntaxError = async (doc: FileToWrite): Promise<SyntaxError[]> => {
return Array.from(doc.content.matchAll(INLINE_ESQL_QUERY_REGEX)).reduce(
async (listP, [match, query]) => {
const list = await listP;
const { errors, warnings } = await validateQuery(query, {
// setting this to true, we don't want to validate the index / fields existence
ignoreOnMissingCallbacks: true,
});
const all = [...errors, ...warnings];
if (all.length) {
list.push({
errors: all,
query,
});
}
return list;
},
Promise.resolve([] as SyntaxError[])
);
};
yargs(process.argv.slice(2))
.command(
'*',
'Extract ES|QL documentation',
(y: Argv) =>
y
.option('logLevel', {
describe: 'Log level',
string: true,
default: process.env.LOG_LEVEL || 'info',
choices: ['info', 'debug', 'silent', 'verbose'],
})
.option('outDir', {
describe: 'The directory to write the syntax errors to.',
default: Path.join(__dirname, '../../server/tasks/nl_to_esql/esql_docs'),
})
.parse(),
(argv) => {
run(
async ({ log }) => {
const outDir = argv.outDir as string;
await reportSyntaxErrors(outDir, log);
},
{ log: { defaultLevel: argv.logLevel as any }, flags: { allowUnexpected: true } }
);
}
)
.parse();

View file

@ -1,6 +1,6 @@
# ABS
The ABS function returns the absolute value of a given number.
The `ABS` function returns the absolute value of a numeric expression.
## Syntax
@ -8,23 +8,23 @@ The ABS function returns the absolute value of a given number.
### Parameters
#### number
#### `number`
A numeric expression. If the parameter is `null`, the function will also return `null`.
A numeric expression. If the value is `null`, the function returns `null`.
## Examples
In this example, the ABS function is used to calculate the absolute value of -1.0:
```esql
ROW number = -1.0
| EVAL abs_number = ABS(number)
```
In the following example, the ABS function is used to calculate the absolute value of the height of employees:
Calculate the absolute value of a negative number.
```esql
FROM employees
| KEEP first_name, last_name, height
| EVAL abs_height = ABS(0.0 - height)
```
```
Calculate the absolute value of the difference between `0.0` and the `height` column.

View file

@ -1,6 +1,6 @@
# ACOS
The ACOS function returns the arccosine of a given number, expressed in radians.
Returns the arccosine of a number as an angle, expressed in radians.
## Syntax
@ -8,20 +8,16 @@ The ACOS function returns the arccosine of a given number, expressed in radians.
### Parameters
#### number
#### `number`
This is a number between -1 and 1. If the parameter is `null`, the function will also return `null`.
- A number between -1 and 1.
- If `null`, the function returns `null`.
## Examples
In this example, the ACOS function calculates the arccosine of 0.9.
```esql
ROW a=.9
| EVAL acos=ACOS(a)
ROW a = .9
| EVAL acos = ACOS(a)
```
```esql
ROW b = -0.5
| EVAL acos_b = ACOS(b)
```
Calculate the arccosine of the value `0.9` and store the result in a new column named `acos`.

View file

@ -1,6 +1,6 @@
# ASIN
The ASIN function returns the arcsine of a given numeric expression as an angle, expressed in radians.
Returns the arcsine of the input numeric expression as an angle, expressed in radians.
## Syntax
@ -8,22 +8,16 @@ The ASIN function returns the arcsine of a given numeric expression as an angle,
### Parameters
#### number
#### `number`
This is a numeric value ranging between -1 and 1. If the parameter is `null`, the function will also return `null`.
- A number between -1 and 1.
- If `null`, the function returns `null`.
## Examples
In this example, the ASIN function calculates the arcsine of 0.9:
```esql
ROW a=.9
| EVAL asin=ASIN(a)
```
In this example, the ASIN function calculates the arcsine of -0.5:
```esql
ROW a = -.5
ROW a = .9
| EVAL asin = ASIN(a)
```
Calculate the arcsine of the value `0.9` and return the result in radians.

View file

@ -1,6 +1,6 @@
# ATAN
The ATAN function returns the arctangent of a given numeric expression, expressed in radians.
Returns the arctangent of the input numeric expression as an angle, expressed in radians.
## Syntax
@ -8,9 +8,9 @@ The ATAN function returns the arctangent of a given numeric expression, expresse
### Parameters
#### number
#### `number`
This is a numeric expression. If the parameter is `null`, the function will also return `null`.
Numeric expression. If `null`, the function returns `null`.
## Examples
@ -19,6 +19,8 @@ ROW a=12.9
| EVAL atan = ATAN(a)
```
Calculate the arctangent of the value `12.9` and store the result in a new column named `atan`.
```esql
ROW x=5.0, y=3.0
| EVAL atan_yx = ATAN(y / x)

View file

@ -1,6 +1,6 @@
# AVG
The AVG function calculates the average of a numeric field.
The `AVG` function calculates the average of a numeric field.
## Syntax
@ -8,22 +8,26 @@ The AVG function calculates the average of a numeric field.
### Parameters
#### number
#### `number`
The numeric field for which the average is calculated.
A numeric field to calculate the average.
## Examples
Calculate the average height of employees:
Basic Usage
```esql
FROM employees
| STATS AVG(height)
```
The AVG function can be used with inline functions. For example:
Calculate the average height of employees.
Using Inline Functions
```esql
FROM employees
| STATS avg_salary_change = ROUND(AVG(MV_AVG(salary_change)), 10)
```
Calculate the average salary change by first averaging multiple values per row using `MV_AVG`, and then applying the `AVG` function with rounding to 10 decimal places.

View file

@ -1,6 +1,8 @@
# BIT_LENGTH
This function calculates the bit length of a given string.
Returns the bit length of a string.
**Note:** All strings are in UTF-8, so a single character can use multiple bytes.
## Syntax
@ -8,11 +10,9 @@ This function calculates the bit length of a given string.
### Parameters
#### string
#### `string`
This is the string whose bit length you want to calculate. If `null` is provided, the function will return `null`.
**Note**: Strings are in UTF-8 format, which means a single character may occupy multiple bytes.
String expression. If `null`, the function returns `null`.
## Examples
@ -22,3 +22,5 @@ FROM airports
| KEEP city
| EVAL fn_length = LENGTH(city), fn_bit_length = BIT_LENGTH(city)
```
This example calculates both the character length and the bit length of city names in airports located in India.

View file

@ -1,28 +1,28 @@
# BUCKET
The BUCKET function allows you to create groups of values, known as buckets, from a datetime or numeric input. The size of the buckets can be specified directly or determined based on a recommended count and values range.
The `BUCKET` function creates groups of values—buckets—out of a datetime or numeric input. The size of the buckets can either be provided directly or chosen based on a recommended count and values range.
## Syntax
`BUCKET(field, buckets [, from, to])`
`BUCKET(field, buckets, from, to)`
### Parameters
#### field
#### `field`
A numeric or date expression from which to derive buckets.
#### buckets
#### `buckets`
The target number of buckets, or the desired bucket size if `from` and `to` parameters are omitted.
The target number of buckets or the desired bucket size if the `from` and `to` parameters are omitted.
#### from
#### `from` (optional)
(optional) The start of the range. This can be a number, a date, or a date expressed as a string.
The start of the range. Can be a number, a date, or a date expressed as a string.
#### to
#### `to` (optional)
(optional) The end of the range. This can be a number, a date, or a date expressed as a string.
The end of the range. Can be a number, a date, or a date expressed as a string.
## Important notes:
@ -36,7 +36,7 @@ When the bucket size is provided directly for time interval, it is expressed as
## Examples
For instance, asking for at most 20 buckets over a year results in monthly buckets:
Using a target number of buckets, a start of a range, and an end of a range
```esql
FROM employees
@ -45,7 +45,31 @@ FROM employees
| SORT hire_date
```
If the desired bucket size is known in advance, simply provide it as the second argument, leaving the range out:
This example creates buckets for hire dates in 1985, aiming for 20 buckets. The actual number of buckets may vary depending on the range.
Combine BUCKET with an aggregation to create a histogram
```esql
FROM employees
| WHERE hire_date >= "1985-01-01T00:00:00Z" AND hire_date < "1986-01-01T00:00:00Z"
| STATS hires_per_month = COUNT(*) BY month = BUCKET(hire_date, 20, "1985-01-01T00:00:00Z", "1986-01-01T00:00:00Z")
| SORT month
```
This example calculates the number of hires per month in 1985.
Asking for more buckets can result in a smaller range
```esql
FROM employees
| WHERE hire_date >= "1985-01-01T00:00:00Z" AND hire_date < "1986-01-01T00:00:00Z"
| STATS hires_per_week = COUNT(*) BY week = BUCKET(hire_date, 100, "1985-01-01T00:00:00Z", "1986-01-01T00:00:00Z")
| SORT week
```
This example creates weekly buckets for hire dates in 1985, aiming for 100 buckets.
Providing the bucket size directly
```esql
FROM employees
@ -54,7 +78,9 @@ FROM employees
| SORT week
```
BUCKET can also operate on numeric fields. For example, to create a salary histogram:
This example creates weekly buckets for hire dates in 1985 by directly specifying the bucket size.
Creating a salary histogram
```esql
FROM employees
@ -62,7 +88,41 @@ FROM employees
| SORT bs
```
BUCKET may be used in both the aggregating and grouping part of the STATS ... BY ... command provided that in the aggregating part the function is referenced by an alias defined in the grouping part, or that it is invoked with the exact same expression:
This example creates a histogram of salaries, dividing the range into 20 buckets.
Omitting the range when the bucket size is known
```esql
FROM employees
| WHERE hire_date >= "1985-01-01T00:00:00Z" AND hire_date < "1986-01-01T00:00:00Z"
| STATS c = COUNT(*) BY b = BUCKET(salary, 5000.)
| SORT b
```
This example creates salary buckets with a fixed size of 5000.
Create hourly buckets for the last 24 hours
```esql
FROM sample_data
| WHERE @timestamp >= NOW() - 1 day AND @timestamp < NOW()
| STATS COUNT(*) BY bucket = BUCKET(@timestamp, 25, NOW() - 1 day, NOW())
```
This example creates hourly buckets for the last 24 hours.
Create monthly buckets for the year 1985
```esql
FROM employees
| WHERE hire_date >= "1985-01-01T00:00:00Z" AND hire_date < "1986-01-01T00:00:00Z"
| STATS AVG(salary) BY bucket = BUCKET(hire_date, 20, "1985-01-01T00:00:00Z", "1986-01-01T00:00:00Z")
| SORT bucket
```
This example calculates the average salary for each month in 1985.
Using BUCKET in both aggregating and grouping parts of STATS
```esql
FROM employees
@ -71,42 +131,16 @@ FROM employees
| KEEP s1, b1, s2, b2
```
More examples:
This example demonstrates advanced usage of `BUCKET` in both aggregation and grouping.
Adjusting bucket start value with an offset
*Regrouping employees in buckets based on salary and counting them*
```esql
FROM employees
| WHERE hire_date >= "1985-01-01T00:00:00Z" AND hire_date < "1986-01-01T00:00:00Z"
| STATS c = COUNT(*) BY b = BUCKET(salary, 5000.)
| SORT b
| STATS dates = MV_SORT(VALUES(birth_date)) BY b = BUCKET(birth_date + 1 HOUR, 1 YEAR) - 1 HOUR
| EVAL d_count = MV_COUNT(dates)
| SORT d_count, b
| LIMIT 3
```
*Group data emitted over the last 24h into 25 buckets*
```esql
FROM sample_data
| WHERE @timestamp >= NOW() - 1 day and @timestamp < NOW()
| STATS COUNT(*) BY bucket = BUCKET(@timestamp, 25, NOW() - 1 day, NOW())
```
*Similar to previous example but with fixed 1 hour bucket size*
```esql
FROM sample_data
| WHERE @timestamp >= NOW() - 1 day and @timestamp < NOW()
| STATS COUNT(*) BY bucket = BUCKET(@timestamp, 1 hour)
```
*Group employees in 20 buckets based on their hire_date and then calculate the average salary for each bucket*
```esql
FROM employees
| WHERE hire_date >= "1985-01-01T00:00:00Z" AND hire_date < "1986-01-01T00:00:00Z"
| STATS AVG(salary) BY bucket = BUCKET(hire_date, 20, "1985-01-01T00:00:00Z", "1986-01-01T00:00:00Z")
| SORT bucket
```
*Similar to previous example but using fixed 1 month buckets size*
```esql
FROM employees
| WHERE hire_date >= "1985-01-01T00:00:00Z" AND hire_date < "1986-01-01T00:00:00Z"
| STATS AVG(salary) BY bucket = BUCKET(hire_date, 1 month)
| SORT bucket
```
This example adjusts the bucket start value by adding and subtracting an offset.

View file

@ -1,6 +1,6 @@
# BYTE_LENGTH
This function calculates the byte length of a given string.
Returns the byte length of a string. Since all strings are in UTF-8, a single character may use multiple bytes.
## Syntax
@ -8,9 +8,9 @@ This function calculates the byte length of a given string.
### Parameters
#### string
#### `string`
The text string for which the byte length is to be determined. If `null` is provided, the function will return `null`.
String expression. If `null`, the function returns `null`.
## Examples
@ -20,3 +20,5 @@ FROM airports
| KEEP city
| EVAL fn_length = LENGTH(city), fn_byte_length = BYTE_LENGTH(city)
```
This example calculates both the character length and the byte length of city names in airports located in India.

View file

@ -1,28 +1,30 @@
# CASE
## CASE
The CASE function accepts pairs of conditions and values. It returns the value that corresponds to the first condition that evaluates to `true`. If no condition matches, the function returns a default value or `null` if the number of arguments is even.
The `CASE` function evaluates a series of conditions and returns a value corresponding to the first condition that evaluates to `true`. If no conditions match, a default value or `null` is returned.
## Syntax
`CASE(condition, trueValue, elseValue)`
`CASE (condition, trueValue, elseValue)`
### Parameters
#### condition
#### `condition`
A condition to evaluate.
#### trueValue
#### `trueValue`
The value that is returned when the corresponding condition is the first to evaluate to `true`. If no condition matches, the default value is returned.
The value returned when the corresponding condition is the first to evaluate to `true`. If no condition matches, the default value is returned.
#### elseValue
#### `elseValue`
The value that will be returned when no condition evaluates to `true`.
The value returned when no condition evaluates to `true`.
## Examples
In this example, employees are categorized as monolingual, bilingual, or polyglot depending on how many languages they speak:
### Determine whether employees are monolingual, bilingual, or polyglot
Classify employees based on the number of languages they speak:
```esql
FROM employees
@ -33,7 +35,9 @@ FROM employees
| KEEP emp_no, languages, type
```
Calculate the total connection success rate based on log messages:
### Calculate the total connection success rate based on log messages
Determine the success rate of connections by analyzing log messages:
```esql
FROM sample_data
@ -44,7 +48,9 @@ FROM sample_data
| STATS success_rate = AVG(successful)
```
Calculate an hourly error rate as a percentage of the total number of log messages:
### Calculate an hourly error rate as a percentage of the total number of log messages
Compute the error rate for each hour based on log messages:
```esql
FROM sample_data

View file

@ -1,8 +1,6 @@
# CATEGORIZE
The `CATEGORIZE` function organizes textual data into groups of similar format.
> **Note:** The `CATEGORIZE` function is currently in technical preview and may undergo changes or be removed in future releases.
Groups text messages into categories of similarly formatted text values.
## Syntax
@ -10,13 +8,15 @@ The `CATEGORIZE` function organizes textual data into groups of similar format.
### Parameters
#### field
#### `field`
The expression that is to be categorized.
Expression to categorize.
## Examples
The following example demonstrates how to use `CATEGORIZE` to group server log messages into categories and then aggregate their counts.
Categorizing server log messages
Categorizes server log messages into categories and aggregates their counts.
```esql
FROM sample_data
@ -25,6 +25,10 @@ FROM sample_data
## Limitations
- `CATEGORIZE` can't be used within other expressions
- `CATEGORIZE` can't be used with multiple groupings
- `CATEGORIZE` can't be used or referenced within aggregate functions
- Cannot be used within other expressions.
- Cannot be used with multiple groupings.
- Cannot be used or referenced within aggregate functions.
## Additional Notes
- The `CATEGORIZE` function requires a platinum license.

View file

@ -1,6 +1,6 @@
# CBRT
The CBRT function calculates the cube root of a given number.
Returns the cube root of a number. The input can be any numeric value, and the return value is always a double. Cube roots of infinities are `null`.
## Syntax
@ -8,9 +8,9 @@ The CBRT function calculates the cube root of a given number.
### Parameters
#### number
#### `number`
This is a numeric expression. If the parameter is `null`, the function will also return `null`.
Numeric expression. If `null`, the function returns `null`.
## Examples
@ -18,3 +18,5 @@ This is a numeric expression. If the parameter is `null`, the function will also
ROW d = 1000.0
| EVAL c = cbrt(d)
```
Calculate the cube root of the value `1000.0`.

View file

@ -1,6 +1,6 @@
# CEIL
The CEIL function rounds a number up to the nearest integer.
Rounds a number up to the nearest integer.
## Syntax
@ -8,17 +8,21 @@ The CEIL function rounds a number up to the nearest integer.
### Parameters
#### number
#### `number`
This is a numeric expression. If the parameter is `null`, the function will also return `null`.
Numeric expression. If `null`, the function returns `null`.
## Examples
Rounding up a decimal number
```esql
ROW a=1.8
| EVAL a=CEIL(a)
| EVAL a = CEIL(a)
```
This example rounds the value `1.8` up to the nearest integer, resulting in `2`.
## Limitations
- the CEIL function does not perform any operation for `long` (including unsigned) and `integer` types. For `double` type, it picks the closest `double` value to the integer, similar to the Math.ceil function in other programming languages.
- This function is a no-op for `long` (including unsigned) and `integer` types. For `double`, it selects the closest `double` value to the integer, similar to `Math.ceil`.

View file

@ -1,6 +1,6 @@
# CIDR_MATCH
## CIDR_MATCH
The CIDR_MATCH function checks if a given IP address falls within one or more specified CIDR blocks.
The `CIDR_MATCH` function checks if a given IP address is contained within one or more specified CIDR blocks.
## Syntax
@ -8,28 +8,31 @@ The CIDR_MATCH function checks if a given IP address falls within one or more sp
### Parameters
#### ip
#### `ip`
The IP address to be checked. This function supports both IPv4 and IPv6 addresses.
The IP address to test. Must be of type `ip` (supports both IPv4 and IPv6).
#### blockX
#### `blockX`
The CIDR block(s) against which the IP address is to be checked.
One or more CIDR blocks to test the IP address against.
## Examples
The following example checks if the IP address 'ip1' falls within the CIDR blocks "127.0.0.2/32":
Filtering IP addresses
```esql
FROM hosts
| WHERE CIDR_MATCH(ip1, "127.0.0.2/32")
| KEEP card, host, ip0, ip1
| KEEP host, ip1
```
The function also supports passing multiple blockX:
Filtering IP addresses within specific CIDR blocks
```esql
FROM network_logs
| WHERE CIDR_MATCH(source_ip, "192.168.1.0/24", "10.0.0.0/8")
| KEEP timestamp, source_ip, destination_ip, action
FROM hosts
| WHERE CIDR_MATCH(ip1, "127.0.0.2/32", "127.0.0.3/32")
| KEEP host, ip1
```
This example filters rows where the `ip1` column contains an IP address that falls within the specified CIDR blocks (`127.0.0.2/32` or `127.0.0.3/32`). It then keeps the `card`, `host`, `ip0`, and `ip1` columns in the output.

View file

@ -1,6 +1,6 @@
# COALESCE
The COALESCE function returns the first non-null argument from the list of provided arguments.
Returns the first argument that is not null. If all arguments are null, it returns `null`.
## Syntax
@ -8,27 +8,29 @@ The COALESCE function returns the first non-null argument from the list of provi
### Parameters
#### first
#### `first`
The first expression to evaluate.
Expression to evaluate.
#### rest
#### `rest`
The subsequent expressions to evaluate.
### Description
The COALESCE function evaluates the provided expressions in order and returns the first non-null value it encounters. If all the expressions evaluate to null, the function returns null.
Other expressions to evaluate.
## Examples
In the following example, the COALESCE function evaluates the expressions 'a' and 'b'. Since 'a' is null, the function returns the value of 'b'.
Returning the first non-null value
```esql
ROW a=null, b="b"
| EVAL COALESCE(a, b)
```
#### Result
| a | b | EVAL_COALESCE_a_b |
|------|-----|-------------------|
| null | "b" | "b" |
COALESCE supports any number of rest parameters:
```esql

View file

@ -1,32 +1,29 @@
# CONCAT
The CONCAT function combines two or more strings into one.
Concatenates two or more strings.
## Syntax
`CONCAT(string1, string2, [...stringN])`
`CONCAT(string1, string2)`
### Parameters
#### string1
#### `string1`
The first string to concatenate.
#### string2
#### `string2`
The second string to concatenate.
## Examples
The following example concatenates the `street_1` and `street_2` fields:
```esql
FROM address
| KEEP street_1, street_2
| EVAL fullstreet = CONCAT(street_1, street_2)
```
CONCAT supports any number of string parameters. The following example concatenates the `first_name` and `last_name` fields with a space in between:
```esql

View file

@ -1,6 +1,6 @@
# COS
The COS function calculates the cosine of a given angle.
The `COS` function returns the cosine of a given angle.
## Syntax
@ -8,18 +8,15 @@ The COS function calculates the cosine of a given angle.
### Parameters
#### angle
#### `angle`
The angle for which the cosine is to be calculated, expressed in radians. If the parameter is `null`, the function will return `null`.
An angle, in radians. If `null`, the function returns `null`.
## Examples
```esql
ROW a=1.8
| EVAL cos=COS(a)
| EVAL cos = COS(a)
```
```esql
ROW angle=0.5
| EVAL cosine_value = COS(angle)
```
Calculate the cosine of the angle `1.8` radians.

View file

@ -1,25 +1,22 @@
# COSH
The COSH function calculates the hyperbolic cosine of a given angle.
Returns the hyperbolic cosine of a number.
## Syntax
`COSH(angle)`
`COSH(number)`
### Parameters
#### angle
#### number
The angle in radians for which the hyperbolic cosine is to be calculated. If the angle is null, the function will return null.
Numeric expression. If `null`, the function returns `null`.
## Examples
```esql
ROW a=1.8
| EVAL cosh=COSH(a)
| EVAL cosh = COSH(a)
```
```esql
ROW angle=0.5
| EVAL hyperbolic_cosine = COSH(angle)
```
Calculate the hyperbolic cosine of the value `1.8`.

View file

@ -1,6 +1,6 @@
# COUNT
## COUNT
The COUNT function returns the total number of input values.
The `COUNT` function returns the total number of input values. If no field is specified, it counts the number of rows.
## Syntax
@ -8,20 +8,22 @@ The COUNT function returns the total number of input values.
### Parameters
#### field
#### `field`
This is an expression that outputs values to be counted. If it's omitted, it's equivalent to `COUNT(*)`, which counts the number of rows.
An expression that outputs values to be counted. If omitted, the function is equivalent to `COUNT(*)`, which counts the number of rows.
## Examples
Count the number of specific field values:
### Count specific field values
```esql
FROM employees
| STATS COUNT(height)
```
Count the number of rows using `COUNT()` or `COUNT(*)`:
Count the number of non-null values in the `height` field.
### Count the number of rows
```esql
FROM employees
@ -29,14 +31,18 @@ FROM employees
| SORT languages DESC
```
The expression can use inline functions. In this example, a string is split into multiple values using the `SPLIT` function, and the values are counted:
Count the total number of rows grouped by the `languages` field and sort the results in descending order.
### Count values using inline functions
```esql
ROW words="foo;bar;baz;qux;quux;foo"
| STATS word_count = COUNT(SPLIT(words, ";"))
```
To count the number of times an expression returns `TRUE`, use a `WHERE` command to remove rows that shouldnt be included:
Count the number of elements in a string split by the `;` delimiter.
### Count values based on a condition
```esql
ROW n=1
@ -44,9 +50,13 @@ ROW n=1
| STATS COUNT(n)
```
To count the same stream of data based on two different expressions, use the pattern `COUNT(<expression> OR NULL)`:
Count the number of rows where the value of `n` is less than 0.
### Count based on two different expressions
```esql
ROW n=1
| STATS COUNT(n > 0 OR NULL), COUNT(n < 0 OR NULL)
```
```
Count the number of rows where `n > 0` and `n < 0` separately.

View file

@ -1,6 +1,6 @@
# COUNT_DISTINCT
## COUNT_DISTINCT
The COUNT_DISTINCT function calculates the approximate number of distinct values in a specified field.
The `COUNT_DISTINCT` function returns the approximate number of distinct values in a column or expression.
## Syntax
@ -8,37 +8,43 @@ The COUNT_DISTINCT function calculates the approximate number of distinct values
### Parameters
#### field
#### `field`
The column or literal for which to count the number of distinct values.
#### precision
#### `precision`
(Optional) The precision threshold. The counts are approximate. The maximum supported value is 40000. Thresholds above this number will have the same effect as a threshold of 40000. The default value is 3000.
(Optional) The precision threshold. The maximum supported value is 40,000. Thresholds above this value will behave as if set to 40,000. The default value is 3,000. Higher precision thresholds may increase memory usage and processing time.
## Examples
The following example calculates the number of distinct values in the `ip0` and `ip1` fields:
Counting distinct values in multiple columns
```esql
FROM hosts
| STATS COUNT_DISTINCT(ip0), COUNT_DISTINCT(ip1)
```
You can also specify a precision threshold. In the following example, the precision threshold for `ip0` is set to 80000 and for `ip1` to 5:
This example calculates the approximate number of distinct values in the `ip0` and `ip1` columns.
Configuring the precision threshold
```esql
FROM hosts
| STATS COUNT_DISTINCT(ip0, 80000), COUNT_DISTINCT(ip1, 5)
```
The COUNT_DISTINCT function can also be used with inline functions. This example splits a string into multiple values using the `SPLIT` function and counts the unique values:
This example demonstrates how to specify a precision threshold for each column. The `ip0` column uses a high precision threshold of 80,000, while the `ip1` column uses a low threshold of 5.
Counting distinct values from a split string
```esql
ROW words="foo;bar;baz;qux;quux;foo"
| STATS distinct_word_count = COUNT_DISTINCT(SPLIT(words, ";"))
```
This example splits the `words` string into multiple values using the `SPLIT` function and counts the unique values. The result is the number of distinct words in the string.
### Notes
- Computing exact counts requires loading values into a set and returning its size, which doesn't scale well for high-cardinality sets or large values due to memory usage and communication overhead.

View file

@ -1,6 +1,6 @@
# DATE_DIFF
## DATE_DIFF
The DATE_DIFF function calculates the difference between two timestamps and returns the difference in multiples of the specified `unit`.
Calculates the difference between two timestamps in multiples of a specified unit. If the start timestamp is later than the end timestamp, the result will be negative.
## Syntax
@ -8,32 +8,43 @@ The DATE_DIFF function calculates the difference between two timestamps and retu
### Parameters
#### unit
#### `unit`
The unit of time in which the difference will be calculated.
The unit of time for the difference calculation.
#### startTimestamp
#### `startTimestamp`
The starting timestamp for the calculation.
A string representing the starting timestamp.
#### endTimestamp
#### `endTimestamp`
The ending timestamp for the calculation.
A string representing the ending timestamp.
## Examples
The following example demonstrates how to use the DATE_DIFF function to calculate the difference between two timestamps in microseconds:
Calculate the difference in microseconds between two timestamps:
```esql
ROW date1 = TO_DATETIME("2023-12-02T11:00:00.000Z"), date2 = TO_DATETIME("2023-12-02T11:00:00.001Z")
| EVAL dd_ms = DATE_DIFF("microseconds", date1, date2)
```
Calculate the difference in calendar units (e.g., years) between timestamps. Only fully elapsed units are counted. To include remainders, switch to a smaller unit and perform additional calculations:
```esql
ROW date1 = TO_DATETIME("2023-01-01T00:00:00.000Z"), date2 = TO_DATETIME("2023-12-31T23:59:59.999Z")
| EVAL dd_days = DATE_DIFF("days", date1, date2)
ROW end_23=TO_DATETIME("2023-12-31T23:59:59.999Z"),
start_24=TO_DATETIME("2024-01-01T00:00:00.000Z"),
end_24=TO_DATETIME("2024-12-31T23:59:59.999")
| EVAL end23_to_start24 = DATE_DIFF("year", end_23, start_24)
| EVAL end23_to_end24 = DATE_DIFF("year", end_23, end_24)
| EVAL start_to_end_24 = DATE_DIFF("year", start_24, end_24)
```
## Limitations
- The functions supported units and ES|QLs time span literals are distinct and not interchangeable.
- Supported abbreviations align with other established implementations but may differ from Elasticsearchs date-time nomenclature.
## Notes
- If the `startTimestamp` is later than the `endTimestamp`, the function will return a negative value.

View file

@ -1,6 +1,6 @@
# DATE_EXTRACT
The DATE_EXTRACT function is used to extract specific parts of a date.
Extracts specific parts of a date, such as the year, month, day, or hour.
## Syntax
@ -8,26 +8,63 @@ The DATE_EXTRACT function is used to extract specific parts of a date.
### Parameters
#### datePart
#### `datePart`
This is the part of the date you want to extract, such as "year", "month" or "hour_of_day".
The part of the date to extract. Supported values include:
#### date
- `aligned_day_of_week_in_month`
- `aligned_day_of_week_in_year`
- `aligned_week_of_month`
- `aligned_week_of_year`
- `ampm_of_day`
- `clock_hour_of_ampm`
- `clock_hour_of_day`
- `day_of_month`
- `day_of_week`
- `day_of_year`
- `epoch_day`
- `era`
- `hour_of_ampm`
- `hour_of_day`
- `instant_seconds`
- `micro_of_day`
- `micro_of_second`
- `milli_of_day`
- `milli_of_second`
- `minute_of_day`
- `minute_of_hour`
- `month_of_year`
- `nano_of_day`
- `nano_of_second`
- `offset_seconds`
- `proleptic_month`
- `second_of_day`
- `second_of_minute`
- `year`
- `year_of_era`
This is the date expression.
Refer to `java.time.temporal.ChronoField` for detailed descriptions of these values. If `null`, the function returns `null`.
#### `date`
The date expression from which to extract the specified part. If `null`, the function returns `null`.
## Examples
To extract the year from a date:
### Extracting the Year from a Date
Extract the year from a given date:
```esql
ROW date = DATE_PARSE("yyyy-MM-dd", "2022-05-06")
| EVAL year = DATE_EXTRACT("year", date)
```
To find all events that occurred outside of business hours (before 9 AM or after 5PM), on any given date:
### Filtering Events Outside Business Hours
Retrieve all events that occurred outside of business hours (before 9 AM or after 5 PM):
```esql
FROM sample_data
| WHERE DATE_EXTRACT("hour_of_day", @timestamp) < 9 AND DATE_EXTRACT("hour_of_day", @timestamp) >= 17
```
```

View file

@ -1,6 +1,6 @@
# DATE_FORMAT
The DATE_FORMAT function returns a string representation of a date, formatted according to the provided format.
The `DATE_FORMAT` function returns a string representation of a date in the specified format.
## Syntax
@ -8,21 +8,25 @@ The DATE_FORMAT function returns a string representation of a date, formatted ac
### Parameters
#### dateFormat
#### `dateFormat`
This is an optional parameter that specifies the desired date format.
If no format is provided, the function defaults to the `yyyy-MM-dd'T'HH:mm:ss.SSSZ` format.
- **Optional**
- Specifies the date format. If no format is provided, the default format `yyyy-MM-dd'T'HH:mm:ss.SSSZ` is used.
- If `null`, the function returns `null`.
#### date
#### `date`
This is the date expression that you want to format.
- A date expression.
- If `null`, the function returns `null`.
## Examples
In this example, the `hire_date` field is formatted according to the "YYYY-MM-dd" format, and the result is stored in the `hired` field:
Formatting a date to `yyyy-MM-dd`
```esql
FROM employees
| KEEP first_name, last_name, hire_date
| EVAL hired = DATE_FORMAT("YYYY-MM-dd", hire_date)
| EVAL hired = DATE_FORMAT("yyyy-MM-dd", hire_date)
```
This example formats the `hire_date` field into the `yyyy-MM-dd` format and stores the result in a new column named `hired`.

View file

@ -1,6 +1,6 @@
# DATE_PARSE
The DATE_PARSE function is used to convert a date string into a date based on the provided format pattern.
Parses a date string into a date object using the specified format.
## Syntax
@ -8,22 +8,21 @@ The DATE_PARSE function is used to convert a date string into a date based on th
### Parameters
#### datePattern
#### `datePattern`
This is the format of the date. If `null` is provided, the function will return `null`.
The date format. Refer to the `DateTimeFormatter` documentation for the syntax. If `null`, the function returns `null`.
#### dateString
#### `dateString`
This is the date expression in string format.
A date expression as a string. If `null` or an empty string, the function returns `null`.
## Examples
Parsing a date string
```esql
ROW date_string = "2022-05-06"
| EVAL date = DATE_PARSE("yyyy-MM-dd", date_string)
```
```esql
FROM logs
| EVAL date = DATE_PARSE("yyyy-MM-dd", date_string)
```
This example parses the string `"2022-05-06"` into a date object using the format `"yyyy-MM-dd"`.

View file

@ -1,6 +1,6 @@
# DATE_TRUNC
## DATE_TRUNC
The DATE_TRUNC function rounds down a date to the nearest specified interval.
The `DATE_TRUNC` function rounds down a date to the closest specified interval.
## Syntax
@ -8,25 +8,17 @@ The DATE_TRUNC function rounds down a date to the nearest specified interval.
### Parameters
#### interval
#### `interval`
This is the interval to which the date will be rounded down. It is expressed using the timespan literal syntax.
The interval to which the date is rounded down, expressed using the timespan literal syntax.
#### date
#### `date`
This is the date expression that will be rounded down.
## Important notes
The *interval* parameter of DATE_TRUNC is a timespan literal, NOT a string.
- GOOD: `DATE_TRUNC(1 year, date)`
- BAD: `DATE_TRUNC("year", date)`
When grouping data by time interval, it is recommended to use BUCKET instead of DATE_TRUNC.
The date expression to be truncated.
## Examples
The following example rounds down the hire_date to the nearest year:
Truncate hire dates to the year
```esql
FROM employees
@ -34,7 +26,9 @@ FROM employees
| EVAL year_hired = DATE_TRUNC(1 year, hire_date)
```
You can combine DATE_TRUNC with STATS ... BY to create date histograms. For example, the number of hires per year:
This example truncates the `hire_date` field to the beginning of the year and stores the result in a new column named `year_hired`.
Number of hires per year
```esql
FROM employees
@ -43,7 +37,9 @@ FROM employees
| SORT year
```
Or, you can calculate an hourly error rate:
This example calculates the number of hires per year by truncating the `hire_date` field to the year and grouping the results.
Hourly error rate
```esql
FROM sample_data
@ -52,3 +48,5 @@ FROM sample_data
| STATS error_rate = AVG(error) BY hour
| SORT hour
```
This example calculates the hourly error rate by truncating the `@timestamp` field to the hour and averaging the `error` values for each hour.

View file

@ -1,6 +1,6 @@
# DISSECT
## DISSECT
The DISSECT command is used to extract structured data from a string. It matches the string against a delimiter-based pattern and extracts the specified keys as columns.
The `DISSECT` command is used to extract structured data from a string. It matches the string against a delimiter-based pattern and extracts the specified keys as columns.
### Use Cases
- **Log Parsing**: Extracting timestamps, log levels, and messages from log entries.
@ -13,21 +13,23 @@ The DISSECT command is used to extract structured data from a string. It matches
### Parameters
#### input
#### `input`
The column containing the string you want to structure. If the column has multiple values, DISSECT will process each value.
The column containing the string you want to structure. If the column has multiple values, `DISSECT` will process each value.
#### pattern
#### `pattern`
A dissect pattern. If a field name conflicts with an existing column, the existing column is dropped. If a field name is used more than once, only the rightmost duplicate creates a column.
#### <separator>
#### `<separator>`
A string used as the separator between appended values, when using the append modifier.
## Examples
The following example parses a string that contains a timestamp, some text, and an IP address:
Parsing a string with a timestamp, text, and IP address
Extracts the `date`, `msg`, and `ip` fields from a structured string.
```esql
ROW a = "2023-01-23T12:15:00.000Z - some text - 127.0.0.1"
@ -35,7 +37,9 @@ ROW a = "2023-01-23T12:15:00.000Z - some text - 127.0.0.1"
| KEEP date, msg, ip
```
By default, DISSECT outputs keyword string columns. To convert to another type, use Type conversion functions:
Converting output to another type
Converts the `date` field from a string to a datetime type after extracting it.
```esql
ROW a = "2023-01-23T12:15:00.000Z - some text - 127.0.0.1"
@ -43,7 +47,6 @@ ROW a = "2023-01-23T12:15:00.000Z - some text - 127.0.0.1"
| KEEP date, msg, ip
| EVAL date = TO_DATETIME(date)
```
In this example, we use the `APPEND_SEPARATOR` to concatenate values with a custom separator:
```esql

View file

@ -1,6 +1,6 @@
# DROP
The DROP command is used to eliminate one or more columns from the data.
The `DROP` command removes one or more columns from the result set.
## Syntax
@ -10,31 +10,17 @@ The DROP command is used to eliminate one or more columns from the data.
#### columns
This is a list of columns, separated by commas, that you want to remove. Wildcards are supported.
A comma-separated list of columns to remove. Supports wildcards.
## Examples
In the following example, the 'height' column is removed from the data:
Remove a specific column:
```esql
FROM employees
| DROP height
```
You can also use wildcards to remove all columns that match a certain pattern. In the following example, all columns that start with 'height' are removed:
```esql
FROM employees
| DROP height*
```
This example demonstrates how to drop multiple specific columns by listing them in a comma-separated format.
```esql
FROM employees
| DROP height, weight, age
```
This example shows how to drop columns that match a more complex pattern using wildcards.
```esql

View file

@ -1,6 +1,6 @@
# E
The E function returns Euler's number.
Returns Eulers number.
## Syntax
@ -8,7 +8,7 @@ The E function returns Euler's number.
### Parameters
This function does not require any parameters.
This function does not take any parameters.
## Examples
@ -16,8 +16,4 @@ This function does not require any parameters.
ROW E()
```
```esql
FROM employees
| EVAL euler_number = E()
| KEEP euler_number
```
This example returns Eulers number.

View file

@ -1,6 +1,6 @@
# ENDS_WITH
The ENDS_WITH function checks if a given string ends with a specified suffix.
Determines whether a keyword string ends with a specified suffix and returns a boolean value.
## Syntax
@ -8,25 +8,20 @@ The ENDS_WITH function checks if a given string ends with a specified suffix.
### Parameters
#### str
#### `str`
This is the string expression that you want to check.
String expression. If `null`, the function returns `null`.
#### suffix
#### `suffix`
The string expression that will be checked if it is the ending of the first string.
String expression. If `null`, the function returns `null`.
## Examples
```esql
FROM employees
| KEEP last_name
| EVAL ln_E = ENDS_WITH(last_name, "d")
```
```esql
FROM employees
| KEEP first_name
| EVAL fn_E = ENDS_WITH(first_name, "a")
```
This example checks if the `last_name` column values end with the letter "d" and stores the result in a new column `ln_E`.

View file

@ -1,6 +1,6 @@
# ENRICH
## ENRICH
The ENRICH command allows you to add data from existing indices as new columns using an enrich policy.
The `ENRICH` command allows you to add data from existing indices as new columns using an enrich policy.
## Syntax
@ -8,53 +8,71 @@ The ENRICH command allows you to add data from existing indices as new columns u
### Parameters
#### policy
#### `policy`
The name of the enrich policy. You need to create and execute the enrich policy first.
The name of the enrich policy. You must create and execute the enrich policy before using it.
#### match_field
#### `mode`
The match field. ENRICH uses its value to look for records in the enrich index. If not specified, the match will be performed on the column with the same name as the `match_field` defined in the enrich policy.
(Optional) The mode of the enrich command in cross-cluster queries. Refer to enrich across clusters for more details.
#### new_nameX
#### `match_field`
Allows you to change the name of the column thats added for each of the enrich fields. Defaults to the enrich field name. If a column has the same name as the new name, it will be discarded. If a name (new or original) occurs more than once, only the rightmost duplicate creates a new column.
(Optional) The field used to match records in the enrich index. If not specified, the match is performed on the column with the same name as the `match_field` defined in the enrich policy.
#### fieldX
#### `fieldX`
The enrich fields from the enrich index that are added to the result as new columns. If a column with the same name as the enrich field already exists, the existing column will be replaced by the new column. If not specified, each of the enrich fields defined in the policy is added. A column with the same name as the enrich field will be dropped unless the enrich field is renamed.
(Optional) The enrich fields from the enrich index to be added as new columns. If a column with the same name as the enrich field already exists, it will be replaced. If not specified, all enrich fields defined in the policy are added. Columns with the same name as the enrich fields will be dropped unless renamed.
#### `new_nameX`
(Optional) Allows you to rename the columns added for each enrich field. Defaults to the enrich field name. If a column with the same name as the new name already exists, it will be discarded. If a name (new or original) occurs more than once, only the rightmost duplicate creates a column.
## Examples
The following example uses the `languages_policy` enrich policy to add a new column for each enrich field defined in the policy. The match is performed using the `match_field` defined in the enrich policy and requires that the input table has a column with the same name (`language_code` in this example). ENRICH will look for records in the enrich index based on the match field value.
Basic usage
Add a new column for each enrich field defined in the `languages_policy` enrich policy. The match is performed using the `match_field` defined in the policy, requiring the input table to have a column with the same name (`language_code` in this case).
```esql
ROW language_code = "1"
| ENRICH languages_policy
```
To use a column with a different name than the `match_field` defined in the policy as the match field, use `ON <column-name>`:
Using a different match field
Use a column with a different name than the `match_field` defined in the policy as the match field.
```esql
ROW a = "1"
| ENRICH languages_policy ON a
```
By default, each of the enrich fields defined in the policy is added as a column. To explicitly select the enrich fields that are added, use `WITH <field1>, <field2>, ...`:
Selecting specific enrich fields
Explicitly select the enrich fields to be added as columns.
```esql
ROW a = "1"
| ENRICH languages_policy ON a WITH language_name
```
You can rename the columns that are added using `WITH new_name=<field1>`:
Renaming added columns
Rename the columns added using the `WITH` clause.
```esql
ROW a = "1"
| ENRICH languages_policy ON a WITH name = language_name
```
### Limitations
- In case of name collisions, the newly created columns will override existing columns.
- The ENRICH command only supports enrich policies of type `match`.
- ENRICH only supports enriching on a column of type `keyword`.
In case of name collisions, the newly created columns will override existing columns.
## Limitations
- The `ENRICH` command requires an existing enrich policy to be created and executed beforehand.
- The `match_field` in the `ENRICH` command must match the type defined in the enrich policy. For example:
- A `geo_match` policy requires a `match_field` of type `geo_point` or `geo_shape`.
- A `range` policy requires a `match_field` of type `integer`, `long`, `date`, or `ip`, depending on the range field type in the enrich index.
- For `range` policies, if the `match_field` is of type `KEYWORD`, field values are parsed during query execution. If parsing fails, the output values for that row are set to `null`, and a warning is produced.
- The `geo_match` enrich policy type only supports the `intersects` spatial relation.

View file

@ -1,6 +1,6 @@
# EVAL
The EVAL command allows you to append new columns with calculated values to your data.
The `EVAL` command allows you to add new columns with calculated values to your dataset.
## Syntax
@ -8,13 +8,16 @@ The EVAL command allows you to append new columns with calculated values to your
### Parameters
#### {columnX}
#### `columnX`
This is the name of the column. If a column with the same name already exists, it will be replaced. If a column name is used more than once, only the rightmost duplicate will create a column.
- The name of the column to be added or updated.
- If a column with the same name already exists, it will be replaced by the new column.
- If a column name is used multiple times, only the rightmost definition is applied.
#### {valueX}
#### `valueX`
This is the value for the column. It can be a literal, an expression, or a function. Columns defined to the left of this one can be used.
- The value to assign to the column. This can be a literal, an expression, or a function.
- You can reference columns defined earlier in the same `EVAL` command.
## Notes
@ -30,7 +33,9 @@ Aggregation functions are NOT supported for EVAL.
## Examples
The following example multiplies the `height` column by 3.281 and 100 to create new columns `height_feet` and `height_cm`:
### Adding new calculated columns
Add two new columns, `height_feet` and `height_cm`, by performing calculations on the `height` column:
```esql
FROM employees
@ -39,7 +44,9 @@ FROM employees
| EVAL height_feet = height * 3.281, height_cm = height * 100
```
If the specified column already exists, the existing column will be replaced, and the new column will be appended to the table:
### Overwriting an existing column
Replace the `height` column with a new value calculated by converting it to feet:
```esql
FROM employees
@ -48,7 +55,9 @@ FROM employees
| EVAL height = height * 3.281
```
Specifying the output column name is optional. If not specified, the new column name is equal to the expression. The following query adds a column named `height*3.281`:
### Adding a column without specifying a name
If no column name is provided, the new column will be named after the expression itself. For example, this query adds a column named `height*3.281`:
```esql
FROM employees
@ -57,7 +66,9 @@ FROM employees
| EVAL height * 3.281
```
Because this name contains special characters, it needs to be quoted with backticks (`) when using it in subsequent commands:
### Using a column with special characters in subsequent commands
When a column name contains special characters, enclose it in backticks (`) to reference it in later commands:
```esql
FROM employees

View file

@ -1,6 +1,6 @@
# EXP
The EXP function calculates the value of Euler's number (e) raised to the power of a given number.
Returns the value of e raised to the power of the given number.
## Syntax
@ -8,19 +8,15 @@ The EXP function calculates the value of Euler's number (e) raised to the power
### Parameters
#### number
#### `number`
A numeric expression. If the parameter is `null`, the function will also return `null`.
Numeric expression. If `null`, the function returns `null`.
## Examples
```esql
ROW d = 5.0
| EVAL s = EXP(d)
```
```esql
FROM geo
| EVAL exp = EXP(x)
```
Calculate e raised to the power of 5.0.

View file

@ -1,6 +1,6 @@
# FLOOR
The FLOOR function rounds a number down to the nearest integer.
Rounds a number down to the nearest integer. For `double` values, it selects the closest `double` representation of the integer, similar to `Math.floor`. For `long` (including unsigned) and `integer`, this operation has no effect.
## Syntax
@ -8,22 +8,25 @@ The FLOOR function rounds a number down to the nearest integer.
### Parameters
#### number
#### `number`
This is a numeric expression. If the parameter is `null`, the function will return `null`.
Numeric expression. If `null`, the function returns `null`.
## Examples
```esql
ROW a=1.8
| EVAL a=FLOOR(a)
| EVAL a = FLOOR(a)
```
Rounds the value `1.8` down to `1`.
```esql
FROM employees
| KEEP first_name, last_name, height
| EVAL height_floor = FLOOR(height)
```
Rounds all values in the column `height` down to nearest integer
## Notes

View file

@ -1,61 +1,80 @@
# FROM
## FROM
The `FROM` command retrieves a table of data from a specified data stream, index, or alias.
The `FROM` command retrieves data from a data stream, index, or alias and returns it as a table. Each row in the table represents a document, and each column corresponds to a field that can be accessed by its name.
## Syntax
`FROM index_pattern METADATA fields`
`FROM index_pattern [METADATA fields]`
### Parameters
#### index_pattern
#### `index_pattern`
This parameter represents a list of indices, data streams, or aliases. It supports the use of wildcards and date math.
A list of indices, data streams, or aliases. Supports wildcards and date math.
#### fields
#### `fields`
This is a comma-separated list of metadata fields to be retrieved.
A comma-separated list of metadata fields to retrieve.
## Description
## Examples
The `FROM` command retrieves a table of data from a specified data stream, index, or alias. Each row in the resulting table represents a document, and each column corresponds to a field. The field can be accessed using its name.
### Basic Example
Retrieve all documents from the `employees` index:
#### Basic Data Retrieval
```esql
FROM employees
```
#### Time Series Data
Use date math to refer to indices, aliases, and data streams. This can be useful for time series data, for example, to access todays index:
### Using Date Math
Access indices, aliases, or data streams using date math. For example, retrieve todays index for time series data:
```esql
FROM <logs-{now/d}>
```
#### Multiple Indices
Use comma-separated lists or wildcards to query multiple data streams, indices, or aliases:
### Querying Multiple Data Streams, Indices, or Aliases
Query multiple data streams, indices, or aliases using a comma-separated list or wildcards:
```esql
FROM employees-00001,other-employees-*
```
#### Remote Clusters
Use the format `<remote_cluster_name>:<target>` to query data streams and indices on remote clusters:
### Querying Across Clusters
Query data streams and indices on remote clusters using the format `<remote_cluster_name>:<target>`:
```esql
FROM cluster_one:employees-00001,cluster_two:other-employees-*
```
#### Metadata Retrieval
Use the optional `METADATA` directive to enable metadata fields:
### Using the `METADATA` Directive
Enable metadata fields by using the optional `METADATA` directive:
```esql
FROM employees METADATA _id
```
#### Escaping Special Characters
Use enclosing double quotes (") or three enclosing double quotes (""") to escape index names that contain special characters:
### Escaping Index Names
Escape index names containing special characters using double quotes (`"`) or triple double quotes (`"""`):
```esql
FROM "this=that","""this[that"""
```
### Limitations
## Limitations
- By default, an ES|QL query without an explicit `LIMIT` uses an implicit limit of 1000 rows. This applies to the `FROM` command as well.
- Queries do not return more than 10,000 rows, regardless of the `LIMIT` commands value.
- By default, the `FROM` command applies an implicit limit of 1,000 rows if no explicit `LIMIT` is specified. For example:
```esql
FROM employees
```
is equivalent to:
```esql
FROM employees
| LIMIT 1000
```
- Queries cannot return more than 10,000 rows, even if a higher `LIMIT` is specified. This is a configurable upper limit. For more details, refer to the [LIMIT command](#LIMIT).

View file

@ -1,6 +1,6 @@
# GREATEST
The GREATEST function returns the maximum value from multiple columns.
Returns the maximum value from multiple columns. This function is similar to `MV_MAX` but is designed to operate on multiple columns simultaneously.
## Syntax
@ -8,27 +8,26 @@ The GREATEST function returns the maximum value from multiple columns.
### Parameters
#### first
#### `first`
The first column to evaluate.
#### rest
#### `rest`
The remaining columns to evaluate.
## Examples
Finding the maximum value between two columns
```esql
ROW a = 10, b = 20
| EVAL g = GREATEST(a, b)
```
```esql
ROW x = "apple", y = "banana", z = "cherry"
| EVAL max_fruit = GREATEST(x, y, z)
```
This example evaluates the maximum value between columns `a` and `b`, resulting in `g = 20`.
## Notes
- When applied to `keyword` or `text` fields, the GREATEST function returns the last string in alphabetical order.
- When applied to `boolean` columns, it returns `true` if any values are `true`.
- When applied to `keyword` or `text` fields, the function returns the last string in alphabetical order.
- When applied to `boolean` columns, the function returns `true` if any of the values are `true`.

View file

@ -1,6 +1,6 @@
# GROK
## GROK
The GROK command is used to extract structured data from a string. It matches the string against patterns based on regular expressions and extracts the specified patterns as columns.
The `GROK` command is used to extract structured data from a string. It matches the string against patterns based on regular expressions and extracts the specified patterns as columns.
## Syntax
@ -8,47 +8,57 @@ The GROK command is used to extract structured data from a string. It matches th
### Parameters
#### input
#### `input`
The column containing the string you want to structure. If the column has multiple values, GROK will process each value.
The column containing the string you want to structure. If the column has multiple values, `GROK` will process each value.
#### pattern
#### `pattern`
A grok pattern. If a field name conflicts with an existing column, the existing column is dropped. If a field name is used more than once, a multi-valued column is created with one value per each occurrence of the field name.
A grok pattern.
- If a field name conflicts with an existing column, the existing column is discarded.
- If a field name is used more than once, a multi-valued column will be created with one value for each occurrence of the field name.
## Examples
The following example parses a string that contains a timestamp, an IP address, an email address, and a number:
Parsing a string with multiple data types
Parse a string containing a timestamp, an IP address, an email address, and a number:
```esql
ROW a = "2023-01-23T12:15:00.000Z 127.0.0.1 some.email@foo.com 42"
| GROK a "%{TIMESTAMP_ISO8601:date} %{IP:ip} %{EMAILADDRESS:email} %{NUMBER:num}"
| GROK a """%{TIMESTAMP_ISO8601:date} %{IP:ip} %{EMAILADDRESS:email} %{NUMBER:num}"""
| KEEP date, ip, email, num
```
By default, GROK outputs keyword string columns. `int` and `float` types can be converted by appending `:type` to the semantics in the pattern. For example `{NUMBER:num:int}`:
Type conversion for numeric fields
Convert numeric fields to specific types by appending `:type` to the semantics in the pattern. For example, `{NUMBER:num:int}` converts the `num` field to an integer:
```esql
ROW a = "2023-01-23T12:15:00.000Z 127.0.0.1 some.email@foo.com 42"
| GROK a "%{TIMESTAMP_ISO8601:date} %{IP:ip} %{EMAILADDRESS:email} %{NUMBER:num:int}"
| GROK a """%{TIMESTAMP_ISO8601:date} %{IP:ip} %{EMAILADDRESS:email} %{NUMBER:num:int}"""
| KEEP date, ip, email, num
```
For other type conversions, use Type conversion functions:
Using type conversion functions
For other type conversions, use type conversion functions like `TO_DATETIME`:
```esql
ROW a = "2023-01-23T12:15:00.000Z 127.0.0.1 some.email@foo.com 42"
| GROK a "%{TIMESTAMP_ISO8601:date} %{IP:ip} %{EMAILADDRESS:email} %{NUMBER:num:int}"
| GROK a """%{TIMESTAMP_ISO8601:date} %{IP:ip} %{EMAILADDRESS:email} %{NUMBER:num:int}"""
| KEEP date, ip, email, num
| EVAL date = TO_DATETIME(date)
```
If a field name is used more than once, GROK creates a multi-valued column:
Handling multi-valued columns
When a field name is used more than once, `GROK` creates a multi-valued column:
```esql
FROM addresses
| KEEP city.name, zip_code
| GROK zip_code "%{WORD:zip_parts} %{WORD:zip_parts}"
| GROK zip_code """%{WORD:zip_parts} %{WORD:zip_parts}"""
```
### Limitations

View file

@ -1,6 +1,6 @@
# HASH
The HASH function computes the hash of a given input using a specified algorithm.
Computes the hash of the input using various algorithms such as MD5, SHA, SHA-224, SHA-256, SHA-384, and SHA-512.
## Syntax
@ -8,23 +8,21 @@ The HASH function computes the hash of a given input using a specified algorithm
### Parameters
#### algorithm
#### `algorithm`
The hash algorithm to be used.
Hash algorithm to use.
The supported algorithms are:
- "MD5"
- "SHA-1"
- "SHA-256"
#### `input`
#### input
The value to be hashed.
Input to hash.
## Examples
```esql
FROM messages
| EVAL hashed_content = HASH("SHA-1", content)
| KEEP message_id, hashed_content
FROM sample_data
| WHERE message != "Connection error"
| EVAL md5 = hash("md5", message), sha256 = hash("sha256", message)
| KEEP message, md5, sha256
```
This example computes the MD5 and SHA-256 hashes of the `message` field for rows where the `message` is not "Connection error".

View file

@ -1,6 +1,6 @@
# HYPOT
The HYPOT function is used to calculate the hypotenuse of two numbers.
Calculates the hypotenuse of two numbers. The input can be any numeric values, and the return value is always a double. If either input is `null`, the function returns `null`. Hypotenuses of infinities are also `null`.
## Syntax
@ -10,19 +10,17 @@ The HYPOT function is used to calculate the hypotenuse of two numbers.
#### number1
This is a numeric value. If it's `null`, the function will also return `null`.
Numeric expression. If `null`, the function returns `null`.
#### number2
This is also a numeric value. If it's `null`, the function will also return `null`.
Numeric expression. If `null`, the function returns `null`.
## Examples
Check the hypotenuse of two variables through the following example:
```esql
ROW a = 3.0, b = 4.0
| EVAL c = HYPOT(a, b)
```
Note that the HYPOT function returns the hypotenuse in double data type. Besides, if any of the numbers is infinity, the hypotenuse returns `null`.
Calculates the hypotenuse of a right triangle with sides `a = 3.0` and `b = 4.0`.

View file

@ -1,6 +1,6 @@
# IP_PREFIX
The IP_PREFIX function truncates an IP address to a specified prefix length.
Truncates an IP address to a specified prefix length.
## Syntax
@ -8,27 +8,25 @@ The IP_PREFIX function truncates an IP address to a specified prefix length.
### Parameters
#### ip
#### `ip`
The IP address that you want to truncate. This function supports both IPv4 and IPv6 addresses.
The IP address to truncate. Supports both IPv4 and IPv6 addresses and must be of type `ip`.
#### prefixLengthV4
#### `prefixLengthV4`
The prefix length for IPv4 addresses.
The prefix length to apply for IPv4 addresses.
#### prefixLengthV6
#### `prefixLengthV6`
The prefix length for IPv6 addresses.
The prefix length to apply for IPv6 addresses.
## Examples
Truncating IPv4 and IPv6 addresses
```esql
ROW ip4 = TO_IP("1.2.3.4"), ip6 = TO_IP("fe80::cae2:65ff:fece:feb9")
| EVAL ip4_prefix = IP_PREFIX(ip4, 24, 0), ip6_prefix = IP_PREFIX(ip6, 0, 112)
```
```esql
FROM network_logs
| EVAL truncated_ip = IP_PREFIX(ip_address, 16, 0)
| KEEP ip_address, truncated_ip
```
This example truncates the IPv4 address `1.2.3.4` to a `/24` prefix and the IPv6 address `fe80::cae2:65ff:fece:feb9` to a `/112` prefix.

View file

@ -1,6 +1,6 @@
# KEEP
## KEEP
The KEEP command allows you to specify which columns to return and in what order.
The `KEEP` command specifies which columns are returned and the order in which they appear in the output.
## Syntax
@ -8,9 +8,9 @@ The KEEP command allows you to specify which columns to return and in what order
### Parameters
#### columns
#### `columns`
A comma-separated list of columns to retain. Wildcards are supported. If an existing column matches multiple provided wildcards or column names, certain rules apply.
A comma-separated list of columns to retain. Supports wildcards. If a column matches multiple expressions, precedence rules determine the final output.
## Note
@ -28,46 +28,54 @@ Important: only the columns in the KEEP command can be used after a KEEP command
## Examples
Return columns in a specified order:
### Return columns in a specific order
The following query returns the `emp_no`, `first_name`, `last_name`, and `height` columns in the specified order:
```esql
FROM employees
| KEEP emp_no, first_name, last_name, height
```
If you do not want to mention each column by name, you can use wildcards to select all columns that match a certain pattern:
### Use wildcards to match column names
This query keeps all columns with names starting with `h`:
```esql
FROM employees
| KEEP h*
```
The wildcard asterisk (`*`) by itself translates to all columns that are not matched by other arguments.
### Combine specific columns and wildcards
This command will first return all columns with a name that starts with `h`, followed by all other columns:
The asterisk wildcard (`*`) matches all columns not explicitly specified. This query returns all columns starting with `h` first, followed by all other columns:
```esql
FROM employees
| KEEP h*, *
```
The following examples demonstrate how precedence rules function when a field name corresponds to multiple expressions.
### Precedence of complete field names over wildcards
Clear field name takes precedence over wildcard expressions:
When a column matches both a complete field name and a wildcard, the complete field name takes precedence:
```esql
FROM employees
| KEEP first_name, last_name, first_name*
```
Wildcard expressions have the same priority, with the last one winning (despite it being a less specific match):
### Wildcard precedence and ordering
If a column matches multiple wildcard expressions, the rightmost expression takes precedence, even if it is less specific:
```esql
FROM employees
| KEEP first_name*, last_name, first_na*
```
A simple wildcard expression `*` has the minimum precedence. The sequence of output is determined by other arguments:
### Lowest precedence for the `*` wildcard
The `*` wildcard has the lowest precedence. The order of other arguments determines the output order:
```esql
FROM employees

View file

@ -0,0 +1,25 @@
## KQL
Performs a KQL query and returns `true` if the provided KQL query string matches the row.
## Syntax
`KQL(query)`
### Parameters
#### `query`
Query string in KQL query string format.
## Examples
```esql
FROM books
| WHERE KQL("author: Faulkner")
| KEEP book_no, author
| SORT book_no
| LIMIT 5
```
This example filters rows where the `author` field matches "Faulkner," keeps the `book_no` and `author` columns, sorts the results by `book_no`, and limits the output to 5 rows.

View file

@ -1,6 +1,6 @@
# LEAST
## LEAST
The LEAST function returns the smallest value from multiple columns.
Returns the minimum value from multiple columns. This function is similar to `MV_MIN` but is designed to operate on multiple columns simultaneously.
## Syntax
@ -8,11 +8,11 @@ The LEAST function returns the smallest value from multiple columns.
### Parameters
#### first
#### `first`
The first column to evaluate.
#### rest
#### `rest`
The remaining columns to evaluate.
@ -23,7 +23,4 @@ ROW a = 10, b = 20
| EVAL l = LEAST(a, b)
```
```esql
ROW x = 5, y = 15, z = 10
| EVAL min_value = LEAST(x, y, z)
```
This example calculates the minimum value between columns `a` and `b`.

View file

@ -1,6 +1,6 @@
# LEFT
The LEFT function returns a substring from the beginning of a specified string.
Returns a substring that extracts a specified number of characters from the beginning of a string.
## Syntax
@ -8,18 +8,16 @@ The LEFT function returns a substring from the beginning of a specified string.
### Parameters
#### string
#### `string`
The string from which a substring will be extracted.
The string from which to return a substring.
#### length
#### `length`
The number of characters to extract from the string.
The number of characters to return.
## Examples
The following example extracts the first three characters from the `last_name` field:
```esql
FROM employees
| KEEP last_name
@ -28,8 +26,4 @@ FROM employees
| LIMIT 5
```
```esql
ROW full_name = "John Doe"
| EVAL first_name = LEFT(full_name, 4)
| KEEP first_name
```
Extracts the first three characters from the `last_name` column, sorts the results alphabetically, and limits the output to the first five rows.

View file

@ -1,6 +1,6 @@
# LENGTH
## LENGTH
The LENGTH function calculates the character length of a given string.
Returns the character length of a string.
## Syntax
@ -8,21 +8,17 @@ The LENGTH function calculates the character length of a given string.
### Parameters
#### string
#### `string`
The string expression for which the length is to be calculated.
String expression. If `null`, the function returns `null`.
## Examples
The following example calculates the character length of the `first_name` field:
```esql
FROM employees
| KEEP first_name, last_name
| EVAL fn_length = LENGTH(first_name)
FROM airports
| WHERE country == "India"
| KEEP city
| EVAL fn_length = LENGTH(city)
```
```esql
ROW message = "Hello, World!"
| EVAL message_length = LENGTH(message)
```
This example calculates the character length of the `city` field for airports located in India.

View file

@ -1,6 +1,6 @@
# LIMIT
## LIMIT
The LIMIT command is used to restrict the number of rows returned by a query.
The `LIMIT` command restricts the number of rows returned by a query.
## Syntax
@ -8,13 +8,66 @@ The LIMIT command is used to restrict the number of rows returned by a query.
### Parameters
#### max_number_of_rows
#### `max_number_of_rows`
This parameter specifies the maximum number of rows to be returned.
The maximum number of rows to return.
## Description
The `LIMIT` command restricts the number of rows returned by a query. For example:
```esql
FROM index
| WHERE field == "value"
```
is equivalent to:
```esql
FROM index
| WHERE field == "value"
| LIMIT 1000
```
Queries cannot return more than 10,000 rows, regardless of the value specified in the `LIMIT` command. This is a configurable upper limit.
### Overcoming the 10,000 Row Limit
To address this limitation:
- Modify the query to reduce the result set size by using `WHERE` to filter the data.
- Perform post-query processing within the query itself. For example, use the `STATS` command to aggregate data.
The 10,000-row limit applies only to the number of rows returned by the query, not to the number of documents processed. The query operates on the full dataset.
Consider these examples:
```esql
FROM index
| WHERE field0 == "value"
| LIMIT 20000
```
and
```esql
FROM index
| STATS AVG(field1) BY field2
| LIMIT 20000
```
In both cases, the filtering by `field0` in the first query or the grouping by `field2` in the second query is applied to all documents in the `index`. However, the output is capped at 10,000 rows, even if more rows are available.
### Configuring Limits
The default and maximum limits can be adjusted using the following dynamic cluster settings:
- `esql.query.result_truncation_default_size`
- `esql.query.result_truncation_max_size`
Increasing these limits may lead to higher memory usage, longer processing times, and increased internode traffic within and across clusters.
## Examples
This example demonstrates how to limit the number of rows returned to 5.
Limit the result to the first 5 rows, sorted by `emp_no` in ascending order:
```esql
FROM employees
@ -22,67 +75,13 @@ FROM employees
| LIMIT 5
```
This example shows how to limit the number of rows after applying a filter:
```esql
FROM employees
| WHERE department == "Engineering"
| LIMIT 10
```
This example demonstrates limiting the number of rows after performing an aggregation:
```esql
FROM employees
| STATS avg_salary = AVG(salary) BY department
| LIMIT 3
```
This example shows how to limit the number of rows after sorting the data:
```esql
FROM employees
| SORT hire_date DESC
| LIMIT 7
```
This example demonstrates the use of `LIMIT` in conjunction with multiple other commands:
```esql
FROM employees
| WHERE hire_date > "2020-01-01"
| SORT salary DESC
| KEEP first_name, last_name, salary
| LIMIT 5
```
`LIMIT` can and should be used as soon as possible in the query
For example this query uses SORT and LIMIT as soon as it can and before further computations:
```esql
FROM sets
| EVAL count = MV_COUNT(values)
| SORT count DESC
| LIMIT 5
| EVAL min = MV_MIN(values), max = MV_MAX(values), avg = MV_AVG(value)
| KEEP set_id, min, max, avg
```
## Limitations
There is no way to achieve pagination with LIMIT, there is no offset parameter.
- Queries cannot return more than 10,000 rows, even if the `LIMIT` value exceeds this threshold.
A query will never return more than 10,000 rows. This limitation only applies to the number of rows retrieved by the query. The query and any aggregations will still run on the full dataset.
To work around this limitation:
To work around this limitation:
- Reduce the size of the result set by modifying the query to only return relevant data. This can be achieved by using the WHERE command to select a smaller subset of the data.
- Shift any post-query processing to the query itself. The ES|QL STATS ... BY command can be used to aggregate data within the query.
- Reduce the size of the result set by modifying the query to only return relevant data. This can be achieved by using the WHERE command to select a smaller subset of the data.
- Shift any post-query processing to the query itself. The ES|QL STATS ... BY command can be used to aggregate data within the query.
## Notes
The default and maximum limits can be adjusted using the following dynamic cluster settings:
- `esql.query.result_truncation_default_size`
- `esql.query.result_truncation_max_size`
- Adjusting the default or maximum limits can impact performance and resource usage.

View file

@ -1,6 +1,6 @@
# LOCATE
The LOCATE function returns the position of a specified substring within a string.
Returns an integer indicating the position of a substring within another string. If the substring is not found, it returns `0`. Note that string positions start from `1`.
## Syntax
@ -8,29 +8,28 @@ The LOCATE function returns the position of a specified substring within a strin
### Parameters
#### string
#### `string`
The string in which you want to search for the substring.
An input string.
#### substring
#### `substring`
The substring you want to find in the string.
A substring to locate within the input string.
#### start
#### `start`
The starting index for the search.
The start index. This parameter is optional.
## Examples
Locate a substring within a string
```esql
ROW a = "hello"
| EVAL a_ll = LOCATE(a, "ll")
```
```esql
ROW phrase = "Elasticsearch is powerful"
| EVAL position = LOCATE(phrase, "powerful")
```
This example finds the position of the substring `"ll"` within the string `"hello"`. The result is `3`.
## Notes

View file

@ -1,6 +1,6 @@
# LOG
The LOG function calculates the logarithm of a given value to a specified base.
Calculates the logarithm of a numeric value to a specified base. If the base is not provided, it defaults to the natural logarithm (base e).
## Syntax
@ -8,22 +8,34 @@ The LOG function calculates the logarithm of a given value to a specified base.
### Parameters
#### base
#### `base`
The base of the logarithm. If the base is `null`, the function will return `null`. If the base is not provided, the function will return the natural logarithm (base e) of the value.
- Base of the logarithm. If `null`, the function returns `null`. If not provided, the function calculates the natural logarithm (base e).
#### number
#### `number`
The numeric value for which the logarithm is to be calculated. If the number is `null`, the function will return `null`.
- Numeric expression. If `null`, the function returns `null`.
## Examples
Logarithm with a specified base
Calculate the logarithm of 8 to base 2:
```esql
ROW base = 2.0, value = 8.0
| EVAL s = LOG(base, value)
```
Natural logarithm (base e)
Calculate the natural logarithm of 100:
```esql
ROW value = 100
| EVAL s = LOG(value)
```
## Limitations
- Logs of zero, negative numbers, and a base of one return `null` and generate a warning.

View file

@ -1,6 +1,6 @@
# LTRIM
The LTRIM function is used to remove leading whitespaces from a string.
Removes leading whitespaces from a string.
## Syntax
@ -8,9 +8,9 @@ The LTRIM function is used to remove leading whitespaces from a string.
### Parameters
#### string
#### `string`
This is the string expression from which you want to remove leading whitespaces. If the string is `null`, the function will return `null`.
String expression. If `null`, the function returns `null`.
## Examples
@ -22,8 +22,4 @@ ROW message = " some text ", color = " red "
| EVAL color = CONCAT("'", color, "'")
```
```esql
ROW text = " example text "
| EVAL trimmed_text = LTRIM(text)
| EVAL formatted_text = CONCAT("Trimmed: '", trimmed_text, "'")
```
This example removes leading whitespaces from the `message` and `color` columns, then wraps the resulting strings in single quotes.

View file

@ -1,36 +1,47 @@
# MATCH
`MATCH` is a function used to execute a match query on a specified field. It works on various field types including text fields, boolean, dates, and numeric types. It returns 'true' when the provided query matches the row.
The `MATCH` function performs a match query on the specified field. It is equivalent to the `match` query in the Elasticsearch Query DSL and can be used to search for values in various field types, including text, semantic_text, keyword, boolean, dates, and numeric types.
`MATCH` supports function named parameters to specify additional options for the match query. For a simplified syntax, the match operator `:` can be used instead of `MATCH`. The function returns `true` if the provided query matches the row.
## Syntax
`MATCH (field, query)`
`MATCH(field, query, options)`
### Parameters
#### `field`
This represents the field that the query will target. If the field contains multiple values,
`MATCH` will process each value.
The field that the query will target.
#### `query`
This is the value that is being searched in the provided field.
The value to find in the specified field.
#### `options`
(Optional) Additional match query options provided as function named parameters. Refer to the match query documentation for more details.
## Examples
In this example, `"Faulkner"` is matched against the `author` field in `books` data. `MATCH` returns true if it finds the provided query, in this case `"Faulkner"` in the author field. The query then keeps the columns `book_no` and `author`, sorts by `book_no` and limits the result to 5.
Match on a specific field
```esql
FROM books
| WHERE MATCH(author, "Faulkner")
| KEEP book_no, author
| SORT book_no
| LIMIT 5;
| LIMIT 5
```
## Notes
This example retrieves books where the `author` field matches "Faulkner," keeping only the `book_no` and `author` fields, sorting by `book_no`, and limiting the results to 5 rows.
- Do not use `MATCH` in production - it is in technical preview and may be changed or removed in a future release
- `MATCH` relies on Elasticsearch Match query under the hood, and should be used for full-text search only. For more traditional
text matching, `LIKE` or `RLIKE` should be used instead.
Match with additional options
```esql
FROM books
| WHERE MATCH(title, "Hobbit Back Again", {"operator": "AND"})
| KEEP title
```
This example searches for books where the `title` field matches "Hobbit Back Again" using the `AND` operator, and keeps only the `title` field in the results.

View file

@ -1,6 +1,6 @@
# MAX
The MAX function calculates the maximum value of a specified field.
The `MAX` function returns the maximum value of a field.
## Syntax
@ -8,22 +8,26 @@ The MAX function calculates the maximum value of a specified field.
### Parameters
#### field
#### `field`
The field for which the maximum value is to be calculated.
The field for which the maximum value is calculated.
## Examples
Calculate the maximum number of languages known by employees:
Basic Usage
```esql
FROM employees
| STATS MAX(languages)
```
The MAX function can be used with inline functions:
Calculate the maximum value of the `languages` field.
Using Inline Functions
```esql
FROM employees
| STATS max_avg_salary_change = MAX(MV_AVG(salary_change))
```
Calculate the maximum value of the average salary change by first averaging the multiple values per row using the `MV_AVG` function and then applying the `MAX` function.

View file

@ -1,6 +1,6 @@
# MEDIAN
The MEDIAN function calculates the median value of a numeric field. The median is the value that is greater than half of all values and less than half of all values, also known as the 50% percentile.
The `MEDIAN` function calculates the value that is greater than half of all values and less than half of all values, also known as the 50th percentile. The result is usually approximate.
## Syntax
@ -8,20 +8,22 @@ The MEDIAN function calculates the median value of a numeric field. The median i
### Parameters
#### number
#### `number`
The numeric field for which the median is calculated.
The input numeric value for which the median is calculated.
## Examples
Calculate the median salary:
Calculating the median and 50th percentile of salaries
```esql
FROM employees
| STATS MEDIAN(salary)
| STATS MEDIAN(salary), PERCENTILE(salary, 50)
```
Calculate the median of the maximum values of a multivalued column:
Calculating the median of maximum values in a multivalued column
To calculate the median of the maximum values of a multivalued column, first use `MV_MAX` to get the maximum value per row, and then use the result with the `MEDIAN` function:
```esql
FROM employees
@ -30,4 +32,5 @@ FROM employees
## Limitations
- The MEDIAN function is usually approximate and non-deterministic. This means you can get slightly different results using the same data.
- The `MEDIAN` function is non-deterministic, meaning you may get slightly different results when using the same data.
- Like the `PERCENTILE` function, the `MEDIAN` function provides approximate results.

View file

@ -1,6 +1,8 @@
# MEDIAN_ABSOLUTE_DEVIATION
The MEDIAN_ABSOLUTE_DEVIATION function calculates the median absolute deviation, a measure of variability. It is particularly useful for describing data that may have outliers or may not follow a normal distribution. In such cases, it can be more descriptive than standard deviation. The function computes the median of each data points deviation from the median of the entire sample.
Returns the median absolute deviation, a robust measure of variability. It is particularly useful for describing data with outliers or non-normal distributions, as it can be more descriptive than standard deviation. The median absolute deviation is calculated as the median of the absolute deviations from the median of the entire sample. For a random variable `X`, it is defined as `median(|median(X) - X|)`.
**Note:** This function is usually approximate, similar to `PERCENTILE`.
## Syntax
@ -8,27 +10,30 @@ The MEDIAN_ABSOLUTE_DEVIATION function calculates the median absolute deviation,
### Parameters
#### number
#### `number`
The numeric expression for which the median absolute deviation is to be calculated.
The input numeric field or expression.
## Examples
Calculate the median salary and the median absolute deviation of salaries:
Basic Usage
```esql
FROM employees
| STATS MEDIAN(salary), MEDIAN_ABSOLUTE_DEVIATION(salary)
```
Calculate the median absolute deviation of the maximum values of a multivalued column:
Calculate the median and the median absolute deviation of employee salaries.
Using Inline Functions
```esql
FROM employees
| STATS m_a_d_max_salary_change = MEDIAN_ABSOLUTE_DEVIATION(MV_MAX(salary_change))
```
Calculate the median absolute deviation of the maximum values of a multivalued column by first using `MV_MAX` to get the maximum value per row.
## Limitations
- The `MEDIAN_ABSOLUTE_DEVIATION` function is non-deterministic, which means you can get slightly different results using the same data.
- The `MEDIAN_ABSOLUTE_DEVIATION` function is usually approximate, which means the results may not be exact.
- `MEDIAN_ABSOLUTE_DEVIATION` is non-deterministic, meaning that slightly different results may be returned when using the same data.

View file

@ -1,6 +1,6 @@
# MIN
The MIN function calculates the minimum value of a specified field.
The `MIN` function calculates the minimum value of a field.
## Syntax
@ -8,22 +8,26 @@ The MIN function calculates the minimum value of a specified field.
### Parameters
#### field
#### `field`
The field for which the minimum value is to be calculated.
The field for which the minimum value is calculated.
## Examples
Calculate the minimum number of languages spoken by employees:
Basic Usage
```esql
FROM employees
| STATS MIN(languages)
```
The MIN function can be used with inline functions:
Calculate the minimum value of the `languages` field.
Using Inline Functions
```esql
FROM employees
| STATS min_avg_salary_change = MIN(MV_AVG(salary_change))
```
Calculate the minimum value of the average salary change by first averaging the multiple values per row using the `MV_AVG` function and then applying the `MIN` function.

View file

@ -1,6 +1,6 @@
# MV_APPEND
MV_APPEND is a function that concatenates the values of two multi-value fields.
Concatenates the values of two multi-value fields into a single field.
## Syntax
@ -8,9 +8,15 @@ MV_APPEND is a function that concatenates the values of two multi-value fields.
### Parameters
#### field1
#### `field1`
The first multi-value field to be concatenated.
The first multi-value field to concatenate.
#### `field2`
The second multi-value field to concatenate.
## Examples
```esql
ROW a = ["foo", "bar"], b = ["baz", "qux"]
@ -23,3 +29,7 @@ ROW x = [1, 2, 3], y = [4, 5, 6]
| EVAL z = MV_APPEND(x, y)
| KEEP x, y, z
```
## Limitations
No specific limitations are mentioned in the source documentation.

View file

@ -1,6 +1,6 @@
# MV_AVG
The MV_AVG function calculates the average of all values in a multivalued field and returns a single value.
Converts a multivalued field into a single-valued field containing the average of all its values.
## Syntax
@ -8,7 +8,7 @@ The MV_AVG function calculates the average of all values in a multivalued field
### Parameters
#### number
#### `number`
A multivalued expression.
@ -19,8 +19,11 @@ ROW a=[3, 5, 1, 6]
| EVAL avg_a = MV_AVG(a)
```
**Retrieving the average value from a multivalued field**
Calculate the average of the values in the multivalued column `a`.
```esql
FROM bag_of_numbers
| EVAL min = MV_AVG(numbers)
```
Retrieve the average value from a multivalued field

View file

@ -1,6 +1,6 @@
# MV_CONCAT
MV_CONCAT is a function that transforms a multivalued string expression into a single valued column. It concatenates all values and separates them with a specified delimiter.
Converts a multivalued string expression into a single-valued column by concatenating all values, separated by a specified delimiter.
## Syntax
@ -8,26 +8,30 @@ MV_CONCAT is a function that transforms a multivalued string expression into a s
### Parameters
#### string
#### `string`
A multivalue expression.
A multivalued expression.
#### delim
#### `delim`
This is the delimiter that separates the concatenated values.
The delimiter used to separate the concatenated values.
## Examples
The following example concatenates the values in the array ["foo", "zoo", "bar"] with a comma and a space as the delimiter:
Concatenating string values
```esql
ROW a=["foo", "zoo", "bar"]
| EVAL j = MV_CONCAT(a, ", ")
```
If you want to concatenate non-string columns, you need to convert them to strings first using the `TO_STRING` function:
Concatenates the values in the array ["foo", "zoo", "bar"] with a comma and a space as the delimiter:
Concatenating non-string values
```esql
ROW a=[10, 9, 8]
| EVAL j = MV_CONCAT(TO_STRING(a), ", ")
```
Converts the numeric values in the multivalued column `a` to strings using `TO_STRING`, then concatenates them into a single string, separated by `", "`.

View file

@ -1,6 +1,6 @@
# MV_COUNT
The MV_COUNT function calculates the total number of values in a multivalued expression.
Converts a multivalued expression into a single-valued column containing the count of the number of values.
## Syntax
@ -8,7 +8,7 @@ The MV_COUNT function calculates the total number of values in a multivalued exp
### Parameters
#### field
#### `field`
A multivalued expression.
@ -19,8 +19,10 @@ ROW a=["foo", "zoo", "bar"]
| EVAL count_a = MV_COUNT(a)
```
**Counting the number of element in a multivalued field**
Count the number of values in the multivalued column `a`.
```esql
FROM bag_of_numbers
| EVAL count = MV_COUNT(numbers)
```
Count the number of element in a multivalued field `numbers`

View file

@ -1,6 +1,8 @@
# MV_DEDUPE
The MV_DEDUPE function is used to eliminate duplicate values from a multivalued field.
Removes duplicate values from a multivalued field.
**Note:** `MV_DEDUPE` may, but wont always, sort the values in the column.
## Syntax
@ -8,9 +10,9 @@ The MV_DEDUPE function is used to eliminate duplicate values from a multivalued
### Parameters
#### field
#### `field`
This is a multivalue expression.
A multivalue expression.
## Examples
@ -19,11 +21,4 @@ ROW a=["foo", "foo", "bar", "foo"]
| EVAL dedupe_a = MV_DEDUPE(a)
```
```esql
ROW b=["apple", "apple", "banana", "apple", "banana"]
| EVAL dedupe_b = MV_DEDUPE(b)
```
## Notes
While MV_DEDUPE may sort the values in the column, it's not guaranteed to always do so.
This example removes duplicate values from the multivalued column `a`.

View file

@ -1,6 +1,8 @@
# MV_EXPAND
The MV_EXPAND command is used to expand multivalued columns into individual rows, replicating the other columns for each new row.
The `MV_EXPAND` command expands multivalued columns into one row per value, duplicating other columns.
> **Note:** This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.
## Syntax
@ -10,33 +12,31 @@ The MV_EXPAND command is used to expand multivalued columns into individual rows
#### column
This is the multivalued column that you want to expand.
The multivalued column to expand.
## Notes
The output rows produced by `MV_EXPAND` can be in any order and may not respect preceding `SORT` commands. To ensure a specific ordering, place a `SORT` command after any `MV_EXPAND` commands.
## Examples
Expanding a multivalued column `a` into individual rows:
```esql
ROW a=[1,2,3], b="b", j=["a","b"]
| MV_EXPAND a
```
Expanding two multivalued columns `a` and `j` into individual rows:
Expand a multivalued column `a` into individual rows:
```esql
ROW a=[1,2,3], b="b", j=["a","b"]
| MV_EXPAND a
| MV_EXPAND j
```
Expanding a multivalued column and then filtering the results:
Expand two multivalued columns `a` and `j` into individual rows:
```esql
ROW a=[1,2,3,4,5], b="b"
| MV_EXPAND a
| WHERE a > 2
```
## Notes
This feature is currently in technical preview and may be subject to changes or removal in future releases.
Expand a multivalued column and then filtering the results:

View file

@ -1,6 +1,6 @@
# MV_FIRST
The MV_FIRST function converts a multivalued expression into a single valued column containing the first value.
Converts a multivalued expression into a single-valued column containing the first value. This is particularly useful when working with functions like `SPLIT` that produce multivalued columns in a known order.
## Syntax
@ -8,23 +8,22 @@ The MV_FIRST function converts a multivalued expression into a single valued col
### Parameters
#### field
#### `field`
A multivalue expression.
A multivalued expression.
## Examples
Extracting the first value from a multivalued column
```esql
ROW a="foo;bar;baz"
| EVAL first_a = MV_FIRST(SPLIT(a, ";"))
```
**Retrieving the first element from a multivalued field**
```esql
FROM bag_of_numbers
| EVAL first = MV_FIRST(numbers)
```
This example splits the string `a` into multiple values using the `SPLIT` function and extracts the first value, resulting in `first_a = "foo"`.
## Notes
The MV_FIRST function is particularly useful when reading from a function that emits multivalued columns in a known order, such as SPLIT. However, it's important to note that the order in which multivalued fields are read from underlying storage is not guaranteed. While it's often ascending, this should not be relied upon. If you need the minimum value, use the MV_MIN function instead of MV_FIRST. MV_MIN has optimizations for sorted values, so there isn't a performance benefit to MV_FIRST.
- The order in which multivalued fields are read from underlying storage is not guaranteed. While it is often ascending, this behavior should not be relied upon.
- If you need the minimum value, use `MV_MIN` instead of `MV_FIRST`. The `MV_MIN` function is optimized for sorted values and offers no performance disadvantage compared to `MV_FIRST`.

View file

@ -1,6 +1,8 @@
# MV_LAST
The MV_LAST function converts a multivalued expression into a single valued column containing the last value.
Converts a multivalue expression into a single-valued column containing the last value. This is particularly useful when working with functions that produce multivalued columns in a known order, such as `SPLIT`.
The order in which multivalued fields are read from underlying storage is not guaranteed. While it is often ascending, this behavior should not be relied upon. If you need the maximum value, use `MV_MAX` instead of `MV_LAST`. `MV_MAX` is optimized for sorted values and does not offer a performance advantage over `MV_LAST`.
## Syntax
@ -8,25 +10,17 @@ The MV_LAST function converts a multivalued expression into a single valued colu
### Parameters
#### field
#### `field`
A multivalue expression.
## Examples
Extracting the last value from a multivalued column
```esql
ROW a="foo;bar;baz"
| EVAL last_a = MV_LAST(SPLIT(a, ";"))
```
**Retrieving the last element from a multivalued field**
```esql
FROM bag_of_numbers
| EVAL last = MV_LAST(numbers)
```
## Notes
The MV_LAST function is particularly useful when reading from a function that emits multivalued columns in a known order, such as SPLIT. However, the order in which multivalued fields are read from underlying storage is not guaranteed. It is often ascending, but this should not be relied upon. If you need the maximum value, use the MV_MAX function instead of MV_LAST. MV_MAX has optimizations for sorted values, so there is no performance benefit to using MV_LAST.
This example splits the string `a` into multiple values using the `SPLIT` function and then extracts the last value, resulting in `last_a = "baz"`.

View file

@ -1,6 +1,6 @@
# MV_MAX
MV_MAX function converts a multivalued expression into a single valued column containing the maximum value.
Converts a multivalued expression into a single-valued column containing the maximum value.
## Syntax
@ -8,21 +8,26 @@ MV_MAX function converts a multivalued expression into a single valued column co
### Parameters
#### field
#### `field`
A multivalue expression.
Multivalue expression.
## Examples
The following example demonstrates the use of MV_MAX function:
```esql
ROW a=[3, 5, 1]
| EVAL max_a = MV_MAX(a)
```
**Retrieving the max value from a multivalued field**
Finds the maximum value in the multivalued column `a`, resulting in `max_a = 5`.
```esql
FROM bag_of_numbers
| EVAL max = MV_MAX(numbers)
```
Finds the maximum value in the column `a` by comparing the strings' UTF-8 representations, resulting in `max_a = "zoo"`.
## Supported Types
This function can be used with any column type, including `keyword` columns. For `keyword` columns, it picks the last string by comparing their UTF-8 representation byte by byte.

View file

@ -1,6 +1,6 @@
# MV_MEDIAN
The MV_MEDIAN function converts a multivalued field into a single valued field containing the median value.
Converts a multivalued field into a single-valued field containing the median value.
## Syntax
@ -8,9 +8,9 @@ The MV_MEDIAN function converts a multivalued field into a single valued field c
### Parameters
#### number
#### `number`
A multivalue expression.
Multivalue expression.
## Examples
@ -19,9 +19,11 @@ ROW a=[3, 5, 1]
| EVAL median_a = MV_MEDIAN(a)
```
If the row has an even number of values for a column, the result will be the average of the middle two entries. If the column is not floating point, the average rounds **down**:
Calculate the median value of the multivalued column `a`.
```esql
ROW a=[3, 7, 1, 6]
| EVAL median_a = MV_MEDIAN(a)
```
For rows with an even number of values, the result is the average of the middle two entries. If the column is not of a floating-point type, the average rounds **down**.

View file

@ -1,6 +1,6 @@
# MV_MEDIAN_ABSOLUTE_DEVIATION
The MV_MEDIAN_ABSOLUTE_DEVIATION function transforms a multi-valued field into a single-valued field that retains the median absolute deviation. It computes this as a median of the deviation of each datum from the entire sample's median. In other words, for a random variable `X`, the median absolute deviation can be represented as `median(|median(X) - X|)`.
Converts a multivalued field into a single-valued field containing the median absolute deviation. The median absolute deviation is calculated as the median of each data points deviation from the median of the entire sample. For a random variable `X`, it is defined as `median(|median(X) - X|)`.
## Syntax
@ -8,17 +8,22 @@ The MV_MEDIAN_ABSOLUTE_DEVIATION function transforms a multi-valued field into a
### Parameters
#### number
#### `number`
A multi-valued expression.
*Notice*: If the field comprises an even amount of values, the median is deduced as an average of the two central values. If the value isn't a floating-point number, the average values are rounded towards 0.
A multivalue expression.
## Examples
Calculating the median absolute deviation and median
```esql
ROW values = [0, 2, 5, 6]
| EVAL median_absolute_deviation = MV_MEDIAN_ABSOLUTE_DEVIATION(values), median = MV_MEDIAN(values)
```
This example illustrates the computation of the median absolute deviation and the median from a list of numerical values.
This example calculates the median absolute deviation and the median for the multivalued field `values`.
## Notes
- If the field contains an even number of values, the medians are calculated as the average of the middle two values.
- If the values are not floating-point numbers, the averages are rounded towards 0.

View file

@ -1,6 +1,6 @@
# MV_MIN
The MV_MIN function converts a multivalued expression into a single valued column containing the minimum value.
Converts a multivalued expression into a single-valued column containing the minimum value.
## Syntax
@ -8,19 +8,26 @@ The MV_MIN function converts a multivalued expression into a single valued colum
### Parameters
#### field
#### `field`
This is a multivalue expression.
A multivalued expression.
## Supported Types
This function can be used with any column type, including `keyword` columns. For `keyword` columns, it selects the first string by comparing their UTF-8 representation byte by byte.
## Examples
```esql
#```esql
ROW a=[2, 1]
| EVAL min_a = MV_MIN(a)
```
**Retrieving the min value from a multivalued field**
```esql
Extracts the minimum value from the multivalued column `a`, resulting in `min_a = 1`.
#```esql
FROM bag_of_numbers
| EVAL min = MV_MIN(numbers)
```
Extracts the minimum value from the multivalued column `numbers` by comparing the values lexicographically.

View file

@ -1,6 +1,6 @@
# MV_PERCENTILE
This function converts a multivalued field into a single-valued field. The single-valued field it produces contains the value at which a specified percentage of observed values occur.
Converts a multivalued field into a single-valued field containing the value at which a certain percentage of observed values occur.
## Syntax
@ -8,19 +8,19 @@ This function converts a multivalued field into a single-valued field. The singl
### Parameters
#### number
#### `number`
This refers to a multivalue expression.
A multivalue expression.
#### percentile
#### `percentile`
Value for the percentile to calculate. The value should range from 0 and 100. Values outside this range return null.
The percentile to calculate. Must be a number between 0 and 100. Numbers outside this range will return `null`.
## Examples
Consider an instance where you want to calculate the 50th percentile (or median) of a set of numbers - `[5, 5, 10, 12, 5000]`. This can be done using the following statement.
```esql
ROW values = [5, 5, 10, 12, 5000]
| EVAL p50 = MV_PERCENTILE(values, 50), median = MV_MEDIAN(values)
```
```
This example calculates the 50th percentile (median) of the multivalued field `values` and compares it to the result of the `MV_MEDIAN` function.

View file

@ -1,6 +1,6 @@
# MV_PSERIES_WEIGHTED_SUM
The MV_PSERIES_WEIGHTED_SUM function transforms a multivalued expression into a single-valued column. It does this by multiplying each element in the input list by its corresponding term in a P-Series and then calculating the sum.
Converts a multivalued expression into a single-valued column by multiplying each element in the input list by its corresponding term in the P-Series and computing the sum.
## Syntax
@ -8,24 +8,22 @@ The MV_PSERIES_WEIGHTED_SUM function transforms a multivalued expression into a
### Parameters
#### number
#### `number`
This is a multivalue expression.
A multivalue expression.
#### p
#### `p`
A number that represents the *p* parameter in the P-Series. It influences the contribution of each element to the weighted sum.
A constant number representing the *p* parameter in the P-Series. It determines the impact of each elements contribution to the weighted sum.
## Examples
Calculating the weighted sum of a multivalued column
```esql
ROW a = [70.0, 45.0, 21.0, 21.0, 21.0]
| EVAL sum = MV_PSERIES_WEIGHTED_SUM(a, 1.5)
| KEEP sum
```
```esql
ROW b = [10.0, 20.0, 30.0, 40.0, 50.0]
| EVAL weighted_sum = MV_PSERIES_WEIGHTED_SUM(b, 2.0)
| KEEP weighted_sum
```
This example calculates the weighted sum of the multivalued column `a` using a P-Series parameter of `1.5`. The result is stored in the `sum` column.

View file

@ -1,6 +1,6 @@
# MV_SLICE
The MV_SLICE function is used to extract a subset of a multivalued field using specified start and end index values.
The `MV_SLICE` function extracts a subset of a multivalued field based on specified start and end index values. It is particularly useful when working with functions that produce multivalued columns in a known order, such as `SPLIT` or `MV_SORT`.
## Syntax
@ -8,26 +8,44 @@ The MV_SLICE function is used to extract a subset of a multivalued field using s
### Parameters
#### field
#### `field`
This is a multivalue expression. If `null`, the function will return `null`.
- A multivalue expression. If `null`, the function returns `null`.
#### start
#### `start`
This is the start position. If `null`, the function will return `null`. The start argument can be negative, where an index of -1 is used to specify the last value in the list.
- The starting position of the slice. If `null`, the function returns `null`.
- Can be negative, where `-1` refers to the last value in the list.
#### end
#### `end` (Optional)
This is the end position (included). This parameter is optional; if omitted, the position at `start` is returned. The end argument can be negative, where an index of -1 is used to specify the last value in the list.
- The ending position of the slice (inclusive). If omitted, only the value at the `start` position is returned.
- Can be negative, where `-1` refers to the last value in the list.
## Examples
Extracting specific slices from a multivalued field
```esql
ROW a = [1, 2, 2, 3]
| EVAL a1 = MV_SLICE(a, 1), a2 = MV_SLICE(a, 2, 3)
```
This example extracts:
- `a1` as the value starting at index `1` (second value in the list).
- `a2` as the values from index `2` to `3` (third and fourth values in the list).
Using negative indices to slice from the end of the list
```esql
ROW a = [1, 2, 2, 3]
| EVAL a1 = MV_SLICE(a, -2), a2 = MV_SLICE(a, -3, -1)
```
This example extracts:
- `a1` as the value starting at the second-to-last index (`-2`).
- `a2` as the values from the third-to-last index (`-3`) to the last index (`-1`).
## Notes
- The order in which multivalued fields are read from underlying storage is not guaranteed. While it is often ascending, this behavior should not be relied upon.

View file

@ -1,6 +1,6 @@
# MV_SORT
The MV_SORT function sorts a multivalued field in lexicographical order.
Sorts a multivalued field in lexicographical order.
## Syntax
@ -8,34 +8,25 @@ The MV_SORT function sorts a multivalued field in lexicographical order.
### Parameters
#### field
#### `field`
This is a multivalue expression. If the value is `null`, the function will return `null`.
- Multivalue expression. If `null`, the function returns `null`.
#### order
#### `order`
This parameter determines the sort order. The valid options are `ASC` and `DESC`. If not specified, the default is `ASC`.
- Sort order. The valid options are `ASC` and `DESC`. The default is `ASC`.
## Examples
Without order parameter
```esql
ROW names = ["Alice", "Bob", "Charlie"]
| EVAL sorted_names = MV_SORT(names)
```
With order parameter
```esql
ROW a = [4, 2, -3, 2]
| EVAL sd = MV_SORT(a, "DESC")
| EVAL sa = mv_sort(a), sd = mv_sort(a, "DESC")
```
**Sorting a multivalued field**
This example sorts the multivalued field `a` in ascending order (`sa`) and descending order (`sd`).
```esql
FROM bag_of_numbers
| EVAL sorted = MV_SORT(numbers)
```

View file

@ -1,6 +1,6 @@
# MV_SUM
The MV_SUM function converts a multivalued field into a single valued field containing the sum of all the values.
Converts a multivalued field into a single-valued field containing the sum of all its values.
## Syntax
@ -8,9 +8,9 @@ The MV_SUM function converts a multivalued field into a single valued field cont
### Parameters
#### number
#### `number`
This is a multivalue expression.
A multivalued expression.
## Examples
@ -19,7 +19,4 @@ ROW a=[3, 5, 6]
| EVAL sum_a = MV_SUM(a)
```
```esql
ROW numbers=[1, 2, 3, 4, 5]
| EVAL total_sum = MV_SUM(numbers)
```
This example calculates the sum of the values in the multivalued column `a`.

View file

@ -1,6 +1,6 @@
# MV_ZIP
The MV_ZIP function combines the values from two multivalued fields with a specified delimiter.
Combines the values from two multivalued fields with a delimiter that joins them together.
## Syntax
@ -8,30 +8,32 @@ The MV_ZIP function combines the values from two multivalued fields with a speci
### Parameters
#### string1
#### `string1`
A multivalue expression.
Multivalue expression.
#### string2
#### `string2`
A multivalue expression.
Multivalue expression.
#### delim
#### `delim`
An optional parameter that specifies the delimiter used to join the values. If omitted, a comma (`,`) is used as the default delimiter.
Optional. The delimiter used to join the values. If omitted, `,` is used as the default delimiter.
## Examples
The following example demonstrates how to use the MV_ZIP function:
Combining two multivalued fields with a custom delimiter
```esql
ROW a = ["x", "y", "z"], b = ["1", "2"]
| EVAL c = MV_ZIP(a, b, "-")
| EVAL c = mv_zip(a, b, "-")
| KEEP a, b, c
```
```esql
ROW names = ["Alice", "Bob", "Charlie"], ids = ["001", "002", "003"]
| EVAL combined = MV_ZIP(names, ids, ":")
| KEEP names, ids, combined
```
This example combines the values from two multivalued fields `a` and `b` using the `-` delimiter.
#### Result
| a | b | c |
|------------------|-------------|----------------|
| ["x", "y", "z"] | ["1", "2"] | ["x-1", "y-2", "z"] |

View file

@ -1,6 +1,6 @@
# NOW
## NOW
The NOW function returns the current date and time.
Returns the current date and time.
## Syntax
@ -8,14 +8,18 @@ The NOW function returns the current date and time.
### Parameters
This function does not require any parameters.
This function does not take any parameters.
## Examples
#Retrieve the current date and time
```esql
ROW current_date = NOW()
```
#Retrieve logs from the last hour
```esql
FROM sample_data
| WHERE @timestamp > NOW() - 1 hour

View file

@ -1,240 +1,204 @@
```markdown
# ES|QL Operators
This document provides an overview of the operators supported by ES|QL.
This document provides an overview of the operators available in ES|QL, categorized into binary, unary, logical, and other operators. Each operator is accompanied by an example query to demonstrate its usage.
---
## Binary Operators
### Equality `==`
Binary operators are used to compare or perform arithmetic operations between two values.
The equality operator checks if the values of two operands are equal or not.
Example:
### Equality (`==`)
Checks if two values are equal.
```esql
FROM employees
| WHERE emp_no == 10001
| WHERE first_name == "John"
```
### Inequality `!=`
The inequality operator checks if the values of two operands are equal or not.
Example:
### Inequality (`!=`)
Checks if two values are not equal.
```esql
FROM employees
| WHERE emp_no != 10001
| WHERE department != "HR"
```
### Less Than `<`
The less than operator checks if the value of the left operand is less than the value of the right operand.
Example:
### Less Than (`<`)
Checks if a value is less than another.
```esql
FROM employees
| WHERE salary < 50000
```
### Less Than or Equal To `<=`
This operator checks if the value of the left operand is less than or equal to the value of the right operand.
Example:
### Less Than or Equal To (`<=`)
Checks if a value is less than or equal to another.
```esql
FROM employees
| WHERE salary <= 50000
| WHERE hire_date <= "2020-01-01"
```
### Greater Than `>`
The greater than operator checks if the value of the left operand is greater than the value of the right operand.
Example:
### Greater Than (`>`)
Checks if a value is greater than another.
```esql
FROM employees
| WHERE salary > 50000
| WHERE age > 30
```
### Greater Than or Equal To `>=`
This operator checks if the value of the left operand is greater than or equal to the value of the right operand.
Example:
### Greater Than or Equal To (`>=`)
Checks if a value is greater than or equal to another.
```esql
FROM employees
| WHERE salary >= 50000
| WHERE experience_years >= 5
```
### Add `+`
The add operator adds the values of the operands.
Example:
### Add (`+`)
Adds two values.
```esql
FROM employees
| EVAL total_compensation = salary + bonus
```
### Subtract `-`
The subtract operator subtracts the right-hand operand from the left-hand operand.
Example:
### Subtract (`-`)
Subtracts one value from another.
```esql
FROM employees
| EVAL remaining_salary = salary - tax
| EVAL remaining_vacation_days = total_vacation_days - used_vacation_days
```
### Multiply `*`
The multiply operator multiplies the values of the operands.
Example:
### Multiply (`*`)
Multiplies two values.
```esql
FROM employees
| EVAL yearly_salary = salary * 12
| EVAL annual_salary = monthly_salary * 12
```
### Divide `/`
The divide operator divides the left-hand operand by the right-hand operand.
Example:
### Divide (`/`)
Divides one value by another.
```esql
FROM employees
| EVAL monthly_salary = salary / 12
| EVAL average_salary = total_salary / employee_count
```
### Modulus `%`
The modulus operator returns the remainder of the division of the left operand by the right operand.
Example:
### Modulus (`%`)
Returns the remainder of a division.
```esql
FROM employees
| EVAL remainder = salary % 12
| EVAL remainder = employee_id % 2
```
---
## Unary Operators
### Negation (`-`)
Unary operators operate on a single operand.
Example:
### Negation (`-`)
Negates a numeric value.
```esql
FROM employees
| EVAL negative_salary = -salary
ROW value = 10
| EVAL negative_value = -value
```
---
## Logical Operators
Logical operators are used to combine or negate conditions.
### AND
Logical AND operator.
Example:
Returns `true` if both conditions are true.
```esql
FROM employees
| WHERE salary > 50000 AND bonus > 10000
| WHERE age > 30 AND department == "Engineering"
```
### OR
Logical OR operator.
Example:
Returns `true` if at least one condition is true.
```esql
FROM employees
| WHERE salary > 50000 OR bonus > 10000
| WHERE department == "HR" OR department == "Finance"
```
### NOT
Logical NOT operator.
Example:
Negates a condition.
```esql
FROM employees
| WHERE NOT (salary > 50000)
| WHERE NOT still_hired
```
---
## Other Operators
### IS NULL and IS NOT NULL
Checks if a value is `NULL` or not.
The `IS NULL` operator returns true if the value is null.
Example:
#### IS NULL
```esql
FROM employees
| WHERE manager IS NULL
| WHERE birth_date IS NULL
| KEEP first_name, last_name
```
The `IS NOT NULL` operator returns true if the value is not null.
#### IS NOT NULL
```esql
FROM employees
| WHERE is_rehired IS NOT NULL
| STATS COUNT(emp_no)
```
Example:
### Cast (`::`)
Casts a value to a specific type.
```esql
FROM employees
| WHERE manager IS NOT NULL
| EVAL salary_as_string = salary::KEYWORD
```
### IN
The `IN` operator checks if a value is within a set of values (literals, fields or expressions).
Example:
```esql
FROM employees
| WHERE department IN ("Sales", "Marketing", "HR")
```
Checks if a value is in a list of values.
```esql
ROW a = 1, b = 4, c = 3
| WHERE c-a IN (3, b / 2, a)
| WHERE c - a IN (3, b / 2, a)
```
### LIKE
Filters data based on string patterns using wildcards.
Use `LIKE` to filter data based on string patterns using wildcards.
The following wildcard characters are supported:
- `*` matches zero or more characters.
- `?` matches one character.
Example:
#Basic usage
```esql
FROM employees
| WHERE first_name LIKE "?b*"
| KEEP first_name, last_name
| WHERE first_name LIKE "J*"
```
#Escaping special characters
```esql
ROW message = "foo * bar"
| WHERE message LIKE "foo \\* bar"
```
### RLIKE
Use `RLIKE` to filter data based on string patterns using regular expressions.
Example:
Filters data based on string patterns using regular expressions.
```esql
FROM employees
| WHERE first_name RLIKE ".leja.*"
| KEEP first_name, last_name
| WHERE first_name RLIKE "J.*"
```
### Cast `::`

View file

@ -1,103 +1,105 @@
## ES|QL Overview
```markdown
# Elasticsearch Query Language (ES|QL)
### ES|QL
The Elasticsearch Query Language (ES|QL) is a powerful and intuitive language designed to filter, transform, and analyze data stored in Elasticsearch. It is built to be user-friendly and accessible to a wide range of users, including end users, SRE teams, application developers, and administrators. ES|QL enables users to perform complex data operations such as filtering, aggregation, and time-series analysis, as well as generate visualizations and statistical insights.
The Elasticsearch Query Language (ES|QL) provides a powerful way to filter, transform, and analyze data stored in Elasticsearch. It is designed to be easy to learn and use by all types of end users.
## Key Features of ES|QL
Users can author ES|QL queries to find specific events, perform statistical analysis, and generate visualizations. It supports a wide range of commands and functions that enable users to perform various data operations, such as filtering, aggregation, time-series analysis, and more.
- **Pipe-based Syntax**: ES|QL uses a pipe (`|`) syntax to chain operations, where the output of one operation becomes the input for the next. This step-by-step approach simplifies complex data transformations and analysis.
- **Rich Command Set**: ES|QL supports a wide range of commands and functions for data manipulation, including filtering, aggregation, enrichment, and statistical analysis.
- **Ease of Use**: Designed to be easy to learn and use, ES|QL is suitable for both technical and non-technical users.
- **Integration with Elasticsearch**: ES|QL queries are executed directly within Elasticsearch, leveraging its compute engine for high performance and scalability.
ES|QL makes use of "pipes" (`|`) to manipulate and transform data in a step-by-step fashion. This approach allows users to compose a series of operations, where the output of one operation becomes the input for the next, enabling complex data transformations and analysis.
---
### Known Limitations
## Known Limitations of ES|QL
#### Result Set Size Limit
While ES|QL is a powerful tool, it has some limitations to be aware of:
By default, an ES|QL query returns up to 1000 rows. You can increase the number of rows up to 10,000 using the `LIMIT` command. Queries do not return more than 10,000 rows, regardless of the `LIMIT` commands value. This limit only applies to the number of rows that are retrieved by the query. Queries and aggregations run on the full data set.
### Result Set Size
- By default, ES|QL queries return up to 1,000 rows. This can be increased to a maximum of 10,000 rows using the `LIMIT` command. This upper limit is configurable but comes with trade-offs such as increased memory usage and processing time.
To overcome this limitation:
- Reduce the result set size by modifying the query to only return relevant data. Use `WHERE` to select a smaller subset of the data.
- Shift any post-query processing to the query itself. You can use the ES|QL `STATS ... BY` command to aggregate data in the query.
### Field Types
- ES|QL supports a wide range of field types, including `boolean`, `date`, `keyword`, `text`, `long`, and `double`. However, some field types, such as `binary`, `nested`, and `histogram`, are not yet supported.
- When querying multiple indices, fields with conflicting types must be explicitly converted to a single type using type conversion functions.
The default and maximum limits can be changed using these dynamic cluster settings:
- `esql.query.result_truncation_default_size`
- `esql.query.result_truncation_max_size`
### Full-Text Search
- Full-text search is in technical preview and has limitations. For example, full-text search functions like `MATCH` must be used directly after the `FROM` command or close to it. Additionally, disjunctions (`OR`) in `WHERE` clauses are restricted unless all clauses use full-text functions.
#### Field Types
### Time Series Data Streams
- ES|QL does not currently support querying time series data streams (TSDS).
ES|QL currently supports the following field types:
- `alias`
- `boolean`
- `date`
- `double` (`float`, `half_float`, `scaled_float` are represented as `double`)
- `ip`
- `keyword` family including `keyword`, `constant_keyword`, and `wildcard`
- `int` (`short` and `byte` are represented as `int`)
- `long`
- `null`
- `text`
- `unsigned_long` (preview)
- `version`
### Date Math
- Date math expressions are limited. For example, subtracting two datetime values or using parentheses in date math expressions is not supported.
Spatial types:
- `geo_point`
- `geo_shape`
- `point`
- `shape`
### Multivalued Fields
- Functions generally return `null` when applied to multivalued fields unless explicitly documented otherwise. Use multivalue functions to handle such fields.
Unsupported types:
- TSDB metrics: `counter`, `position`, `aggregate_metric_double`
- Date/time: `date_nanos`, `date_range`
- Other types: `binary`, `completion`, `dense_vector`, `double_range`, `flattened`, `float_range`, `histogram`, `integer_range`, `ip_range`, `long_range`, `nested`, `rank_feature`, `rank_features`, `search_as_you_type`
### Timezone Support
- ES|QL only supports the UTC timezone.
Querying a column with an unsupported type returns an error. If a column with an unsupported type is not explicitly used in a query, it is returned with `null` values, with the exception of nested fields. Nested fields are not returned at all.
### Kibana Integration
- The Discover interface in Kibana has a 10,000-row limit for displayed results and a 50-column limit for displayed fields. These limits apply only to the UI and not to the underlying query execution.
#### _source Availability
---
ES|QL does not support configurations where the `_source` field is disabled. ES|QLs support for synthetic `_source` is currently experimental.
## Using ES|QL in Kibana
#### Full-Text Search
ES|QL is integrated into Kibana, allowing users to query and visualize data directly from the Discover interface. Key points for using ES|QL in Kibana include:
Because of the way ES|QL treats `text` values, queries on `text` fields are like queries on `keyword` fields: they are case-sensitive and need to match the full string.
To perform full-text search on `text` fields, search functions such as `MATCH` should be used.
#### Time Series Data Streams
ES|QL does not support querying time series data streams (TSDS).
#### Date Math Limitations
Date math expressions work well when the leftmost expression is a datetime. However, using parentheses or putting the datetime to the right is not always supported yet. Date math does not allow subtracting two datetimes.
#### Timezone Support
ES|QL only supports the UTC timezone.
### Cross-Cluster Querying
Using ES|QL across clusters allows you to execute a single query across multiple clusters. This feature is in technical preview and may be changed or removed in a future release.
#### Prerequisites
- Remote clusters must be configured.
- The local coordinating node must have the `remote_cluster_client` node role.
- Security privileges must be configured appropriately.
#### Querying Across Clusters
In the `FROM` command, specify data streams and indices on remote clusters using the format `<remote_cluster_name>:<target>`. For example:
- **Enablement**: ES|QL is enabled by default in Kibana but can be disabled via the `enableESQL` setting in Advanced Settings.
- **Query Bar**: The query bar in Discover supports ES|QL syntax, with features like auto-complete and query history for ease of use.
- **Visualization**: ES|QL queries can be used to create visualizations, which can be saved to dashboards or used for alerting.
- **Time Filtering**: Use the standard time filter or custom time parameters (`?_tstart` and `?_tend`) to filter data by time range.
### Example Query in Kibana
```esql
FROM cluster_one:my-index-000001
| LIMIT 10
FROM kibana_sample_data_logs
| WHERE @timestamp > NOW() - 1 day
| STATS total_bytes = SUM(bytes) BY geo.dest
| SORT total_bytes DESC
| LIMIT 5
```
### Using ES|QL in Kibana
This query retrieves the top 5 destinations by total bytes in the last 24 hours.
ES|QL can be used in Kibana to query and aggregate data, create visualizations, and set up alerts.
---
#### Important Information
## Cross-Cluster Querying with ES|QL
- ES|QL is enabled by default in Kibana.
- The query bar in Discover allows you to write and execute ES|QL queries.
- The results table shows up to 10,000 rows, and Discover shows no more than 50 columns.
- You can create visualizations and alerts based on ES|QL queries.
ES|QL supports querying across multiple clusters, enabling users to analyze data stored in different Elasticsearch clusters. To query remote clusters, use the format `<remote_cluster_name>:<index_pattern>` in the `FROM` command.
### Example Cross-Cluster Query
```esql
FROM cluster_one:employees,cluster_two:other-employees-*
| STATS avg_salary = AVG(salary) BY department
| SORT avg_salary DESC
```
This query retrieves the average salary by department across two clusters and sorts the results in descending order.
---
## Using the ES|QL REST API
The ES|QL REST API allows users to execute ES|QL queries programmatically. Queries are sent as HTTP POST requests to the `_query` endpoint.
### Example REST API Request
```json
POST /_query
{
"query": "FROM employees | WHERE salary > 50000 | SORT salary DESC | LIMIT 10"
}
```
### Key Points
- The `query` field contains the ES|QL query as a string.
- Use the `params` field to pass query parameters dynamically.
- The API returns results in JSON format, making it easy to integrate with other applications.
---
## Summary
ES|QL is a versatile and user-friendly query language for Elasticsearch, offering powerful capabilities for data analysis and transformation. While it has some limitations, its integration with Kibana and support for cross-cluster querying make it a valuable tool for a wide range of use cases. Whether you're analyzing logs, building dashboards, or creating alerts, ES|QL provides the flexibility and performance needed to work with Elasticsearch data effectively.
```

View file

@ -1,6 +1,6 @@
# PERCENTILE
The PERCENTILE function calculates the value at a specified percentile of observed values.
The `PERCENTILE` function calculates the value at which a specified percentage of observed values occur. For example, the 95th percentile is the value greater than 95% of the observed values, while the 50th percentile corresponds to the `MEDIAN`.
## Syntax
@ -8,28 +8,35 @@ The PERCENTILE function calculates the value at a specified percentile of observ
### Parameters
#### number
#### `number`
The numeric expression that represents the set of values to be analyzed.
The numeric field or expression for which the percentile is calculated.
#### percentile
#### `percentile`
The percentile to compute. The value should be between 0 and 100.
The percentile value to calculate (e.g., 0 for the minimum, 50 for the median, 100 for the maximum).
## Examples
Basic Percentile Calculation
```esql
FROM employees
| STATS p0 = PERCENTILE(salary, 0), p50 = PERCENTILE(salary, 50), p99 = PERCENTILE(salary, 99)
```
This example calculates the 0th percentile (minimum), 50th percentile (median), and 99th percentile of the `salary` field.
Using Inline Functions
```esql
FROM employees
| STATS p80_max_salary_change = PERCENTILE(MV_MAX(salary_change), 80)
```
This example calculates the 80th percentile of the maximum values in a multivalued column `salary_change`. The `MV_MAX` function is used to determine the maximum value per row before applying the `PERCENTILE` function.
## Notes
- PERCENTILE is usually approximate.
- PERCENTILE is also non-deterministic. This means you can get slightly different results using the same data.

View file

@ -1,6 +1,6 @@
# PI
The PI function returns the mathematical constant Pi, which is the ratio of a circle's circumference to its diameter.
Returns Pi, the mathematical constant representing the ratio of a circles circumference to its diameter.
## Syntax
@ -8,16 +8,12 @@ The PI function returns the mathematical constant Pi, which is the ratio of a ci
### Parameters
This function does not require any parameters.
This function does not take any parameters.
## Examples
Returning the value of Pi
```esql
ROW PI()
```
```esql
FROM employees
| EVAL pi_value = PI()
| KEEP pi_value
```

View file

@ -1,6 +1,6 @@
# POW
The POW function calculates the value of a base number raised to the power of an exponent number.
The `POW` function calculates the value of a base raised to the power of an exponent.
## Syntax
@ -8,22 +8,36 @@ The POW function calculates the value of a base number raised to the power of an
### Parameters
#### base
#### `base`
This is a numeric expression for the base.
Numeric expression for the base. If `null`, the function returns `null`.
#### exponent
#### `exponent`
This is a numeric expression for the exponent.
Numeric expression for the exponent. If `null`, the function returns `null`.
## Examples
Basic usage
```esql
ROW base = 2.0, exponent = 2
| EVAL result = POW(base, exponent)
```
Calculate `2.0` raised to the power of `2`.
Fractional exponent (root calculation)
The exponent can be a fraction, which is similar to performing a root. For example, an exponent of `0.5` calculates the square root of the base:
```esql
ROW base = 4, exponent = 0.5
| EVAL s = POW(base, exponent)
```
Calculate the square root of `4` using an exponent of `0.5`.
## Limitations
- It is possible to overflow a double result when using this function. In such cases, the function will return `null`.

View file

@ -1,8 +1,6 @@
# QSTR
## QSTR
The QSTR function performs a query string query, returning true if the provided query string matches a row.
Please note this functionality is currently in its technical preview stage, which means it might undergo changes or removal in future releases. Elastic commits to address any issues during this period. However, since it's a technical preview, it doesn't come under the support SLA of official GA features.
Performs a query string query and returns `true` if the provided query string matches the row.
## Syntax
@ -10,22 +8,18 @@ Please note this functionality is currently in its technical preview stage, whic
### Parameters
#### query
#### `query`
The query parameter must be a string written in the Lucene query format.
Query string in Lucene query string format.
## Examples
Conduct a query string query on a book's author:
```esql
FROM books
| WHERE QSTR("author: Faulkner")
| KEEP book_no, author
| SORT book_no
| LIMIT 5;
| LIMIT 5
```
## Notes
- Do not use `QSTR` in production - it is in technical preview and may be changed or removed in a future release
This example filters rows where the `author` field matches "Faulkner," keeps the `book_no` and `author` columns, sorts by `book_no`, and limits the output to 5 rows.

View file

@ -1,6 +1,6 @@
# RENAME
## RENAME
The RENAME command is used to change the names of one or more columns in a table.
The `RENAME` command is used to rename one or more columns in a table. If a column with the new name already exists, it will be replaced by the renamed column.
## Syntax
@ -8,28 +8,32 @@ The RENAME command is used to change the names of one or more columns in a table
### Parameters
#### old_nameX
#### `old_nameX`
This is the current name of the column that you want to rename.
The name of the column you want to rename.
#### new_nameX
#### `new_nameX`
This is the new name that you want to assign to the column. If a column with the new name already exists, the existing column will be replaced. If multiple columns are renamed to the same name, all but the rightmost column with the same new name will be dropped.
The new name for the column. If it conflicts with an existing column name, the existing column is dropped. If multiple columns are renamed to the same name, all but the rightmost column with the same new name are dropped.
## Examples
The following example renames the column "still_hired" to "employed":
### Rename a single column
Rename the `still_hired` column to `employed`:
```esql
FROM employees
| KEEP first_name, last_name, still_hired
| RENAME still_hired AS employed
| RENAME still_hired AS employed
```
You can rename multiple columns with a single RENAME command:
### Rename multiple columns
Rename `first_name` to `fn` and `last_name` to `ln` in a single command:
```esql
FROM employees
| KEEP first_name, last_name
| RENAME first_name AS fn, last_name AS ln
```
```

View file

@ -1,6 +1,6 @@
# REPEAT
The REPEAT function generates a string by repeating a specified string a certain number of times.
The `REPEAT` function constructs a string by concatenating a given string with itself a specified number of times.
## Syntax
@ -8,13 +8,13 @@ The REPEAT function generates a string by repeating a specified string a certain
### Parameters
#### string
#### `string`
The string that you want to repeat.
The string to be repeated.
#### number
#### `number`
The number of times you want to repeat the string.
The number of times the string should be repeated.
## Examples
@ -23,7 +23,4 @@ ROW a = "Hello!"
| EVAL triple_a = REPEAT(a, 3)
```
```esql
ROW greeting = "Hi"
| EVAL repeated_greeting = REPEAT(greeting, 5)
```
This example creates a new column `triple_a` by repeating the string `"Hello!"` three times.

View file

@ -1,6 +1,6 @@
# REPLACE
The REPLACE function substitutes any match of a regular expression within a string with a replacement string.
The `REPLACE` function substitutes any match of a regular expression in a string with a specified replacement string.
## Syntax
@ -8,17 +8,17 @@ The REPLACE function substitutes any match of a regular expression within a stri
### Parameters
#### string
#### `string`
The string expression where the replacement will occur.
String expression.
#### regex
#### `regex`
The regular expression that will be matched in the string.
Regular expression.
#### newString
#### `newString`
The string that will replace the matched regular expression in the string.
Replacement string.
## Examples

View file

@ -1,6 +1,6 @@
# REVERSE
The REVERSE function returns a reversed form of the input string.
The `REVERSE` function returns a new string with the characters of the input string in reverse order.
## Syntax
@ -8,22 +8,36 @@ The REVERSE function returns a reversed form of the input string.
### Parameters
#### str
#### `str`
The string you want to reverse. If the string is `null`, the function will also return `null`.
String expression. If `null`, the function returns `null`.
## Examples
Here's an example of how to reverse a string:
Reversing a simple string
```esql
ROW message = "Some Text"
| EVAL message_reversed = REVERSE(message);
```
REVERSE also works with unicode characters, keeping unicode grapheme clusters intact during reversal:
| message | message_reversed |
|-----------|------------------|
| Some Text | txeT emoS |
Reversing a string with emojis
```esql
ROW bending_arts = "💧🪨🔥💨"
| EVAL bending_arts_reversed = REVERSE(bending_arts);
```
```
| bending_arts | bending_arts_reversed |
|--------------|-----------------------|
| 💧🪨🔥💨 | 💨🔥🪨💧 |
`REVERSE` works with Unicode and preserves grapheme clusters during reversal.
## Limitations
If Elasticsearch is running with a JDK version less than 20, the function may not properly reverse grapheme clusters. For example, "👍🏽😊" might be reversed to "🏽👍😊" instead of the correct "😊👍🏽". Elastic Cloud and the JDK bundled with Elasticsearch use newer JDKs, so this issue typically arises only if an older JDK is explicitly used.

View file

@ -1,6 +1,6 @@
# RIGHT
The RIGHT function extracts a specified number of characters from the end of a string.
Returns a substring by extracting a specified number of characters from the right side of a string.
## Syntax
@ -8,18 +8,16 @@ The RIGHT function extracts a specified number of characters from the end of a s
### Parameters
#### string
#### `string`
The string from which a substring is to be returned.
The string from which to return a substring.
#### length
#### `length`
The number of characters to return from the end of the string.
The number of characters to return.
## Examples
The following example extracts the last three characters from the `last_name` field:
```esql
FROM employees
| KEEP last_name
@ -28,8 +26,4 @@ FROM employees
| LIMIT 5
```
```esql
ROW full_name = "John Doe"
| EVAL last_part = RIGHT(full_name, 4)
| KEEP last_part
```
Extracts the last three characters from the `last_name` column, sorts the results alphabetically, and limits the output to the first five rows.

View file

@ -1,6 +1,6 @@
# ROUND
The ROUND function rounds a numeric value to a specified number of decimal places.
Rounds a number to the specified number of decimal places. By default, it rounds to 0 decimal places, returning the nearest integer. If the precision is a negative number, it rounds to the specified number of digits left of the decimal point.
## Syntax
@ -8,24 +8,25 @@ The ROUND function rounds a numeric value to a specified number of decimal place
### Parameters
#### number
#### `number`
The numeric value to be rounded.
The numeric value to round. If `null`, the function returns `null`.
#### decimals
#### `decimals`
The number of decimal places to which the number should be rounded. The default value is 0.
The number of decimal places to round to. Defaults to 0. If `null`, the function returns `null`.
## Examples
The following example rounds the height of employees to one decimal place after converting it from meters to feet:
Rounding a height value to one decimal place
```esql
FROM employees
| KEEP first_name, last_name, height
| EVAL height_ft = ROUND(height * 3.281, 1)
```
This example converts the `height` column from meters to feet and rounds the result to one decimal place.
```esql
FROM sales
| KEEP product_name, revenue

View file

@ -1,6 +1,6 @@
# ROW
The ROW command is used to generate a row with one or more columns with specified values. This can be particularly useful for testing purposes.
The `ROW` command generates a single row with one or more columns, each assigned a specified value. This is particularly useful for testing purposes.
## Syntax
@ -8,37 +8,37 @@ The ROW command is used to generate a row with one or more columns with specifie
### Parameters
#### {column name}
#### `columnX`
This is the name of the column. If there are duplicate column names, only the rightmost duplicate will create a column.
The name of the column.
If duplicate column names are provided, only the rightmost duplicate creates a column.
#### {value}
#### `valueX`
This is the value for the column. It can be a literal, an expression, or a function.
The value assigned to the column. This can be a literal, an expression, or a function.
## Examples
1. Creating a row with simple literal values:
```esql
Basic usage
Create a row with three columns, each assigned a specific value:
```esql
ROW a = 1, b = "two", c = null
```
2. Creating a row with multi-value columns using square brackets:
```esql
Multi-value columns
Use square brackets to assign multiple values to a single column:
```esql
ROW a = [2, 1]
```
3. Creating a row with a function:
```esql
Using functions
Generate a row where a column's value is calculated using a function:
```esql
ROW a = ROUND(1.23, 0)
```
4. Combining literals, multi-value columns, and functions:
```esql
ROW x = 5, y = [3, 4], z = TO_STRING(123)
```
5. Using nested functions within a row:
```esql
ROW a = ABS(-10), b = CONCAT("Hello", " ", "World"), c = TO_BOOLEAN("true")
```

View file

@ -1,6 +1,6 @@
# RTRIM
The RTRIM function is used to remove trailing whitespaces from a string.
Removes trailing whitespaces from a string.
## Syntax
@ -8,14 +8,12 @@ The RTRIM function is used to remove trailing whitespaces from a string.
### Parameters
#### string
#### `string`
This is the string expression from which trailing whitespaces will be removed.
String expression. If `null`, the function returns `null`.
## Examples
The following example demonstrates how to use the RTRIM function:
```esql
ROW message = " some text ", color = " red "
| EVAL message = RTRIM(message)
@ -23,3 +21,5 @@ ROW message = " some text ", color = " red "
| EVAL message = CONCAT("'", message, "'")
| EVAL color = CONCAT("'", color, "'")
```
This example removes trailing whitespaces from the `message` and `color` columns, then wraps the resulting strings in single quotes.

View file

@ -1,6 +1,6 @@
# SHOW
## SHOW
The SHOW command retrieves details about the deployment and its capabilities.
The `SHOW` command provides information about the deployment and its capabilities.
## Syntax
@ -8,9 +8,9 @@ The SHOW command retrieves details about the deployment and its capabilities.
### Parameters
#### item
#### `item`
The only acceptable value is `INFO`.
This parameter can only be `INFO`.
## Examples
@ -18,4 +18,4 @@ Retrieve the deployments version, build date, and hash:
```esql
SHOW INFO
```
```

View file

@ -1,6 +1,6 @@
# SIGNUM
The SIGNUM function returns the sign of a given number. It outputs `-1` for negative numbers, `0` for `0`, and `1` for positive numbers.
Returns the sign of a given number. It outputs `-1` for negative numbers, `0` for `0`, and `1` for positive numbers.
## Syntax
@ -8,9 +8,9 @@ The SIGNUM function returns the sign of a given number. It outputs `-1` for nega
### Parameters
#### number
#### `number`
A numeric expression.
Numeric expression. If `null`, the function returns `null`.
## Examples
@ -19,7 +19,24 @@ ROW d = 100.0
| EVAL s = SIGNUM(d)
```
```esql
ROW d = -50.0
| EVAL s = SIGNUM(d)
```
This example calculates the sign of the number `100.0`.
## Notes
If SORT is used right after a KEEP command, make sure it only uses column names in KEEP,
or move the SORT before the KEEP, e.g.
- not correct: KEEP date | SORT @timestamp,
- correct: SORT @timestamp | KEEP date
By default, the sorting order is ascending. You can specify an explicit sort order by using `ASC` for ascending or `DESC` for descending.
If two rows have the same sort key, they are considered equal. You can provide additional sort expressions to act as tie breakers.
When sorting on multivalued columns, the lowest value is used when sorting in ascending order and the highest value is used when sorting in descending order.
By default, `null` values are treated as being larger than any other value. This means that with an ascending sort order, `null` values are sorted last, and with a descending sort order, `null` values are sorted first. You can change this by providing `NULLS FIRST` or `NULLS LAST`.
## Limitations
- **Multivalued Columns**: When sorting on multivalued columns, the lowest value is used for ascending order and the highest value for descending order.
- **Null Values**: By default, null values are treated as larger than any other value. This can be changed using `NULLS FIRST` or `NULLS LAST`.

View file

@ -1,6 +1,6 @@
# SIN
The SIN function calculates the sine of a given angle.
Returns the sine of an angle.
## Syntax
@ -8,9 +8,9 @@ The SIN function calculates the sine of a given angle.
### Parameters
#### angle
#### `angle`
The angle for which the sine value is to be calculated. The angle should be in radians.
An angle, in radians. If `null`, the function returns `null`.
## Examples
@ -19,7 +19,4 @@ ROW a=1.8
| EVAL sin = SIN(a)
```
```esql
ROW angle=0.5
| EVAL sine_value = SIN(angle)
```
Calculate the sine of the angle `1.8` radians and store the result in a new column named `sin`.

View file

@ -1,25 +1,22 @@
# SINH
The SINH function calculates the hyperbolic sine of a given angle.
Returns the hyperbolic sine of a number.
## Syntax
`SINH(angle)`
`SINH(number)`
### Parameters
#### angle
#### number
The angle in radians for which the hyperbolic sine is to be calculated. If the parameter is null, the function will return null.
A numeric expression. If `null`, the function returns `null`.
## Examples
```esql
ROW a=1.8
| EVAL sinh=SINH(a)
| EVAL sinh = SINH(a)
```
```esql
ROW angle=0.5
| EVAL hyperbolic_sine = SINH(angle)
```
Calculate the hyperbolic sine of the value `1.8`.

View file

@ -1,6 +1,6 @@
# SORT
## SORT
The SORT command is used to arrange a table based on one or more columns.
The `SORT` command organizes a table by one or more columns.
## Syntax
@ -8,13 +8,15 @@ The SORT command is used to arrange a table based on one or more columns.
### Parameters
#### columnX
#### `columnX`
The column on which the sorting is to be performed.
The column to sort on.
## Examples
Sort a table based on the 'height' column:
### Basic sorting
Sort the table by the `height` column in ascending order (default behavior):
```esql
FROM employees
@ -22,7 +24,9 @@ FROM employees
| SORT height
```
Explicitly sort in ascending order with `ASC`:
### Explicitly sorting in descending order with `DESC`
Sort the table by the `height` column in descending order:
```esql
FROM employees
@ -30,7 +34,9 @@ FROM employees
| SORT height DESC
```
Provide additional sort expressions to act as tie breakers:
### Providing additional sort expressions to act as tie breakers
Sort the table by `height` in descending order, and use `first_name` in ascending order as a tie breaker:
```esql
FROM employees
@ -38,7 +44,9 @@ FROM employees
| SORT height DESC, first_name ASC
```
Sort `null` values first using `NULLS FIRST`:
### Sorting `null` values first using `NULLS FIRST`
Sort the table by `first_name` in ascending order, placing `null` values first:
```esql
FROM employees

View file

@ -1,6 +1,6 @@
# SPACE
The SPACE function creates a string composed of a specific number of spaces.
Returns a string composed of a specified number of spaces.
## Syntax
@ -8,15 +8,14 @@ The SPACE function creates a string composed of a specific number of spaces.
### Parameters
#### number
#### `number`
The number of spaces the function should generate.
The number of spaces to include in the resulting string.
## Examples
This example demonstrates how to use the SPACE function to insert a space into a string:
```esql
ROW message = CONCAT("Hello", SPACE(1), "World!");
```
In this example, the SPACE function creates a single space, which is then used to separate the words "Hello" and "World!" in the resulting string. If desired, the `number` parameter could be adjusted in order to generate more spaces.
This example creates a string with the word "Hello," followed by a single space, and then the word "World!".

View file

@ -1,6 +1,6 @@
# SPLIT
The SPLIT function is used to divide a single string into multiple strings.
The `SPLIT` function splits a single-valued string into multiple strings based on a specified delimiter.
## Syntax
@ -8,13 +8,13 @@ The SPLIT function is used to divide a single string into multiple strings.
### Parameters
#### string
#### `string`
This is the string expression that you want to split.
String expression. If `null`, the function returns `null`.
#### delim
#### `delim`
This is the delimiter used to split the string. Currently, only single byte delimiters are supported.
Delimiter used to split the string. Only single-byte delimiters are currently supported.
## Examples
@ -23,7 +23,4 @@ ROW words="foo;bar;baz;qux;quux;corge"
| EVAL word = SPLIT(words, ";")
```
```esql
ROW sentence="hello world;this is ES|QL"
| EVAL words = SPLIT(sentence, " ")
```
This example splits the string `words` into multiple strings using the semicolon (`;`) as the delimiter.

View file

@ -1,6 +1,6 @@
# SQRT
The SQRT function calculates the square root of a given number.
Returns the square root of a number. The input can be any numeric value, and the return value is always a double. Square roots of negative numbers and infinities are `null`.
## Syntax
@ -8,20 +8,22 @@ The SQRT function calculates the square root of a given number.
### Parameters
#### number
#### `number`
This is a numeric expression.
Numeric expression. If `null`, the function returns `null`.
## Examples
```esql
ROW d = 100.0
| EVAL s = SQRT(d)
```
Calculate the square root of the value `100.0`.
```esql
FROM employees
| KEEP first_name, last_name, height
| EVAL sqrt_height = SQRT(height)
```
Keep only the first_name, last_name, height columns, and then create a new `sqrt_height` which equals to the square root of all the values in the height column.

View file

@ -1,6 +1,6 @@
# ST_CENTROID_AGG
The ST_CENTROID_AGG function calculates the spatial centroid over a field with spatial point geometry type.
Calculates the spatial centroid over a field with a spatial point geometry type.
## Syntax
@ -8,20 +8,15 @@ The ST_CENTROID_AGG function calculates the spatial centroid over a field with s
### Parameters
#### field
#### `field`
The field parameter represents the column that contains the spatial point geometry data.
The field containing spatial point geometry data.
## Examples
Here is an example of how to use the ST_CENTROID_AGG function:
```esql
FROM airports
| STATS centroid = ST_CENTROID_AGG(location)
```
```esql
FROM city_boundaries
| STATS city_centroid = ST_CENTROID_AGG(boundary)
```
Calculate the spatial centroid of the `location` field in the `airports` dataset.

Some files were not shown because too many files have changed in this diff Show more