[Obs AI Assistant] Automatically execute ES|QL queries (#174081)

Automatically executes ES|QL queries if the user asks for it. Hard to
get 100% right, but close enough. It removes the need for user
interaction and also makes our tests a little easier. I also implemented
some improvements to query generation, by making the following changes:

- Remove references to other query languages (to remove relation with
those languages, limiting the chance of mixing in its features). This
reduces hallucinations such as STATS .. AS ..
- Format every query in the docs in esql backticks. This hopefully
increases the attention of the LLM to actual examples in the docs.
Specifically it seems to prevent the scenario where the LLM tries to
escape index names with double quotes.

---------

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
This commit is contained in:
Dario Gieselaar 2024-01-15 21:00:19 +01:00 committed by GitHub
parent 59179a6239
commit c75f13a831
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
91 changed files with 794 additions and 134 deletions

View file

@ -7,48 +7,21 @@
/// <reference types="@kbn/ambient-ftr-types"/>
import { last } from 'lodash';
import moment from 'moment';
import { apm, timerange } from '@kbn/apm-synthtrace-client';
import expect from '@kbn/expect';
import { MessageRole } from '../../../../common';
import moment from 'moment';
import { chatClient, esClient, synthtraceEsClients } from '../../services';
function extractEsqlQuery(response: string) {
return response.match(/```esql([\s\S]*?)```/)?.[1];
}
async function evaluateEsqlQuery({
question,
expected,
criteria = [],
execute = true,
}: {
question: string;
expected?: string;
criteria?: string[];
execute?: boolean;
}): Promise<void> {
let conversation = await chatClient.complete(question);
const esqlQuery = extractEsqlQuery(last(conversation.messages)?.content || '');
if (esqlQuery && execute) {
conversation = await chatClient.complete(
conversation.conversationId!,
conversation.messages.concat({
content: '',
role: MessageRole.Assistant,
function_call: {
name: 'execute_query',
arguments: JSON.stringify({
query: esqlQuery,
}),
trigger: MessageRole.User,
},
})
);
}
const conversation = await chatClient.complete(question);
const evaluation = await chatClient.evaluate(conversation, [
...(expected
@ -57,7 +30,7 @@ async function evaluateEsqlQuery({
${expected}`,
]
: []),
...(execute && expected ? [`The query successfully executed without an error`] : []),
...(expected ? [`The query successfully executed without an error`] : []),
...criteria,
]);
@ -146,7 +119,6 @@ describe('ES|QL query generation', () => {
| SORT hire_date
| KEEP emp_no, hire_date_formatted
| LIMIT 5`,
execute: false,
});
});

View file

@ -0,0 +1,19 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/
export function formatEsqlExamples(content: string) {
// Regular expression to match the queries
const queryRegex = /(\s*(FROM |ROW |SHOW ).*?)(?=\n[^|\s]|$)/gs;
// Function to format a matched query
const formatQuery = (match: string) => {
return `\n\`\`\`esql\n${match.trim()}\n\`\`\`\n`;
};
// Replace all matches in the input string
return content.replace(queryRegex, formatQuery);
}

View file

@ -15,6 +15,7 @@ import Path from 'path';
import git, { SimpleGitProgressEvent } from 'simple-git';
import yargs, { Argv } from 'yargs';
import { extractSections } from './extract_sections';
import { formatEsqlExamples } from './format_esql_examples';
yargs(process.argv.slice(2))
.command(
@ -221,7 +222,19 @@ yargs(process.argv.slice(2))
outDir,
`esql-${doc.title.replaceAll(' ', '-').toLowerCase()}.txt`
);
await Fs.writeFile(fileName, doc.content);
// We ask the LLM to output queries wrapped in ```esql...```,
// so we try to format ES|QL examples in the docs in the same
// way. The hope is that this creates a stronger relation in the
// output.
const formattedContent = formatEsqlExamples(doc.content);
log.debug({
content: doc.content,
formattedContent,
});
await Fs.writeFile(fileName, formattedContent);
})
)
);

View file

@ -1,8 +1,18 @@
ABS
Syntax
Parameters
n
Numeric expression. If null, the function returns null.
DescriptionReturns the absolute value.Supported types
Examples
```esql
ROW number = -1.0
| EVAL abs_number = ABS(number)
```
Returns the absolute value.
```esql
FROM employees
| KEEP first_name, last_name, height
| EVAL abs_height = ABS(0.0 - height)
Supported types:
```

View file

@ -7,5 +7,7 @@ Numeric expression. If null, the function returns null.
DescriptionReturns the arccosine of n as an
angle, expressed in radians.Supported types
Example
```esql
ROW a=.9
| EVAL acos=ACOS(a)
```

View file

@ -1,7 +1,14 @@
ASIN
Inverse sine trigonometric function.
Syntax
Parameters
n
Numeric expression. If null, the function returns null.
DescriptionReturns the
arcsine
of the input numeric expression as an angle, expressed in radians.Supported types
Example
```esql
ROW a=.9
| EVAL asin=ASIN(a)
Supported types:
```

View file

@ -1,7 +1,14 @@
ATAN
Inverse tangent trigonometric function.
Syntax
Parameters
n
Numeric expression. If null, the function returns null.
DescriptionReturns the
arctangent of the
input numeric expression as an angle, expressed in radians.Supported types
Example
```esql
ROW a=12.9
| EVAL atan=ATAN(a)
Supported types:
```

View file

@ -1,8 +1,16 @@
ATAN2
The angle between the positive x-axis and the
ray from the origin to the point (x , y) in the Cartesian plane.
Syntax
Parameters
y
Numeric expression. If null, the function returns null.
x
Numeric expression. If null, the function returns null.
DescriptionThe angle between the positive x-axis and
the ray from the origin to the point (x , y) in the Cartesian plane, expressed
in radians.Supported types
Example
```esql
ROW y=12.9, x=.6
| EVAL atan2=ATAN2(y, x)
Supported types:
```

View file

@ -1,27 +1,83 @@
AUTO_BUCKET
Creates human-friendly buckets and returns a datetime value for each row that
corresponds to the resulting bucket the row falls into. Combine AUTO_BUCKET
with STATS ... BY to create a date histogram.You provide a target number of buckets, a start date, and an end date, and it
picks an appropriate bucket size to generate the target number of buckets or
fewer. For example, this asks for at most 20 buckets over a whole year, which
picks monthly buckets:
ROW date=TO_DATETIME("1985-07-09T00:00:00.000Z")
| EVAL bucket=AUTO_BUCKET(date, 20, "1985-01-01T00:00:00Z", "1986-01-01T00:00:00Z")
Syntax
AUTO_BUCKET(field, buckets, from, to)
Parameters
field
Numeric or date column from which to derive buckets.
buckets
Target number of buckets.
from
Start of the range. Can be a number or a date expressed as a string.
to
End of the range. Can be a number or a date expressed as a string.
DescriptionCreates human-friendly buckets and returns a value for each row that corresponds
to the resulting bucket the row falls into.Using a target number of buckets, a start of a range, and an end of a range,
AUTO_BUCKET picks an appropriate bucket size to generate the target number of
buckets or fewer. For example, asking for at most 20 buckets over a year results
in monthly buckets:
```esql
FROM employees
| WHERE hire_date >= "1985-01-01T00:00:00Z" AND hire_date < "1986-01-01T00:00:00Z"
| EVAL month = AUTO_BUCKET(hire_date, 20, "1985-01-01T00:00:00Z", "1986-01-01T00:00:00Z")
| KEEP hire_date, month
| SORT hire_date
```
The goal isnt to provide exactly the target number of buckets, its to pick a
range that people are comfortable with that provides at most the target number of
buckets.If you ask for more buckets then AUTO_BUCKET can pick a smaller range. For example,
asking for at most 100 buckets in a year will get you week long buckets:
ROW date=TO_DATETIME("1985-07-09T00:00:00.000Z")
| EVAL bucket=AUTO_BUCKET(date, 100, "1985-01-01T00:00:00Z", "1986-01-01T00:00:00Z")
AUTO_BUCKET does not filter any rows. It only uses the provided time range to
pick a good bucket size. For rows with a date outside of the range, it returns a
datetime that corresponds to a bucket outside the range. Combine AUTO_BUCKET
with WHERE to filter rows.A more complete example might look like:
range that people are comfortable with that provides at most the target number
of buckets.Combine AUTO_BUCKET with
STATS ... BY to create a histogram:
```esql
FROM employees
| WHERE hire_date >= "1985-01-01T00:00:00Z" AND hire_date < "1986-01-01T00:00:00Z"
| EVAL month = AUTO_BUCKET(hire_date, 20, "1985-01-01T00:00:00Z", "1986-01-01T00:00:00Z")
| STATS hires_per_month = COUNT(*) BY month
| SORT month
```
AUTO_BUCKET does not create buckets that dont match any documents.
Thats why this example is missing 1985-03-01 and other dates.
Asking for more buckets can result in a smaller range. For example, asking for
at most 100 buckets in a year results in weekly buckets:
```esql
FROM employees
| WHERE hire_date >= "1985-01-01T00:00:00Z" AND hire_date < "1986-01-01T00:00:00Z"
| EVAL week = AUTO_BUCKET(hire_date, 100, "1985-01-01T00:00:00Z", "1986-01-01T00:00:00Z")
| STATS hires_per_week = COUNT(*) BY week
| SORT week
```
AUTO_BUCKET does not filter any rows. It only uses the provided range to
pick a good bucket size. For rows with a value outside of the range, it returns
a bucket value that corresponds to a bucket outside the range. Combine
AUTO_BUCKET with WHERE to filter rows.
AUTO_BUCKET can also operate on numeric fields. For example, to create a
salary histogram:
```esql
FROM employees
| EVAL bs = AUTO_BUCKET(salary, 20, 25324, 74999)
| STATS COUNT(*) by bs
| SORT bs
```
Unlike the earlier example that intentionally filters on a date range, you
rarely want to filter on a numeric range. You have to find the min and max
separately. ES|QL doesnt yet have an easy way to do that automatically.ExamplesCreate hourly buckets for the last 24 hours, and calculate the number of events
per hour:
```esql
FROM sample_data
| WHERE @timestamp >= NOW() - 1 day and @timestamp < NOW()
| EVAL bucket = AUTO_BUCKET(@timestamp, 25, DATE_FORMAT(NOW() - 1 day), DATE_FORMAT(NOW()))
| STATS COUNT(*) BY bucket
```
Create monthly buckets for the year 1985, and calculate the average salary by
hiring month:
```esql
FROM employees
| WHERE hire_date >= "1985-01-01T00:00:00Z" AND hire_date < "1986-01-01T00:00:00Z"
| EVAL bucket = AUTO_BUCKET(hire_date, 20, "1985-01-01T00:00:00Z", "1986-01-01T00:00:00Z")
| STATS AVG(salary) BY bucket
| SORT bucket
AUTO_BUCKET does not create buckets that dont match any documents. Thats
why the example above is missing 1985-03-01 and other dates.
```

View file

@ -1,6 +1,11 @@
AVG
The average of a numeric field.
Syntax
AVG(column)
column
Numeric column. If null, the function returns null.
DescriptionThe average of a numeric field.Supported typesThe result is always a double no matter the input type.Example
```esql
FROM employees
| STATS AVG(height)
The result is always a double not matter the input type.
```

View file

@ -13,10 +13,32 @@ The default value thats is returned when no condition matches.
DescriptionAccepts pairs of conditions and values. The function returns the value that
belongs to the first condition that evaluates to true.If the number of arguments is odd, the last argument is the default value which
is returned when no condition matches. If the number of arguments is even, and
no condition matches, the function returns null.Example
no condition matches, the function returns null.ExampleDetermine whether employees are monolingual, bilingual, or polyglot:
```esql
FROM employees
| EVAL type = CASE(
languages <= 1, "monolingual",
languages <= 2, "bilingual",
"polyglot")
| KEEP emp_no, languages, type
```
Calculate the total connection success rate based on log messages:
```esql
FROM sample_data
| EVAL successful = CASE(
STARTS_WITH(message, "Connected to"), 1,
message == "Connection error", 0
)
| STATS success_rate = AVG(successful)
```
Calculate an hourly error rate as a percentage of the total number of log
messages:
```esql
FROM sample_data
| EVAL error = CASE(message LIKE "*error*", 1, 0)
| EVAL hour = DATE_TRUNC(1 hour, @timestamp)
| STATS error_rate = AVG(error) by hour
| SORT hour
```

View file

@ -1,10 +1,16 @@
CEIL
Round a number up to the nearest integer.
Syntax
Parameters
n
Numeric expression. If null, the function returns null.
DescriptionRound a number up to the nearest integer.
This is a noop for long (including unsigned) and integer.
For double this picks the closest double value to the integer
similar to Math.ceil.
Supported types
Example
```esql
ROW a=1.8
| EVAL a=CEIL(a)
This is a noop for long (including unsigned) and integer.
For double this picks the the closest double value to the integer ala
Math.ceil.
Supported types:
```

View file

@ -1,5 +1,13 @@
COALESCE
Returns the first non-null value.
Syntax
COALESCE(expression1 [, ..., expressionN])
Parameters
expressionX
Expression to evaluate.
DescriptionReturns the first of its arguments that is not null. If all arguments are null,
it returns null.Example
```esql
ROW a=null, b="b"
| EVAL COALESCE(a, b)
```

View file

@ -1,6 +1,13 @@
CONCAT
Concatenates two or more strings.
Syntax
CONCAT(string1, string2[, ..., stringN])
Parameters
stringX
Strings to concatenate.
DescriptionConcatenates two or more strings.Example
```esql
FROM employees
| KEEP first_name, last_name, height
| KEEP first_name, last_name
| EVAL fullname = CONCAT(first_name, " ", last_name)
```

View file

@ -1,7 +1,13 @@
COS
Cosine trigonometric function. Input expected in radians.
Syntax
Parameters
n
Numeric expression. If null, the function returns null.
DescriptionReturns the cosine of n. Input
expected in radians.Supported types
Example
```esql
ROW a=1.8
| EVAL cos=COS(a)
Supported types:
```

View file

@ -1,7 +1,13 @@
COSH
Cosine hyperbolic function.
Syntax
Parameters
n
Numeric expression. If null, the function returns null.
Supported types
DescriptionReturns the hyperbolic
cosine.Example
```esql
ROW a=1.8
| EVAL cosh=COSH(a)
Supported types:
```

View file

@ -1,10 +1,20 @@
COUNT
Counts field values.
Syntax
COUNT([input])
Parameters
input
Column or literal for which to count the number of values. If omitted, returns a
count all (the number of rows).
DescriptionReturns the total number (count) of input values.Supported typesCan take any field type as input.Examples
```esql
FROM employees
| STATS COUNT(height)
Can take any field type as input and the result is always a long not matter
the input type.To count the number of rows, use COUNT(*):
```
To count the number of rows, use COUNT() or COUNT(*):
```esql
FROM employees
| STATS count = COUNT(*) BY languages
| SORT languages DESC
```

View file

@ -1,10 +1,13 @@
COUNT_DISTINCT
The approximate number of distinct values.
FROM hosts
| STATS COUNT_DISTINCT(ip0), COUNT_DISTINCT(ip1)
Can take any field type as input and the result is always a long not matter
the input type.Counts are approximateeditComputing exact counts requires loading values into a set and returning its
Syntax
COUNT_DISTINCT(column[, precision])
Parameters
column
Column for which to count the number of distinct values.
precision
Precision. Refer to Counts are approximate.
DescriptionReturns the approximate number of distinct values.Counts are approximateeditComputing exact counts requires loading values into a set and returning its
size. This doesnt scale when working on high-cardinality sets and/or large
values as the required memory usage and the need to communicate those
per-shard sets between nodes would utilize too many resources of the cluster.This COUNT_DISTINCT function is based on the
@ -22,7 +25,15 @@ on the dataset in question. In general, most datasets show consistently good
accuracy. Also note that even with a threshold as low as 100, the error
remains very low (1-6% as seen in the above graph) even when counting millions of items.The HyperLogLog++ algorithm depends on the leading zeros of hashed
values, the exact distributions of hashes in a dataset can affect the
accuracy of the cardinality.Precision is configurableeditThe COUNT_DISTINCT function takes an optional second parameter to configure the
precision discussed previously.
accuracy of the cardinality.The COUNT_DISTINCT function takes an optional second parameter to configure the
precision.Supported typesCan take any field type as input.Examples
```esql
FROM hosts
| STATS COUNT_DISTINCT(ip0), COUNT_DISTINCT(ip1)
```
With the optional second parameter to configure the precision:
```esql
FROM hosts
| STATS COUNT_DISTINCT(ip0, 80000), COUNT_DISTINCT(ip1, 5)
```

View file

@ -1,6 +1,31 @@
DATE_EXTRACT
Extracts parts of a date, like year, month, day, hour.
The supported field types are those provided by java.time.temporal.ChronoField.
Syntax
DATE_EXTRACT(date_part, date)
Parameters
date_part
Part of the date to extract. Can be: aligned_day_of_week_in_month,
aligned_day_of_week_in_year, aligned_week_of_month, aligned_week_of_year,
ampm_of_day, clock_hour_of_ampm, clock_hour_of_day, day_of_month,
day_of_week, day_of_year, epoch_day, era, hour_of_ampm, hour_of_day,
instant_seconds, micro_of_day, micro_of_second, milli_of_day,
milli_of_second, minute_of_day, minute_of_hour, month_of_year,
nano_of_day, nano_of_second, offset_seconds, proleptic_month,
second_of_day, second_of_minute, year, or year_of_era. Refer to
java.time.temporal.ChronoField
for a description of these values.
If null, the function returns null.
date
Date expression. If null, the function returns null.
DescriptionExtracts parts of a date, like year, month, day, hour.Examples
```esql
ROW date = DATE_PARSE("yyyy-MM-dd", "2022-05-06")
| EVAL year = DATE_EXTRACT("year", date)
```
Find all events that occurred outside of business hours (before 9 AM or after 5
PM), on any given date:
```esql
FROM sample_data
| WHERE DATE_EXTRACT("hour_of_day", @timestamp) < 9 AND DATE_EXTRACT("hour_of_day", @timestamp) >= 17
```

View file

@ -1,7 +1,17 @@
DATE_FORMAT
Returns a string representation of a date in the provided format. If no format
is specified, the yyyy-MM-dd'T'HH:mm:ss.SSSZ format is used.
Syntax
DATE_FORMAT([format,] date)
Parameters
format
Date format (optional). If no format is specified, the
yyyy-MM-dd'T'HH:mm:ss.SSSZ format is used. If null, the function returns
null.
date
Date expression. If null, the function returns null.
DescriptionReturns a string representation of a date, in the provided format.Example
```esql
FROM employees
| KEEP first_name, last_name, hire_date
| EVAL hired = DATE_FORMAT("YYYY-MM-dd", hire_date)
```

View file

@ -12,5 +12,7 @@ Date expression as a string. If null or an empty string, the function returns
null.
DescriptionReturns a date by parsing the second argument using the format specified in the
first argument.Example
```esql
ROW date_string = "2022-05-06"
| EVAL date = DATE_PARSE("yyyy-MM-dd", date_string)
```

View file

@ -1,8 +1,34 @@
DATE_TRUNC
Rounds down a date to the closest interval. Intervals can be expressed using the
timespan literal syntax.
Syntax
DATE_TRUNC(interval, date)
Parameters
interval
Interval, expressed using the timespan literal
syntax. If null, the function returns null.
date
Date expression. If null, the function returns null.
DescriptionRounds down a date to the closest interval.Examples
```esql
FROM employees
| KEEP first_name, last_name, hire_date
| EVAL year_hired = DATE_TRUNC(1 year, hire_date)
| STATS COUNT(emp_no) BY year_hired
| SORT year_hired
```
Combine DATE_TRUNC with STATS ... BY to create date histograms. For
example, the number of hires per year:
```esql
FROM employees
| EVAL year = DATE_TRUNC(1 year, hire_date)
| STATS hires = COUNT(emp_no) BY year
| SORT year
```
Or an hourly error rate:
```esql
FROM sample_data
| EVAL error = CASE(message LIKE "*error*", 1, 0)
| EVAL hour = DATE_TRUNC(1 hour, @timestamp)
| STATS error_rate = AVG(error) by hour
| SORT hour
```

View file

@ -14,12 +14,17 @@ DescriptionDISSECT enables you to extract
structured data out of a string. DISSECT matches the string against a
delimiter-based pattern, and extracts the specified keys as columns.Refer to Process data with DISSECT for the syntax of dissect patterns.ExamplesThe following example parses a string that contains a timestamp, some text, and
an IP address:
```esql
ROW a = "2023-01-23T12:15:00.000Z - some text - 127.0.0.1"
| DISSECT a "%{date} - %{msg} - %{ip}"
| KEEP date, msg, ip
```
By default, DISSECT outputs keyword string columns. To convert to another
type, use Type conversion functions:
```esql
ROW a = "2023-01-23T12:15:00.000Z - some text - 127.0.0.1"
| DISSECT a "%{date} - %{msg} - %{ip}"
| KEEP date, msg, ip
| EVAL date = TO_DATETIME(date)
```

View file

@ -6,9 +6,14 @@ Parameters
columns
A comma-separated list of columns to remove. Supports wildcards.
DescriptionThe DROP processing command removes one or more columns.Examples
```esql
FROM employees
| DROP height
```
Rather than specify each column by name, you can use wildcards to drop all
columns with a name that matches a pattern:
```esql
FROM employees
| DROP height*
```

View file

@ -2,4 +2,6 @@ E
Eulers number.
```esql
ROW E()
```

View file

@ -29,19 +29,31 @@ the match_field defined in the enrich policy and
requires that the input table has a column with the same name (language_code
in this example). ENRICH will look for records in the
enrich index based on the match field value.
```esql
ROW language_code = "1"
| ENRICH languages_policy
```
To use a column with a different name than the match_field defined in the
policy as the match field, use ON <column-name>:
```esql
ROW a = "1"
| ENRICH languages_policy ON a
```
By default, each of the enrich fields defined in the policy is added as a
column. To explicitly select the enrich fields that are added, use
WITH <field1>, <field2>, ...:
```esql
ROW a = "1"
| ENRICH languages_policy ON a WITH language_name
```
You can rename the columns that are added using WITH new_name=<field1>:
```esql
ROW a = "1"
| ENRICH languages_policy ON a WITH name = language_name
```
In case of name collisions, the newly created columns will override existing
columns.

View file

@ -11,13 +11,18 @@ function.
DescriptionThe EVAL processing command enables you to append new columns with calculated
values. EVAL supports various functions for calculating values. Refer to
Functions for more information.Examples
```esql
FROM employees
| SORT emp_no
| KEEP first_name, last_name, height
| EVAL height_feet = height * 3.281, height_cm = height * 100
```
If the specified column already exists, the existing column will be dropped, and
the new column will be appended to the table:
```esql
FROM employees
| SORT emp_no
| KEEP first_name, last_name, height
| EVAL height = height * 3.281
```

View file

@ -2,8 +2,11 @@ FLOOR
Round a number down to the nearest integer.
```esql
ROW a=1.8
| EVAL a=FLOOR(a)
```
This is a noop for long (including unsigned) and integer.
For double this picks the the closest double value to the integer ala
Math.floor.

View file

@ -1,29 +1,57 @@
FROM
Syntax
```esql
FROM index_pattern [METADATA fields]
```
Parameters
index_pattern
A list of indices, data streams or aliases. Supports wildcards and date math.
fields
A comma-separated list of metadata fields to retrieve.
DescriptionThe FROM source command returns a table with data from a data stream, index,
DescriptionThe
```esql
FROM source command returns a table with data from a data stream, index,
```
or alias. Each row in the resulting table represents a document. Each column
corresponds to a field, and can be accessed by the name of that field.
By default, an ES|QL query without an explicit LIMIT uses an implicit
limit of 500. This applies to FROM too. A FROM command without LIMIT:
limit of 500. This applies to
```esql
FROM too. A FROM command without LIMIT:
```
```esql
FROM employees
```
is executed as:
```esql
FROM employees
| LIMIT 500
```
Examples
```esql
FROM employees
```
You can use date math to refer to indices, aliases
and data streams. This can be useful for time series data, for example to access
todays index:
```esql
FROM <logs-{now/d}>
```
Use comma-separated lists or wildcards to query multiple data streams, indices,
or aliases:
```esql
FROM employees-00001,other-employees-*
```
Use the METADATA directive to enable metadata fields:
```esql
FROM employees [METADATA _id]
```

View file

@ -3,8 +3,11 @@ GREATEST
Returns the maximum value from many columns. This is similar to MV_MAX
except its intended to run on multiple columns at once.
```esql
ROW a = 10, b = 20
| EVAL g = GREATEST(a, b)
```
When run on keyword or text fields, thisll return the last string
in alphabetical order. When run on boolean columns this will return
true if any values are true.

View file

@ -12,17 +12,25 @@ DescriptionGROK enables you to extract
structured data out of a string. GROK matches the string against patterns,
based on regular expressions, and extracts the specified patterns as columns.Refer to Process data with GROK for the syntax of grok patterns.ExamplesThe following example parses a string that contains a timestamp, an IP address,
an email address, and a number:
```esql
ROW a = "2023-01-23T12:15:00.000Z 127.0.0.1 some.email@foo.com 42"
| GROK a "%{TIMESTAMP_ISO8601:date} %{IP:ip} %{EMAILADDRESS:email} %{NUMBER:num}"
| KEEP date, ip, email, num
```
By default, GROK outputs keyword string columns. int and float types can
be converted by appending :type to the semantics in the pattern. For example
{NUMBER:num:int}:
```esql
ROW a = "2023-01-23T12:15:00.000Z 127.0.0.1 some.email@foo.com 42"
| GROK a "%{TIMESTAMP_ISO8601:date} %{IP:ip} %{EMAILADDRESS:email} %{NUMBER:num:int}"
| KEEP date, ip, email, num
```
For other type conversions, use Type conversion functions:
```esql
ROW a = "2023-01-23T12:15:00.000Z 127.0.0.1 some.email@foo.com 42"
| GROK a "%{TIMESTAMP_ISO8601:date} %{IP:ip} %{EMAILADDRESS:email} %{NUMBER:num:int}"
| KEEP date, ip, email, num
| EVAL date = TO_DATETIME(date)
```

View file

@ -6,14 +6,22 @@ Parameters
columns::
A comma-separated list of columns to keep. Supports wildcards.DescriptionThe KEEP processing command enables you to specify what columns are returned
and the order in which they are returned.ExamplesThe columns are returned in the specified order:
```esql
FROM employees
| KEEP emp_no, first_name, last_name, height
```
Rather than specify each column by name, you can use wildcards to return all
columns with a name that matches a pattern:
```esql
FROM employees
| KEEP h*
```
The asterisk wildcard (*) by itself translates to all columns that do not
match the other arguments. This query will first return all columns with a name
that starts with h, followed by all other columns:
```esql
FROM employees
| KEEP h*, *
```

View file

@ -3,8 +3,11 @@ LEAST
Returns the minimum value from many columns. This is similar to MV_MIN
except its intended to run on multiple columns at once.
```esql
ROW a = 10, b = 20
| EVAL l = LEAST(a, b)
```
When run on keyword or text fields, thisll return the first string
in alphabetical order. When run on boolean columns this will return
false if any values are false.

View file

@ -2,9 +2,12 @@ LEFT
Return the substring that extracts length chars from the string starting from the left.
```esql
FROM employees
| KEEP last_name
| EVAL left = LEFT(last_name, 3)
| SORT last_name ASC
| LIMIT 5
```
Supported types:

View file

@ -1,6 +1,8 @@
LENGTH
Returns the character length of a string.
```esql
FROM employees
| KEEP first_name, last_name, height
| EVAL fn_length = LENGTH(first_name)
```

View file

@ -19,6 +19,8 @@ settings:
esql.query.result_truncation_default_size
esql.query.result_truncation_max_size
Example
```esql
FROM employees
| SORT emp_no ASC
| LIMIT 5
```

View file

@ -3,6 +3,9 @@ LOG10
Returns the log base 10. The input can be any numeric value, the return value
is always a double.Logs of negative numbers are NaN. Logs of infinites are infinite, as is the log of 0.
```esql
ROW d = 1000.0
| EVAL s = LOG10(d)
```
Supported types:

View file

@ -2,9 +2,12 @@ LTRIM
Removes leading whitespaces from strings.
```esql
ROW message = " some text ", color = " red "
| EVAL message = LTRIM(message)
| EVAL color = LTRIM(color)
| EVAL message = CONCAT("'", message, "'")
| EVAL color = CONCAT("'", color, "'")
```
Supported types:

View file

@ -1,5 +1,7 @@
MAX
The maximum value of a numeric field.
```esql
FROM employees
| STATS MAX(languages)
```

View file

@ -2,8 +2,11 @@ MEDIAN
The value that is greater than half of all values and less than half of
all values, also known as the 50% PERCENTILE.
```esql
FROM employees
| STATS MEDIAN(salary), PERCENTILE(salary, 50)
```
Like PERCENTILE, MEDIAN is usually approximate.
MEDIAN is also non-deterministic.
This means you can get slightly different results using the same data.

View file

@ -6,8 +6,11 @@ or may not be normally distributed. For such data it can be more descriptive tha
standard deviation.It is calculated as the median of each data points deviation from the median of
the entire sample. That is, for a random variable X, the median absolute deviation
is median(|median(X) - Xi|).
```esql
FROM employees
| STATS MEDIAN(salary), MEDIAN_ABSOLUTE_DEVIATION(salary)
```
Like PERCENTILE, MEDIAN_ABSOLUTE_DEVIATION is
usually approximate.
MEDIAN_ABSOLUTE_DEVIATION is also non-deterministic.

View file

@ -1,5 +1,7 @@
MIN
The minimum value of a numeric field.
```esql
FROM employees
| STATS MIN(languages)
```

View file

@ -2,6 +2,9 @@ MV_AVG
Converts a multivalued field into a single valued field containing the average
of all of the values. For example:
```esql
ROW a=[3, 5, 1, 6]
| EVAL avg_a = MV_AVG(a)
```
The output type is always a double and the input type can be any number.

View file

@ -3,9 +3,15 @@ MV_CONCAT
Converts a multivalued string field into a single valued field containing the
concatenation of all values separated by a delimiter:
```esql
ROW a=["foo", "zoo", "bar"]
| EVAL j = MV_CONCAT(a, ", ")
```
If you want to concat non-string fields call TO_STRING on them first:
```esql
ROW a=[10, 9, 8]
| EVAL j = MV_CONCAT(TO_STRING(a), ", ")
```
Supported types:

View file

@ -3,6 +3,9 @@ MV_COUNT
Converts a multivalued field into a single valued field containing a count of the number
of values:
```esql
ROW a=["foo", "zoo", "bar"]
| EVAL count_a = MV_COUNT(a)
```
Supported types:

View file

@ -2,7 +2,10 @@ MV_DEDUPE
Removes duplicates from a multivalued field. For example:
```esql
ROW a=["foo", "foo", "bar", "foo"]
| EVAL dedupe_a = MV_DEDUPE(a)
```
Supported types:
MV_DEDUPE may, but wont always, sort the values in the field.

View file

@ -7,5 +7,7 @@ column
The multivalued column to expand.
DescriptionThe MV_EXPAND processing command expands multivalued columns into one row per
value, duplicating other columns.Example
```esql
ROW a=[1,2,3], b="b", j=["a","b"]
| MV_EXPAND a
```

View file

@ -2,10 +2,16 @@ MV_MAX
Converts a multivalued field into a single valued field containing the maximum value. For example:
```esql
ROW a=[3, 5, 1]
| EVAL max_a = MV_MAX(a)
```
It can be used by any field type, including keyword fields. In that case picks the
last string, comparing their utf-8 representation byte by byte:
```esql
ROW a=["foo", "zoo", "bar"]
| EVAL max_a = MV_MAX(a)
```
Supported types:

View file

@ -1,10 +1,15 @@
MV_MEDIAN
Converts a multivalued field into a single valued field containing the median value. For example:
```esql
ROW a=[3, 5, 1]
| EVAL median_a = MV_MEDIAN(a)
```
It can be used by any numeric field type and returns a value of the same type. If the
row has an even number of values for a column the result will be the average of the
middle two entries. If the field is not floating point then the average rounds down:
```esql
ROW a=[3, 7, 1, 6]
| EVAL median_a = MV_MEDIAN(a)
```

View file

@ -2,10 +2,16 @@ MV_MIN
Converts a multivalued field into a single valued field containing the minimum value. For example:
```esql
ROW a=[2, 1]
| EVAL min_a = MV_MIN(a)
```
It can be used by any field type, including keyword fields. In that case picks the
first string, comparing their utf-8 representation byte by byte:
```esql
ROW a=["foo", "bar"]
| EVAL min_a = MV_MIN(a)
```
Supported types:

View file

@ -2,6 +2,9 @@ MV_SUM
Converts a multivalued field into a single valued field containing the sum
of all of the values. For example:
```esql
ROW a=[3, 5, 6]
| EVAL sum_a = MV_SUM(a)
```
The input type can be any number and the output type is the same as the input type.

View file

@ -1,4 +1,6 @@
NOW
Returns current date and time.
```esql
ROW current_date = NOW()
```

View file

@ -1,12 +0,0 @@
Numeric fields
auto_bucket can also operate on numeric fields like this:
FROM employees
| WHERE hire_date >= "1985-01-01T00:00:00Z" AND hire_date < "1986-01-01T00:00:00Z"
| EVAL bs = AUTO_BUCKET(salary, 20, 25324, 74999)
| SORT hire_date, salary
| KEEP hire_date, salary, bs
Unlike the example above where you are intentionally filtering on a date range,
you rarely want to filter on a numeric range. So you have find the min and max
separately. We dont yet have an easy way to do that automatically. Improvements
coming!

View file

@ -86,24 +86,37 @@ IS NULL and IS NOT NULL predicates
IS NULL and IS NOT NULL predicates
For NULL comparison, use the IS NULL and IS NOT NULL predicates:
```esql
FROM employees
| WHERE birth_date IS NULL
| KEEP first_name, last_name
| SORT first_name
| LIMIT 3
```
```esql
FROM employees
| WHERE is_rehired IS NOT NULL
| STATS COUNT(emp_no)
```
CIDR_MATCH
CIDR_MATCH
Returns true if the provided IP is contained in one of the provided CIDR
blocks.CIDR_MATCH accepts two or more arguments. The first argument is the IP
address of type ip (both IPv4 and IPv6 are supported). Subsequent arguments
are the CIDR blocks to test the IP against.
Syntax
CIDR_MATCH(ip, block1[, ..., blockN])
Parameters
ip
IP address of type ip (both IPv4 and IPv6 are supported).
blockX
CIDR block to test the IP against.
DescriptionReturns true if the provided IP is contained in one of the provided CIDR
blocks.Example
```esql
FROM hosts
| WHERE CIDR_MATCH(ip, "127.0.0.2/32", "127.0.0.3/32")
| WHERE CIDR_MATCH(ip1, "127.0.0.2/32", "127.0.0.3/32")
| KEEP card, host, ip0, ip1
```
ENDS_WITH
ENDS_WITH
@ -111,9 +124,12 @@ ENDS_WITH
Returns a boolean that indicates whether a keyword string ends with another
string:
```esql
FROM employees
| KEEP last_name
| EVAL ln_E = ENDS_WITH(last_name, "d")
```
Supported types:
IN
@ -121,12 +137,16 @@ IN
The IN operator allows testing whether a field or expression equals
an element in a list of literals, fields or expressions:
```esql
ROW a = 1, b = 4, c = 3
| WHERE c-a IN (3, b / 2, a)
```
Returns a boolean that indicates whether its input is not a number.
```esql
ROW d = 1.0
| EVAL s = IS_NAN(d)
```
LIKE
LIKE
@ -137,9 +157,11 @@ also act on a constant (literal) expression. The right-hand side of the operator
represents the pattern.The following wildcard characters are supported:
* matches zero or more characters.
? matches one character.
```esql
FROM employees
| WHERE first_name LIKE "?b*"
| KEEP first_name, last_name
```
RLIKE
RLIKE
@ -148,9 +170,11 @@ Use RLIKE to filter data based on string patterns using using
regular expressions. RLIKE usually acts on a field placed on
the left-hand side of the operator, but it can also act on a constant (literal)
expression. The right-hand side of the operator represents the pattern.
```esql
FROM employees
| WHERE first_name RLIKE ".leja.*"
| KEEP first_name, last_name
```
STARTS_WITH
STARTS_WITH
@ -158,7 +182,10 @@ STARTS_WITH
Returns a boolean that indicates whether a keyword string starts with another
string:
```esql
FROM employees
| KEEP last_name
| EVAL ln_S = STARTS_WITH(last_name, "B")
```
Supported types:

View file

@ -3,10 +3,13 @@ PERCENTILE
The value at which a certain percentage of observed values occur. For example,
the 95th percentile is the value which is greater than 95% of the observed values and
the 50th percentile is the MEDIAN.
```esql
FROM employees
| STATS p0 = PERCENTILE(salary, 0)
, p50 = PERCENTILE(salary, 50)
, p99 = PERCENTILE(salary, 99)
```
PERCENTILE is (usually) approximateeditThere are many different algorithms to calculate percentiles. The naive
implementation simply stores all the values in a sorted array. To find the 50th
percentile, you simply find the value that is at my_array[count(my_array) * 0.5].Clearly, the naive implementation does not scalethe sorted array grows

View file

@ -2,4 +2,6 @@ PI
The ratio of a circles circumference to its diameter.
```esql
ROW PI()
```

View file

@ -4,10 +4,16 @@ POW
Returns the value of a base (first argument) raised to the power of an exponent (second argument).
Both arguments must be numeric. The output is always a double. Note that it is still possible to overflow
a double result here; in that case, null will be returned.
```esql
ROW base = 2.0, exponent = 2
| EVAL result = POW(base, exponent)
```
Fractional exponentseditThe exponent can be a fraction, which is similar to performing a root.
For example, the exponent of 0.5 will give the square root of the base:
```esql
ROW base = 4, exponent = 0.5
| EVAL s = POW(base, exponent)
```
Table of supported input and output typeseditFor clarity, the following table describes the output result type for all combinations of numeric input types:

View file

@ -9,10 +9,15 @@ new_nameX
The new name of the column.
DescriptionThe RENAME processing command renames one or more columns. If a column with
the new name already exists, it will be replaced by the new column.Examples
```esql
FROM employees
| KEEP first_name, last_name, still_hired
| RENAME still_hired AS employed
```
Multiple columns can be renamed with a single RENAME command:
```esql
FROM employees
| KEEP first_name, last_name
| RENAME first_name AS fn, last_name AS ln
```

View file

@ -2,6 +2,8 @@ REPLACE
The function substitutes in the string (1st argument) any match of the regular expression (2nd argument) with the replacement string (3rd argument).If any of the arguments are NULL, the result is NULL.
This example replaces an occurrence of the word "World" with the word "Universe":
```esql
ROW str = "Hello World"
| EVAL str = REPLACE(str, "World", "Universe")
| KEEP str
```

View file

@ -2,9 +2,12 @@ RIGHT
Return the substring that extracts length chars from the string starting from the right.
```esql
FROM employees
| KEEP last_name
| EVAL right = RIGHT(last_name, 3)
| SORT last_name ASC
| LIMIT 5
```
Supported types:

View file

@ -3,6 +3,8 @@ ROUND
Rounds a number to the closest number with the specified number of digits.
Defaults to 0 digits if no number of digits is provided. If the specified number
of digits is negative, rounds to the number of digits left of the decimal point.
```esql
FROM employees
| KEEP first_name, last_name, height
| EVAL height_ft = ROUND(height * 3.281, 1)
```

View file

@ -1,17 +1,35 @@
ROW
Syntax
```esql
ROW column1 = value1[, ..., columnN = valueN]
```
Parameters
columnX
The column name.
valueX
The value for the column. Can be a literal, an expression, or a
function.
DescriptionThe ROW source command produces a row with one or more columns with values
DescriptionThe
```esql
ROW source command produces a row with one or more columns with values
```
that you specify. This can be useful for testing.Examples
```esql
ROW a = 1, b = "two", c = null
```
Use square brackets to create multi-value columns:
```esql
ROW a = [2, 1]
```
```esql
ROW supports the use of functions:
```
```esql
ROW a = ROUND(1.23, 0)
```

View file

@ -2,9 +2,12 @@ RTRIM
Removes trailing whitespaces from strings.
```esql
ROW message = " some text ", color = " red "
| EVAL message = RTRIM(message)
| EVAL color = RTRIM(color)
| EVAL message = CONCAT("'", message, "'")
| EVAL color = CONCAT("'", color, "'")
```
Supported types:

View file

@ -1,15 +1,32 @@
SHOW
Syntax
```esql
SHOW item
```
Parameters
item
Can be INFO or FUNCTIONS.
DescriptionThe SHOW source command returns information about the deployment and
DescriptionThe
```esql
SHOW source command returns information about the deployment and
```
its capabilities:
Use SHOW INFO to return the deployments version, build date and hash.
Use SHOW FUNCTIONS to return a list of all supported functions and a
Use
```esql
SHOW INFO to return the deployments version, build date and hash.
```
Use
```esql
SHOW FUNCTIONS to return a list of all supported functions and a
```
synopsis of each function.
Examples
```esql
SHOW functions
| WHERE STARTS_WITH(name, "is_")
```

View file

@ -2,6 +2,9 @@ SIN
Sine trigonometric function. Input expected in radians.
```esql
ROW a=1.8
| EVAL sin=SIN(a)
```
Supported types:

View file

@ -2,6 +2,9 @@ SINH
Sine hyperbolic function.
```esql
ROW a=1.8
| EVAL sinh=SINH(a)
```
Supported types:

View file

@ -12,18 +12,29 @@ the highest value when sorting descending.By default, null values are treated as
an ascending sort order, null values are sorted last, and with a descending
sort order, null values are sorted first. You can change that by providing
NULLS FIRST or NULLS LAST.Examples
```esql
FROM employees
| KEEP first_name, last_name, height
| SORT height
```
Explicitly sorting in ascending order with ASC:
```esql
FROM employees
| KEEP first_name, last_name, height
| SORT height DESC
```
Providing additional sort expressions to act as tie breakers:
```esql
FROM employees
| KEEP first_name, last_name, height
| SORT height DESC, first_name ASC
```
Sorting null values first using NULLS FIRST:
```esql
FROM employees
| KEEP first_name, last_name, height
| SORT first_name ASC NULLS FIRST
```

View file

@ -1,7 +1,10 @@
SPLIT
Split a single valued string into multiple strings. For example:
```esql
ROW words="foo;bar;baz;qux;quux;corge"
| EVAL word = SPLIT(words, ";")
```
Which splits "foo;bar;baz;qux;quux;corge" on ; and returns an array:
Only single byte delimiters are currently supported.

View file

@ -3,6 +3,9 @@ SQRT
Returns the square root of a number. The input can be any numeric value, the return value
is always a double.Square roots of negative numbers are NaN. Square roots of infinites are infinite.
```esql
ROW d = 100.0
| EVAL s = SQRT(d)
```
Supported types:

View file

@ -31,20 +31,31 @@ Grouping on a single column is currently much more optimized than grouping
something like CONCAT and then grouping - that is not going to be
faster.
ExamplesCalculating a statistic and grouping by the values of another column:
```esql
FROM employees
| STATS count = COUNT(emp_no) BY languages
| SORT languages
```
Omitting BY returns one row with the aggregations applied over the entire
dataset:
```esql
FROM employees
| STATS avg_lang = AVG(languages)
```
Its possible to calculate multiple values:
```esql
FROM employees
| STATS avg_lang = AVG(languages), max_lang = MAX(languages)
```
Its also possible to group by multiple values (only supported for long and
keyword family fields):
```esql
FROM employees
| EVAL hired = DATE_FORMAT("YYYY", hire_date)
| STATS avg_salary = AVG(salary) BY hired, languages.long
| EVAL avg_salary = ROUND(avg_salary)
| SORT hired, languages.long
```

View file

@ -2,16 +2,24 @@ SUBSTRING
Returns a substring of a string, specified by a start position and an optional
length. This example returns the first three characters of every last name:
```esql
FROM employees
| KEEP last_name
| EVAL ln_sub = SUBSTRING(last_name, 1, 3)
```
A negative start position is interpreted as being relative to the end of the
string. This example returns the last three characters of of every last name:
```esql
FROM employees
| KEEP last_name
| EVAL ln_sub = SUBSTRING(last_name, -3, 3)
```
If length is omitted, substring returns the remainder of the string. This
example returns all characters except for the first:
```esql
FROM employees
| KEEP last_name
| EVAL ln_sub = SUBSTRING(last_name, 2)
```

View file

@ -1,5 +1,7 @@
SUM
The sum of a numeric field.
```esql
FROM employees
| STATS SUM(languages)
```

View file

@ -27,24 +27,36 @@ expressions - require quoting if the identifier contains characters other than
letters, numbers and `_` and doesnt start with a letter, `_` or `@`.
For instance:
// Retain just one field
```esql
FROM index
| KEEP 1.field
```
is legal. However, if same field is to be used with an EVAL,
itd have to be quoted:
// Copy one field
```esql
FROM index
| EVAL my_field = `1.field`
```
Literalsedit
ES|QL currently supports numeric and string literals.
String literalsedit
A string literal is a sequence of unicode characters delimited by double
quotes (`"`).
// Filter by a string value
```esql
FROM index
| WHERE first_name == "Georgi"
```
If the literal string itself contains quotes, these need to be escaped (`\\"`).
ES|QL also supports the triple-quotes (`"""`) delimiter, for convenience:
```esql
ROW name = """Indiana "Indy" Jones"""
```
The special characters CR, LF and TAB can be provided with the usual escaping:
`\r`, `\n`, `\t`, respectively.
Numerical literalsedit
@ -67,11 +79,20 @@ ES|QL uses C++ style comments:
double slash `//` for single line comments
`/*` and `*/` for block comments
// Query the employees index
```esql
FROM employees
| WHERE height > 2
```
```esql
FROM /* Query the employees index */ employees
| WHERE height > 2
```
```esql
FROM employees
```
/* Query the
* employees
* index */

View file

@ -2,6 +2,9 @@ TAN
Tangent trigonometric function. Input expected in radians.
```esql
ROW a=1.8
| EVAL tan=TAN(a)
```
Supported types:

View file

@ -2,6 +2,9 @@ TANH
Tangent hyperbolic function.
```esql
ROW a=1.8
| EVAL tanh=TANH(a)
```
Supported types:

View file

@ -2,4 +2,6 @@ TAU
The ratio of a circles circumference to its radius.
```esql
ROW TAU()
```

View file

@ -4,7 +4,10 @@ Converts an input value to a boolean value.The input can be a single- or multi-v
type must be of a string or numeric type.A string value of "true" will be case-insensitive converted to the Boolean
true. For anything else, including the empty string, the function will
return false. For example:
```esql
ROW str = ["true", "TRuE", "false", "", "yes", "1"]
| EVAL bool = TO_BOOLEAN(str)
```
The numerical value of 0 will be converted to false, anything else will be
converted to true.Alias: TO_BOOL

View file

@ -3,8 +3,11 @@ TO_DATETIME
Converts an input value to a date value.The input can be a single- or multi-valued field or an expression. The input
type must be of a string or numeric type.A string will only be successfully converted if its respecting the format
yyyy-MM-dd'T'HH:mm:ss.SSS'Z' (to convert dates in other formats, use DATE_PARSE). For example:
```esql
ROW string = ["1953-09-02T00:00:00.000Z", "1964-06-02T00:00:00.000Z", "1964-06-02 00:00:00"]
| EVAL datetime = TO_DATETIME(string)
```
Note that in this example, the last value in the source multi-valued
field has not been converted. The reason being that if the date format is not
respected, the conversion will result in a null value. When this happens a
@ -12,6 +15,9 @@ Warning header is added to the response. The header will provide information
on the source of the failure:"Line 1:112: evaluation of [TO_DATETIME(string)] failed, treating result as null. Only first 20 failures recorded."A following header will contain the failure reason and the offending value:"java.lang.IllegalArgumentException: failed to parse date field [1964-06-02 00:00:00] with format [yyyy-MM-dd'T'HH:mm:ss.SSS'Z']"If the input parameter is of a numeric type, its value will be interpreted as
milliseconds since the Unix epoch.
For example:
```esql
ROW int = [0, 1]
| EVAL dt = TO_DATETIME(int)
```
Alias: TO_DT

View file

@ -3,5 +3,7 @@ TO_DEGREES
Converts a number in radians
to degrees.The input can be a single- or multi-valued field or an expression. The input
type must be of a numeric type and result is always double.Example:
```esql
ROW rad = [1.57, 3.14, 4.71]
| EVAL deg = TO_DEGREES(rad)
```

View file

@ -2,8 +2,11 @@ TO_DOUBLE
Converts an input value to a double value.The input can be a single- or multi-valued field or an expression. The input
type must be of a boolean, date, string or numeric type.Example:
```esql
ROW str1 = "5.20128E11", str2 = "foo"
| EVAL dbl = TO_DOUBLE("520128000000"), dbl1 = TO_DOUBLE(str1), dbl2 = TO_DOUBLE(str2)
```
Note that in this example, the last conversion of the string isnt
possible. When this happens, the result is a null value. In this case a
Warning header is added to the response. The header will provide information

View file

@ -2,8 +2,11 @@ TO_INTEGER
Converts an input value to an integer value.The input can be a single- or multi-valued field or an expression. The input
type must be of a boolean, date, string or numeric type.Example:
```esql
ROW long = [5013792, 2147483647, 501379200000]
| EVAL int = TO_INTEGER(long)
```
Note that in this example, the last value of the multi-valued field cannot
be converted as an integer. When this happens, the result is a null value.
In this case a Warning header is added to the response. The header will

View file

@ -1,9 +1,12 @@
TO_IP
Converts an input string to an IP value.The input can be a single- or multi-valued field or an expression.Example:
```esql
ROW str1 = "1.1.1.1", str2 = "foo"
| EVAL ip1 = TO_IP(str1), ip2 = TO_IP(str2)
| WHERE CIDR_MATCH(ip1, "1.0.0.0/8")
```
Note that in the example above the last conversion of the string isnt
possible. When this happens, the result is a null value. In this case a
Warning header is added to the response. The header will provide information

View file

@ -2,8 +2,11 @@ TO_LONG
Converts an input value to a long value.The input can be a single- or multi-valued field or an expression. The input
type must be of a boolean, date, string or numeric type.Example:
```esql
ROW str1 = "2147483648", str2 = "2147483648.2", str3 = "foo"
| EVAL long1 = TO_LONG(str1), long2 = TO_LONG(str2), long3 = TO_LONG(str3)
```
Note that in this example, the last conversion of the string isnt
possible. When this happens, the result is a null value. In this case a
Warning header is added to the response. The header will provide information

View file

@ -3,5 +3,7 @@ TO_RADIANS
Converts a number in degrees to
radians.The input can be a single- or multi-valued field or an expression. The input
type must be of a numeric type and result is always double.Example:
```esql
ROW deg = [90.0, 180.0, 270.0]
| EVAL rad = TO_RADIANS(deg)
```

View file

@ -2,9 +2,15 @@ TO_STRING
Converts a field into a string. For example:
```esql
ROW a=10
| EVAL j = TO_STRING(a)
```
It also works fine on multivalued fields:
```esql
ROW a=[10, 9, 8]
| EVAL j = TO_STRING(a)
```
Alias: TO_STRSupported types:

View file

@ -2,8 +2,11 @@ TO_UNSIGNED_LONG
Converts an input value to an unsigned long value.The input can be a single- or multi-valued field or an expression. The input
type must be of a boolean, date, string or numeric type.Example:
```esql
ROW str1 = "2147483648", str2 = "2147483648.2", str3 = "foo"
| EVAL long1 = TO_UNSIGNED_LONG(str1), long2 = TO_ULONG(str2), long3 = TO_UL(str3)
```
Note that in this example, the last conversion of the string isnt
possible. When this happens, the result is a null value. In this case a
Warning header is added to the response. The header will provide information

View file

@ -2,5 +2,8 @@ TO_VERSION
Converts an input string to a version value. For example:
```esql
ROW v = TO_VERSION("1.2.3")
```
The input can be a single- or multi-valued field or an expression.Alias: TO_VERSupported types:

View file

@ -2,7 +2,10 @@ TRIM
Removes leading and trailing whitespaces from strings.
```esql
ROW message = " some text ", color = " red "
| EVAL message = TRIM(message)
| EVAL color = TRIM(color)
```
Supported types:

View file

@ -7,45 +7,69 @@ expression
A boolean expression.
DescriptionThe WHERE processing command produces a table that contains all the rows from
the input table for which the provided condition evaluates to true.Examples
```esql
FROM employees
| KEEP first_name, last_name, still_hired
| WHERE still_hired == true
```
Which, if still_hired is a boolean field, can be simplified to:
```esql
FROM employees
| KEEP first_name, last_name, still_hired
| WHERE still_hired
```
WHERE supports various functions. For example the
LENGTH function:
```esql
FROM employees
| KEEP first_name, last_name, height
| WHERE LENGTH(first_name) < 4
```
For a complete list of all functions, refer to Functions and operators.For NULL comparison, use the IS NULL and IS NOT NULL predicates:
```esql
FROM employees
| WHERE birth_date IS NULL
| KEEP first_name, last_name
| SORT first_name
| LIMIT 3
```
```esql
FROM employees
| WHERE is_rehired IS NOT NULL
| STATS COUNT(emp_no)
```
Use LIKE to filter data based on string patterns using wildcards. LIKE
usually acts on a field placed on the left-hand side of the operator, but it can
also act on a constant (literal) expression. The right-hand side of the operator
represents the pattern.The following wildcard characters are supported:
* matches zero or more characters.
? matches one character.
```esql
FROM employees
| WHERE first_name LIKE "?b*"
| KEEP first_name, last_name
```
Use RLIKE to filter data based on string patterns using using
regular expressions. RLIKE usually acts on a field placed on
the left-hand side of the operator, but it can also act on a constant (literal)
expression. The right-hand side of the operator represents the pattern.
```esql
FROM employees
| WHERE first_name RLIKE ".leja.*"
| KEEP first_name, last_name
```
The IN operator allows testing whether a field or expression equals
an element in a list of literals, fields or expressions:
```esql
ROW a = 1, b = 4, c = 3
| WHERE c-a IN (3, b / 2, a)
```
For a complete list of all operators, refer to Operators.

View file

@ -72,7 +72,7 @@ export function registerEsqlFunction({
name: 'execute_query',
contexts: ['core'],
visibility: FunctionVisibility.User,
description: 'Execute an ES|QL query',
description: 'Execute an ES|QL query.',
parameters: {
type: 'object',
additionalProperties: false,
@ -129,14 +129,36 @@ export function registerEsqlFunction({
const source$ = streamIntoObservable(
await client.chat({
connectorId,
messages: withEsqlSystemMessage(),
messages: withEsqlSystemMessage(
`Use the classify_esql function to classify the user's request
and get more information about specific functions and commands
you think are candidates for answering the question.
Examples for functions and commands:
Do you need to group data? Request \`STATS\`.
Extract data? Request \`DISSECT\` AND \`GROK\`.
Convert a column based on a set of conditionals? Request \`EVAL\` and \`CASE\`.
Examples for determining whether the user wants to execute a query:
- "Show me the avg of x"
- "Give me the results of y"
- "Display the sum of z"
Examples for determining whether the user does not want to execute a query:
- "I want a query that ..."
- "... Just show me the query"
- "Create a query that ..."`
),
signal,
stream: true,
functions: [
{
name: 'get_esql_info',
description:
'Use this function to get more information about syntax, commands and examples. Take a deep breath and reason about what commands and functions you expect to use. Do you need to group data? Request `STATS`. Extract data? Request `DISSECT` AND `GROK`. Convert a column based on a set of conditionals? Request `EVAL` and `CASE`.',
name: 'classify_esql',
description: `Use this function to determine:
- what ES|QL functions and commands are candidates for answering the user's question
- whether the user has requested a query, and if so, it they want it to be executed, or just shown.
`,
parameters: {
type: 'object',
properties: {
@ -154,12 +176,17 @@ export function registerEsqlFunction({
},
description: 'A list of functions.',
},
execute: {
type: 'boolean',
description:
'Whether the user wants to execute a query (true) or just wants the query to be displayed (false)',
},
},
required: ['commands', 'functions'],
required: ['commands', 'functions', 'execute'],
},
},
],
functionCall: 'get_esql_info',
functionCall: 'classify_esql',
})
).pipe(processOpenAiStream(), concatenateOpenAiChunks());
@ -168,6 +195,7 @@ export function registerEsqlFunction({
const args = JSON.parse(response.message.function_call.arguments) as {
commands: string[];
functions: string[];
execute: boolean;
};
const keywords = args.commands.concat(args.functions).concat('SYNTAX').concat('OVERVIEW');
@ -254,7 +282,6 @@ export function registerEsqlFunction({
},
],
connectorId,
functions: [],
signal,
stream: true,
})
@ -294,6 +321,30 @@ export function registerEsqlFunction({
],
});
}
const esqlQuery = cachedContent.match(/```esql([\s\S]*?)```/)?.[1];
if (esqlQuery && args.execute) {
subscriber.next({
created: 0,
id: '',
model: '',
object: 'chat.completion.chunk',
choices: [
{
delta: {
function_call: {
name: 'execute_query',
arguments: JSON.stringify({ query: esqlQuery }),
},
},
index: 0,
finish_reason: null,
},
],
});
}
subscriber.complete();
},
error: (error) => {

View file

@ -11,9 +11,8 @@ processing. This source command is then followed by one or more
processing commands, which can transform the data returned by the
previous command.
ES|QL is not Elasticsearch SQL, nor is it anything like SQL. SQL
commands are not available in ES|QL. Make sure you write a query
using ONLY commands specified in this conversation.
Make sure you write a query using ONLY commands specified in this
conversation.
# Syntax
@ -168,6 +167,7 @@ one or more aggregated values over the grouped rows. This commands only
Here are some example queries:
```esql
FROM employees
| WHERE still_hired == true
| EVAL hired = DATE_FORMAT("YYYY", hire_date)
@ -179,34 +179,48 @@ FROM employees
| KEEP avg_salary, lang
| SORT avg_salary ASC
| LIMIT 3
```
```esql
FROM employees
| EVAL trunk_worked_seconds = avg_worked_seconds / 100000000 * 100000000
| STATS c = count(languages.long) BY languages.long, trunk_worked_seconds
| SORT c desc, languages.long, trunk_worked_seconds
```
```esql
ROW a = "2023-01-23T12:15:00.000Z - some text - 127.0.0.1"
| DISSECT a "%{date} - %{msg} - %{ip}"
| KEEP date, msg, ip
| EVAL date = TO_DATETIME(date)
```
```esql
FROM employees
| WHERE first_name LIKE "?b*"
| KEEP first_name, last_name
```
```esql
FROM employees
| WHERE hire_date >= "1985-01-01T00:00:00Z" AND hire_date < "1986-01-01T00:00:00Z"
| EVAL bucket = AUTO_BUCKET(hire_date, 20, "1985-01-01T00:00:00Z", "1986-01-01T00:00:00Z")
| STATS AVG(salary) BY bucket
| SORT bucket
```
```esql
ROW a = 1, b = "two", c = null
```
```esql
FROM employees
| EVAL is_recent_hire = CASE(hire_date <= "2023-01-01T00:00:00Z", 1, 0)
| STATS total_recent_hires = SUM(is_recent_hire), total_hires = COUNT(*) BY country
| EVAL recent_hiring_rate = total_recent_hires / total_hires
```
```esql
FROM logs-*
| WHERE @timestamp <= NOW() - 24 hours
// divide data in 1 hour buckets
@ -217,4 +231,4 @@ FROM logs-*
| STATS total_events = COUNT(*), total_failures = SUM(is_5xx) BY host.hostname, bucket
| EVAL failure_rate_per_host = total_failures / total_events
| DROP total_events, total_failures
```