mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-06-28 17:34:17 -04:00
Restructure query-languages docs files for clarity (#124797)
In a few previous PR's we restructured the ES|QL docs to make it possible to generate them dynamically. This PR just moves a few files around to make the query languages docs easier to work with, and a little more organized like the ES|QL docs. A bit part of this was setting up redirects to the new locations, so other repo's could correctly link to the elasticsearch docs.
This commit is contained in:
parent
8c737fb235
commit
94cad286bc
188 changed files with 1021 additions and 969 deletions
892
docs/reference/query-languages/sql/sql-functions-aggs.md
Normal file
892
docs/reference/query-languages/sql/sql-functions-aggs.md
Normal file
|
@ -0,0 +1,892 @@
|
|||
---
|
||||
mapped_pages:
|
||||
- https://www.elastic.co/guide/en/elasticsearch/reference/current/sql-functions-aggs.html
|
||||
---
|
||||
|
||||
# Aggregate Functions [sql-functions-aggs]
|
||||
|
||||
Functions for computing a *single* result from a set of input values. Elasticsearch SQL supports aggregate functions only alongside [grouping](/reference/query-languages/sql/sql-syntax-select.md#sql-syntax-group-by) (implicit or explicit).
|
||||
|
||||
|
||||
## General Purpose [sql-functions-aggs-general]
|
||||
|
||||
## `AVG` [sql-functions-aggs-avg]
|
||||
|
||||
```sql
|
||||
AVG(numeric_field) <1>
|
||||
```
|
||||
|
||||
**Input**:
|
||||
|
||||
1. numeric field. If this field contains only `null` values, the function returns `null`. Otherwise, the function ignores `null` values in this field.
|
||||
|
||||
|
||||
**Output**: `double` numeric value
|
||||
|
||||
**Description**: Returns the [Average](https://en.wikipedia.org/wiki/Arithmetic_mean) (arithmetic mean) of input values.
|
||||
|
||||
```sql
|
||||
SELECT AVG(salary) AS avg FROM emp;
|
||||
|
||||
avg
|
||||
---------------
|
||||
48248.55
|
||||
```
|
||||
|
||||
```sql
|
||||
SELECT AVG(salary / 12.0) AS avg FROM emp;
|
||||
|
||||
avg
|
||||
---------------
|
||||
4020.7125
|
||||
```
|
||||
|
||||
|
||||
## `COUNT` [sql-functions-aggs-count]
|
||||
|
||||
```sql
|
||||
COUNT(expression) <1>
|
||||
```
|
||||
|
||||
**Input**:
|
||||
|
||||
1. a field name, wildcard (`*`) or any numeric value. For `COUNT(*)` or `COUNT(<literal>)`, all values are considered, including `null` or missing ones. For `COUNT(<field_name>)`, `null` values are not considered.
|
||||
|
||||
|
||||
**Output**: numeric value
|
||||
|
||||
**Description**: Returns the total number (count) of input values.
|
||||
|
||||
```sql
|
||||
SELECT COUNT(*) AS count FROM emp;
|
||||
|
||||
count
|
||||
---------------
|
||||
100
|
||||
```
|
||||
|
||||
|
||||
## `COUNT(ALL)` [sql-functions-aggs-count-all]
|
||||
|
||||
```sql
|
||||
COUNT(ALL field_name) <1>
|
||||
```
|
||||
|
||||
**Input**:
|
||||
|
||||
1. a field name. If this field contains only `null` values, the function returns `null`. Otherwise, the function ignores `null` values in this field.
|
||||
|
||||
|
||||
**Output**: numeric value
|
||||
|
||||
**Description**: Returns the total number (count) of all *non-null* input values. `COUNT(<field_name>)` and `COUNT(ALL <field_name>)` are equivalent.
|
||||
|
||||
```sql
|
||||
SELECT COUNT(ALL last_name) AS count_all, COUNT(DISTINCT last_name) count_distinct FROM emp;
|
||||
|
||||
count_all | count_distinct
|
||||
---------------+------------------
|
||||
100 |96
|
||||
```
|
||||
|
||||
```sql
|
||||
SELECT COUNT(ALL CASE WHEN languages IS NULL THEN -1 ELSE languages END) AS count_all, COUNT(DISTINCT CASE WHEN languages IS NULL THEN -1 ELSE languages END) count_distinct FROM emp;
|
||||
|
||||
count_all | count_distinct
|
||||
---------------+---------------
|
||||
100 |6
|
||||
```
|
||||
|
||||
|
||||
## `COUNT(DISTINCT)` [sql-functions-aggs-count-distinct]
|
||||
|
||||
```sql
|
||||
COUNT(DISTINCT field_name) <1>
|
||||
```
|
||||
|
||||
**Input**:
|
||||
|
||||
1. a field name
|
||||
|
||||
|
||||
**Output**: numeric value. If this field contains only `null` values, the function returns `null`. Otherwise, the function ignores `null` values in this field.
|
||||
|
||||
**Description**: Returns the total number of *distinct non-null* values in input values.
|
||||
|
||||
```sql
|
||||
SELECT COUNT(DISTINCT hire_date) unique_hires, COUNT(hire_date) AS hires FROM emp;
|
||||
|
||||
unique_hires | hires
|
||||
----------------+---------------
|
||||
99 |100
|
||||
```
|
||||
|
||||
```sql
|
||||
SELECT COUNT(DISTINCT DATE_TRUNC('YEAR', hire_date)) unique_hires, COUNT(DATE_TRUNC('YEAR', hire_date)) AS hires FROM emp;
|
||||
|
||||
unique_hires | hires
|
||||
---------------+---------------
|
||||
14 |100
|
||||
```
|
||||
|
||||
|
||||
## `FIRST/FIRST_VALUE` [sql-functions-aggs-first]
|
||||
|
||||
```sql
|
||||
FIRST(
|
||||
field_name <1>
|
||||
[, ordering_field_name]) <2>
|
||||
```
|
||||
|
||||
**Input**:
|
||||
|
||||
1. target field for the aggregation
|
||||
2. optional field used for ordering
|
||||
|
||||
|
||||
**Output**: same type as the input
|
||||
|
||||
**Description**: Returns the first non-`null` value (if such exists) of the `field_name` input column sorted by the `ordering_field_name` column. If `ordering_field_name` is not provided, only the `field_name` column is used for the sorting. E.g.:
|
||||
|
||||
| a | b |
|
||||
| --- | --- |
|
||||
| 100 | 1 |
|
||||
| 200 | 1 |
|
||||
| 1 | 2 |
|
||||
| 2 | 2 |
|
||||
| 10 | null |
|
||||
| 20 | null |
|
||||
| null | null |
|
||||
|
||||
```sql
|
||||
SELECT FIRST(a) FROM t
|
||||
```
|
||||
|
||||
will result in:
|
||||
|
||||
| |
|
||||
| --- |
|
||||
| **FIRST(a)** |
|
||||
| 1 |
|
||||
|
||||
and
|
||||
|
||||
```sql
|
||||
SELECT FIRST(a, b) FROM t
|
||||
```
|
||||
|
||||
will result in:
|
||||
|
||||
| |
|
||||
| --- |
|
||||
| **FIRST(a, b)** |
|
||||
| 100 |
|
||||
|
||||
```sql
|
||||
SELECT FIRST(first_name) FROM emp;
|
||||
|
||||
FIRST(first_name)
|
||||
--------------------
|
||||
Alejandro
|
||||
```
|
||||
|
||||
```sql
|
||||
SELECT gender, FIRST(first_name) FROM emp GROUP BY gender ORDER BY gender;
|
||||
|
||||
gender | FIRST(first_name)
|
||||
------------+--------------------
|
||||
null | Berni
|
||||
F | Alejandro
|
||||
M | Amabile
|
||||
```
|
||||
|
||||
```sql
|
||||
SELECT FIRST(first_name, birth_date) FROM emp;
|
||||
|
||||
FIRST(first_name, birth_date)
|
||||
--------------------------------
|
||||
Remzi
|
||||
```
|
||||
|
||||
```sql
|
||||
SELECT gender, FIRST(first_name, birth_date) FROM emp GROUP BY gender ORDER BY gender;
|
||||
|
||||
gender | FIRST(first_name, birth_date)
|
||||
--------------+--------------------------------
|
||||
null | Lillian
|
||||
F | Sumant
|
||||
M | Remzi
|
||||
```
|
||||
|
||||
`FIRST_VALUE` is a name alias and can be used instead of `FIRST`, e.g.:
|
||||
|
||||
```sql
|
||||
SELECT gender, FIRST_VALUE(first_name, birth_date) FROM emp GROUP BY gender ORDER BY gender;
|
||||
|
||||
gender | FIRST_VALUE(first_name, birth_date)
|
||||
--------------+--------------------------------------
|
||||
null | Lillian
|
||||
F | Sumant
|
||||
M | Remzi
|
||||
```
|
||||
|
||||
```sql
|
||||
SELECT gender, FIRST_VALUE(SUBSTRING(first_name, 2, 6), birth_date) AS "first" FROM emp GROUP BY gender ORDER BY gender;
|
||||
|
||||
gender | first
|
||||
---------------+---------------
|
||||
null |illian
|
||||
F |umant
|
||||
M |emzi
|
||||
```
|
||||
|
||||
::::{note}
|
||||
`FIRST` cannot be used in a HAVING clause.
|
||||
::::
|
||||
|
||||
|
||||
::::{note}
|
||||
`FIRST` cannot be used with columns of type [`text`](/reference/elasticsearch/mapping-reference/text.md) unless the field is also [saved as a keyword](/reference/elasticsearch/mapping-reference/text.md#before-enabling-fielddata).
|
||||
::::
|
||||
|
||||
|
||||
|
||||
## `LAST/LAST_VALUE` [sql-functions-aggs-last]
|
||||
|
||||
```sql
|
||||
LAST(
|
||||
field_name <1>
|
||||
[, ordering_field_name]) <2>
|
||||
```
|
||||
|
||||
**Input**:
|
||||
|
||||
1. target field for the aggregation
|
||||
2. optional field used for ordering
|
||||
|
||||
|
||||
**Output**: same type as the input
|
||||
|
||||
**Description**: It’s the inverse of [`FIRST/FIRST_VALUE`](#sql-functions-aggs-first). Returns the last non-`null` value (if such exists) of the `field_name` input column sorted descending by the `ordering_field_name` column. If `ordering_field_name` is not provided, only the `field_name` column is used for the sorting. E.g.:
|
||||
|
||||
| a | b |
|
||||
| --- | --- |
|
||||
| 10 | 1 |
|
||||
| 20 | 1 |
|
||||
| 1 | 2 |
|
||||
| 2 | 2 |
|
||||
| 100 | null |
|
||||
| 200 | null |
|
||||
| null | null |
|
||||
|
||||
```sql
|
||||
SELECT LAST(a) FROM t
|
||||
```
|
||||
|
||||
will result in:
|
||||
|
||||
| |
|
||||
| --- |
|
||||
| **LAST(a)** |
|
||||
| 200 |
|
||||
|
||||
and
|
||||
|
||||
```sql
|
||||
SELECT LAST(a, b) FROM t
|
||||
```
|
||||
|
||||
will result in:
|
||||
|
||||
| |
|
||||
| --- |
|
||||
| **LAST(a, b)** |
|
||||
| 2 |
|
||||
|
||||
```sql
|
||||
SELECT LAST(first_name) FROM emp;
|
||||
|
||||
LAST(first_name)
|
||||
-------------------
|
||||
Zvonko
|
||||
```
|
||||
|
||||
```sql
|
||||
SELECT gender, LAST(first_name) FROM emp GROUP BY gender ORDER BY gender;
|
||||
|
||||
gender | LAST(first_name)
|
||||
------------+-------------------
|
||||
null | Patricio
|
||||
F | Xinglin
|
||||
M | Zvonko
|
||||
```
|
||||
|
||||
```sql
|
||||
SELECT LAST(first_name, birth_date) FROM emp;
|
||||
|
||||
LAST(first_name, birth_date)
|
||||
-------------------------------
|
||||
Hilari
|
||||
```
|
||||
|
||||
```sql
|
||||
SELECT gender, LAST(first_name, birth_date) FROM emp GROUP BY gender ORDER BY gender;
|
||||
|
||||
gender | LAST(first_name, birth_date)
|
||||
-----------+-------------------------------
|
||||
null | Eberhardt
|
||||
F | Valdiodio
|
||||
M | Hilari
|
||||
```
|
||||
|
||||
`LAST_VALUE` is a name alias and can be used instead of `LAST`, e.g.:
|
||||
|
||||
```sql
|
||||
SELECT gender, LAST_VALUE(first_name, birth_date) FROM emp GROUP BY gender ORDER BY gender;
|
||||
|
||||
gender | LAST_VALUE(first_name, birth_date)
|
||||
-----------+-------------------------------------
|
||||
null | Eberhardt
|
||||
F | Valdiodio
|
||||
M | Hilari
|
||||
```
|
||||
|
||||
```sql
|
||||
SELECT gender, LAST_VALUE(SUBSTRING(first_name, 3, 8), birth_date) AS "last" FROM emp GROUP BY gender ORDER BY gender;
|
||||
|
||||
gender | last
|
||||
---------------+---------------
|
||||
null |erhardt
|
||||
F |ldiodio
|
||||
M |lari
|
||||
```
|
||||
|
||||
::::{note}
|
||||
`LAST` cannot be used in `HAVING` clause.
|
||||
::::
|
||||
|
||||
|
||||
::::{note}
|
||||
`LAST` cannot be used with columns of type [`text`](/reference/elasticsearch/mapping-reference/text.md) unless the field is also [`saved as a keyword`](/reference/elasticsearch/mapping-reference/text.md#before-enabling-fielddata).
|
||||
::::
|
||||
|
||||
|
||||
|
||||
## `MAX` [sql-functions-aggs-max]
|
||||
|
||||
```sql
|
||||
MAX(field_name) <1>
|
||||
```
|
||||
|
||||
**Input**:
|
||||
|
||||
1. a numeric field. If this field contains only `null` values, the function returns `null`. Otherwise, the function ignores `null` values in this field.
|
||||
|
||||
|
||||
**Output**: same type as the input
|
||||
|
||||
**Description**: Returns the maximum value across input values in the field `field_name`.
|
||||
|
||||
```sql
|
||||
SELECT MAX(salary) AS max FROM emp;
|
||||
|
||||
max
|
||||
---------------
|
||||
74999
|
||||
```
|
||||
|
||||
```sql
|
||||
SELECT MAX(ABS(salary / -12.0)) AS max FROM emp;
|
||||
|
||||
max
|
||||
-----------------
|
||||
6249.916666666667
|
||||
```
|
||||
|
||||
::::{note}
|
||||
`MAX` on a field of type [`text`](/reference/elasticsearch/mapping-reference/text.md) or [`keyword`](/reference/elasticsearch/mapping-reference/keyword.md) is translated into [`LAST/LAST_VALUE`](#sql-functions-aggs-last) and therefore, it cannot be used in `HAVING` clause.
|
||||
::::
|
||||
|
||||
|
||||
|
||||
## `MIN` [sql-functions-aggs-min]
|
||||
|
||||
```sql
|
||||
MIN(field_name) <1>
|
||||
```
|
||||
|
||||
**Input**:
|
||||
|
||||
1. a numeric field. If this field contains only `null` values, the function returns `null`. Otherwise, the function ignores `null` values in this field.
|
||||
|
||||
|
||||
**Output**: same type as the input
|
||||
|
||||
**Description**: Returns the minimum value across input values in the field `field_name`.
|
||||
|
||||
```sql
|
||||
SELECT MIN(salary) AS min FROM emp;
|
||||
|
||||
min
|
||||
---------------
|
||||
25324
|
||||
```
|
||||
|
||||
::::{note}
|
||||
`MIN` on a field of type [`text`](/reference/elasticsearch/mapping-reference/text.md) or [`keyword`](/reference/elasticsearch/mapping-reference/keyword.md) is translated into [`FIRST/FIRST_VALUE`](#sql-functions-aggs-first) and therefore, it cannot be used in `HAVING` clause.
|
||||
::::
|
||||
|
||||
|
||||
|
||||
## `SUM` [sql-functions-aggs-sum]
|
||||
|
||||
```sql
|
||||
SUM(field_name) <1>
|
||||
```
|
||||
|
||||
**Input**:
|
||||
|
||||
1. a numeric field. If this field contains only `null` values, the function returns `null`. Otherwise, the function ignores `null` values in this field.
|
||||
|
||||
|
||||
**Output**: `bigint` for integer input, `double` for floating points
|
||||
|
||||
**Description**: Returns the sum of input values in the field `field_name`.
|
||||
|
||||
```sql
|
||||
SELECT SUM(salary) AS sum FROM emp;
|
||||
|
||||
sum
|
||||
---------------
|
||||
4824855
|
||||
```
|
||||
|
||||
```sql
|
||||
SELECT ROUND(SUM(salary / 12.0), 1) AS sum FROM emp;
|
||||
|
||||
sum
|
||||
---------------
|
||||
402071.3
|
||||
```
|
||||
|
||||
|
||||
## Statistics [sql-functions-aggs-statistics]
|
||||
|
||||
|
||||
## `KURTOSIS` [sql-functions-aggs-kurtosis]
|
||||
|
||||
```sql
|
||||
KURTOSIS(field_name) <1>
|
||||
```
|
||||
|
||||
**Input**:
|
||||
|
||||
1. a numeric field. If this field contains only `null` values, the function returns `null`. Otherwise, the function ignores `null` values in this field.
|
||||
|
||||
|
||||
**Output**: `double` numeric value
|
||||
|
||||
**Description**:
|
||||
|
||||
[Quantify](https://en.wikipedia.org/wiki/Kurtosis) the shape of the distribution of input values in the field `field_name`.
|
||||
|
||||
```sql
|
||||
SELECT MIN(salary) AS min, MAX(salary) AS max, KURTOSIS(salary) AS k FROM emp;
|
||||
|
||||
min | max | k
|
||||
---------------+---------------+------------------
|
||||
25324 |74999 |2.0444718929142986
|
||||
```
|
||||
|
||||
::::{note}
|
||||
`KURTOSIS` cannot be used on top of scalar functions or operators but only directly on a field. So, for example, the following is not allowed and an error is returned:
|
||||
|
||||
```sql
|
||||
SELECT KURTOSIS(salary / 12.0), gender FROM emp GROUP BY gender
|
||||
```
|
||||
|
||||
::::
|
||||
|
||||
|
||||
|
||||
## `MAD` [sql-functions-aggs-mad]
|
||||
|
||||
```sql
|
||||
MAD(field_name) <1>
|
||||
```
|
||||
|
||||
**Input**:
|
||||
|
||||
1. a numeric field. If this field contains only `null` values, the function returns `null`. Otherwise, the function ignores `null` values in this field.
|
||||
|
||||
|
||||
**Output**: `double` numeric value
|
||||
|
||||
**Description**:
|
||||
|
||||
[Measure](https://en.wikipedia.org/wiki/Median_absolute_deviation) the variability of the input values in the field `field_name`.
|
||||
|
||||
```sql
|
||||
SELECT MIN(salary) AS min, MAX(salary) AS max, AVG(salary) AS avg, MAD(salary) AS mad FROM emp;
|
||||
|
||||
min | max | avg | mad
|
||||
---------------+---------------+---------------+---------------
|
||||
25324 |74999 |48248.55 |10096.5
|
||||
```
|
||||
|
||||
```sql
|
||||
SELECT MIN(salary / 12.0) AS min, MAX(salary / 12.0) AS max, AVG(salary/ 12.0) AS avg, MAD(salary / 12.0) AS mad FROM emp;
|
||||
|
||||
min | max | avg | mad
|
||||
------------------+-----------------+---------------+-----------------
|
||||
2110.3333333333335|6249.916666666667|4020.7125 |841.3750000000002
|
||||
```
|
||||
|
||||
|
||||
## `PERCENTILE` [sql-functions-aggs-percentile]
|
||||
|
||||
```sql
|
||||
PERCENTILE(
|
||||
field_name, <1>
|
||||
percentile[, <2>
|
||||
method[, <3>
|
||||
method_parameter]]) <4>
|
||||
```
|
||||
|
||||
**Input**:
|
||||
|
||||
1. a numeric field. If this field contains only `null` values, the function returns `null`. Otherwise, the function ignores `null` values in this field.
|
||||
2. a numeric expression (must be a constant and not based on a field). If `null`, the function returns `null`.
|
||||
3. optional string literal for the [percentile algorithm](/reference/data-analysis/aggregations/search-aggregations-metrics-percentile-aggregation.md#search-aggregations-metrics-percentile-aggregation-approximation). Possible values: `tdigest` or `hdr`. Defaults to `tdigest`.
|
||||
4. optional numeric literal that configures the [percentile algorithm](/reference/data-analysis/aggregations/search-aggregations-metrics-percentile-aggregation.md#search-aggregations-metrics-percentile-aggregation-approximation). Configures `compression` for `tdigest` or `number_of_significant_value_digits` for `hdr`. The default is the same as that of the backing algorithm.
|
||||
|
||||
|
||||
**Output**: `double` numeric value
|
||||
|
||||
**Description**:
|
||||
|
||||
Returns the nth [percentile](https://en.wikipedia.org/wiki/Percentile) (represented by `numeric_exp` parameter) of input values in the field `field_name`.
|
||||
|
||||
```sql
|
||||
SELECT languages, PERCENTILE(salary, 95) AS "95th" FROM emp
|
||||
GROUP BY languages;
|
||||
|
||||
languages | 95th
|
||||
---------------+-----------------
|
||||
null |74482.4
|
||||
1 |71122.8
|
||||
2 |70271.4
|
||||
3 |71926.0
|
||||
4 |69352.15
|
||||
5 |56371.0
|
||||
```
|
||||
|
||||
```sql
|
||||
SELECT languages, PERCENTILE(salary / 12.0, 95) AS "95th" FROM emp
|
||||
GROUP BY languages;
|
||||
|
||||
languages | 95th
|
||||
---------------+------------------
|
||||
null |6206.866666666667
|
||||
1 |5926.9
|
||||
2 |5855.949999999999
|
||||
3 |5993.833333333333
|
||||
4 |5779.345833333333
|
||||
5 |4697.583333333333
|
||||
```
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
languages,
|
||||
PERCENTILE(salary, 97.3, 'tdigest', 100.0) AS "97.3_TDigest",
|
||||
PERCENTILE(salary, 97.3, 'hdr', 3) AS "97.3_HDR"
|
||||
FROM emp
|
||||
GROUP BY languages;
|
||||
|
||||
languages | 97.3_TDigest | 97.3_HDR
|
||||
---------------+-----------------+---------------
|
||||
null |74720.036 |74992.0
|
||||
1 |72316.132 |73712.0
|
||||
2 |71792.436 |69936.0
|
||||
3 |73326.23999999999|74992.0
|
||||
4 |71753.281 |74608.0
|
||||
5 |61176.16000000001|56368.0
|
||||
```
|
||||
|
||||
|
||||
## `PERCENTILE_RANK` [sql-functions-aggs-percentile-rank]
|
||||
|
||||
```sql
|
||||
PERCENTILE_RANK(
|
||||
field_name, <1>
|
||||
value[, <2>
|
||||
method[, <3>
|
||||
method_parameter]]) <4>
|
||||
```
|
||||
|
||||
**Input**:
|
||||
|
||||
1. a numeric field. If this field contains only `null` values, the function returns `null`. Otherwise, the function ignores `null` values in this field.
|
||||
2. a numeric expression (must be a constant and not based on a field). If `null`, the function returns `null`.
|
||||
3. optional string literal for the [percentile algorithm](/reference/data-analysis/aggregations/search-aggregations-metrics-percentile-aggregation.md#search-aggregations-metrics-percentile-aggregation-approximation). Possible values: `tdigest` or `hdr`. Defaults to `tdigest`.
|
||||
4. optional numeric literal that configures the [percentile algorithm](/reference/data-analysis/aggregations/search-aggregations-metrics-percentile-aggregation.md#search-aggregations-metrics-percentile-aggregation-approximation). Configures `compression` for `tdigest` or `number_of_significant_value_digits` for `hdr`. The default is the same as that of the backing algorithm.
|
||||
|
||||
|
||||
**Output**: `double` numeric value
|
||||
|
||||
**Description**:
|
||||
|
||||
Returns the nth [percentile rank](https://en.wikipedia.org/wiki/Percentile_rank) (represented by `numeric_exp` parameter) of input values in the field `field_name`.
|
||||
|
||||
```sql
|
||||
SELECT languages, PERCENTILE_RANK(salary, 65000) AS rank FROM emp GROUP BY languages;
|
||||
|
||||
languages | rank
|
||||
---------------+-----------------
|
||||
null |73.65766569962062
|
||||
1 |73.7291625157734
|
||||
2 |88.88005607010643
|
||||
3 |79.43662623295829
|
||||
4 |85.70446389643493
|
||||
5 |96.79075152940749
|
||||
```
|
||||
|
||||
```sql
|
||||
SELECT languages, PERCENTILE_RANK(salary/12, 5000) AS rank FROM emp GROUP BY languages;
|
||||
|
||||
languages | rank
|
||||
---------------+------------------
|
||||
null |66.91240875912409
|
||||
1 |66.70766707667076
|
||||
2 |84.13266895048271
|
||||
3 |61.052992625621684
|
||||
4 |76.55646443990001
|
||||
5 |94.00696864111498
|
||||
```
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
languages,
|
||||
ROUND(PERCENTILE_RANK(salary, 65000, 'tdigest', 100.0), 2) AS "rank_TDigest",
|
||||
ROUND(PERCENTILE_RANK(salary, 65000, 'hdr', 3), 2) AS "rank_HDR"
|
||||
FROM emp
|
||||
GROUP BY languages;
|
||||
|
||||
languages | rank_TDigest | rank_HDR
|
||||
---------------+---------------+---------------
|
||||
null |73.66 |80.0
|
||||
1 |73.73 |73.33
|
||||
2 |88.88 |89.47
|
||||
3 |79.44 |76.47
|
||||
4 |85.7 |83.33
|
||||
5 |96.79 |95.24
|
||||
```
|
||||
|
||||
|
||||
## `SKEWNESS` [sql-functions-aggs-skewness]
|
||||
|
||||
```sql
|
||||
SKEWNESS(field_name) <1>
|
||||
```
|
||||
|
||||
**Input**:
|
||||
|
||||
1. a numeric field. If this field contains only `null` values, the function returns `null`. Otherwise, the function ignores `null` values in this field.
|
||||
|
||||
|
||||
**Output**: `double` numeric value
|
||||
|
||||
**Description**:
|
||||
|
||||
[Quantify](https://en.wikipedia.org/wiki/Skewness) the asymmetric distribution of input values in the field `field_name`.
|
||||
|
||||
```sql
|
||||
SELECT MIN(salary) AS min, MAX(salary) AS max, SKEWNESS(salary) AS s FROM emp;
|
||||
|
||||
min | max | s
|
||||
---------------+---------------+------------------
|
||||
25324 |74999 |0.2707722118423227
|
||||
```
|
||||
|
||||
::::{note}
|
||||
`SKEWNESS` cannot be used on top of scalar functions but only directly on a field. So, for example, the following is not allowed and an error is returned:
|
||||
|
||||
```sql
|
||||
SELECT SKEWNESS(ROUND(salary / 12.0, 2), gender FROM emp GROUP BY gender
|
||||
```
|
||||
|
||||
::::
|
||||
|
||||
|
||||
|
||||
## `STDDEV_POP` [sql-functions-aggs-stddev-pop]
|
||||
|
||||
```sql
|
||||
STDDEV_POP(field_name) <1>
|
||||
```
|
||||
|
||||
**Input**:
|
||||
|
||||
1. a numeric field. If this field contains only `null` values, the function returns `null`. Otherwise, the function ignores `null` values in this field.
|
||||
|
||||
|
||||
**Output**: `double` numeric value
|
||||
|
||||
**Description**:
|
||||
|
||||
Returns the [population standard deviation](https://en.wikipedia.org/wiki/Standard_deviations) of input values in the field `field_name`.
|
||||
|
||||
```sql
|
||||
SELECT MIN(salary) AS min, MAX(salary) AS max, STDDEV_POP(salary) AS stddev FROM emp;
|
||||
|
||||
min | max | stddev
|
||||
---------------+---------------+------------------
|
||||
25324 |74999 |13765.125502787832
|
||||
```
|
||||
|
||||
```sql
|
||||
SELECT MIN(salary / 12.0) AS min, MAX(salary / 12.0) AS max, STDDEV_POP(salary / 12.0) AS stddev FROM emp;
|
||||
|
||||
min | max | stddev
|
||||
------------------+-----------------+-----------------
|
||||
2110.3333333333335|6249.916666666667|1147.093791898986
|
||||
```
|
||||
|
||||
|
||||
## `STDDEV_SAMP` [sql-functions-aggs-stddev-samp]
|
||||
|
||||
```sql
|
||||
STDDEV_SAMP(field_name) <1>
|
||||
```
|
||||
|
||||
**Input**:
|
||||
|
||||
1. a numeric field. If this field contains only `null` values, the function returns `null`. Otherwise, the function ignores `null` values in this field.
|
||||
|
||||
|
||||
**Output**: `double` numeric value
|
||||
|
||||
**Description**:
|
||||
|
||||
Returns the [sample standard deviation](https://en.wikipedia.org/wiki/Standard_deviations) of input values in the field `field_name`.
|
||||
|
||||
```sql
|
||||
SELECT MIN(salary) AS min, MAX(salary) AS max, STDDEV_SAMP(salary) AS stddev FROM emp;
|
||||
|
||||
min | max | stddev
|
||||
---------------+---------------+------------------
|
||||
25324 |74999 |13834.471662090747
|
||||
```
|
||||
|
||||
```sql
|
||||
SELECT MIN(salary / 12.0) AS min, MAX(salary / 12.0) AS max, STDDEV_SAMP(salary / 12.0) AS stddev FROM emp;
|
||||
|
||||
min | max | stddev
|
||||
------------------+-----------------+-----------------
|
||||
2110.3333333333335|6249.916666666667|1152.872638507562
|
||||
```
|
||||
|
||||
|
||||
## `SUM_OF_SQUARES` [sql-functions-aggs-sum-squares]
|
||||
|
||||
```sql
|
||||
SUM_OF_SQUARES(field_name) <1>
|
||||
```
|
||||
|
||||
**Input**:
|
||||
|
||||
1. a numeric field. If this field contains only `null` values, the function returns `null`. Otherwise, the function ignores `null` values in this field.
|
||||
|
||||
|
||||
**Output**: `double` numeric value
|
||||
|
||||
**Description**:
|
||||
|
||||
Returns the sum of squares of input values in the field `field_name`.
|
||||
|
||||
```sql
|
||||
SELECT MIN(salary) AS min, MAX(salary) AS max, SUM_OF_SQUARES(salary) AS sumsq
|
||||
FROM emp;
|
||||
|
||||
min | max | sumsq
|
||||
---------------+---------------+----------------
|
||||
25324 |74999 |2.51740125721E11
|
||||
```
|
||||
|
||||
```sql
|
||||
SELECT MIN(salary / 24.0) AS min, MAX(salary / 24.0) AS max, SUM_OF_SQUARES(salary / 24.0) AS sumsq FROM emp;
|
||||
|
||||
min | max | sumsq
|
||||
------------------+------------------+-------------------
|
||||
1055.1666666666667|3124.9583333333335|4.370488293767361E8
|
||||
```
|
||||
|
||||
|
||||
## `VAR_POP` [sql-functions-aggs-var-pop]
|
||||
|
||||
```sql
|
||||
VAR_POP(field_name) <1>
|
||||
```
|
||||
|
||||
**Input**:
|
||||
|
||||
1. a numeric field. If this field contains only `null` values, the function returns `null`. Otherwise, the function ignores `null` values in this field.
|
||||
|
||||
|
||||
**Output**: `double` numeric value
|
||||
|
||||
**Description**:
|
||||
|
||||
Returns the [population variance](https://en.wikipedia.org/wiki/Variance) of input values in the field `field_name`.
|
||||
|
||||
```sql
|
||||
SELECT MIN(salary) AS min, MAX(salary) AS max, VAR_POP(salary) AS varpop FROM emp;
|
||||
|
||||
min | max | varpop
|
||||
---------------+---------------+----------------
|
||||
25324 |74999 |1.894786801075E8
|
||||
```
|
||||
|
||||
```sql
|
||||
SELECT MIN(salary / 24.0) AS min, MAX(salary / 24.0) AS max, VAR_POP(salary / 24.0) AS varpop FROM emp;
|
||||
|
||||
min | max | varpop
|
||||
------------------+------------------+------------------
|
||||
1055.1666666666667|3124.9583333333335|328956.04185329855
|
||||
```
|
||||
|
||||
|
||||
## `VAR_SAMP` [sql-functions-aggs-var-samp]
|
||||
|
||||
```sql
|
||||
VAR_SAMP(field_name) <1>
|
||||
```
|
||||
|
||||
**Input**:
|
||||
|
||||
1. a numeric field. If this field contains only `null` values, the function returns `null`. Otherwise, the function ignores `null` values in this field.
|
||||
|
||||
|
||||
**Output**: `double` numeric value
|
||||
|
||||
**Description**:
|
||||
|
||||
Returns the [sample variance](https://en.wikipedia.org/wiki/Variance) of input values in the field `field_name`.
|
||||
|
||||
```sql
|
||||
SELECT MIN(salary) AS min, MAX(salary) AS max, VAR_SAMP(salary) AS varsamp FROM emp;
|
||||
|
||||
min | max | varsamp
|
||||
---------------+---------------+----------------
|
||||
25324 |74999 |1.913926061691E8
|
||||
```
|
||||
|
||||
```sql
|
||||
SELECT MIN(salary / 24.0) AS min, MAX(salary / 24.0) AS max, VAR_SAMP(salary / 24.0) AS varsamp FROM emp;
|
||||
|
||||
min | max | varsamp
|
||||
------------------+------------------+----------------
|
||||
1055.1666666666667|3124.9583333333335|332278.830154847
|
||||
```
|
||||
|
||||
|
Loading…
Add table
Add a link
Reference in a new issue