elasticsearch/docs/reference/esql
Sylvain Wallez e78bdc953a
ESQL: add Arrow dataframes output format (#109873)
Initial support for Apache Arrow's streaming format as a response for ES|QL. It triggers based on the Accept header or the format request parameter.

Arrow has implementations in every mainstream language and is a backend of the Python Pandas library, which is extremely popular among data scientists and data analysts. Arrow's streaming format has also become the de facto standard for dataframe interchange. It is an efficient binary format that allows zero-cost deserialization by adding data access wrappers on top of memory buffers received from the network.

This PR builds on the experiment made by @nik9000 in PR #104877

Features/limitations:
- all ES|QL data types are supported
- multi-valued fields are not supported
- fields of type _source are output as JSON text in a varchar array. In a future iteration we may want to offer the choice of the more efficient CBOR and SMILE formats.

Technical details:

Arrow comes with its own memory management to handle vectors with direct memory, reference counting, etc. We don't want to use this as it conflicts with Elasticsearch's own memory management.

We therefore use the Arrow library only for the metadata objects describing the dataframe schema and the structure of the streaming format. The Arrow vector data is produced directly from ES|QL blocks.

---------

Co-authored-by: Nik Everett <nik9000@gmail.com>
2024-07-03 10:29:57 +02:00
..
functions [ES|QL] weighted_avg (#109993) 2024-07-02 18:29:02 -04:00
processing-commands ESQL: Run LOOKUP docs test only in SNAPSHOT (#109493) 2024-06-11 23:27:22 +10:00
source-commands ESQL: change from quoting from backtick to quote (#108395) 2024-06-30 20:01:31 +03:00
esql-across-clusters.asciidoc [DOCS][ESQL][8.14] Add API key based security model info for ESQL CCS (#109155) 2024-06-03 18:44:33 +02:00
esql-apis.asciidoc Add ES|QL async query api docs (#104054) 2024-01-09 09:17:02 +00:00
esql-async-query-api.asciidoc Remove esql version from docs (#108933) 2024-05-23 10:36:15 -04:00
esql-async-query-delete-api.asciidoc Add ES|QL async query api docs (#104054) 2024-01-09 09:17:02 +00:00
esql-async-query-get-api.asciidoc Add ES|QL async query api docs (#104054) 2024-01-09 09:17:02 +00:00
esql-commands.asciidoc ESQL: Implement LOOKUP, an "inline" enrich (#107987) 2024-06-07 11:38:51 +10:00
esql-enrich-data.asciidoc [DOCS] Refactor book-scoped variables in docs/reference/index.asciidoc (#107413) 2024-04-17 14:37:07 +02:00
esql-examples.asciidoc [DOCS] Small ES|QL improvements (#101877) 2023-11-07 17:24:59 +01:00
esql-functions-operators.asciidoc Introduce an IP functions group (#108304) 2024-05-06 13:43:30 +02:00
esql-get-started.asciidoc Revert "[DOCS] Remove ESQL demo env link from 8.14+ (#109562)" (#109579) 2024-06-11 17:04:37 +02:00
esql-kibana.asciidoc [DOCS] Update Using ESQL in Kibana doc (#108715) 2024-05-17 12:36:04 +02:00
esql-language.asciidoc ESQL: Remove OPTIONS clause in FROM command (#108692) 2024-05-15 18:15:02 -04:00
esql-limitations.asciidoc ESQL: Fix error on sorting unsortable geo_point and cartesian_point (#106351) 2024-03-14 23:16:53 +01:00
esql-process-data-with-dissect-grok.asciidoc [DOCS] Empty keys for ES|QL DISSECT (#102632) 2023-12-11 11:23:27 +01:00
esql-query-api.asciidoc ESQL: Implement LOOKUP, an "inline" enrich (#107987) 2024-06-07 11:38:51 +10:00
esql-rest.asciidoc ESQL: add Arrow dataframes output format (#109873) 2024-07-03 10:29:57 +02:00
esql-security-solution.asciidoc Update esql-security-solution.asciidoc (#104531) 2024-01-18 15:48:43 +01:00
esql-syntax.asciidoc ESQL: Add more time span units (#108300) 2024-05-08 08:51:02 -04:00
esql-using.asciidoc Remove esql version from docs (#108933) 2024-05-23 10:36:15 -04:00
implicit-casting.asciidoc [DOCS] ES|QL implicit casting (#108618) 2024-05-15 09:07:09 -04:00
index.asciidoc [DOCS] ESQL goes GA (#108342) 2024-05-07 14:12:50 +02:00
metadata-fields.asciidoc Reapply "ESQL: Expose "_ignored" metadata field" (#108864) (#108871) 2024-05-22 07:06:04 -04:00
multivalued-fields.asciidoc Remove esql version from docs (#108933) 2024-05-23 10:36:15 -04:00
task-management.asciidoc [DOCS] One more round of restructuring the ES|QL documentation (#101340) 2023-10-26 10:57:05 +02:00