kibana/packages/kbn-esql-ast
Vadim Kibana ec90430d4b
[ES|QL] Correctly format column nodes (#192343)
## Summary

Closes https://github.com/elastic/kibana/issues/192258


### Checklist

- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

### For maintainers

- [x] This was checked for breaking API changes and was [labeled
appropriately](https://www.elastic.co/guide/en/kibana/master/contributing.html#kibana-release-notes-process)
2024-09-09 04:48:10 -05:00
..
scripts Adds AGPL 3.0 license (#192025) 2024-09-06 19:02:41 -06:00
src [ES|QL] Correctly format column nodes (#192343) 2024-09-09 04:48:10 -05:00
BUILD.bazel [ES|QL] adapt to dev mode grammar gating (#192027) 2024-09-04 09:32:40 -05:00
index.ts Adds AGPL 3.0 license (#192025) 2024-09-06 19:02:41 -06:00
jest.config.js Adds AGPL 3.0 license (#192025) 2024-09-06 19:02:41 -06:00
kibana.jsonc [ES|QL] Move last packages to the project team (#179538) 2024-03-27 08:25:55 -07:00
package.json Adds AGPL 3.0 license (#192025) 2024-09-06 19:02:41 -06:00
README.md
tsconfig.json

ES|QL utility library

Folder structure

This library brings all the foundation data structure to enable all advanced features within an editor for ES|QL as validation, autocomplete, hover, etc... The package is structure as follow:

src
  |- antlr                      // => contains the ES|QL grammar files and various compilation assets
  | ast_factory.ts              // => binding to the Antlr that generates the AST data structure
  | ast_errors.ts               // => error translation utility from raw Antlr to something understandable (somewhat)
  | antlr_error_listener.ts     // => The ES|QL syntax error listener
  | antlr_facade.ts             // => getParser and getLexer utilities
  | ...                         // => miscellaneas utilities to work with AST

Basic usage

Get AST from a query string

This module contains the entire logic to translate from a query string into the AST data structure. The getAstAndSyntaxErrors function returns the AST data structure, unless a syntax error happens in which case the errors array gets populated with a Syntax error.

Usage
import { getAstAndSyntaxErrors } from '@kbn/esql-ast';

const queryString = "from index | stats 1 + avg(myColumn) ";
const { ast, errors} = await astProvider(queryString);

if(errors){
  console.log({ syntaxErrors: errors });
}
// do stuff with the ast

How does it work

The general idea of this package is to provide all ES|QL features on top of a custom compact AST definition (all data structure types defined in ./types.ts) which is designed to be resilient to many grammar changes. The pipeline is the following:

Antlr grammar files
=> Compiled grammar files (.ts assets in the antlr folder)
=> AST Factory (Antlr Parser tree => custom AST)

Each feature function works with the combination of the AST and the definition files: the former describe the current statement in a easy to traverse way, while the definitions describe what's the expected behaviour of each node in the AST node (i.e. what arguments should it accept? How many arguments? etc...). While AST requires the grammar to be compiled to be updated, definitions are static files which can be dynamically updated without running the ANTLR compile task.

AST

The AST is generated by 2 files: ast_factory.ts and its buddy ast_walker.ts:

  • ast_factory.ts is a binding to Antlr and access the Parser tree
  • Parser tree is passed over to ast_walker to append new AST nodes

In general Antlr is resilient to grammar errors, in the sense that it can produe a Parser tree up to the point of the error, then stops. This is useful to perform partial tasks even with broken queries and this means that a partial AST can be produced even with an invalid query.

Keeping ES|QL up to date

In general when operating on changes here use the yarn kbn watch in a terminal window to make sure changes are correctly compiled.

How to add new commands/options

When a new command/option is added to ES|QL it is done via a grammar update. Therefore adding them requires a two step phase:

  • Update the grammar with the new one
    • add/fix all AST generator bindings in case of new/changed TOKENS in the lexer grammar file
  • Update the definition files for commands/options

To update the grammar:

  1. Make sure the lexer and parser files are up to date with their ES counterparts
  • an existing Kibana CI job is updating them already automatically
  1. Run the script into the package.json to compile the ES|QL grammar.
  2. open the ast_factory.ts file and add a new exit<Command/Option> method
  3. write some code in the ast_walker/ts to translate the Antlr Parser tree into the custom AST (there are already few utilites for that, but sometimes it is required to write some more code if the parser introduced a new flow)
  • pro tip: use the http://lab.antlr.org/ to visualize/debug the parser tree for a given statement (copy and paste the grammar files there)
  1. if something goes wrong with new quoted/unquoted identifier token, open the ast_helpers.ts and check the ids of the new tokens in the getQuotedText and getUnquotedText functions - please make sure to leave a comment on the token name

Debug and fix grammar changes (tokens, etc...)

On TOKEN renaming or with subtle lexer grammar changes it can happens that test breaks, this can be happen for two main issues:

  • A TOKEN name changed so the ast_walker.ts doesn't find it any more. Go there and rename the TOKEN name.
  • TOKEN order changed and tests started failing. This probably generated some TOKEN id reorder and there are two functions in ast_helpers.ts who rely on hardcoded ids: getQuotedText and getUnquotedText.
    • Note that the getQuotedText and getUnquotedText are automatically updated on grammar changes detected by the Kibana CI sync job.
    • to fix this just look at the commented tokens and update the ids. If a new token add it and leave a comment to point to the new token name.
    • This choice was made to reduce the bundle size, as importing the esql_parser adds some hundreds of Kbs to the bundle otherwise.