[ES|QL] AST package documentation (#194296)

Updates documentation for the ES|QL AST package.
2025-04-24 01:38:56 -04:00 · 2024-09-30 18:14:58 +02:00 · 2024-09-30 18:14:58 +02:00 · 5def848d2c
commit 5def848d2c
parent 896dce358c
6 changed files with 578 additions and 111 deletions
--- a/packages/kbn-esql-ast/README.md
+++ b/packages/kbn-esql-ast/README.md
@ -1,89 +1,38 @@
-# ES|QL utility library
+# ES|QL AST library

-## Folder structure
+The general idea of this package is to provide low-level ES|QL parsing,
+building, traversal, pretty-printing, and manipulation features on top of a
+custom compact AST representation, which is designed to be resilient to many
+grammar changes.

-This library brings all the foundation data structure to enable all advanced features within an editor for ES|QL as validation, autocomplete, hover, etc...
-The package is structure as follow:
+Contents of this package:

-```
-src
-  |- antlr                      // => contains the ES|QL grammar files and various compilation assets
-  | ast_factory.ts              // => binding to the Antlr that generates the AST data structure
-  | ast_errors.ts               // => error translation utility from raw Antlr to something understandable (somewhat)
-  | antlr_error_listener.ts     // => The ES|QL syntax error listener
-  | antlr_facade.ts             // => getParser and getLexer utilities
-  | ...                         // => miscellaneas utilities to work with AST
+- [`builder` &mdash; Contains the `Builder` class for AST node construction](./src/builder/README.md).
+- [`parser` &mdash; Contains text to ES|QL AST parsing code](./src/parser/README.md).
+- [`walker` &mdash; Contains the ES|QL AST `Walker` utility](./src/walker/README.md).
+- [`visitor` &mdash; Contains the ES|QL AST `Visitor` utility](./src/visitor/README.md).
+- [`pretty_print` &mdash; Contains code for formatting AST to text](./src/pretty_print/README.md).
+
+
+## Demo
+
+Much of the functionality of this package is demonstrated in the demo UI. You
+can run it in Storybook, using the following command:
+
+```bash
+yarn storybook esql_ast_inspector
 ```

-### Basic usage
+Alternatively, you can start Kibana with *Example Plugins* enabled, using:

-#### Get AST from a query string
-
-This module contains the entire logic to translate from a query string into the AST data structure.
-The `getAstAndSyntaxErrors` function returns the AST data structure, unless a syntax error happens in which case the `errors` array gets populated with a Syntax error.
-
-##### Usage
-
-```js
-import { getAstAndSyntaxErrors } from '@kbn/esql-ast';
-
-const queryString = "from index | stats 1 + avg(myColumn) ";
-const { ast, errors} = await astProvider(queryString);
-
-if(errors){
-  console.log({ syntaxErrors: errors });
-}
-// do stuff with the ast
+```bash
+yarn start --run-examples
 ```

-## How does it work
+Then navigate to the *ES|QL AST Inspector* plugin in the Kibana UI.

-The general idea of this package is to provide all ES|QL features on top of a custom compact AST definition (all data structure types defined in `./types.ts`) which is designed to be resilient to many grammar changes.
-The pipeline is the following:

-```
-Antlr grammar files
-=> Compiled grammar files (.ts assets in the antlr folder)
-=> AST Factory (Antlr Parser tree => custom AST)
-```
+## Keeping ES|QL AST library up to date

-Each feature function works with the combination of the AST and the definition files: the former describe the current statement in a easy to traverse way, while the definitions describe what's the expected behaviour of each node in the AST node (i.e. what arguments should it accept? How many arguments? etc...).
-While AST requires the grammar to be compiled to be updated, definitions are static files which can be dynamically updated without running the ANTLR compile task.
-
-#### AST
-
-The AST is generated by 2 files: `ast_factory.ts` and its buddy `ast_walker.ts`:
-* `ast_factory.ts` is a binding to Antlr and access the Parser tree
-* Parser tree is passed over to `ast_walker` to append new AST nodes
-
-In general Antlr is resilient to grammar errors, in the sense that it can produe a Parser tree up to the point of the error, then stops. This is useful to perform partial tasks even with broken queries and this means that a partial AST can be produced even with an invalid query.
-
-### Keeping ES|QL up to date
-
-In general when operating on changes here use the `yarn kbn watch` in a terminal window to make sure changes are correctly compiled.
-
-### How to add new commands/options
-
-When a new command/option is added to ES|QL it is done via a grammar update.
-Therefore adding them requires a two step phase:
-* Update the grammar with the new one
-    * add/fix all AST generator bindings in case of new/changed TOKENS in the `lexer` grammar file
-* Update the definition files for commands/options
-
-To update the grammar:
-1. Make sure the `lexer` and `parser` files are up to date with their ES counterparts
-  * an existing Kibana CI job is updating them already automatically
-2. Run the script into the `package.json` to compile the ES|QL grammar.
-3. open the `ast_factory.ts` file and add a new `exit<Command/Option>` method
-4. write some code in the `ast_walker/ts` to translate the Antlr Parser tree into the custom AST (there are already few utilites for that, but sometimes it is required to write some more code if the `parser` introduced a new flow)
-  * pro tip: use the `http://lab.antlr.org/` to visualize/debug the parser tree for a given statement (copy and paste the grammar files there)
-5. if something goes wrong with new quoted/unquoted identifier token, open the `ast_helpers.ts` and check the ids of the new tokens in the `getQuotedText` and `getUnquotedText` functions - please make sure to leave a comment on the token name
-
-#### Debug and fix grammar changes (tokens, etc...)
-
-On TOKEN renaming or with subtle `lexer` grammar changes it can happens that test breaks, this can be happen for two main issues:
-* A TOKEN name changed so the `ast_walker.ts` doesn't find it any more. Go there and rename the TOKEN name.
-* TOKEN order changed and tests started failing. This probably generated some TOKEN id reorder and there are two functions in `ast_helpers.ts` who rely on hardcoded ids: `getQuotedText` and `getUnquotedText`.
-  * Note that the `getQuotedText` and `getUnquotedText` are automatically updated on grammar changes detected by the Kibana CI sync job.
-  * to fix this just look at the commented tokens and update the ids. If a new token add it and leave a comment to point to the new token name.
-  * This choice was made to reduce the bundle size, as importing the `esql_parser` adds some hundreds of Kbs to the bundle otherwise.
+In general when operating on changes here use the `yarn kbn watch` in a terminal
+window to make sure changes are correctly compiled.
--- a/packages/kbn-esql-ast/src/builder/README.md
+++ b/packages/kbn-esql-ast/src/builder/README.md
@ -0,0 +1,39 @@
+# Builder
+
+Contains the `Builder` class for AST node construction. It provides the most
+low-level stateless AST node construction API.
+
+The `Builder` API can be used when constructing AST nodes from scratch manually,
+and it is also used by the parser to construct the AST nodes during the parsing
+process.
+
+When parsing the AST nodes will typically have more information, such as the
+position in the source code, and other metadata. When constructing the AST nodes
+manually, this information is not available, but the `Builder` API can still be
+used as it permits to skip the metadata.
+
+
+## Usage
+
+Construct a `literal` expression node:
+
+```typescript
+import { Builder } from '@kbn/esql-ast';
+
+const node = Builder.expression.literal.numeric({ value: 42, literalType: 'integer' });
+```
+
+Returns:
+
+```js
+{
+  type: 'literal',
+  literalType: 'integer',
+  value: 42,
+  name: '42',
+
+  location: { min: 0, max: 0 },
+  text: '',
+  incomplete: false,
+}
+```
--- a/packages/kbn-esql-ast/src/parser/README.md
+++ b/packages/kbn-esql-ast/src/parser/README.md
@ -1,6 +1,91 @@
+# ES|QL Parser
+
+The Kibana ES|QL parser uses the ANTLR library for lexing and parse tree (CST)
+generation. The ANTLR grammar is imported from the Elasticsearch repository in
+an automated CI job.
+
+We use the ANTLR outputs: (1) the token stream; and (2) the parse tree to
+generate (1) the Abstract Syntax Tree (AST), (2) for syntax validation, (3) for
+syntax highlighting, and (4) for formatting (comment and whitespace) extraction
+and assignment to AST nodes.
+
+In general ANTLR is resilient to grammar errors, in the sense that it can
+produce a Parser tree up to the point of the error, then stops. This is useful
+to perform partial tasks even with broken queries and this means that a partial
+AST can be produced even with an invalid query.
+
+
+## Folder structure
+
+The parser is structured as follows:
+
+```
+src/
+|- parser/                            Contains the logic to parse the ES|QL query and generate the AST.
+|  |- factories.ts                    Contains AST node factories.
+|  |- antlr_error_listener.ts         Contains code which traverses ANTLR CST and collects syntax errors.
+|  |- esql_ast_builder_listener.ts    Contains code which traverses ANTLR CST and builds the AST.
+|
+|- antlr/                             Contains the autogenerated ES|QL ANTLR grammar files and various compilation assets.
+   |- esql_lexer.g4                   Contains the ES|QL ANTLR lexer grammar.
+   |- esql_parser.g4                  Contains the ES|QL ANTLR parser grammar.
+```
+
+
+## Usage
+
+### Get AST from a query string
+
+The `parse` function returns the AST data structure, unless a syntax error
+happens in which case the `errors` array gets populated with a Syntax errors.
+
+```js
+import { parse } from '@kbn/esql-ast';
+
+const src = "FROM index | STATS 1 + AVG(myColumn) ";
+const { root, errors } = await parse(src);
+
+if(errors){
+  console.log({ syntaxErrors: errors });
+}
+
+// do stuff with the ast
+```
+
+The `root` is the root node of the AST. The AST is a tree structure where each
+node represents a part of the query. Each node has a `type` property which
+indicates the type of the node.
+
+
+### Parse a query and populate the AST with comments
+
+When calling the `parse` method with the `withFormatting` flag set to `true`,
+the AST will be populated with comments.
+
+```js
+import { parse } from '@kbn/esql-ast';
+
+const src = "FROM /* COMMENT */ index";
+const { root } = await parse(src, { withFormatting: true });
+```
+
+
 ## Comments

-### Inter-node comment places
+By default, when parsing the AST does not include any *formatting* information,
+such as comments or whitespace. This is because the AST is designed to be
+compact and to be used for syntax validation, syntax highlighting, and other
+high-level operations.
+
+However, sometimes it is useful to have comments attached to the AST nodes. The
+parser can collect all comments when the `withFormatting` flag is set to `true`
+and attach them to the AST nodes. The comments are attached to the closest node,
+while also considering the surrounding punctuation.
+
+### Inter-node comments
+
+Currently, when parsed inter-node comments are attached to the node from the
+left side.

 Around colon in source identifier:

@ -25,3 +110,60 @@ Time interface expressions:
 ```eslq
 STATS 1 /* asdf */ DAY
 ```
+
+
+## Internal Details
+
+
+### How does it work?
+
+The pipeline is the following:
+
+1. ANTLR grammar files are added to Kibana.
+2. ANTLR grammar files are compiled to `.ts` assets in the `antlr` folder.
+3. A query is parsed to a CST by ANTLR.
+4. The `ESQLAstBuilderListener` traverses the CST and builds the AST.
+5. Optionally:
+  1. Comments and whitespace are extracted from the ANTLR lexer's token stream.
+  2. The comments and whitespace are attached to the AST nodes.
+
+
+### How to add new commands/options?
+
+When a new command/option is added to ES|QL it is done via a grammar update.
+Therefore adding them requires a two step phase:
+
+To update the grammar:
+
+1. Make sure the `lexer` and `parser` files are up to date with their ES
+   counterparts.
+  * an existing Kibana CI job is updating them already automatically
+2. Run the script into the `package.json` to compile the ES|QL grammar.
+3. open the `ast_factory.ts` file and add a new `exit<Command/Option>` method
+4. write some code in the `ast_walker/ts` to translate the Antlr Parser tree
+   into the custom AST (there are already few utilites for that, but sometimes
+   it is required to write some more code if the `parser` introduced a new flow)
+  * pro tip: use the `http://lab.antlr.org/` to visualize/debug the parser tree
+    for a given statement (copy and paste the grammar files there)
+5. if something goes wrong with new quoted/unquoted identifier token, open
+   the `ast_helpers.ts` and check the ids of the new tokens in the `getQuotedText`
+   and `getUnquotedText` functions, please make sure to leave a comment on the
+   token name
+
+
+#### Debug and fix grammar changes (tokens, etc...)
+
+On token renaming or with subtle `lexer` grammar changes it can happens that
+test breaks, this can be happen for two main issues:
+
+* A token name changed so the `esql_ast_builder_listener.ts` doesn't find it any
+  more. Go there and rename the TOKEN name.
+* Token order changed and tests started failing. This probably generated some
+  token id reorder and there are two functions in `helpers.ts` who rely on
+  hardcoded ids: `getQuotedText` and `getUnquotedText`.
+  * Note that the `getQuotedText` and `getUnquotedText` are automatically
+    updated on grammar changes detected by the Kibana CI sync job.
+  * to fix this just look at the commented tokens and update the ids. If a new
+    token add it and leave a comment to point to the new token name.
+  * This choice was made to reduce the bundle size, as importing the
+    `esql_parser` adds some hundreds of Kbs to the bundle otherwise.
--- a/packages/kbn-esql-ast/src/pretty_print/README.md
+++ b/packages/kbn-esql-ast/src/pretty_print/README.md
@ -4,20 +4,82 @@
 human-readable string. This is useful for debugging or for displaying
 the AST to the user.

-This module provides a number of pretty-printing options.
+This module provides a number of pretty-printing facilities. There are two
+main classes that provide pretty-printing:
+
+- `BasicPrettyPrinter` &mdash; provides the basic pretty-printing to a single
+  line.
+- `WrappingPrettyPrinter` &mdash; provides more advanced pretty-printing, which
+  can wrap the query to multiple lines, and can also wrap the query to a
+  specific width.


 ## `BasicPrettyPrinter`

-The `BasicPrettyPrinter` class provides the most basic pretty-printing&mdash;it
-prints a query to a single line. Or it can print a query with each command on
-a separate line, with the ability to customize the indentation before the pipe
-character.
+The `BasicPrettyPrinter` class provides the simpler pretty-printing
+functionality&mdash;it prints a query to a single line. Or, it can print a query
+with each command on a separate line, with the ability to customize the
+indentation before the pipe character.
+
+Usage:
+
+```typescript
+import { parse, BasicPrettyPrinter } from '@kbn/esql-ast';
+
+const src = 'FROM index | LIMIT 10';
+const { root } = parse(src);
+const text = BasicPrettyPrinter.print(root);
+
+console.log(text); // FROM index | LIMIT 10
+```
+
+It can print each command on a separate line, with a custom indentation before
+the pipe character:
+
+```typescript
+const text = BasicPrettyPrinter.multiline(root, { pipeTab: '  ' });
+```

 It can also print a single command to a single line; or an expression to a
-single line.
+single line. Below is the summary of the top-level functions:

 - `BasicPrettyPrinter.print()` &mdash; prints query to a single line.
 - `BasicPrettyPrinter.multiline()` &mdash; prints a query to multiple lines.
 - `BasicPrettyPrinter.command()` &mdash; prints a command to a single line.
- `BasicPrettyPrinter.expression()` &mdash; prints an expression to a single line.
+- `BasicPrettyPrinter.expression()` &mdash; prints an expression to a single
+  line.
+
+See `BasicPrettyPrinterOptions` for formatting options. For example, a
+`lowercase` options allows you to lowercase all ES|QL keywords:
+
+```typescript
+const text = BasicPrettyPrinter.print(root, { lowercase: true });
+```
+
+The `BasicPrettyPrinter` prints only *left* and *right* multi-line comments,
+which do not have line breaks, as this formatter is designed to print a query
+to a single line. If you need to print a query to multiple lines, use the
+`WrappingPrettyPrinter`.
+
+
+## `WrappingPrettyPrinter`
+
+The *wrapping pretty printer* can print a query to multiple lines, and can wrap
+the text to a new line if the line width exceeds a certain threshold. It also
+prints all comments attached to the AST (including ones that force the text
+to be wrapped).
+
+Usage:
+
+```typescript
+import { parse, WrappingPrettyPrinter } from '@kbn/esql-ast';
+
+const src = `
+  FROM index /* this is a comment */
+  | LIMIT 10`;
+const { root } = parse(src, { withFormatting: true });
+const text = WrappingPrettyPrinter.print(root);
+```
+
+See `WrappingPrettyPrinterOptions` interface for available formatting options.
+
--- a/packages/kbn-esql-ast/src/visitor/README.md
+++ b/packages/kbn-esql-ast/src/visitor/README.md
@ -1,4 +1,28 @@
-## High-level AST structure
+# `Visitor` Traversal API
+
+The `Visitor` traversal API provides a feature-rich way to traverse the ES|QL
+AST. It is more powerful than the [`Walker` API](../walker/README.md), as it
+allows to traverse the AST in a more flexible way.
+
+The `Visitor` API allows to traverse the AST starting from the root node or a
+command statement, or an expression. Unlike in the `Walker` API, the `Visitor`
+does not automatically traverse the entire AST. Instead, the developer has to
+manually call the necessary *visit* methods to traverse the AST. This allows
+to traverse the AST in a more flexible way: only traverse the parts of the AST
+that are needed, or maybe traverse the AST in a different order, or multiple
+times.
+
+The `Visitor` API is also more powerful than the `Walker` API, as for each
+visitor callback it provides a *context* object, which contains the information
+about the current node as well as the parent node, and the whole parent chain
+up to the root node.
+
+In addition, each visitor callback can return a value (*output*), which is then
+passed to the parent node, in the place where the visitor was called. Also, when
+a child is visited, the parent node can pass in *input* to the child visitor.
+
+
+## About ES|QL AST structure

 Broadly, there are two AST node types: (1) commands (say `FROM ...`, like
 *statements* in other languages), and (2) expressions (say `a + b`, or `fn()`).
@ -59,7 +83,8 @@ As of this writing, the following expressions are defined:
 - Column identifier expression, `{type: "column"}`, like `@timestamp`
 - Function call expression, `{type: "function"}`, like `fn(123)`
 - Literal expression, `{type: "literal"}`, like `123`, `"hello"`
- List literal expression, `{type: "list"}`, like `[1, 2, 3]`, `["a", "b", "c"]`, `[true, false]`
+- List literal expression, `{type: "list"}`, like `[1, 2, 3]`,
+  `["a", "b", "c"]`, `[true, false]`
 - Time interval expression, `{type: "interval"}`, like `1h`, `1d`, `1w`
 - Inline cast expression, `{type: "cast"}`, like `abc::int`, `def::string`
 - Unknown node, `{type: "unknown"}`
@ -67,3 +92,176 @@ As of this writing, the following expressions are defined:
 Each expression has a `visitExpressionX` callback, where `X` is the type of the
 expression. If a expression-specific callback is not found, the generic
 `visitExpression` callback is called.
+
+
+## `Visitor` API Usage
+
+The `Visitor` API is used to traverse the AST. The process is as follows:
+
+1. Create a new `Visitor` instance.
+2. Register callbacks for the nodes you are interested in.
+3. Call the `visitQuery`, `visitCommand`, or `visitExpression` method to start
+   the traversal.
+
+For example, the below code snippet prints the type of each expression node:
+
+```typescript
+new Visitor()
+  .on('visitExpression', (ctx) => console.log(ctx.node.type))
+  .on('visitCommand', (ctx) => [...ctx.visitArguments()])
+  .on('visitQuery', (ctx) => [...ctx.visitCommands()])
+  .visitQuery(root);
+```
+
+In the `visitQuery` callback it visits all commands, using the `visitCommands`.
+In the `visitCommand` callback it visits all arguments, using the
+`visitArguments`. And finally, in the `visitExpression` callback it prints the
+type of the expression node.
+
+Above we started the traversal from the root node, using the `.visitQuery(root)`
+method. However, one can start the traversal from any node, by calling the
+following methods:
+
+- `.visitQuery()` &mdash; Start traversal from the root node.
+- `.visitCommand()` &mdash; Start traversal from a command node.
+- `.visitExpression()` &mdash; Start traversal from an expression node.
+
+
+### Specifying Callbacks
+
+The simplest way to traverse the AST is to specify the below three callbacks:
+
+- `visitQuery` &mdash; Called for every query node. (Normally once.)
+- `visitCommand` &mdash; Called for every command node.
+- `visitExpression` &mdash; Called for every expression node.
+
+
+However, you can be more specific and specify callbacks for commands and
+expression types. This way the context `ctx` provided to the callback will have
+helpful methods specific to the node type.
+
+When a more specific callback is not found, the generic `visitCommand` or
+`visitExpression` callbacks are not called for that node.
+
+You can specify a specific callback for each command, instead of the generic
+`visitCommand`:
+
+- `visitFromCommand` &mdash; Called for every `FROM` command node.
+- `visitLimitCommand` &mdash; Called for every `LIMIT` command node.
+- `visitExplainCommand` &mdash; Called for every `EXPLAIN` command node.
+- `visitRowCommand` &mdash; Called for every `ROW` command node.
+- `visitMetricsCommand` &mdash; Called for every `METRICS` command node.
+- `visitShowCommand` &mdash; Called for every `SHOW` command node.
+- `visitMetaCommand` &mdash; Called for every `META` command node.
+- `visitEvalCommand` &mdash; Called for every `EVAL` command node.
+- `visitStatsCommand` &mdash; Called for every `STATS` command node.
+- `visitInlineStatsCommand` &mdash; Called for every `INLINESTATS` command node.
+- `visitLookupCommand` &mdash; Called for every `LOOKUP` command node.
+- `visitKeepCommand` &mdash; Called for every `KEEP` command node.
+- `visitSortCommand` &mdash; Called for every `SORT` command node.
+- `visitWhereCommand` &mdash; Called for every `WHERE` command node.
+- `visitDropCommand` &mdash; Called for every `DROP` command node.
+- `visitRenameCommand` &mdash; Called for every `RENAME` command node.
+- `visitDissectCommand` &mdash; Called for every `DISSECT` command node.
+- `visitGrokCommand` &mdash; Called for every `GROK` command node.
+- `visitEnrichCommand` &mdash; Called for every `ENRICH` command node.
+- `visitMvExpandCommand` &mdash; Called for every `MV_EXPAND` command node.
+
+Similarly, you can specify a specific callback for each expression type, instead
+of the generic `visitExpression`:
+
+- `visitColumnExpression` &mdash; Called for every column expression node, say
+  `@timestamp`.
+- `visitSourceExpression` &mdash; Called for every source expression node, say
+  `tsdb_index`.
+- `visitFunctionCallExpression` &mdash; Called for every function call
+  expression node. Including binary expressions, such as `a + b`.
+- `visitLiteralExpression` &mdash; Called for every literal expression node, say
+  `123`, `"hello"`.
+- `visitListLiteralExpression` &mdash; Called for every list literal expression
+  node, say `[1, 2, 3]`, `["a", "b", "c"]`.
+- `visitTimeIntervalLiteralExpression` &mdash; Called for every time interval
+  literal expression node, say `1h`, `1d`, `1w`.
+- `visitInlineCastExpression` &mdash; Called for every inline cast expression
+  node, say `abc::int`, `def::string`.
+- `visitRenameExpression` &mdash; Called for every rename expression node, say
+  `a AS b`.
+- `visitOrderExpression` &mdash; Called for every order expression node, say
+  `@timestamp ASC`.
+
+
+### Using the Node Context
+
+Each visitor callback receives a `ctx` object, which contains the reference to
+the parent node's context:
+
+```typescript
+new Visitor()
+  .on('visitExpression', (ctx) => {
+    ctx.parent
+  });
+```
+
+Each visitor callback also contains various methods to visit the children nodes,
+if needed. For example, to visit all arguments of a command node:
+
+```typescript
+const expressions = [];
+
+new Visitor()
+  .on('visitExpression', (ctx) => expressions.push(ctx.node));
+  .on('visitCommand', (ctx) => {
+    for (const output of ctx.visitArguments()) {
+    }
+  });
+```
+
+The node context object may also have node specific methods. For example, the
+`LIMIT` command context has the `.numeric()` method, which returns the numeric
+value of the `LIMIT` command:
+
+```typescript
+new Visitor()
+  .on('visitLimitCommand', (ctx) => {
+    console.log(ctx.numeric());
+  })
+  .on('visitCommand', () => null)
+  .on('visitQuery', (ctx) => [...ctx.visitCommands()])
+  .visitQuery(root);
+```
+
+
+### Using the Visitor Output
+
+Each visitor callback can return a *output*, which is then passed to the parent
+callback. This allows to pass information from the child node to the parent
+node.
+
+For example, the below code snippet collects all column names in the AST:
+
+```typescript
+const columns = new Visitor()
+  .on('visitExpression', (ctx) => null)
+  .on('visitColumnExpression', (ctx) => ctx.node.name)
+  .on('visitCommand', (ctx) => [...ctx.visitArguments()])
+  .on('visitQuery', (ctx) => [...ctx.visitCommands()])
+  .visitQuery(root);
+```
+
+
+### Using the Visitor Input
+
+Analogous to the output, each visitor callback can receive an *input* value.
+This allows to pass information from the parent node to the child node.
+
+For example, the below code snippet prints all column names prefixed with the
+text `"prefix"`:
+
+```typescript
+new Visitor()
+  .on('visitExpression', (ctx) => null)
+  .on('visitColumnExpression', (ctx, INPUT) => console.log(INPUT + ctx.node.name))
+  .on('visitCommand', (ctx) => [...ctx.visitArguments("prefix")])
+  .on('visitQuery', (ctx) => [...ctx.visitCommands()])
+  .visitQuery(root);
+``` 
--- a/packages/kbn-esql-ast/src/walker/README.md
+++ b/packages/kbn-esql-ast/src/walker/README.md
@ -1,41 +1,118 @@
-# ES|QL AST Walker
+# `Walker` Traversal API

-The ES|QL AST Walker is a utility that traverses the ES|QL AST and provides a
-set of callbacks that can be used to perform introspection of the AST.
+The ES|QL AST `Walker` is a utility that traverses the ES|QL AST. The developer
+can provide a set of callbacks which are called when the walker visits a
+specific type of node.
+
+The `Walker` utility allows to traverse the AST starting from any node, not just
+the root node.
+
+
+## Low-level API

 To start a new *walk* you create a `Walker` instance and call the `walk()` method
 with the AST node to start the walk from.

 ```ts
-
-import { Walker, getAstAndSyntaxErrors } from '@kbn/esql-ast';
+import { Walker } from '@kbn/esql-ast';

 const walker = new Walker({
-  // Called every time a function node is visited.
-  visitFunction: (fn) => {
+  /**
+   * Visit commands
+   */
+  visitCommand: (node: ESQLCommand) => {
+    // Called for every command node.
+  },
+  visitCommandOption: (node: ESQLCommandOption) => {
+    // Called for every command option node.
+  },
+
+  /**
+   * Visit expressions
+   */
+  visitFunction: (fn: ESQLFunction) => {
+    // Called every time a function expression is visited.
    console.log('Function:', fn.name);
  },
-  // Called every time a source identifier node is visited.
-  visitSource: (source) => {
+  visitSource: (source: ESQLSource) => {
+    // Called every time a source identifier expression is visited.
    console.log('Source:', source.name);
  },
+  visitQuery: (node: ESQLAstQueryExpression) => {
+    // Called for every query node.
+  },
+  visitColumn: (node: ESQLColumn) => {
+    // Called for every column node.
+  },
+  visitLiteral: (node: ESQLLiteral) => {
+    // Called for every literal node.
+  },
+  visitListLiteral: (node: ESQLList) => {
+    // Called for every list literal node.
+  },
+  visitTimeIntervalLiteral: (node: ESQLTimeInterval) => {
+    // Called for every time interval literal node.
+  },
+  visitInlineCast: (node: ESQLInlineCast) => {
+    // Called for every inline cast node.
+  },
 });

-const { ast } = getAstAndSyntaxErrors('FROM source | STATS fn()');
 walker.walk(ast);
 ```

-Conceptual structure of an ES|QL AST:
+It is also possible to provide a single `visitAny` callback that is called for
+any node type that does not have a specific visitor.

- A single ES|QL query is composed of one or more source commands and zero or
-  more transformation commands.
- Each command is represented by a `command` node.
- Each command contains a list expressions named in ES|QL AST as *AST Item*.
-  - `function` &mdash; function call expression.
-  - `option` &mdash; a list of expressions with a specific role in the command.
-  - `source` &mdash; s source identifier expression.
-  - `column` &mdash; a field identifier expression.
-  - `timeInterval` &mdash; a time interval expression.
-  - `list` &mdash; a list literal expression.
-  - `literal` &mdash; a literal expression.
-  - `inlineCast` &mdash; an inline cast expression.
+```ts
+import { Walker } from '@kbn/esql-ast';
+
+const walker = new Walker({
+  visitAny?: (node: ESQLProperNode) => {
+    // Called for any node type that does not have a specific visitor.
+  },
+});
+
+walker.walk(ast);
+```
+
+
+## High-level API
+
+There are few high-level utility functions that are implemented on top of the
+low-level API, for your convenience:
+
+- `Walker.walk` &mdash; Walks the AST and calls the appropriate visitor functions.
+- `Walker.commands` &mdash; Walks the AST and extracts all command statements.
+- `Walker.params` &mdash; Walks the AST and extracts all parameter literals.
+- `Walker.find` &mdash; Finds and returns the first node that matches the search criteria.
+- `Walker.findAll` &mdash; Finds and returns all nodes that match the search criteria.
+- `Walker.match` &mdash; Matches a single node against a template object.
+- `Walker.matchAll` &mdash; Matches all nodes against a template object.
+- `Walker.findFunction` &mdash; Finds the first function that matches the predicate.
+- `Walker.hasFunction` &mdash; Searches for at least one occurrence of a function or expression in the AST.
+- `Walker.visitComments` &mdash; Visits all comments in the AST.
+
+The `Walker.walk()` method is simply a sugar syntax around the low-level
+`new Walker().walk()` method.
+
+The `Walker.commands()` method returns a list of all commands. This also
+includes nested commands, once they become supported in ES|QL.
+
+The `Walker.params()` method collects all param literals, such as unnamed `?` or
+named `?param`, or ordered `?1`.
+
+The `Walker.find()` and `Walker.findAll()` methods are used to search for nodes
+in the AST that match a specific criteria. The criteria is specified using a
+predicate function.
+
+The `Walker.match()` and `Walker.matchAll()` methods are also used to search for
+nodes in the AST, but unlike `find` and `findAll`, they use a template object
+to match the nodes.
+
+The `Walker.findFunction()` is a simple utility to find the first function that
+matches a predicate. The `Walker.hasFunction()` returns `true` if at least one
+function or expression in the AST matches the predicate.
+
+The `Walker.visitComments()` method is used to visit all comments in the AST.
+You specify a callback that is called for each comment node.