Add introduction and examples for the model version API (#158904)

## Summary

Add a markdown file with a short introduction to model versions and
examples of the currently supported migration scenarios.
This commit is contained in:
Pierre Gayvallet 2023-06-07 02:09:39 -04:00 committed by GitHub
parent 19d3343c77
commit 9f5ecaa913
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
5 changed files with 640 additions and 4 deletions

View file

@ -2,4 +2,6 @@
This package contains the public types for Core server-side savedObjects service and contracts.
Note: the types related to the savedObjects client and repository APIs can be found in the `@kbn/core-saved-objects-api-server` package.
Note: the types related to the savedObjects client and repository APIs can be found in the `@kbn/core-saved-objects-api-server` package.
Documentation about model versions is available [in the doc folder](./docs/model_versions.md)

Binary file not shown.

After

Width:  |  Height:  |  Size: 86 KiB

View file

@ -0,0 +1,612 @@
# savedObjects: Model Version API
## Introduction
The modelVersion API is a new way to define transformations (*"migrations"*) for your savedObject types, and will
replace the "old" migration API after Kibana version `8.10.0` (where it will no longer be possible to register
migrations using the old system).
The main purpose of this API is to address two problems of the old migration system regarding managed ("serverless") deployments:
- savedObjects model versioning is coupled to the stack versioning (migrations are registered per stack version)
- migration functions are not safe in regard to our BWC and ZDT requirements (Kibana N and N+1 running at the same time during upgrade)
This API also intend to address minor DX issues of the old migration system, by having a more explicit definition of saved Objects "versions".
## What are model versions trying to solve?
As explained in the previous section, the API is solving issues of the previous migration system regarding managed deployments:
### 1. SO type versioning was tightly coupled to stack versioning
With the previous migration system, migrations were defined per stack version, meaning that the "granularity" for defining
migrations was the stack version. You couldn't for example, add 2 consecutive migrations on `8.6.0` (to be executed at different points in time).
It was fine for on-prem distributions, given there is no way to upgrade Kibana to something else than a "fixed" stack version.
For our managed offering however, where we're planning on decoupling deployments and upgrades from stack versions
(deploying more often, so more than once per stack release), it would have been an issue, as it wouldn't have been possible
to add a new migration in-between 2 stack versions.
<img src="./assets/mv_img_1.png" alt="multiple migration per stack version schema">
We needed a way to decouple SO versioning from the stack versioning to support this, and model versions do by design.
### 2. The current migrations API is unsafe for the zero-downtime and backward-compatible requirements
On traditional deployments (on-prem/non-managed cloud), upgrading Kibana is done with downtime.
The upgrade process requires shutting down all the nodes of the prior version before deploying the new one.
That way, there is always a single version of Kibana running at a given time, which avoids all risks of data incompatibility
between version (e.g the new version introduces a migration that changes the shape of the document in a way that breaks compatibility
with the previous version)
For serverless however, the same process can't be used, as we need to be able to upgrade Kibana without interruption of service.
Which means that the old and new version of Kibana will have to cohabitate for a time.
This leads to a lot of constraints regarding what can, or cannot, be done with data transformations (migrations) during an upgrade.
And, unsurprisingly, the existing migration API (which allows to register any kind of *(doc) => doc* transformations) was way too permissive and
unsafe given our backward compatibility requirements.
## Defining model versions
As for old migrations, model versions are bound to a given [savedObject type](https://github.com/elastic/kibana/blob/9b330e493216e8dde3166451e4714966f63f5ab7/packages/core/saved-objects/core-saved-objects-server/src/saved_objects_type.ts#L22-L27)
When registering a SO type, a new [modelVersions](https://github.com/elastic/kibana/blob/9a6a2ccdff619f827b31c40dd9ed30cb27203da7/packages/core/saved-objects/core-saved-objects-server/src/saved_objects_type.ts#L138-L177)
property is available. This attribute is a map of [SavedObjectsModelVersion](https://github.com/elastic/kibana/blob/9a6a2ccdff619f827b31c40dd9ed30cb27203da7/packages/core/saved-objects/core-saved-objects-server/src/model_version/model_version.ts#L12-L20)
which is the top-level type/container to define model versions.
This map follows a similar `{ [version number] => version definition }` format as the old migration map, however
a given SO type's model version is now identified by a single integer.
The first version must be numbered as version 1, incrementing by one for each new version.
That way:
- SO type versions are decoupled from stack versioning
- SO type versions are independent between types
*a **valid** version numbering:*
```ts
const myType: SavedObjectsType = {
name: 'test',
switchToModelVersionAt: '8.10.0',
modelVersions: {
1: modelVersion1, // valid: start with version 1
2: modelVersion2, // valid: no gap between versions
},
// ...other mandatory properties
};
```
*an **invalid** version numbering:*
```ts
const myType: SavedObjectsType = {
name: 'test',
switchToModelVersionAt: '8.10.0',
modelVersions: {
2: modelVersion2, // invalid: first version must be 1
4: modelVersion3, // invalid: skipped version 3
},
// ...other mandatory properties
};
```
## Structure of a model version
[Model versions](https://github.com/elastic/kibana/blob/9b330e493216e8dde3166451e4714966f63f5ab7/packages/core/saved-objects/core-saved-objects-server/src/model_version/model_version.ts#L12-L20)
are not just functions as the previous migrations were, but structured objects describing how the version behaves and what changed since the last one.
*A base example of what a model version can look like:*
```ts
const myType: SavedObjectsType = {
name: 'test',
switchToModelVersionAt: '8.10.0',
modelVersions: {
1: {
changes: [
{
type: 'mappings_addition',
addedMappings: {
someNewField: { type: 'text' },
},
},
{
type: 'data_backfill',
transform: someBackfillFunction,
},
],
schemas: {
forwardCompatibility: fcSchema,
create: createSchema,
},
},
},
// ...other mandatory properties
};
```
**Note:** Having multiple changes of the same type for a given version is supported by design
to allow merging different sources (to prepare for an eventual higher-level API)
*This definition would be perfectly valid:*
```ts
const version1: SavedObjectsModelVersion = {
changes: [
{
type: 'mappings_addition',
addedMappings: {
someNewField: { type: 'text' },
},
},
{
type: 'mappings_addition',
addedMappings: {
anotherNewField: { type: 'text' },
},
},
],
};
```
It's currently composed of two main properties:
### changes
[link to the TS doc for `changes`](https://github.com/elastic/kibana/blob/9b330e493216e8dde3166451e4714966f63f5ab7/packages/core/saved-objects/core-saved-objects-server/src/model_version/model_version.ts#L21-L51)
Describes the list of changes applied during this version.
**Important:** This is the part that replaces the old migration system, and allows defining when a version adds new mapping,
mutates the documents, or other type-related changes.
The current types of changes are:
#### - mappings_addition
Used to define new mappings introduced in a given version.
*Usage example:*
```ts
const change: SavedObjectsModelMappingsAdditionChange = {
type: 'mappings_addition',
addedMappings: {
newField: { type: 'text' },
existingNestedField: {
properties: {
newNestedProp: { type: 'keyword' },
},
},
},
};
```
**note:** *When adding mappings, the root `type.mappings` must also be updated accordingly (as it was done previously).*
#### - mappings_deprecation
Used to flag mappings as no longer being used and ready to be removed.
*Usage example:*
```ts
let change: SavedObjectsModelMappingsDeprecationChange = {
type: 'mappings_deprecation',
deprecatedMappings: ['someDeprecatedField', 'someNested.deprecatedField'],
};
```
**note:** *It is currently not possible to remove fields from an existing index's mapping (without reindexing into another index),
so the mappings flagged with this change type won't be deleted for now, but this should still be used to allow
our system to clean the mappings once upstream (ES) unblock us.*
#### - data_backfill
Used to populate fields (indexed or not) added in the same version
*Usage example:*
```ts
let change: SavedObjectsModelDataBackfillChange = {
type: 'data_backfill',
transform: (document) => {
document.attributes.someAddedField = 'defaultValue';
return { document };
},
};
```
**note:** *Even if no check is performed to ensure it, this type of model change should only be used to
backfill newly introduced fields.*
### schemas
[link to the TS doc for `schemas`](https://github.com/elastic/kibana/blob/9b330e493216e8dde3166451e4714966f63f5ab7/packages/core/saved-objects/core-saved-objects-server/src/model_version/schemas.ts#L11-L16)
The schemas associated with this version. Schemas are used to validate or convert SO documents at various
stages of their lifecycle.
The currently available schemas are:
#### forwardCompatibility
This is a new concept introduced by model versions. This schema is used for inter-version compatibility.
When retrieving a savedObject document from an index, if the version of the document is higher than the latest version
known of the Kibana instance, the document will go through the `forwardCompatibility` schema of the associated model version.
**Important:** These conversion mechanism shouldn't assert the data itself, and only strip unknown fields to convert the document to
the **shape** of the document at the given version.
Basically, this schema should keep all the known fields of a given version, and remove all the unknown fields, without throwing.
Forward compatibility schema can be implemented in two different ways.
1. Using `config-schema`
*Example of schema for a version having two fields: someField and anotherField*
```ts
const versionSchema = schema.object(
{
someField: schema.maybe(schema.string()),
anotherField: schema.maybe(schema.string()),
},
{ unknowns: 'ignore' }
);
```
**Important:** Note the `{ unknowns: 'ignore' }` in the schema's options. This is required when using
`config-schema` based schemas, as this what will evict the additional fields without throwing an error.
2. Using a plain javascript function
*Example of schema for a version having two fields: someField and anotherField*
```ts
const versionSchema: SavedObjectModelVersionEvictionFn = (attributes) => {
const knownFields = ['someField', 'anotherField'];
return pick(attributes, knownFields);
}
```
**note:** *Even if highly recommended, implementing this schema is not strictly required. Type owners can manage unknown fields
and inter-version compatibility themselves in their service layer instead.*
#### create
This is a direct replacement for [the old SavedObjectType.schemas](https://github.com/elastic/kibana/blob/9b330e493216e8dde3166451e4714966f63f5ab7/packages/core/saved-objects/core-saved-objects-server/src/saved_objects_type.ts#L75-L82)
definition, now directly included in the model version definition.
As a refresher the `create` schema is a `@kbn/config-schema` object-type schema, and is used to validate the properties of the document
during `create` and `bulkCreate` operations.
**note:** *Implementing this schema is optional, but still recommended, as otherwise there will be no validating when
importing objects*
## Use-case examples
These are example of the migration scenario currently supported (out of the box) by the system.
**note:** *more complex scenarios (e.g field mutation by copy/sync) could already be implemented, but without
the proper tooling exposed from Core, most of the work related to sync and compatibility would have to be
implemented in the domain layer of the type owners, which is why we're not documenting them yet.*
### Adding a non-indexed field without default value
We are currently in model version 1, and our type has 2 indexed fields defined: `foo` and `bar`.
The definition of the type at version 1 would look like:
```ts
const myType: SavedObjectsType = {
name: 'test',
namespaceType: 'single',
switchToModelVersionAt: '8.10.0',
modelVersions: {
// initial (and current) model version
1: {
changes: [],
schemas: {
// FC schema defining the known fields (indexed or not) for this version
forwardCompatibility: schema.object(
{ foo: schema.string(), bar: schema.string() },
{ unknowns: 'ignore' } // note the `unknown: ignore` which is how we're evicting the unknown fields
),
// schema that will be used to validate input during `create` and `bulkCreate`
create: schema.object(
{ foo: schema.string(), bar: schema.string() },
)
},
},
},
mappings: {
properties: {
foo: { type: 'text' },
bar: { type: 'text' },
},
},
};
```
From here, say we want to introduce a new `dolly` field that is not indexed, and that we don't need to populate with a default value.
To achieve that, we need to introduce a new model version, with the only thing to do will be to define the
associated schemas to include this new field.
The added model version would look like:
```ts
// the new model version adding the `dolly` field
let modelVersion2: SavedObjectsModelVersion = {
// not an indexed field, no data backfill, so changes are actually empty
changes: [],
schemas: {
// the only addition in this model version: taking the new field into account for the schemas
forwardCompatibility: schema.object(
{ foo: schema.string(), bar: schema.string(), dolly: schema.string() },
{ unknowns: 'ignore' } // note the `unknown: ignore` which is how we're evicting the unknown fields
),
create: schema.object(
{ foo: schema.string(), bar: schema.string(), dolly: schema.string() },
)
},
};
```
The full type definition after the addition of the new model version:
```ts
const myType: SavedObjectsType = {
name: 'test',
namespaceType: 'single',
switchToModelVersionAt: '8.10.0',
modelVersions: {
1: {
changes: [],
schemas: {
forwardCompatibility: schema.object(
{ foo: schema.string(), bar: schema.string() },
{ unknowns: 'ignore' }
),
create: schema.object(
{ foo: schema.string(), bar: schema.string() },
)
},
},
2: {
changes: [],
schemas: {
forwardCompatibility: schema.object(
{ foo: schema.string(), bar: schema.string(), dolly: schema.string() },
{ unknowns: 'ignore' }
),
create: schema.object(
{ foo: schema.string(), bar: schema.string(), dolly: schema.string() },
)
},
},
},
mappings: {
properties: {
foo: { type: 'text' },
bar: { type: 'text' },
},
},
};
```
### Adding an indexed field without default value
This scenario is fairly close to the previous one. The difference being that working with an indexed field means
adding a `mappings_addition` change and to also update the root mappings accordingly.
To reuse the previous example, let's say the `dolly` field we want to add would need to be indexed instead.
In that case, the new version needs to do the following:
- add a `mappings_addition` type change to define the new mappings
- update the root `mappings` accordingly
- add the updated schemas as we did for the previous example
The new version definition would look like:
```ts
let modelVersion2: SavedObjectsModelVersion = {
// add a change defining the mapping for the new field
changes: [
{
type: 'mappings_addition',
addedMappings: {
dolly: { type: 'text' },
},
},
],
schemas: {
// adding the new field to the forwardCompatibility schema
forwardCompatibility: schema.object(
{ foo: schema.string(), bar: schema.string(), dolly: schema.string() },
{ unknowns: 'ignore' }
),
create: schema.object(
{ foo: schema.string(), bar: schema.string(), dolly: schema.string() },
)
},
};
```
As said, we will also need to update the root mappings definition:
```ts
mappings: {
properties: {
foo: { type: 'text' },
bar: { type: 'text' },
dolly: { type: 'text' },
},
},
```
the full type definition after the addition of the model version 2 would be:
```ts
const myType: SavedObjectsType = {
name: 'test',
namespaceType: 'single',
switchToModelVersionAt: '8.10.0',
modelVersions: {
1: {
changes: [
{
type: 'mappings_addition',
addedMappings: {
foo: { type: 'text' },
bar: { type: 'text' },
},
},
],
schemas: {
forwardCompatibility: schema.object(
{ foo: schema.string(), bar: schema.string() },
{ unknowns: 'ignore' }
),
create: schema.object(
{ foo: schema.string(), bar: schema.string() },
)
},
},
2: {
changes: [
{
type: 'mappings_addition',
addedMappings: {
dolly: { type: 'text' },
},
},
],
schemas: {
forwardCompatibility: schema.object(
{ foo: schema.string(), bar: schema.string(), dolly: schema.string() },
{ unknowns: 'ignore' }
),
create: schema.object(
{ foo: schema.string(), bar: schema.string(), dolly: schema.string() },
)
},
},
},
mappings: {
properties: {
foo: { type: 'text' },
bar: { type: 'text' },
dolly: { type: 'text' },
},
},
};
```
### Adding an indexed field with a default value
Now a slightly different scenario where we'd like to populate the newly introduced field with a default value.
In that case, we'd need to add an additional `data_backfill` change to populate the new field's value
(in addition to the `mappings_addition` one):
```ts
let modelVersion2: SavedObjectsModelVersion = {
changes: [
// setting the `dolly` field's default value.
{
type: 'data_backfill',
transform: (document) => {
document.attributes.dolly = 'default_value';
return { document };
},
},
// define the mappings for the new field
{
type: 'mappings_addition',
addedMappings: {
dolly: { type: 'text' },
},
},
],
schemas: {
// define `dolly` as an know field in the schema
forwardCompatibility: schema.object(
{ foo: schema.string(), bar: schema.string(), dolly: schema.string() },
{ unknowns: 'ignore' }
),
create: schema.object(
{ foo: schema.string(), bar: schema.string(), dolly: schema.string() },
)
},
};
```
The full type definition would look like:
```ts
const myType: SavedObjectsType = {
name: 'test',
namespaceType: 'single',
switchToModelVersionAt: '8.10.0',
modelVersions: {
1: {
changes: [
{
type: 'mappings_addition',
addedMappings: {
foo: { type: 'text' },
bar: { type: 'text' },
},
},
],
schemas: {
forwardCompatibility: schema.object(
{ foo: schema.string(), bar: schema.string() },
{ unknowns: 'ignore' }
),
create: schema.object(
{ foo: schema.string(), bar: schema.string() },
)
},
},
2: {
changes: [
{
type: 'data_backfill',
transform: (document) => {
document.attributes.dolly = 'default_value';
return { document };
},
},
{
type: 'mappings_addition',
addedMappings: {
dolly: { type: 'text' },
},
},
],
schemas: {
forwardCompatibility: schema.object(
{ foo: schema.string(), bar: schema.string(), dolly: schema.string() },
{ unknowns: 'ignore' }
),
create: schema.object(
{ foo: schema.string(), bar: schema.string(), dolly: schema.string() },
)
},
},
},
mappings: {
properties: {
foo: { type: 'text' },
bar: { type: 'text' },
dolly: { type: 'text' },
},
},
};
```
**Note:** *if the field was non-indexed, we would just not use the `mappings_addition` change or update the mappings (as done in example 1)*

View file

@ -81,7 +81,7 @@ export interface SavedObjectsModelMappingsDeprecationChange {
* A {@link SavedObjectsModelChange | model change} used to backfill fields introduced in the same model version.
*
* @example
* ```
* ```ts
* let change: SavedObjectsModelDataBackfillChange = {
* type: 'data_backfill',
* transform: (document) => {

View file

@ -22,8 +22,10 @@ export interface SavedObjectsModelVersion {
* The list of changes associated with this version.
*
* Model version changes are defined via low-level components, allowing to use composition
* to describe the list of changes bound to a given version. Composition also allows to more
* easily merge changes from multiple source when needed.
* to describe the list of changes bound to a given version.
*
* @remark Having multiple changes of the same type in a version's list of change is supported
* by design to allow merging different sources.
*
* @example Adding a new indexed field with a default value
* ```ts
@ -46,6 +48,26 @@ export interface SavedObjectsModelVersion {
* };
* ```
*
* @example A version with multiple mappings addition coming from different changes
* ```ts
* const version1: SavedObjectsModelVersion = {
* changes: [
* {
* type: 'mappings_addition',
* addedMappings: {
* someNewField: { type: 'text' },
* },
* },
* {
* type: 'mappings_addition',
* addedMappings: {
* anotherNewField: { type: 'text' },
* },
* },
* ],
* };
* ```
*
* See {@link SavedObjectsModelChange | changes} for more information and examples.
*/
changes: SavedObjectsModelChange[];