Updates internal dev docs for Saved Objects (#178058)

fix [#178060](https://github.com/elastic/kibana/issues/178060)
Updates the internal developer docs for transitions to new model
versions.
The end-user docs were updated in
https://github.com/elastic/kibana/pull/176970

---------

Co-authored-by: Jean-Louis Leysens <jloleysens@gmail.com>
This commit is contained in:
Christiane (Tina) Heiligers 2024-03-07 16:16:28 +01:00 committed by GitHub
parent 2c32fafc08
commit bfee4d79e8
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
3 changed files with 227 additions and 297 deletions

View file

@ -33,10 +33,11 @@ all the "children" will be automatically included. However, when a "child" is ex
## Migrations and Backward compatibility
As your plugin evolves, you may need to change your Saved Object type in a breaking way (for example, changing the type of an attribute, or removing
an attribute). If that happens, you should write a migration to upgrade the Saved Objects that existed prior to the change.
As your plugin evolves, you may need to change your Saved Object
type in a breaking way (for example, changing the type of an attribute, or removing
an attribute). If that happens, you should write a new model version to upgrade the Saved Objects that existed prior to the change.
<DocLink id="kibDevTutorialSavedObject" section="migrations" text="How to write a migration" />.
<DocLink id="kibDevTutorialSavedObject" section="migrations" text="Defining model versions" />.
## Security

View file

@ -19,6 +19,7 @@ import { SavedObjectsType } from 'src/core/server';
export const dashboardVisualization: SavedObjectsType = {
name: 'dashboard_visualization', [1]
hidden: true,
switchToModelVersionAt: '8.10.0', // this is the default, feel free to omit it unless you intend to switch to using model versions before 8.10.0
namespaceType: 'multiple-isolated', [2]
mappings: {
dynamic: false,
@ -31,9 +32,9 @@ export const dashboardVisualization: SavedObjectsType = {
},
},
},
migrations: {
'1.0.0': migratedashboardVisualizationToV1,
'2.0.0': migratedashboardVisualizationToV2,
modelVersions: {
1: dashboardVisualizationModelVersionV1,
2: dashboardVisualizationModelVersionV2,
},
};
```
@ -91,7 +92,7 @@ export const dashboardVisualization: SavedObjectsType = {
},
},
},
migrations: { ... },
modelVersions: { ... },
};
```
@ -118,11 +119,11 @@ Will result in the following mappings being applied to the .kibana index:
}
}
```
Do not use field mappings like you would use data types for the columns of a SQL database. Instead, field mappings are analogous to a
Do not use field mappings like you would use data types for the columns of a SQL database. Instead, field mappings are analogous to a
SQL index. Only specify field mappings for the fields you wish to search on or query. By specifying `dynamic: false`
in any level of your mappings, Elasticsearch will accept and store any other fields even if they are not specified in your mappings.
Since Elasticsearch has a default limit of 1000 fields per index, plugins should carefully consider the
Since Elasticsearch has a default limit of 1000 fields per index, plugins should carefully consider the
fields they add to the mappings. Similarly, Saved Object types should never use `dynamic: true` as this can cause an arbitrary
amount of fields to be added to the .kibana index.
@ -154,146 +155,104 @@ router.get(
}
);
```
[1] Note how `dashboard.panels[0].visualization` stores the name property of the reference (not the id directly) to be able to uniquely
[1] Note how `dashboard.panels[0].visualization` stores the name property of the reference (not the id directly) to be able to uniquely
identify this reference. This guarantees that the id the reference points to always remains up to date. If a
visualization id was directly stored in `dashboard.panels[0].visualization` there is a risk that this id gets updated without
visualization id was directly stored in `dashboard.panels[0].visualization` there is a risk that this id gets updated without
updating the reference in the references array.
## Migrations
Saved Objects support schema changes between Kibana versions, which we call migrations. Migrations are
applied when a Kibana installation is upgraded from one version to a newer version, when exports are imported via
Saved Objects support schema changes between Kibana versions, which we call migrations, implemented with model versions.
Model version transitions are applied when a Kibana installation is upgraded from one version to a newer version, when exports are imported via
the Saved Objects Management UI, or when a new object is created via the HTTP API.
### Writing migrations
### Defining model versions
Each Saved Object type may define migrations for its schema. Migrations are specified by the Kibana version number, receive an input document,
and must return the fully migrated document to be persisted to Elasticsearch.
Model versions are bound to a given [savedObject type](https://github.com/elastic/kibana/blob/9b330e493216e8dde3166451e4714966f63f5ab7/packages/core/saved-objects/core-saved-objects-server/src/saved_objects_type.ts#L22-L27)
Lets say we want to define two migrations: - In version 1.1.0, we want to drop the subtitle field and append it to the title - In version
1.4.0, we want to add a new id field to every panel with a newly generated UUID.
When registering a SO type, a [modelVersions](https://github.com/elastic/kibana/blob/9a6a2ccdff619f827b31c40dd9ed30cb27203da7/packages/core/saved-objects/core-saved-objects-server/src/saved_objects_type.ts#L138-L177)
property is available. This attribute is a map of version numbers to [SavedObjectsModelVersion](https://github.com/elastic/kibana/blob/9a6a2ccdff619f827b31c40dd9ed30cb27203da7/packages/core/saved-objects/core-saved-objects-server/src/model_version/model_version.ts#L12-L20)
which is the top-level type/container to define model versions.
First, the current mappings should always reflect the latest or "target" schema. Next, we should define a migration function for each step in the schema evolution:
The modelVersion map is of the form `{ [version: number] => versionDefinition }`, using single integer to identify a version definition.
**src/plugins/my_plugin/server/saved_objects/dashboard_visualization.ts**
The first version must be numbered as version 1, incrementing by one for each new version.
```ts
import { SavedObjectsType, SavedObjectMigrationFn } from 'src/core/server';
import uuid from 'uuid';
import { schema } from '@kbn/config-schema';
import { SavedObjectsType } from 'src/core/server';
interface DashboardVisualizationPre110 {
title: string;
subtitle: string;
panels: Array<{}>;
}
interface DashboardVisualization110 {
title: string;
panels: Array<{}>;
}
interface DashboardVisualization140 {
title: string;
panels: Array<{ id: string }>;
}
const migrateDashboardVisualization110: SavedObjectMigrationFn<
DashboardVisualizationPre110, [1]
DashboardVisualization110
> = (doc) => {
const { subtitle, ...attributesWithoutSubtitle } = doc.attributes;
return {
...doc, [2]
attributes: {
...attributesWithoutSubtitle,
title: `${doc.attributes.title} - ${doc.attributes.subtitle}`,
},
};
};
const migrateDashboardVisualization140: SavedObjectMigrationFn<
DashboardVisualization110,
DashboardVisualization140
> = (doc) => {
const outPanels = doc.attributes.panels?.map((panel) => {
return { ...panel, id: uuid.v4() };
});
return {
...doc,
attributes: {
...doc.attributes,
panels: outPanels,
},
};
};
const schemaV1 = schema.object({ title: schema.string({ maxLength: 50, minLength: 1 }) });
const schemaV2 = schemaV1.extends({
description: schema.maybe(schema.string({ maxLength: 200, minLength: 1 })),
});
export const dashboardVisualization: SavedObjectsType = {
name: 'dashboard_visualization', [1]
/** ... */
migrations: {
// Takes a pre 1.1.0 doc, and converts it to 1.1.0
'1.1.0': migrateDashboardVisualization110,
// Takes a 1.1.0 doc, and converts it to 1.4.0
'1.4.0': migrateDashboardVisualization140, [3]
name: 'dashboard_visualization',
...
mappings: {
dynamic: false,
properties: {
title: { type: 'text' }, // This mapping was added before model versions
description: { type: 'text' }, // mappings introduced in v2
},
},
};
```
[1] It is useful to define an interface for each version of the schema. This allows TypeScript to ensure that you are properly handling the input and output
types correctly as the schema evolves.
[2] Returning a shallow copy is necessary to avoid type errors when using different types for the input and output shape.
[3] Migrations do not have to be defined for every version. The version number of a migration must always be the earliest Kibana version
in which this migration was released. So if you are creating a migration which will
be part of the v7.10.0 release, but will also be backported and released as v7.9.3, the migration version should be: 7.9.3.
Migrations should be written defensively, an exception in a migration function will prevent a Kibana upgrade from succeeding and will cause downtime for our users.
Having said that, if a document is encountered that is not in the expected shape, migrations are encouraged to throw an exception to abort the upgrade. In most scenarios, it is better to
fail an upgrade than to silently ignore a corrupt document which can cause unexpected behaviour at some future point in time. When such a scenario is encountered,
the error should be verbose and informative so that the corrupt document can be corrected, if possible.
**WARNING:** Do not attempt to change the `typeMigrationVersion`, `id`, or `type` fields within a migration function, this is not supported.
### Deferred Migrations
Usually, migrations run during the upgrade process, and sometimes that may block it if there is a huge amount of outdated objects.
In this case, it is recommended to mark some of the migrations to defer their execution.
```ts
export const dashboardVisualization: SavedObjectsType = {
name: 'dashboard_visualization', [1]
/** ... */
migrations: {
// Takes a pre 1.1.0 doc, and converts it to 1.1.0
'1.1.0': {
deferred: true,
transform: migrateDashboardVisualization110,
modelVersions: {
1: {
// Sometimes no changes are needed in the initial version, but you may have
// pre-existing mappings or data that must be transformed in some way
// In this case, title already has mappings defined.
changes: [],
schemas: {
// The forward compatible schema should allow any future versions of
// this SO to be converted to this version, since we are using
// @kbn/config-schema we opt-in to unknowns to allow the schema to
// successfully "downgrade" future SOs to this version.
forwardCompatibility: schemaV1.extends({}, { unknowns: 'ignore' }),
create: schemaV1,
},
},
2: {
changes: [
// In this second version we added new mappings for the description field.
{
type: 'mappings_addition',
addedMappings: {
description: { type: 'keyword' },
},
},
{
type: 'data_backfill',
backfillFn: (doc) => {
return {
attributes: {
description: 'my default description',
},
};
},
},
],
schemas: {
forwardCompatibility: schemaV2.extends({}, { unknowns: 'ignore' }),
create: schemaV2,
},
},
},
};
```
By default, all the migrations are not deferred, and in order to make them so, the `deferred` flag should be explicitly set to `true`.
In this case, the documents with only pending deferred migrations will not be migrated during the upgrade process.
That way:
- SO type versions are decoupled from stack versioning
- SO type versions are independent between types
But whenever they are accessed via Saved Object API or repository, all the migrations will be applied to them on the fly:
- On read operations, the stored objects remain untouched and only transformed before returning the result.
If there are some failures during the migration, an exception or 500 server error will be thrown,
so that it is guaranteed that all the returned objects will be up to date.
- On write operations, the objects will be migrated to the latest version before writing them.
In other words, this flag postpones the write operation until the objects are explicitly modified.
One important notice: if there is a few pending migrations for a document and not all of them can be deferred,
the document will be migrated during the upgrade process, and all pending migrations will be applied.
### Testing Migrations
Bugs in a migration function cause downtime for our users and therefore have a very high impact. Follow the <DocLink id="kibDevTutorialTestingPlugins" section="saved-objects-migrations" text="Saved Object migrations section in the plugin testing guide"/>.
### Testing model versions
Bugs in model version transitions cause downtime for our users and therefore have a very high impact. Follow the <DocLink id="kibDevTutorialTestingPlugins" section="saved-objects-model-versions" text="Saved Objects model versions"/> section in the plugin testing guide.
### How to opt-out of the global savedObjects APIs?
There are 2 options, depending on the amount of flexibility you need:
For complete control over your HTTP APIs and custom handling, declare your type as `hidden`, as shown in the example.
For complete control over your HTTP APIs and custom handling, declare your type as `hidden`, as shown in the example.
The other option that allows you to build your own HTTP APIs and still use the client as-is is to declare your type as hidden from the global saved objects HTTP APIs as `hiddenFromHttpApis: true`
```ts
@ -304,24 +263,12 @@ export const foo: SavedObjectsType = {
hidden: false, [1]
hiddenFromHttpApis: true, [2]
namespaceType: 'multiple-isolated',
mappings: {
dynamic: false,
properties: {
description: {
type: 'text',
},
hits: {
type: 'integer',
},
},
},
migrations: {
'1.0.0': migratedashboardVisualizationToV1,
'2.0.0': migratedashboardVisualizationToV2,
},
mappings: { ... },
modelVersions: { ... },
...
};
```
[1] Needs to be `false` to use the `hiddenFromHttpApis` option
[2] Set this to `true` to build your own HTTP API and have complete control over the route handler.
[2] Set this to `true` to build your own HTTP API and have complete control over the route handler.

View file

@ -794,202 +794,184 @@ Kibana and esArchiver to load fixture data into Elasticsearch.
_todo: fully worked out example_
### Saved Objects migrations
### Saved Objects model versions
_Also see <DocLink id="kibDevTutorialSavedObject" section="migrations" text="How to write a migration"/>._
_Also see <DocLink id="kibDevTutorialSavedObject" section="model-versions" text="Defining model versions"/>._
It is critical that you have extensive tests to ensure that migrations behave as expected with all possible input
documents. Given how simple it is to test all the branch conditions in a migration function and the high impact of a
bug in this code, theres really no reason not to aim for 100% test code coverage.
Model versions definitions are more structured than the legacy migration functions, which makes them harder
to test without the proper tooling. This is why a set of testing tools and utilities are exposed
from the `@kbn/core-test-helpers-model-versions` package, to help properly test the logic associated
with model version and their associated transformations.
It's recommend that you primarily leverage unit testing with Jest for testing your migration function. Unit tests will
be a much more effective approach to testing all the different shapes of input data and edge cases that your migration
may need to handle. With more complex migrations that interact with several components or may behave different depending
on registry contents (such as Embeddable migrations), we recommend that you use the Jest Integration suite which allows
you to create a full instance Kibana and all plugins in memory and leverage the import API to test migrating documents.
#### Tooling for unit tests
#### Throwing exceptions
Keep in mind that any exception thrown by your migration function will cause Kibana to fail to upgrade. This should almost
never happen for our end users and we should be exhaustive in our testing to be sure to catch as many edge cases that we
could possibly handle. This entails ensuring that the migration is written defensively; we should try to avoid every bug
possible in our implementation.
For unit tests, the package exposes utilities to easily test the impact of transforming documents
from a model version to another one, either upward or backward.
In general, exceptions should only be thrown when the input data is corrupted and doesn't match the expected schema. In
such cases, it's important that an informative error message is included in the exception and we do not rely on implicit
runtime exceptions such as "null pointer exceptions" like `TypeError: Cannot read property 'foo' of undefined`.
##### Model version test migrator
#### Unit testing
The `createModelVersionTestMigrator` helper allows to create a test migrator that can be used to
test model version changes between versions, by transforming documents the same way the migration
algorithm would during an upgrade.
Unit testing migration functions is typically pretty straight forward and comparable to other types of Jest testing. In
general, you should focus this tier of testing on validating output and testing input edge cases. One focus of this tier
should be trying to find edge cases that throw exceptions the migration shouldn't. As you can see in this simple
example, the coverage here is very exhaustive and verbose, which is intentional.
**Example:**
```ts
import { migrateCaseFromV7_9_0ToV7_10_0 } from './case_migrations';
import {
createModelVersionTestMigrator,
type ModelVersionTestMigrator
} from '@kbn/core-test-helpers-model-versions';
const validInput_7_9_0 = {
id: '1',
type: 'case',
attributes: {
connector_id: '1234';
}
}
const mySoTypeDefinition = someSoType();
describe('Case migrations v7.7.0 -> v7.8.0', () => {
it('transforms the connector field', () => {
expect(migrateCaseFromV7_9_0ToV7_10_0(validInput_7_9_0)).toEqual({
id: '1',
type: 'case',
attributes: {
connector: {
id: '1234', // verify id was moved into subobject
name: 'none', // verify new default field was added
}
}
});
describe('mySoTypeDefinition model version transformations', () => {
let migrator: ModelVersionTestMigrator;
beforeEach(() => {
migrator = createModelVersionTestMigrator({ type: mySoTypeDefinition });
});
describe('Model version 2', () => {
it('properly backfill the expected fields when converting from v1 to v2', () => {
const obj = createSomeSavedObject();
it('handles empty string', () => {
expect(migrateCaseFromV7_9_0ToV7_10_0({
id: '1',
type: 'case',
attributes: {
connector_id: ''
}
})).toEqual({
id: '1',
type: 'case',
attributes: {
connector: {
id: 'none',
name: 'none',
}
}
});
});
const migrated = migrator.migrate({
document: obj,
fromVersion: 1,
toVersion: 2,
});
it('handles null', () => {
expect(migrateCaseFromV7_9_0ToV7_10_0({
id: '1',
type: 'case',
attributes: {
connector_id: null
}
})).toEqual({
id: '1',
type: 'case',
attributes: {
connector: {
id: 'none',
name: 'none',
}
}
});
});
it('handles undefined', () => {
expect(migrateCaseFromV7_9_0ToV7_10_0({
id: '1',
type: 'case',
attributes: {
// Even though undefined isn't a valid JSON or Elasticsearch value, we should test it anyways since there
// could be some JavaScript layer that casts the field to `undefined` for some reason.
connector_id: undefined
}
})).toEqual({
id: '1',
type: 'case',
attributes: {
connector: {
id: 'none',
name: 'none',
}
}
expect(migrated.properties).toEqual(expectedV2Properties);
});
expect(migrateCaseFromV7_9_0ToV7_10_0({
id: '1',
type: 'case',
attributes: {
// also test without the field present at all
}
})).toEqual({
id: '1',
type: 'case',
attributes: {
connector: {
id: 'none',
name: 'none',
}
}
it('properly removes the expected fields when converting from v2 to v1', () => {
const obj = createSomeSavedObject();
const migrated = migrator.migrate({
document: obj,
fromVersion: 2,
toVersion: 1,
});
expect(migrated.properties).toEqual(expectedV1Properties);
});
});
});
```
You can generate code coverage report for a single plugin.
#### Tooling for integration tests
```bash
yarn jest --coverage --config src/plugins/console/jest.config.js
```
During integration tests, we can boot a real Elasticsearch cluster, allowing us to manipulate SO
documents in a way almost similar to how it would be done on production runtime. With integration
tests, we can even simulate the cohabitation of two Kibana instances with different model versions
to assert the behavior of their interactions.
Html report should be available in `target/kibana-coverage/jest/src/plugins/console` path
##### Model version test bed
We run code coverage daily on CI and ["Kibana Stats cluster"](https://kibana-stats.elastic.dev/s/code-coverage/app/home)
can be used to view statistics. The report combines code coverage for all jest tests within Kibana repository.
The package exposes a `createModelVersionTestBed` function that can be used to fully setup a
test bed for model version integration testing. It can be used to start and stop the ES server,
and to initiate the migration between the two versions we're testing.
#### Integration testing
With more complicated migrations, the behavior of the migration may be dependent on values from other plugins which may
be difficult or even impossible to test with unit tests. You need to actually bootstrap Kibana, load the plugins, and
then test the full end-to-end migration. This type of set up will also test ingesting your documents into Elasticsearch
against the mappings defined by your Saved Object type.
This can be achieved using the `jest_integration` suite and the `kbnTestServer` utility for starting an in-memory
instance of Kibana. You can then leverage the import API to test migrations. This API applies the same migrations to
imported documents as are applied at Kibana startup and is much easier to work with for testing.
**Example:**
```ts
// You may need to adjust these paths depending on where your test file is located.
// The absolute path is src/core/test_helpers/so_migrations
import { createTestHarness, SavedObjectTestHarness } from '../../../../src/core/test_helpers/so_migrations';
import {
createModelVersionTestBed,
type ModelVersionTestKit
} from '@kbn/core-test-helpers-model-versions';
describe('my plugin migrations', () => {
let testHarness: SavedObjectTestHarness;
describe('myIntegrationTest', () => {
const testbed = createModelVersionTestBed();
let testkit: ModelVersionTestKit;
beforeAll(async () => {
testHarness = createTestHarness();
await testHarness.start();
await testbed.startES();
});
afterAll(async () => {
await testHarness.stop();
await testbed.stopES();
});
it('successfully migrates valid case documents', async () => {
expect(
await testHarness.migrate([
{ type: 'case', id: '1', attributes: { connector_id: '1234' }, references: [] },
{ type: 'case', id: '2', attributes: { connector_id: '' }, references: [] },
{ type: 'case', id: '3', attributes: { connector_id: null }, references: [] },
])
).toEqual([
expect.objectContaining(
{ type: 'case', id: '1', attributes: { connector: { id: '1234', name: 'none' } } }),
expect.objectContaining(
{ type: 'case', id: '2', attributes: { connector: { id: 'none', name: 'none' } } }),
expect.objectContaining(
{ type: 'case', id: '3', attributes: { connector: { id: 'none', name: 'none' } } }),
])
})
})
beforeEach(async () => {
// prepare the test, preparing the index and performing the SO migration
testkit = await testbed.prepareTestKit({
savedObjectDefinitions: [{
definition: mySoTypeDefinition,
// the model version that will be used for the "before" version
modelVersionBefore: 1,
// the model version that will be used for the "after" version
modelVersionAfter: 2,
}]
})
});
afterEach(async () => {
if(testkit) {
// delete the indices between each tests to perform a migration again
await testkit.tearsDown();
}
});
it('can be used to test model version cohabitation', async () => {
// last registered version is `1` (modelVersionBefore)
const repositoryV1 = testkit.repositoryBefore;
// last registered version is `2` (modelVersionAfter)
const repositoryV2 = testkit.repositoryAfter;
// do something with the two repositories, e.g
await repositoryV1.create(someAttrs, { id });
const v2docReadFromV1 = await repositoryV2.get('my-type', id);
expect(v2docReadFromV1.attributes).toEqual(whatIExpect);
});
});
```
There are some caveats about using the import/export API for testing migrations:
- You cannot test the startup behavior of Kibana this way. This should not have any effect on type migrations but does
mean that this method cannot be used for testing the migration algorithm itself.
- While not yet supported, if support is added for migrations that affect multiple types, it's possible that the
behavior during import may vary slightly from the upgrade behavior.
**Limitations:**
Because the test bed is only creating the parts of Core required to instantiate the two SO
repositories, and because we're not able to properly load all plugins (for proper isolation), the integration
test bed currently has some limitations:
- no extensions are enabled
- no security
- no encryption
- no spaces
- all SO types will be using the same SO index
## Limitations and edge cases in serverless environments
The serverless environment, and the fact that upgrade in such environments are performed in a way
where, at some point, the old and new version of the application are living in cohabitation, leads
to some particularities regarding the way the SO APIs works, and to some limitations / edge case
that we need to document
### Using the `fields` option of the `find` savedObjects API
By default, the `find` API (as any other SO API returning documents) will migrate all documents before
returning them, to ensure that documents can be used by both versions during a cohabitation (e.g an old
node searching for documents already migrated, or a new node searching for documents not yet migrated).
However, when using the `fields` option of the `find` API, the documents can't be migrated, as some
model version changes can't be applied against a partial set of attributes. For this reason, when the
`fields` option is provided, the documents returned from `find` will **not** be migrated.
Which is why, when using this option, the API consumer needs to make sure that *all* the fields passed
to the `fields` option **were already present in the prior model version**. Otherwise, it may lead to inconsistencies
during upgrades, where newly introduced or backfilled fields may not necessarily appear in the documents returned
from the `search` API when the option is used.
(*note*: both the previous and next version of Kibana must follow this rule then)
### Using `bulkUpdate` for fields with large `json` blobs
The savedObjects `bulkUpdate` API will update documents client-side and then reindex the updated documents.
These update operations are done in-memory, and cause memory constraint issues when
updating many objects with large `json` blobs stored in some fields. As such, we recommend against using
`bulkUpdate` for savedObjects that:
- use arrays (as these tend to be large objects)
- store large `json` blobs in some fields
### Elasticsearch
@ -1561,4 +1543,4 @@ it('completes without returning results if aborted$ emits before the response',
expectObservable(results).toBe('-|');
});
});
```
```