[8.8] [esArchiver] Automatically cleanup SO indices when SO documents are found in data.json (#159582) (#159910)

# Backport

This will backport the following commits from `main` to `8.8`:
- [[esArchiver] Automatically cleanup SO indices when SO documents are
found in data.json
(#159582)](https://github.com/elastic/kibana/pull/159582)

<!--- Backport version: 8.9.7 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Gerard
Soldevila","email":"gerard.soldevila@elastic.co"},"sourceCommit":{"committedDate":"2023-06-19T11:08:03Z","message":"[esArchiver]
Automatically cleanup SO indices when SO documents are found in
data.json (#159582)\n\nThe ultimate goal of this PR is to lay the
groundwork to be able to\r\nremove the \"dynamic\" `mappings.json`,
which probably should have never\r\nexisted.\r\n\r\nWith the PR,
detecting SO documents in the `data.json` will\r\nautomatically trigger
a cleanup of the SO indices.\r\nThis, in turn, will allow not having to
define \"dynamic\" saved objects\r\nindices (i.e. those with the
`$KIBANA_PACKAGE_VERSION` variable in
the\r\n`mappings.json`).\r\n\r\nIIUC the idea behind the dynamic indices
was to have SO indices that are\r\naligned with the current stack
version, avoiding the extra overhead of\r\nhaving to migrate the
inserted documents, and reducing overall
test\r\ntimes.\r\n\r\nNonetheless, what is happening today is:\r\n1. FTR
starts ES and Kibana.\r\n2. Kibana creates current version SO indices at
startup (empty ones).\r\n3. `esArchiver.load()` processes the
`mappings.json`.\r\n3.1. It detects that we are defining SO indices and
**deletes** existing\r\nsaved object indices.\r\n3.2 It then re-creates
these indices according to the definitions on\r\n`mappings.json`.\r\n4.
`esArchiver.load()` processes the `data.json`. Specifically,
it\r\ninserts SO documents present in `data.json`.\r\n5.
`esArchiver.load()` calls the _KibanaMigrator_ to make sure that
the\r\ninserted documents are up-to-date, hoping they are already
aligned with\r\ncurrent stack version (which is not always the case, not
even with\r\n\"dynamic\" mappings).\r\n\r\nTwo interesting things to
note:\r\n- Steps 3 to 5 happen whilst Kibana is already started and
running. If\r\nKibana queries SO indices during `esArchiver.load()`, and
a request to\r\nES is made **right after** 3.2, the result might
be\r\nhttps://github.com/elastic/kibana/issues/158918.\r\n- Having
dynamic SO indices' definitions, deleting the \"official\"\r\nindices
created by Kibana (3.1), and recreating them hoping to be\r\naligned
with current stack version (3.2) is non-sense. We could use
the\r\nexisting SO indices instead, and simply clean them up whenever we
are\r\nabout to insert SO documents.\r\n\r\nPerforming that cleanup is
precisely the goal of this PR.\r\nThen, in subsequent PRs
like\r\nhttps://github.com/elastic/kibana/pull/159397/files, tackling
the flaky\r\ntests, we'll be able to simply remove the \"dynamic\"
`mappings.json`\r\ndefinitions, causing `esArchiver` to rely on SO
indices created by\r\nKibana.\r\n\r\nThanks to this PR, the FTR tests
won't need to explicitly cleanup saved\r\nobject indices in the `before`
hooks.","sha":"bbb5fc4abe7dd530d8248a09a9638cd3438202aa","branchLabelMapping":{"^v8.9.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["Team:Core","Team:Operations","technical
debt","release_note:skip","backport:prev-minor","v8.9.0","FTR","v8.8.2"],"number":159582,"url":"https://github.com/elastic/kibana/pull/159582","mergeCommit":{"message":"[esArchiver]
Automatically cleanup SO indices when SO documents are found in
data.json (#159582)\n\nThe ultimate goal of this PR is to lay the
groundwork to be able to\r\nremove the \"dynamic\" `mappings.json`,
which probably should have never\r\nexisted.\r\n\r\nWith the PR,
detecting SO documents in the `data.json` will\r\nautomatically trigger
a cleanup of the SO indices.\r\nThis, in turn, will allow not having to
define \"dynamic\" saved objects\r\nindices (i.e. those with the
`$KIBANA_PACKAGE_VERSION` variable in
the\r\n`mappings.json`).\r\n\r\nIIUC the idea behind the dynamic indices
was to have SO indices that are\r\naligned with the current stack
version, avoiding the extra overhead of\r\nhaving to migrate the
inserted documents, and reducing overall
test\r\ntimes.\r\n\r\nNonetheless, what is happening today is:\r\n1. FTR
starts ES and Kibana.\r\n2. Kibana creates current version SO indices at
startup (empty ones).\r\n3. `esArchiver.load()` processes the
`mappings.json`.\r\n3.1. It detects that we are defining SO indices and
**deletes** existing\r\nsaved object indices.\r\n3.2 It then re-creates
these indices according to the definitions on\r\n`mappings.json`.\r\n4.
`esArchiver.load()` processes the `data.json`. Specifically,
it\r\ninserts SO documents present in `data.json`.\r\n5.
`esArchiver.load()` calls the _KibanaMigrator_ to make sure that
the\r\ninserted documents are up-to-date, hoping they are already
aligned with\r\ncurrent stack version (which is not always the case, not
even with\r\n\"dynamic\" mappings).\r\n\r\nTwo interesting things to
note:\r\n- Steps 3 to 5 happen whilst Kibana is already started and
running. If\r\nKibana queries SO indices during `esArchiver.load()`, and
a request to\r\nES is made **right after** 3.2, the result might
be\r\nhttps://github.com/elastic/kibana/issues/158918.\r\n- Having
dynamic SO indices' definitions, deleting the \"official\"\r\nindices
created by Kibana (3.1), and recreating them hoping to be\r\naligned
with current stack version (3.2) is non-sense. We could use
the\r\nexisting SO indices instead, and simply clean them up whenever we
are\r\nabout to insert SO documents.\r\n\r\nPerforming that cleanup is
precisely the goal of this PR.\r\nThen, in subsequent PRs
like\r\nhttps://github.com/elastic/kibana/pull/159397/files, tackling
the flaky\r\ntests, we'll be able to simply remove the \"dynamic\"
`mappings.json`\r\ndefinitions, causing `esArchiver` to rely on SO
indices created by\r\nKibana.\r\n\r\nThanks to this PR, the FTR tests
won't need to explicitly cleanup saved\r\nobject indices in the `before`
hooks.","sha":"bbb5fc4abe7dd530d8248a09a9638cd3438202aa"}},"sourceBranch":"main","suggestedTargetBranches":["8.8"],"targetPullRequestStates":[{"branch":"main","label":"v8.9.0","labelRegex":"^v8.9.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/159582","number":159582,"mergeCommit":{"message":"[esArchiver]
Automatically cleanup SO indices when SO documents are found in
data.json (#159582)\n\nThe ultimate goal of this PR is to lay the
groundwork to be able to\r\nremove the \"dynamic\" `mappings.json`,
which probably should have never\r\nexisted.\r\n\r\nWith the PR,
detecting SO documents in the `data.json` will\r\nautomatically trigger
a cleanup of the SO indices.\r\nThis, in turn, will allow not having to
define \"dynamic\" saved objects\r\nindices (i.e. those with the
`$KIBANA_PACKAGE_VERSION` variable in
the\r\n`mappings.json`).\r\n\r\nIIUC the idea behind the dynamic indices
was to have SO indices that are\r\naligned with the current stack
version, avoiding the extra overhead of\r\nhaving to migrate the
inserted documents, and reducing overall
test\r\ntimes.\r\n\r\nNonetheless, what is happening today is:\r\n1. FTR
starts ES and Kibana.\r\n2. Kibana creates current version SO indices at
startup (empty ones).\r\n3. `esArchiver.load()` processes the
`mappings.json`.\r\n3.1. It detects that we are defining SO indices and
**deletes** existing\r\nsaved object indices.\r\n3.2 It then re-creates
these indices according to the definitions on\r\n`mappings.json`.\r\n4.
`esArchiver.load()` processes the `data.json`. Specifically,
it\r\ninserts SO documents present in `data.json`.\r\n5.
`esArchiver.load()` calls the _KibanaMigrator_ to make sure that
the\r\ninserted documents are up-to-date, hoping they are already
aligned with\r\ncurrent stack version (which is not always the case, not
even with\r\n\"dynamic\" mappings).\r\n\r\nTwo interesting things to
note:\r\n- Steps 3 to 5 happen whilst Kibana is already started and
running. If\r\nKibana queries SO indices during `esArchiver.load()`, and
a request to\r\nES is made **right after** 3.2, the result might
be\r\nhttps://github.com/elastic/kibana/issues/158918.\r\n- Having
dynamic SO indices' definitions, deleting the \"official\"\r\nindices
created by Kibana (3.1), and recreating them hoping to be\r\naligned
with current stack version (3.2) is non-sense. We could use
the\r\nexisting SO indices instead, and simply clean them up whenever we
are\r\nabout to insert SO documents.\r\n\r\nPerforming that cleanup is
precisely the goal of this PR.\r\nThen, in subsequent PRs
like\r\nhttps://github.com/elastic/kibana/pull/159397/files, tackling
the flaky\r\ntests, we'll be able to simply remove the \"dynamic\"
`mappings.json`\r\ndefinitions, causing `esArchiver` to rely on SO
indices created by\r\nKibana.\r\n\r\nThanks to this PR, the FTR tests
won't need to explicitly cleanup saved\r\nobject indices in the `before`
hooks.","sha":"bbb5fc4abe7dd530d8248a09a9638cd3438202aa"}},{"branch":"8.8","label":"v8.8.2","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

Co-authored-by: Gerard Soldevila <gerard.soldevila@elastic.co>
This commit is contained in:
Kibana Machine 2023-06-19 08:50:56 -04:00 committed by GitHub
parent 9e48a57755
commit d2683cba7b
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
6 changed files with 133 additions and 27 deletions

View file

@ -6,12 +6,17 @@
* Side Public License, v 1.
*/
import type { deleteSavedObjectIndices } from './kibana_index';
import type { cleanSavedObjectIndices, deleteSavedObjectIndices } from './kibana_index';
export const mockdeleteSavedObjectIndices = jest.fn() as jest.MockedFunction<
export const mockCleanSavedObjectIndices = jest.fn() as jest.MockedFunction<
typeof cleanSavedObjectIndices
>;
export const mockDeleteSavedObjectIndices = jest.fn() as jest.MockedFunction<
typeof deleteSavedObjectIndices
>;
jest.mock('./kibana_index', () => ({
deleteSavedObjectIndices: mockdeleteSavedObjectIndices,
cleanSavedObjectIndices: mockCleanSavedObjectIndices,
deleteSavedObjectIndices: mockDeleteSavedObjectIndices,
}));

View file

@ -6,7 +6,10 @@
* Side Public License, v 1.
*/
import { mockdeleteSavedObjectIndices } from './create_index_stream.test.mock';
import {
mockCleanSavedObjectIndices,
mockDeleteSavedObjectIndices,
} from './create_index_stream.test.mock';
import sinon from 'sinon';
import Chance from 'chance';
@ -28,7 +31,8 @@ const chance = new Chance();
const log = createStubLogger();
beforeEach(() => {
mockdeleteSavedObjectIndices.mockClear();
mockCleanSavedObjectIndices.mockClear();
mockDeleteSavedObjectIndices.mockClear();
});
describe('esArchiver: createCreateIndexStream()', () => {
@ -199,25 +203,25 @@ describe('esArchiver: createCreateIndexStream()', () => {
it('does not delete Kibana indices for indexes that do not start with .kibana', async () => {
await doTest('.foo');
expect(mockdeleteSavedObjectIndices).not.toHaveBeenCalled();
expect(mockDeleteSavedObjectIndices).not.toHaveBeenCalled();
});
it('deletes Kibana indices at most once for indices that start with .kibana', async () => {
// If we are loading the main Kibana index, we should delete all Kibana indices for backwards compatibility reasons.
await doTest('.kibana_7.16.0_001', '.kibana_task_manager_7.16.0_001');
expect(mockdeleteSavedObjectIndices).toHaveBeenCalledTimes(1);
expect(mockdeleteSavedObjectIndices).toHaveBeenCalledWith(
expect.not.objectContaining({ onlyTaskManager: true })
expect(mockDeleteSavedObjectIndices).toHaveBeenCalledTimes(1);
expect(mockDeleteSavedObjectIndices).toHaveBeenCalledWith(
expect.not.objectContaining({ index: '.kibana_task_manager_7.16.0_001' })
);
});
it('deletes Kibana task manager index at most once, using onlyTaskManager: true', async () => {
it('deletes Kibana task manager index at most once', async () => {
// If we are loading the Kibana task manager index, we should only delete that index, not any other Kibana indices.
await doTest('.kibana_task_manager_7.16.0_001', '.kibana_task_manager_7.16.0_002');
expect(mockdeleteSavedObjectIndices).toHaveBeenCalledTimes(1);
expect(mockdeleteSavedObjectIndices).toHaveBeenCalledWith(
expect(mockDeleteSavedObjectIndices).toHaveBeenCalledTimes(1);
expect(mockDeleteSavedObjectIndices).toHaveBeenCalledWith(
expect.objectContaining({ onlyTaskManager: true })
);
});
@ -227,18 +231,63 @@ describe('esArchiver: createCreateIndexStream()', () => {
// So, we first delete only the Kibana task manager indices, then we wind up deleting all Kibana indices.
await doTest('.kibana_task_manager_7.16.0_001', '.kibana_7.16.0_001');
expect(mockdeleteSavedObjectIndices).toHaveBeenCalledTimes(2);
expect(mockdeleteSavedObjectIndices).toHaveBeenNthCalledWith(
expect(mockDeleteSavedObjectIndices).toHaveBeenCalledTimes(2);
expect(mockDeleteSavedObjectIndices).toHaveBeenNthCalledWith(
1,
expect.objectContaining({ onlyTaskManager: true })
);
expect(mockdeleteSavedObjectIndices).toHaveBeenNthCalledWith(
expect(mockDeleteSavedObjectIndices).toHaveBeenNthCalledWith(
2,
expect.not.objectContaining({ onlyTaskManager: true })
expect.not.objectContaining({ index: expect.any(String) })
);
});
});
describe('saved object cleanup', () => {
describe('when saved object documents are found', () => {
it('cleans the corresponding saved object indices', async () => {
const client = createStubClient();
const stats = createStubStats();
await createPromiseFromStreams([
createListStream([
createStubDocRecord('.kibana_task_manager', 1),
createStubDocRecord('.kibana_alerting_cases', 2),
createStubDocRecord('.kibana', 3),
]),
createCreateIndexStream({ client, stats, log }),
]);
expect(mockCleanSavedObjectIndices).toHaveBeenCalledTimes(2);
expect(mockCleanSavedObjectIndices).toHaveBeenNthCalledWith(
1,
expect.objectContaining({ index: '.kibana_task_manager' })
);
expect(mockCleanSavedObjectIndices).toHaveBeenNthCalledWith(
2,
expect.not.objectContaining({ index: expect.any(String) })
);
});
});
describe('when saved object documents are not found', () => {
it('does not clean any indices', async () => {
const client = createStubClient();
const stats = createStubStats();
await createPromiseFromStreams([
createListStream([
createStubDocRecord('.foo', 1),
createStubDocRecord('.bar', 2),
createStubDocRecord('.baz', 3),
]),
createCreateIndexStream({ client, stats, log }),
]);
expect(mockCleanSavedObjectIndices).not.toHaveBeenCalled();
});
});
});
describe('docsOnly = true', () => {
it('passes through "hit" records without attempting to create indices', async () => {
const client = createStubClient();

View file

@ -19,7 +19,7 @@ import {
TASK_MANAGER_SAVED_OBJECT_INDEX,
} from '@kbn/core-saved-objects-server';
import { Stats } from '../stats';
import { deleteSavedObjectIndices } from './kibana_index';
import { cleanSavedObjectIndices, deleteSavedObjectIndices } from './kibana_index';
import { deleteIndex } from './delete_index';
import { deleteDataStream } from './delete_data_stream';
import { ES_CLIENT_HEADERS } from '../../client_headers';
@ -50,14 +50,36 @@ export function createCreateIndexStream({
// If we're trying to import Kibana index docs, we need to ensure that
// previous indices are removed so we're starting w/ a clean slate for
// migrations. This only needs to be done once per archive load operation.
let kibanaIndexAlreadyDeleted = false;
let kibanaIndicesAlreadyDeleted = false;
let kibanaTaskManagerIndexAlreadyDeleted = false;
// if we detect saved object documents defined in the data.json, we will cleanup their indices
let kibanaIndicesAlreadyCleaned = false;
let kibanaTaskManagerIndexAlreadyCleaned = false;
async function handleDoc(stream: Readable, record: DocRecord) {
if (skipDocsFromIndices.has(record.value.index)) {
const index = record.value.index;
if (skipDocsFromIndices.has(index)) {
return;
}
if (!skipExisting) {
if (index?.startsWith(TASK_MANAGER_SAVED_OBJECT_INDEX)) {
if (!kibanaTaskManagerIndexAlreadyDeleted && !kibanaTaskManagerIndexAlreadyCleaned) {
await cleanSavedObjectIndices({ client, stats, log, index });
kibanaTaskManagerIndexAlreadyCleaned = true;
log.debug(`Cleaned saved object index [${index}]`);
}
} else if (index?.startsWith(MAIN_SAVED_OBJECT_INDEX)) {
if (!kibanaIndicesAlreadyDeleted && !kibanaIndicesAlreadyCleaned) {
await cleanSavedObjectIndices({ client, stats, log });
kibanaIndicesAlreadyCleaned = kibanaTaskManagerIndexAlreadyCleaned = true;
log.debug(`Cleaned all saved object indices`);
}
}
}
stream.push(record);
}
@ -109,12 +131,14 @@ export function createCreateIndexStream({
async function attemptToCreate(attemptNumber = 1) {
try {
if (isKibana && !kibanaIndexAlreadyDeleted) {
if (isKibana && !kibanaIndicesAlreadyDeleted) {
await deleteSavedObjectIndices({ client, stats, log }); // delete all .kibana* indices
kibanaIndexAlreadyDeleted = kibanaTaskManagerIndexAlreadyDeleted = true;
kibanaIndicesAlreadyDeleted = kibanaTaskManagerIndexAlreadyDeleted = true;
log.debug(`Deleted all saved object indices`);
} else if (isKibanaTaskManager && !kibanaTaskManagerIndexAlreadyDeleted) {
await deleteSavedObjectIndices({ client, stats, onlyTaskManager: true, log }); // delete only .kibana_task_manager* indices
kibanaTaskManagerIndexAlreadyDeleted = true;
log.debug(`Deleted saved object index [${index}]`);
}
await client.indices.create(
@ -137,7 +161,11 @@ export function createCreateIndexStream({
err?.body?.error?.reason?.includes('index exists with the same name as the alias') &&
attemptNumber < 3
) {
kibanaIndexAlreadyDeleted = false;
kibanaTaskManagerIndexAlreadyDeleted = false;
if (isKibana) {
kibanaIndicesAlreadyDeleted = false;
}
const aliasStr = inspect(aliases);
log.info(
`failed to create aliases [${aliasStr}] because ES indicated an index/alias already exists, trying again`

View file

@ -6,6 +6,8 @@
* Side Public License, v 1.
*/
import { mockCleanSavedObjectIndices } from './create_index_stream.test.mock';
import sinon from 'sinon';
import { createListStream, createPromiseFromStreams } from '@kbn/utils';
@ -22,6 +24,10 @@ import {
const log = createStubLogger();
beforeEach(() => {
mockCleanSavedObjectIndices.mockClear();
});
describe('esArchiver: createDeleteIndexStream()', () => {
it('deletes the index without checking if it exists', async () => {
const stats = createStubStats();

View file

@ -10,13 +10,20 @@ import { Transform } from 'stream';
import type { Client } from '@elastic/elasticsearch';
import { ToolingLog } from '@kbn/tooling-log';
import { MAIN_SAVED_OBJECT_INDEX } from '@kbn/core-saved-objects-server';
import {
MAIN_SAVED_OBJECT_INDEX,
TASK_MANAGER_SAVED_OBJECT_INDEX,
} from '@kbn/core-saved-objects-server';
import { Stats } from '../stats';
import { deleteIndex } from './delete_index';
import { cleanSavedObjectIndices } from './kibana_index';
import { deleteDataStream } from './delete_data_stream';
export function createDeleteIndexStream(client: Client, stats: Stats, log: ToolingLog) {
// if we detect saved object documents defined in the data.json, we will cleanup their indices
let kibanaIndicesAlreadyCleaned = false;
let kibanaTaskManagerIndexAlreadyCleaned = false;
return new Transform({
readableObjectMode: true,
writableObjectMode: true,
@ -29,8 +36,18 @@ export function createDeleteIndexStream(client: Client, stats: Stats, log: Tooli
if (record.type === 'index') {
const { index } = record.value;
if (index.startsWith(MAIN_SAVED_OBJECT_INDEX)) {
await cleanSavedObjectIndices({ client, stats, log });
if (index.startsWith(TASK_MANAGER_SAVED_OBJECT_INDEX)) {
if (!kibanaTaskManagerIndexAlreadyCleaned) {
await cleanSavedObjectIndices({ client, stats, index, log });
kibanaTaskManagerIndexAlreadyCleaned = true;
log.debug(`Cleaned saved object index [${index}]`);
}
} else if (index.startsWith(MAIN_SAVED_OBJECT_INDEX)) {
if (!kibanaIndicesAlreadyCleaned) {
await cleanSavedObjectIndices({ client, stats, log });
kibanaIndicesAlreadyCleaned = kibanaTaskManagerIndexAlreadyCleaned = true;
log.debug(`Cleaned all saved object indices`);
}
} else {
await deleteIndex({ client, stats, log, index });
}

View file

@ -50,7 +50,6 @@ export async function deleteSavedObjectIndices({
headers: ES_CLIENT_HEADERS,
}
);
await deleteIndex({
client,
stats,
@ -111,15 +110,17 @@ export async function cleanSavedObjectIndices({
client,
stats,
log,
index = ALL_SAVED_OBJECT_INDICES,
}: {
client: Client;
stats: Stats;
log: ToolingLog;
index?: string | string[];
}) {
while (true) {
const resp = await client.deleteByQuery(
{
index: ALL_SAVED_OBJECT_INDICES,
index,
body: {
query: {
bool: {