[8.x] [HTTP] Add a circuit breaker for the HTTP server (#190684) (#208494)

# Backport

This will backport the following commits from `main` to `8.x`:
- [[HTTP] Add a circuit breaker for the HTTP server
(#190684)](https://github.com/elastic/kibana/pull/190684)

<!--- Backport version: 9.6.4 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Michael
Dokolin","email":"mikhail.dokolin@elastic.co"},"sourceCommit":{"committedDate":"2025-01-27T20:29:21Z","message":"[HTTP]
Add a circuit breaker for the HTTP server (#190684)\n\nThis PR resolves
#194605 and closes #170132 and brings the following\nchanges:\n- changed
ELU metrics evaluation used for autoscaling;\n- a rate limiter to
throttle incoming requests when under a high load;\n- a configuration
option to exclude some routes from the rate
limiter.","sha":"52b7bc6f06d2651a5b8f9023e1e526147a659ab0","branchLabelMapping":{"^v9.0.0$":"main","^v8.18.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:enhancement","Feature:http","Team:Core","v9.0.0","ci:build-serverless-image","backport:version","v8.18.0"],"title":"Add
a circuit breaker for the HTTP
server","number":190684,"url":"https://github.com/elastic/kibana/pull/190684","mergeCommit":{"message":"[HTTP]
Add a circuit breaker for the HTTP server (#190684)\n\nThis PR resolves
#194605 and closes #170132 and brings the following\nchanges:\n- changed
ELU metrics evaluation used for autoscaling;\n- a rate limiter to
throttle incoming requests when under a high load;\n- a configuration
option to exclude some routes from the rate
limiter.","sha":"52b7bc6f06d2651a5b8f9023e1e526147a659ab0"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/190684","number":190684,"mergeCommit":{"message":"[HTTP]
Add a circuit breaker for the HTTP server (#190684)\n\nThis PR resolves
#194605 and closes #170132 and brings the following\nchanges:\n- changed
ELU metrics evaluation used for autoscaling;\n- a rate limiter to
throttle incoming requests when under a high load;\n- a configuration
option to exclude some routes from the rate
limiter.","sha":"52b7bc6f06d2651a5b8f9023e1e526147a659ab0"}},{"branch":"8.x","label":"v8.18.0","branchLabelMappingKey":"^v8.18.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->
This commit is contained in:
Michael Dokolin 2025-01-28 12:06:30 +01:00 committed by GitHub
parent 2c52ca40b0
commit f7234d92f9
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
45 changed files with 663 additions and 179 deletions

1
.github/CODEOWNERS vendored
View file

@ -202,6 +202,7 @@ src/core/packages/http/common @elastic/kibana-core
src/core/packages/http/context-server-internal @elastic/kibana-core
packages/core/http/core-http-context-server-mocks @elastic/kibana-core
test/plugin_functional/plugins/core_http @elastic/kibana-core
src/core/packages/http/rate-limiter-internal @elastic/kibana-core
src/core/packages/http/request-handler-context-server @elastic/kibana-core
src/core/packages/http/request-handler-context-server-internal @elastic/kibana-core
src/core/packages/http/resources-server @elastic/kibana-core

View file

@ -480,6 +480,40 @@ NOTE: By default, enabling `http2` requires a valid `h2c` configuration, meaning
and <<server-ssl-supportedProtocols, `server.ssl.supportedProtocols`>>, if specified, must contain at least `TLSv1.2` or `TLSv1.3`. Strict validation of
the `h2c` setup can be disabled by adding `server.http2.allowUnsecure: true` to the configuration.
[[server-rate-limiter-enabled]] `server.rateLimiter.enabled`::
Enables rate-limiting of requests to the {kib} server based on Node.js' Event Loop Utilization.
If the average event loop utilization for the specified term exceeds the configured threshold, the server will respond with a `429 Too Many Requests` status code.
+
This functionality should be used carefully as it may impact the server's availability.
The configuration options vary per environment, so it is recommended to enable this option in a testing environment first, adjust the rate-limiter configuration, and then roll it out to production.
+
*Default: `false`*
`server.rateLimiter.elu`::
The Event Loop Utilization (ELU) threshold for rate-limiting requests to the {kib} server.
The ELU is a value between 0 and 1, representing the average event loop utilization over the specified term.
If the average ELU exceeds this threshold, the server will respond with a `429 Too Many Requests` status code.
+
In a multi-instance environment with autoscaling, this value is usually between 0.6 and 0.8 to give the autoscaler enough time to react.
This value can be higher in a single-instance environment but should not exceed 1.0. In general, the lower the value, the more aggressive the rate limiting.
And the highest possible option should be used to prevent the {kib} server from being terminated.
`server.rateLimiter.term`::
This value is one of `short`, `medium`, or `long`, representing the term over which the average event loop utilization is calculated.
It uses exponential moving averages (EMA) to smooth out the utilization values.
Each term corresponds to `15s`, `30s`, and `60s`, respectively.
+
The term value also changes the way the rate limiter sees the trend in the load:
+
- `short`: `elu.short > server.rateLimiter.term`;
- `medium`: `elu.short > server.rateLimiter.elu AND elu.medium > server.rateLimiter.elu`;
- `long`: `elu.short > server.rateLimiter.elu AND elu.medium > server.rateLimiter.elu AND elu.long > server.rateLimiter.elu`.
+
This behavior prevents requests from being throttled if the load starts decreasing.
In general, the shorter the term, the more aggressive the rate limiting.
In the multi-instance environment, the `medium` term makes the most sense as it gives the {kib} server enough time to spin up a new instance and prevents the existing instances from being terminated.
[[server-requestId-allowFromAnyIp]] `server.requestId.allowFromAnyIp`::
Sets whether or not the `X-Opaque-Id` header should be trusted from any IP address for identifying requests in logs and forwarded to Elasticsearch.

View file

@ -303,6 +303,7 @@
"@kbn/core-http-common": "link:src/core/packages/http/common",
"@kbn/core-http-context-server-internal": "link:src/core/packages/http/context-server-internal",
"@kbn/core-http-plugin": "link:test/plugin_functional/plugins/core_http",
"@kbn/core-http-rate-limiter-internal": "link:src/core/packages/http/rate-limiter-internal",
"@kbn/core-http-request-handler-context-server": "link:src/core/packages/http/request-handler-context-server",
"@kbn/core-http-request-handler-context-server-internal": "link:src/core/packages/http/request-handler-context-server-internal",
"@kbn/core-http-resources-server": "link:src/core/packages/http/resources-server",

View file

@ -10,6 +10,7 @@
import { hapiMocks } from '@kbn/hapi-mocks';
import type {
LifecycleResponseFactory,
OnPreAuthToolkit,
OnPreResponseToolkit,
OnPostAuthToolkit,
OnPreRoutingToolkit,
@ -27,7 +28,9 @@ const createLifecycleResponseFactoryMock = (): jest.Mocked<LifecycleResponseFact
customError: jest.fn(),
});
type ToolkitMock = jest.Mocked<OnPreResponseToolkit & OnPostAuthToolkit & OnPreRoutingToolkit>;
type ToolkitMock = jest.Mocked<
OnPreAuthToolkit & OnPreResponseToolkit & OnPostAuthToolkit & OnPreRoutingToolkit
>;
const createToolkitMock = (): ToolkitMock => {
return {

View file

@ -26,7 +26,7 @@ import type {
import { AuthStatus } from '@kbn/core-http-server';
import { mockRouter, RouterMock } from '@kbn/core-http-router-server-mocks';
import { CspConfig, ExternalUrlConfig } from '@kbn/core-http-server-internal';
import { CspConfig, ExternalUrlConfig, config } from '@kbn/core-http-server-internal';
import type {
HttpService,
InternalHttpServicePreboot,
@ -188,6 +188,7 @@ const createInternalSetupContractMock = () => {
authRequestHeaders: createAuthHeaderStorageMock(),
getServerInfo: jest.fn(),
registerRouterAfterListening: jest.fn(),
rateLimiter: config.schema.getSchema().extract('rateLimiter').validate({}).value,
};
mock.createCookieSessionStorageFactory.mockResolvedValue(sessionStorageMock.createFactory());
mock.createRouter.mockImplementation(() => mockRouter.create());

View file

@ -33,6 +33,7 @@ export const sampleEsClientMetrics: ElasticsearchClientsMetrics = {
const createInternalSetupContractMock = () => {
const setupContract: jest.Mocked<InternalMetricsServiceSetup> = {
collectionInterval: 30000,
getEluMetrics$: jest.fn(),
getOpsMetrics$: jest.fn(),
};

View file

@ -0,0 +1,3 @@
# @kbn/core-http-rate-limiter-internal
This package contains the rate limiter implementation for Core's internal `http` resources service.

View file

@ -0,0 +1,15 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the "Elastic License
* 2.0", the "GNU Affero General Public License v3.0 only", and the "Server Side
* Public License v 1"; you may not use this file except in compliance with, at
* your election, the "Elastic License 2.0", the "GNU Affero General Public
* License v3.0 only", or the "Server Side Public License, v 1".
*/
export {
HttpRateLimiterService,
type SetupDeps,
type InternalRateLimiterSetup,
type InternalRateLimiterStart,
} from './src/service';

View file

@ -0,0 +1,14 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the "Elastic License
* 2.0", the "GNU Affero General Public License v3.0 only", and the "Server Side
* Public License v 1"; you may not use this file except in compliance with, at
* your election, the "Elastic License 2.0", the "GNU Affero General Public
* License v3.0 only", or the "Server Side Public License, v 1".
*/
module.exports = {
preset: '@kbn/test/jest_node',
rootDir: '../../../../..',
roots: ['<rootDir>/src/core/packages/http/rate-limiter-internal'],
};

View file

@ -0,0 +1,9 @@
{
"type": "shared-server",
"id": "@kbn/core-http-rate-limiter-internal",
"owner": [
"@elastic/kibana-core"
],
"group": "platform",
"visibility": "private"
}

View file

@ -0,0 +1,7 @@
{
"name": "@kbn/core-http-rate-limiter-internal",
"private": true,
"version": "1.0.0",
"author": "Kibana Core",
"license": "Elastic License 2.0 OR AGPL-3.0-only OR SSPL-1.0"
}

View file

@ -0,0 +1,138 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the "Elastic License
* 2.0", the "GNU Affero General Public License v3.0 only", and the "Server Side
* Public License v 1"; you may not use this file except in compliance with, at
* your election, the "Elastic License 2.0", the "GNU Affero General Public
* License v3.0 only", or the "Server Side Public License, v 1".
*/
import { Subject } from 'rxjs';
import type { OnPreAuthHandler } from '@kbn/core-http-server';
import {
httpServerMock,
httpServiceMock,
type InternalHttpServiceSetupMock,
} from '@kbn/core-http-server-mocks';
import { metricsServiceMock } from '@kbn/core-metrics-server-mocks';
import type { UnwrapObservable } from '@kbn/utility-types';
import { HttpRateLimiterService } from './service';
describe('HttpRateLimiterService', () => {
let service: HttpRateLimiterService;
let http: InternalHttpServiceSetupMock;
let metrics: ReturnType<typeof metricsServiceMock.createInternalSetupContract>;
let config: typeof http.rateLimiter extends Readonly<infer T> ? T : never;
let elu$: Subject<UnwrapObservable<ReturnType<typeof metrics.getEluMetrics$>>>;
beforeEach(() => {
config = {} as typeof config;
elu$ = new Subject();
service = new HttpRateLimiterService();
http = httpServiceMock.createInternalSetupContract();
metrics = metricsServiceMock.createInternalSetupContract();
http.rateLimiter = config as typeof http.rateLimiter;
metrics.getEluMetrics$.mockReturnValue(elu$);
});
describe('setup', () => {
describe('when disabled', () => {
it('should not register a handler', () => {
config.enabled = false;
service.setup({ http, metrics });
expect(http.registerOnPreAuth).not.toHaveBeenCalled();
});
});
describe('when enabled', () => {
let handler: OnPreAuthHandler;
let request: ReturnType<typeof httpServerMock.createKibanaRequest>;
let response: ReturnType<typeof httpServerMock.createResponseFactory>;
let toolkit: ReturnType<typeof httpServerMock.createToolkit>;
const ignored = 'ignored' as unknown as ReturnType<typeof toolkit.next>;
const throttled = 'throttled' as unknown as ReturnType<typeof response.customError>;
beforeEach(() => {
config.enabled = true;
config.elu = 0.5;
config.term = 'short';
request = httpServerMock.createKibanaRequest();
response = httpServerMock.createResponseFactory();
toolkit = httpServerMock.createToolkit();
toolkit.next.mockReturnValue(ignored);
response.customError.mockReturnValue(throttled);
service.setup({ http, metrics });
[handler] = http.registerOnPreAuth.mock.lastCall!;
});
it('should register a handler if the rate limiter is enabled', () => {
expect(http.registerOnPreAuth).toHaveBeenCalledWith(expect.any(Function));
});
it('should not throttle until started', () => {
elu$.next({ short: 0.9, medium: 0.9, long: 0.9 });
expect(handler(request, response, toolkit)).toBe(ignored);
});
it('should throttle when started', () => {
service.start();
elu$.next({ short: 0.9, medium: 0.9, long: 0.9 });
expect(handler(request, response, toolkit)).toBe(throttled);
});
it('should not throttle when stopped', () => {
service.start();
service.stop();
elu$.next({ short: 0.9, medium: 0.9, long: 0.9 });
expect(handler(request, response, toolkit)).toBe(ignored);
});
it('should not throttle excluded routes', () => {
service.start();
elu$.next({ short: 0.9, medium: 0.9, long: 0.9 });
expect(
handler(
httpServerMock.createKibanaRequest({
kibanaRouteOptions: {
access: 'internal',
excludeFromRateLimiter: true,
xsrfRequired: true,
},
}),
response,
toolkit
)
).toBe(ignored);
});
it.each`
threshold | term | short | medium | long | expected
${0.6} | ${'short'} | ${0.5} | ${0.5} | ${0.5} | ${ignored}
${0.4} | ${'short'} | ${0.5} | ${0.5} | ${0.5} | ${throttled}
${0.6} | ${'medium'} | ${0.4} | ${0.5} | ${0.6} | ${ignored}
${0.5} | ${'medium'} | ${0.4} | ${0.5} | ${0.6} | ${ignored}
${0.4} | ${'medium'} | ${0.4} | ${0.5} | ${0.6} | ${throttled}
${0.7} | ${'long'} | ${0.4} | ${0.5} | ${0.6} | ${ignored}
${0.6} | ${'long'} | ${0.4} | ${0.5} | ${0.6} | ${ignored}
${0.5} | ${'long'} | ${0.4} | ${0.5} | ${0.6} | ${ignored}
${0.4} | ${'long'} | ${0.4} | ${0.5} | ${0.6} | ${throttled}
`(
'should be $expected when the threshold is $threshold for the $term-term',
({ threshold, term, short, medium, long, expected }) => {
config.elu = threshold;
config.term = term;
service.setup({ http, metrics });
[handler] = http.registerOnPreAuth.mock.lastCall!;
service.start();
elu$.next({ short, medium, long });
expect(handler(request, response, toolkit)).toBe(expected);
}
);
});
});
});

View file

@ -0,0 +1,95 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the "Elastic License
* 2.0", the "GNU Affero General Public License v3.0 only", and the "Server Side
* Public License v 1"; you may not use this file except in compliance with, at
* your election, the "Elastic License 2.0", the "GNU Affero General Public
* License v3.0 only", or the "Server Side Public License, v 1".
*/
import {
BehaviorSubject,
endWith,
map,
skipUntil,
type Observable,
Subject,
takeUntil,
} from 'rxjs';
import type { CoreService } from '@kbn/core-base-server-internal';
import type { KibanaRequest, OnPreAuthHandler } from '@kbn/core-http-server';
import type { InternalHttpServiceSetup } from '@kbn/core-http-server-internal';
import type { EluMetrics } from '@kbn/core-metrics-server';
import type { InternalMetricsServiceSetup } from '@kbn/core-metrics-server-internal';
/** @internal */
export interface SetupDeps {
http: InternalHttpServiceSetup;
metrics: InternalMetricsServiceSetup;
}
/** @internal */
export type InternalRateLimiterSetup = void;
/** @internal */
export type InternalRateLimiterStart = void;
/** @internal */
export class HttpRateLimiterService
implements CoreService<InternalRateLimiterSetup, InternalRateLimiterStart>
{
private overloaded$ = new BehaviorSubject(false);
private ready$ = new Subject<boolean>();
private stopped$ = new Subject<boolean>();
private handler: OnPreAuthHandler = (request, response, toolkit) => {
if (!this.shouldBeThrottled(request)) {
return toolkit.next();
}
return response.customError({
statusCode: 429,
body: 'Server is overloaded',
});
};
private shouldBeThrottled(request: KibanaRequest): boolean {
return !request.route.options.excludeFromRateLimiter && this.overloaded$.getValue();
}
private watch(
metrics$: Observable<EluMetrics>,
{ elu, term }: InternalHttpServiceSetup['rateLimiter']
) {
metrics$
.pipe(
skipUntil(this.ready$),
takeUntil(this.stopped$),
map(
({ short, medium, long }) =>
short >= elu && (term === 'short' || medium >= elu) && (term !== 'long' || long >= elu)
),
endWith(false)
)
.subscribe(this.overloaded$);
}
public setup({ http, metrics }: SetupDeps): InternalRateLimiterSetup {
if (!http.rateLimiter.enabled) {
return;
}
this.watch(metrics.getEluMetrics$(), http.rateLimiter);
http.registerOnPreAuth(this.handler);
}
public start(): InternalRateLimiterStart {
this.ready$.next(true);
this.ready$.complete();
}
public stop(): void {
this.stopped$.next(true);
this.stopped$.complete();
}
}

View file

@ -0,0 +1,26 @@
{
"extends": "../../../../../tsconfig.base.json",
"compilerOptions": {
"outDir": "target/types",
"types": [
"jest",
"node"
]
},
"include": [
"**/*.ts",
],
"kbn_references": [
"@kbn/core-http-server-internal",
"@kbn/core-http-server",
"@kbn/core-http-server-mocks",
"@kbn/core-metrics-server-mocks",
"@kbn/utility-types",
"@kbn/core-base-server-internal",
"@kbn/core-metrics-server",
"@kbn/core-metrics-server-internal",
],
"exclude": [
"target/**/*",
]
}

View file

@ -470,6 +470,28 @@ describe('CoreKibanaRequest', () => {
});
});
describe('route.options.excludeFromRateLimiter property', () => {
it.each`
value | expected
${true} | ${true}
${false} | ${false}
${undefined} | ${undefined}
`('handles excludeFromRateLimiter: ${value}', ({ value, expected }) => {
const request = hapiMocks.createRequest({
route: {
settings: {
app: {
excludeFromRateLimiter: value,
},
},
},
});
const kibanaRequest = CoreKibanaRequest.from(request);
expect(kibanaRequest.route.options.excludeFromRateLimiter).toBe(expected);
});
});
describe('RouteSchema type inferring', () => {
it('should work with config-schema', () => {
const body = Buffer.from('body!');

View file

@ -13,6 +13,7 @@ import { inspect } from 'util';
import type { Request, RouteOptions } from '@hapi/hapi';
import { fromEvent, NEVER } from 'rxjs';
import { shareReplay, first, filter } from 'rxjs';
import { isNil, omitBy } from 'lodash';
import { RecursiveReadonly } from '@kbn/utility-types';
import { deepFreeze } from '@kbn/std';
import {
@ -270,6 +271,7 @@ export class CoreKibanaRequest<
}
const options = {
...omitBy({ excludeFromRateLimiter: this.isExcludedFromRateLimiter(request) }, isNil),
authRequired: this.getAuthRequired(request),
// TypeScript note: Casting to `RouterOptions` to fix the following error:
//
@ -354,6 +356,11 @@ export class CoreKibanaRequest<
}${this.url.search}`
);
}
private isExcludedFromRateLimiter(request: RawRequest): boolean | undefined {
return ((request.route?.settings as RouteOptions)?.app as KibanaRouteOptions)
?.excludeFromRateLimiter;
}
}
/**

View file

@ -84,6 +84,9 @@ Object {
"payloadTimeout": 20000,
"port": 5601,
"protocol": "http1",
"rateLimiter": Object {
"enabled": false,
},
"requestId": Object {
"allowFromAnyIp": false,
"ipAllowlist": Array [],

View file

@ -25,6 +25,7 @@ import {
} from './security_response_headers_config';
import { CdnConfig } from './cdn_config';
import { PermissionsPolicyConfigType } from './permissions_policy';
import { type RateLimiterConfig, rateLimiterConfigSchema } from './rate_limiter';
const SECOND = 1000;
@ -192,6 +193,7 @@ const configSchema = schema.object(
}),
}),
}),
rateLimiter: rateLimiterConfigSchema,
requestId: schema.object(
{
allowFromAnyIp: schema.boolean({ defaultValue: false }),
@ -337,6 +339,7 @@ export class HttpConfig implements IHttpConfig {
};
public shutdownTimeout: Duration;
public restrictInternalApis: boolean;
public rateLimiter: RateLimiterConfig;
public eluMonitor: IHttpEluMonitorConfig;
@ -384,6 +387,7 @@ export class HttpConfig implements IHttpConfig {
this.xsrf = rawHttpConfig.xsrf;
this.requestId = rawHttpConfig.requestId;
this.shutdownTimeout = rawHttpConfig.shutdownTimeout;
this.rateLimiter = rawHttpConfig.rateLimiter;
// default to `false` to prevent breaking changes in current offerings
this.restrictInternalApis = rawHttpConfig.restrictInternalApis ?? false;

View file

@ -41,7 +41,7 @@ import type {
} from '@kbn/core-http-server';
import { performance } from 'perf_hooks';
import { isBoom } from '@hapi/boom';
import { identity, isObject } from 'lodash';
import { identity, isNil, isObject, omitBy } from 'lodash';
import { IHttpEluMonitorConfig } from '@kbn/core-http-server/src/elu_monitor';
import { Env } from '@kbn/config';
import { CoreContext } from '@kbn/core-base-server-internal';
@ -763,6 +763,7 @@ export class HttpServer {
access: route.options.access ?? 'internal',
deprecated,
security: route.security,
...omitBy({ excludeFromRateLimiter: route.options.excludeFromRateLimiter }, isNil),
};
// Log HTTP API target consumer.
optionsLogger.debug(

View file

@ -185,6 +185,7 @@ export class HttpService
this.internalSetup = {
...serverContract,
rateLimiter: config.rateLimiter,
registerOnPostValidation: (cb) => {
Router.on('onPostValidate', cb);
},

View file

@ -0,0 +1,30 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the "Elastic License
* 2.0", the "GNU Affero General Public License v3.0 only", and the "Server Side
* Public License v 1"; you may not use this file except in compliance with, at
* your election, the "Elastic License 2.0", the "GNU Affero General Public
* License v3.0 only", or the "Server Side Public License, v 1".
*/
import { schema, type TypeOf } from '@kbn/config-schema';
export const rateLimiterConfigSchema = schema.object({
enabled: schema.boolean({ defaultValue: false }),
elu: schema.conditional(
schema.siblingRef('enabled'),
false,
schema.never(),
schema.number({ min: 0, max: 1 })
),
term: schema.conditional(
schema.siblingRef('enabled'),
false,
schema.never(),
schema.oneOf([schema.literal('short'), schema.literal('medium'), schema.literal('long')], {
defaultValue: 'long',
})
),
});
export type RateLimiterConfig = TypeOf<typeof rateLimiterConfigSchema>;

View file

@ -0,0 +1,10 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the "Elastic License
* 2.0", the "GNU Affero General Public License v3.0 only", and the "Server Side
* Public License v 1"; you may not use this file except in compliance with, at
* your election, the "Elastic License 2.0", the "GNU Affero General Public
* License v3.0 only", or the "Server Side Public License, v 1".
*/
export { type RateLimiterConfig, rateLimiterConfigSchema } from './config';

View file

@ -23,6 +23,7 @@ import type { PostValidationMetadata } from '@kbn/core-http-server';
import type { HttpServerSetup } from './http_server';
import type { ExternalUrlConfig } from './external_url';
import type { InternalStaticAssets } from './static_assets';
import type { RateLimiterConfig } from './rate_limiter';
/** @internal */
export interface InternalHttpServicePreboot
@ -57,6 +58,7 @@ export interface InternalHttpServiceSetup
path: string,
plugin?: PluginOpaqueId
) => IRouter<Context>;
rateLimiter: RateLimiterConfig;
registerOnPostValidation(
cb: (req: CoreKibanaRequest, metadata: PostValidationMetadata) => void
): void;

View file

@ -30,6 +30,7 @@ export interface KibanaRouteOptions extends RouteOptionsApp {
xsrfRequired: boolean;
access: 'internal' | 'public';
security?: InternalRouteSecurity;
excludeFromRateLimiter?: boolean;
}
/**

View file

@ -391,6 +391,13 @@ export interface RouteConfigOptions<Method extends RouteMethod> {
*/
excludeFromOAS?: boolean;
/**
* Whether the rate limiter should never throttle this route.
*
* @default false
*/
excludeFromRateLimiter?: boolean;
/**
* Release version or date that this route will be removed
* Use with `deprecated: true`

View file

@ -0,0 +1,63 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the "Elastic License
* 2.0", the "GNU Affero General Public License v3.0 only", and the "Server Side
* Public License v 1"; you may not use this file except in compliance with, at
* your election, the "Elastic License 2.0", the "GNU Affero General Public
* License v3.0 only", or the "Server Side Public License, v 1".
*/
import { TestScheduler } from 'rxjs/testing';
import { exponentialMovingAverage } from './exponential_moving_average';
describe('exponentialMovingAverage', () => {
let testScheduler: TestScheduler;
beforeEach(() => {
testScheduler = new TestScheduler((actual, expected) => {
return expect(actual).toStrictEqual(expected);
});
});
it('should emit the initial value', () => {
testScheduler.run(({ cold, expectObservable }) => {
const observable = cold('a|', { a: 1 }).pipe(exponentialMovingAverage(15, 5));
expectObservable(observable).toBe('a|', { a: 1 });
});
});
it('should emit smoothed values', () => {
testScheduler.run(({ cold, expectObservable }) => {
const observable = cold('abc|', { a: 1, b: 1, c: 2 }).pipe(exponentialMovingAverage(15, 5));
expectObservable(observable).toBe('abc|', {
a: 1,
b: 1,
c: expect.closeTo(1.3, 1),
});
});
});
it('should fade away outdated values', () => {
testScheduler.run(({ cold, expectObservable }) => {
const observable = cold('abcdef|', {
a: 1,
b: 1,
c: 2,
d: 2,
e: 1,
f: 1,
}).pipe(exponentialMovingAverage(15, 5));
expectObservable(observable).toBe('abcdef|', {
a: 1, // https://en.wikipedia.org/wiki/Exponential_smoothing#Choosing_the_initial_smoothed_value
b: 1,
c: expect.closeTo(1.3, 1),
d: expect.closeTo(1.5, 1),
e: expect.closeTo(1.3, 1),
f: expect.closeTo(1.2, 1),
});
});
});
});

View file

@ -0,0 +1,36 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the "Elastic License
* 2.0", the "GNU Affero General Public License v3.0 only", and the "Server Side
* Public License v 1"; you may not use this file except in compliance with, at
* your election, the "Elastic License 2.0", the "GNU Affero General Public
* License v3.0 only", or the "Server Side Public License, v 1".
*/
import { type OperatorFunction, map, tap } from 'rxjs';
/**
* An RxJS operator implementing the exponential moving average function.
*
* @see https://en.wikipedia.org/wiki/Exponential_smoothing
* @param period The period of time.
* @param interval The interval between values.
* @returns An operator emitting smoothed values.
*/
export function exponentialMovingAverage(
period: number,
interval: number
): OperatorFunction<number, number> {
const alpha = 1 - Math.exp(-interval / period);
return (inner) => {
let previous: number | undefined;
return inner.pipe(
map((current) => (previous == null ? current : alpha * current + (1 - alpha) * previous)),
tap((current) => {
previous = current;
})
);
};
}

View file

@ -9,7 +9,8 @@
import moment from 'moment';
import { merge } from 'lodash';
import { take } from 'rxjs';
import { set } from '@kbn/safer-lodash-set';
import { lastValueFrom, take, toArray } from 'rxjs';
import { configServiceMock } from '@kbn/config-mocks';
import { mockCoreContext } from '@kbn/core-base-server-mocks';
import { loggingSystemMock } from '@kbn/core-logging-server-mocks';
@ -189,6 +190,39 @@ describe('MetricsService', () => {
expect(opsLogs[0][1]).not.toEqual(opsLogs[1][1]);
});
it('emits average ELU values on getEluMetrics$ call', async () => {
mockOpsCollector.collect
.mockImplementationOnce(() => set({}, 'process.event_loop_utilization.utilization', 0.1))
.mockResolvedValueOnce(set({}, 'process.event_loop_utilization.utilization', 0.9))
.mockResolvedValueOnce(set({}, 'process.event_loop_utilization.utilization', 0.9));
await metricsService.setup({ http: httpMock, elasticsearchService: esServiceMock });
const { getEluMetrics$ } = await metricsService.start();
const eluMetricsPromise = lastValueFrom(getEluMetrics$().pipe(toArray()));
jest.advanceTimersByTime(testInterval * 2);
await new Promise((resolve) => process.nextTick(resolve));
await metricsService.stop();
await expect(eluMetricsPromise).resolves.toEqual([
expect.objectContaining({
short: expect.closeTo(0.1),
medium: expect.closeTo(0.1),
long: expect.closeTo(0.1),
}),
expect.objectContaining({
short: expect.closeTo(0.11),
medium: expect.closeTo(0.1),
long: expect.closeTo(0.1),
}),
expect.objectContaining({
short: expect.closeTo(0.11),
medium: expect.closeTo(0.11),
long: expect.closeTo(0.1),
}),
]);
});
it('omits metrics from log message if they are missing or malformed', async () => {
const opsLogger = logger.get('metrics', 'ops');
mockOpsCollector.collect.mockResolvedValueOnce(

View file

@ -7,12 +7,13 @@
* License v3.0 only", or the "Server Side Public License, v 1".
*/
import { firstValueFrom, ReplaySubject } from 'rxjs';
import { BehaviorSubject, firstValueFrom, map, ReplaySubject, zip } from 'rxjs';
import type { CoreContext, CoreService } from '@kbn/core-base-server-internal';
import type { Logger } from '@kbn/logging';
import type { InternalHttpServiceSetup } from '@kbn/core-http-server-internal';
import type { InternalElasticsearchServiceSetup } from '@kbn/core-elasticsearch-server-internal';
import type {
EluMetrics,
OpsMetrics,
MetricsServiceSetup,
MetricsServiceStart,
@ -21,6 +22,11 @@ import { OpsMetricsCollector } from './ops_metrics_collector';
import { OPS_CONFIG_PATH, type OpsConfigType } from './ops_config';
import { getEcsOpsMetricsLog } from './logging';
import { registerEluHistoryRoute } from './routes/elu_history';
import { exponentialMovingAverage } from './exponential_moving_average';
const ELU_SHORT = 15000;
const ELU_MEDIUM = 30000;
const ELU_LONG = 60000;
export interface MetricsServiceSetupDeps {
http: InternalHttpServiceSetup;
@ -42,6 +48,11 @@ export class MetricsService
private metricsCollector?: OpsMetricsCollector;
private collectInterval?: NodeJS.Timeout;
private metrics$ = new ReplaySubject<OpsMetrics>(1);
private elu$ = new BehaviorSubject<EluMetrics>({
long: 0,
medium: 0,
short: 0,
});
private service?: InternalMetricsServiceSetup;
constructor(private readonly coreContext: CoreContext) {
@ -56,6 +67,7 @@ export class MetricsService
const config = await firstValueFrom(
this.coreContext.configService.atPath<OpsConfigType>(OPS_CONFIG_PATH)
);
const collectionInterval = config.interval.asMilliseconds();
this.metricsCollector = new OpsMetricsCollector(
http.server,
@ -70,15 +82,25 @@ export class MetricsService
this.collectInterval = setInterval(() => {
this.refreshMetrics();
}, config.interval.asMilliseconds());
}, collectionInterval);
const metricsObservable = this.metrics$.asObservable();
registerEluHistoryRoute(http.createRouter(''), metricsObservable);
this.metrics$
.pipe(
map((metrics) => metrics.process.event_loop_utilization.utilization),
(elu$) =>
zip(
elu$.pipe(exponentialMovingAverage(ELU_SHORT, collectionInterval)),
elu$.pipe(exponentialMovingAverage(ELU_MEDIUM, collectionInterval)),
elu$.pipe(exponentialMovingAverage(ELU_LONG, collectionInterval))
).pipe(map(([short, medium, long]) => ({ short, medium, long })))
)
.subscribe(this.elu$);
registerEluHistoryRoute(http.createRouter(''), () => this.elu$.value);
this.service = {
collectionInterval: config.interval.asMilliseconds(),
getOpsMetrics$: () => metricsObservable,
collectionInterval,
getOpsMetrics$: () => this.metrics$,
getEluMetrics$: () => this.elu$,
};
return this.service;

View file

@ -8,10 +8,8 @@
*/
import type { IRouter } from '@kbn/core-http-server';
import type { OpsMetrics } from '@kbn/core-metrics-server';
import type { Observable } from 'rxjs';
import apm from 'elastic-apm-node';
import { HistoryWindow } from './history_window';
import { EluMetrics } from '@kbn/core-metrics-server';
interface ELUHistoryResponse {
/**
@ -20,40 +18,17 @@ interface ELUHistoryResponse {
* actual time range covered is determined by our collection interval (configured via `ops.interval`, default 5s)
* and the number of samples held in each window. So by default short: 15s, medium: 30s and long 60s.
*/
history: {
/** The history for the short window */
short: number;
/** The history for the medium window */
medium: number;
/** The history for the long window */
long: number;
};
history: EluMetrics;
}
const HISTORY_WINDOW_SIZE_SHORT = 3;
const HISTORY_WINDOW_SIZE_MED = 6;
const HISTORY_WINDOW_SIZE_LONG = 12;
/**
* Intended for exposing metrics over HTTP that we do not want to include in the /api/stats endpoint, yet.
*/
export function registerEluHistoryRoute(router: IRouter, metrics$: Observable<OpsMetrics>) {
const eluHistoryWindow = new HistoryWindow(HISTORY_WINDOW_SIZE_LONG);
metrics$.subscribe((metrics) => {
eluHistoryWindow.addObservation(metrics.process.event_loop_utilization.utilization);
});
export function registerEluHistoryRoute(router: IRouter, elu: () => EluMetrics) {
// Report the same metrics to APM
apm.registerMetric('elu.history.short', () =>
eluHistoryWindow.getAverage(HISTORY_WINDOW_SIZE_SHORT)
);
apm.registerMetric('elu.history.medium', () =>
eluHistoryWindow.getAverage(HISTORY_WINDOW_SIZE_MED)
);
apm.registerMetric('elu.history.long', () =>
eluHistoryWindow.getAverage(HISTORY_WINDOW_SIZE_LONG)
);
apm.registerMetric('elu.history.short', () => elu().short);
apm.registerMetric('elu.history.medium', () => elu().medium);
apm.registerMetric('elu.history.long', () => elu().long);
router.versioned
.get({
@ -62,6 +37,7 @@ export function registerEluHistoryRoute(router: IRouter, metrics$: Observable<Op
path: '/api/_elu_history',
options: {
authRequired: false,
excludeFromRateLimiter: true,
},
})
.addVersion(
@ -71,11 +47,7 @@ export function registerEluHistoryRoute(router: IRouter, metrics$: Observable<Op
},
async (ctx, req, res) => {
const body: ELUHistoryResponse = {
history: {
short: eluHistoryWindow.getAverage(HISTORY_WINDOW_SIZE_SHORT),
medium: eluHistoryWindow.getAverage(HISTORY_WINDOW_SIZE_MED),
long: eluHistoryWindow.getAverage(HISTORY_WINDOW_SIZE_LONG),
},
history: elu(),
};
return res.ok({ body });
}

View file

@ -1,89 +0,0 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the "Elastic License
* 2.0", the "GNU Affero General Public License v3.0 only", and the "Server Side
* Public License v 1"; you may not use this file except in compliance with, at
* your election, the "Elastic License 2.0", the "GNU Affero General Public
* License v3.0 only", or the "Server Side Public License, v 1".
*/
import { HistoryWindow } from './history_window';
describe('HistoryWindow', () => {
it('#getAverage should work without any observations', () => {
const hw = new HistoryWindow(3);
expect(hw.getAverage(1)).toBe(0);
});
it('Window size remains constant', () => {
const hw = new HistoryWindow(7);
for (let i = 0; i < 100; i++) {
hw.addObservation(i);
expect(hw.size).toBe(7);
}
});
it.each([
[-1000],
[-1],
[0],
[9999],
// [NaN] assuming this is nonsense input
])('#getAverage works given bad input: %s', (badInput) => {
const hw = new HistoryWindow(3);
expect(hw.getAverage(badInput)).toBe(0);
});
const WINDOW_SIZE = 3;
it.each([
{ name: 'base case', observations: [0.44, 0.55, 0.66], averageLast: 3, expected: 0.55 },
{
name: 'reverse base case',
observations: [0.44, 0.55, 0.66].reverse(),
averageLast: 3,
expected: 0.55,
}, // should be same as above
{
name: 'include one observation',
observations: [0.44, 0.55, 0.66],
averageLast: 1,
expected: 0.44,
},
{
name: 'include excess observations',
observations: [0.201, 0.33, 0.44],
averageLast: 4,
expected: 0.33,
},
{
name: 'subset of observations',
observations: [0.44, 0.55, 0.66],
averageLast: 2,
expected: 0.5,
},
{
name: 'includes at least one observation',
observations: [0.44, 0.55, 0.66],
averageLast: -1,
expected: 0.44,
},
{
name: 'excess observations',
observations: [1, 0.99, 0.55, 0.66, 0.44, 0.55, 0.66, 0.77],
averageLast: 1000,
expected: 0.85,
},
{
name: 'bad observation data',
observations: [-1, -0.99, -0.55, -0.66, -0.44],
averageLast: 10000,
expected: 0,
},
])('$name', ({ observations, averageLast, expected }) => {
const lw = new HistoryWindow(WINDOW_SIZE);
// reverse so that our test observations are in the order they appear above
for (const observation of observations.reverse()) {
lw.addObservation(observation);
}
expect(lw.getAverage(averageLast)).toBe(expected);
});
});

View file

@ -1,41 +0,0 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the "Elastic License
* 2.0", the "GNU Affero General Public License v3.0 only", and the "Server Side
* Public License v 1"; you may not use this file except in compliance with, at
* your election, the "Elastic License 2.0", the "GNU Affero General Public
* License v3.0 only", or the "Server Side Public License, v 1".
*/
/** We .ceil to rather _slightly_ over-report usage in certain circumstances */
const twoDeci = (num: number) => Math.ceil(num * 100) / 100;
export class HistoryWindow {
readonly #window: number[];
readonly #size: number;
constructor(size: number) {
this.#size = size;
this.#window = new Array(this.#size).fill(0);
}
public get size(): number {
return this.#window.length;
}
addObservation(value: number) {
this.#window.unshift(Math.max(0, value));
this.#window.pop();
}
/**
* @param includeObservations number of observations to include in calculation. Will be normalized to be within the window size.
*/
getAverage(includeObservations: number) {
includeObservations = Math.min(Math.max(1, includeObservations), this.size);
return twoDeci(
this.#window.slice(0, includeObservations).reduce((acc, val) => acc + val, 0) /
includeObservations
);
}
}

View file

@ -27,6 +27,7 @@
"@kbn/core-logging-server-mocks",
"@kbn/core-elasticsearch-server-mocks",
"@kbn/core-http-server",
"@kbn/safer-lodash-set",
],
"exclude": [
"target/**/*",

View file

@ -10,6 +10,7 @@
export type { MetricsServiceSetup, MetricsServiceStart } from './src/contracts';
export type { MetricsCollector, IEventLoopDelaysMonitor } from './src/collectors';
export type {
EluMetrics,
OpsMetrics,
IntervalHistogram,
OpsProcessMetrics,

View file

@ -8,7 +8,8 @@
*/
import type { Observable } from 'rxjs';
import type { OpsMetrics } from './metrics';
import type { EluMetrics, OpsMetrics } from './metrics';
/**
* APIs to retrieves metrics gathered and exposed by the core platform.
*
@ -18,6 +19,11 @@ export interface MetricsServiceSetup {
/** Interval metrics are collected in milliseconds */
readonly collectionInterval: number;
/**
* Retrieve an observable emitting {@link EluMetrics}.
*/
getEluMetrics$(): Observable<EluMetrics>;
/**
* Retrieve an observable emitting the {@link OpsMetrics} gathered.
* The observable will emit an initial value during core's `start` phase, and a new value every fixed interval of time,

View file

@ -222,3 +222,20 @@ export interface OpsMetrics {
/** number of current concurrent connections to the server */
concurrent_connections: OpsServerMetrics['concurrent_connections'];
}
export interface EluMetrics {
/**
* The long-term event loop utilization history.
*/
long: number;
/**
* The medium-term event loop utilization history.
*/
medium: number;
/**
* The short-term event loop utilization history.
*/
short: number;
}

View file

@ -256,6 +256,7 @@ export function createPluginSetupContext<TPlugin, TPluginDependencies>({
},
metrics: {
collectionInterval: deps.metrics.collectionInterval,
getEluMetrics$: deps.metrics.getEluMetrics$,
getOpsMetrics$: deps.metrics.getOpsMetrics$,
},
savedObjects: {
@ -373,6 +374,7 @@ export function createPluginStartContext<TPlugin, TPluginDependencies>({
},
metrics: {
collectionInterval: deps.metrics.collectionInterval,
getEluMetrics$: deps.metrics.getEluMetrics$,
getOpsMetrics$: deps.metrics.getOpsMetrics$,
},
uiSettings: {

View file

@ -45,6 +45,7 @@ import type {
PrebootRequestHandlerContext,
} from '@kbn/core-http-request-handler-context-server';
import { RenderingService } from '@kbn/core-rendering-server-internal';
import { HttpRateLimiterService } from '@kbn/core-http-rate-limiter-internal';
import { HttpResourcesService } from '@kbn/core-http-resources-server-internal';
import type {
InternalCorePreboot,
@ -80,6 +81,7 @@ export class Server {
private readonly environment: EnvironmentService;
private readonly node: NodeService;
private readonly metrics: MetricsService;
private readonly httpRateLimiter: HttpRateLimiterService;
private readonly httpResources: HttpResourcesService;
private readonly status: StatusService;
private readonly logging: LoggingService;
@ -133,6 +135,7 @@ export class Server {
this.metrics = new MetricsService(core);
this.status = new StatusService(core);
this.coreApp = new CoreAppsService(core);
this.httpRateLimiter = new HttpRateLimiterService();
this.httpResources = new HttpResourcesService(core);
this.logging = new LoggingService(core);
this.coreUsageData = new CoreUsageDataService(core);
@ -343,6 +346,10 @@ export class Server {
i18n: i18nServiceSetup,
});
this.httpRateLimiter.setup({
http: httpSetup,
metrics: metricsSetup,
});
const httpResourcesSetup = this.httpResources.setup({
http: httpSetup,
rendering: renderingSetup,
@ -442,6 +449,7 @@ export class Server {
const featureFlagsStart = this.featureFlags.start();
this.httpRateLimiter.start();
this.status.start();
this.coreStart = {
@ -484,6 +492,7 @@ export class Server {
this.log.debug('stopping server');
this.coreApp.stop();
this.httpRateLimiter.stop();
await this.analytics.stop();
await this.http.stop(); // HTTP server has to stop before savedObjects and ES clients are closed to be able to gracefully attempt to resolve any pending requests
await this.plugins.stop();

View file

@ -77,6 +77,7 @@
"@kbn/core-user-profile-server-mocks",
"@kbn/core-user-profile-server-internal",
"@kbn/core-feature-flags-server-internal",
"@kbn/core-http-rate-limiter-internal",
],
"exclude": [
"target/**/*",

View file

@ -90,6 +90,7 @@ export const registerStatusRoute = ({
tags: ['api', 'security:acceptJWT', 'oas-tag:system'],
access: 'public', // needs to be public to allow access from "system" users like k8s readiness probes.
summary: `Get Kibana's current status`,
excludeFromRateLimiter: true,
},
validate: {
request: {

View file

@ -19,6 +19,7 @@ export const registerPrebootStatusRoute = ({ router }: { router: IRouter }) => {
authRequired: false,
tags: ['api'],
access: 'public', // needs to be public to allow access from "system" users like k8s readiness probes.
excludeFromRateLimiter: true,
},
validate: false,
},

View file

@ -100,7 +100,12 @@ describe('StatusService', () => {
expect(prebootRouterMock.get).toHaveBeenCalledWith(
{
path: '/api/status',
options: { authRequired: false, tags: ['api'], access: 'public' },
options: {
authRequired: false,
tags: ['api'],
access: 'public',
excludeFromRateLimiter: true,
},
validate: false,
},
expect.any(Function)

View file

@ -62,6 +62,7 @@ export function registerStatsRoute({
},
options: {
authRequired: !config.allowAnonymous,
excludeFromRateLimiter: true,
// The `api` tag ensures that unauthenticated calls receive a 401 rather than a 302 redirect to login page.
// The `security:acceptJWT` tag allows route to be accessed with JWT credentials. It points to
// ROUTE_TAG_ACCEPT_JWT from '@kbn/security-plugin/server' that cannot be imported here directly.

View file

@ -398,6 +398,8 @@
"@kbn/core-http-context-server-mocks/*": ["packages/core/http/core-http-context-server-mocks/*"],
"@kbn/core-http-plugin": ["test/plugin_functional/plugins/core_http"],
"@kbn/core-http-plugin/*": ["test/plugin_functional/plugins/core_http/*"],
"@kbn/core-http-rate-limiter-internal": ["src/core/packages/http/rate-limiter-internal"],
"@kbn/core-http-rate-limiter-internal/*": ["src/core/packages/http/rate-limiter-internal/*"],
"@kbn/core-http-request-handler-context-server": ["src/core/packages/http/request-handler-context-server"],
"@kbn/core-http-request-handler-context-server/*": ["src/core/packages/http/request-handler-context-server/*"],
"@kbn/core-http-request-handler-context-server-internal": ["src/core/packages/http/request-handler-context-server-internal"],

View file

@ -4641,6 +4641,10 @@
version "0.0.0"
uid ""
"@kbn/core-http-rate-limiter-internal@link:src/core/packages/http/rate-limiter-internal":
version "0.0.0"
uid ""
"@kbn/core-http-request-handler-context-server-internal@link:src/core/packages/http/request-handler-context-server-internal":
version "0.0.0"
uid ""