Infrastructure to report upon document parsing (#97961)

In serverless we will like to report (meter and bill) upon a document ingestion. The metering should be agnostic to a document format (document structure should be normalised) hence we should allow to create XContentParsers which will keep track of parsed fields and values.
There are 2 places where the parsing of the ingested document happens:
1. upon the 'raw bulk' a request is sent without the pipelines
2. upon the 'ingest service' when a request is sent with pipelines
(parsing can occur twice when a dynamic mappings are calculated, this PR takes this into account and prevent double billing)
We also want to make sure, that the metering logic is not unnecessarily executed when a document was already reported. That is if a document was reported in IngestService, there is no point wrapping the XContentParser again.

This commit introduces a `DocumentReporterPlugin`  an internal plugin that will be implemented in serverless. This plugin should return a `DocumentParsingObserver` supplier  which will create a `DocumentParsingObserver`. A DocumentParsingObserver is used to wrap an `XContentParser` with an implementation that keeps track of parsed fields and values (performs a metering) and allows to send that information along with an index name to a MeteringReporter.
This commit is contained in:
Przemyslaw Gomulka 2023-08-01 13:55:18 +02:00 committed by GitHub
parent 42cc99f204
commit 999489ce04
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
42 changed files with 621 additions and 70 deletions

View file

@ -297,7 +297,6 @@ module org.elasticsearch.server {
exports org.elasticsearch.plugins;
exports org.elasticsearch.plugins.interceptor to org.elasticsearch.security;
exports org.elasticsearch.plugins.spi;
exports org.elasticsearch.plugins.internal to org.elasticsearch.settings.secure;
exports org.elasticsearch.repositories;
exports org.elasticsearch.repositories.blobstore;
exports org.elasticsearch.repositories.fs;
@ -372,6 +371,7 @@ module org.elasticsearch.server {
exports org.elasticsearch.action.datastreams.lifecycle;
exports org.elasticsearch.action.downsample;
exports org.elasticsearch.plugins.internal to co.elastic.elasticsearch.metering, org.elasticsearch.settings.secure;
provides java.util.spi.CalendarDataProvider with org.elasticsearch.common.time.IsoCalendarDataProvider;
provides org.elasticsearch.xcontent.ErrorOnUnknown with org.elasticsearch.common.xcontent.SuggestingErrorOnUnknown;