kibana/packages/kbn-apm-synthtrace
Yngrid Coello 511f77c231
[Dataset quality] Failure store support (#206758)
Closes https://github.com/elastic/logs-dev/issues/183,
https://github.com/elastic/logs-dev/issues/184 and
https://github.com/elastic/logs-dev/issues/185.

## Summary
This PR aims to support failure store in dataset quality page. The
following acceptance criteria items were resolved

### Dataset quality page
- [x] A column for Failed docs is included in the table
- [x] A tooltip is placed in the title of the column
- [x] A % of documents inside Failure store is calculated for every
dataStream
- [x] If % is lesser than 0.0001 but greater than 0 we should show ⚠
symbol next to the ~0 value (as we do with degraded docs)
- [x] Failed docs percentages greater than 0 should link to discover

 🎥 Demo 


https://github.com/user-attachments/assets/6d9e3f4c-02d9-43ab-88cb-ae70716b05d9

### Dataset details page
- [x] A metric, Failed docs, is included in the Overview panel under
Data set quality. This metric includes the number of documents inside
the failure store for the specific dataStream.
- [x] A tooltip is placed in the title of the Failed docs metric with
message: `The percentage of docs sent to failure store due to an issue
during ingestion.`
- [x] Degraded docs graph section is transformed to Document trends
allowing the users to switch between Degraded docs and Failed docs
trends over time.
- [x] A new chart for failed documents is created with links to
discover/Logs explorer using the right dataView

 🎥 Demo 


https://github.com/user-attachments/assets/6a3a1f09-2668-4e83-938e-ecdda798c199

### Failed docs ingestion issue flyout

- [x] Whenever documents are found in failure store we should list
Document indexing failed in Quality issues table
- [x] User should be able to expand Document indexing failed and see
more information in the flyout
- [x] The flyout will show Docs count, an aggregation of the number of
documents inside failure store for the selected timeframe
- [x] The flyout will show Last ocurrence, the datetime registered for
the most recent document in the failure store.
- [x] The flyout will contain a section called Error messages where a
list of unique error messages should be shown, exposing Content (error
message) and Type (Error Type).
- [x] Type should contain a tooltip where message (`Error message
category`) explain users how we are categorising the errors.
- [x] Other issues inside Quality issues table will be appended by field
ignored and the field will be shown in bold.


https://github.com/user-attachments/assets/94dc81f0-9720-4596-b256-c9d289cefd94

Note: This PR was reconstructed from
https://github.com/elastic/kibana/pull/199806 which it supersedes.

## How to test

1. Execute `failed_logs` synthtrace scenario
2. Open dataset quality page

## Follow ups
- Enable in serverless
- Deployment agnostic tests cannot be added until we enable this in
serverless
- FTR tests will be added as part of
https://github.com/elastic/logs-dev/issues/182

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
2025-01-23 09:13:28 +01:00
..
bin Adds AGPL 3.0 license (#192025) 2024-09-06 19:02:41 -06:00
src [Dataset quality] Failure store support (#206758) 2025-01-23 09:13:28 +01:00
index.ts [Inventory] Adding initial e2e structure (#196560) 2024-10-17 08:50:11 -05:00
jest.config.js Adds AGPL 3.0 license (#192025) 2024-09-06 19:02:41 -06:00
kibana.jsonc Sustainable Kibana Architecture: Categorise straightforward packages (#199630) 2024-11-22 10:33:25 +01:00
package.json Adds AGPL 3.0 license (#192025) 2024-09-06 19:02:41 -06:00
README.md [Dataset quality] Enable page for synthetics (#191846) 2024-09-04 15:21:17 +02:00
tsconfig.json [Logs Overview] Overview component (iteration 1) (attempt 2) (#195673) 2024-10-10 12:46:25 +02:00

@kbn/apm-synthtrace

@kbn/apm-synthtrace is a tool in technical preview to generate synthetic APM data. It is intended to be used for development and testing of the Elastic APM app in Kibana.

At a high-level, the module works by modeling APM events/metricsets with a fluent API. The models can then be serialized and converted to Elasticsearch documents. In the future we might support APM Server as an output as well.

Usage

This section assumes that you've installed Kibana's dependencies by running yarn kbn bootstrap in the repository's root folder.

This library can currently be used in two ways:

  • Imported as a Node.js module, for instance to be used in Kibana's functional test suite.
  • With a command line interface, to index data based on a specified scenario.

Using the Node.js module

Concepts

  • Service: a logical grouping for a monitored service. A Service object contains fields like service.name, service.environment and agent.name.
  • Instance: a single instance of a monitored service. E.g., the workload for a monitored service might be spread across multiple containers. An Instance object contains fields like service.node.name and container.id.
  • Timerange: an object that will return an array of timestamps based on an interval and a rate. These timestamps can be used to generate events/metricsets.
  • Transaction, Span, APMError and Metricset: events/metricsets that occur on an instance. For more background, see the explanation of the APM data model
  • Log: An instance of Log generating Service which supports additional helpers to customise fields like messages, logLevel
  • SyntheticsMonitor: An instance of Synthetic monitor. For more information see Synthetic monitoring.

Example

import { service, timerange, toElasticsearchOutput } from '@kbn/apm-synthtrace';

const instance = service({ name: 'synth-go', environment: 'production', agentName: 'go' }).instance(
  'instance-a'
);

const from = new Date('2021-01-01T12:00:00.000Z').getTime();
const to = new Date('2021-01-01T12:00:00.000Z').getTime();

const traceEvents = timerange(from, to)
  .interval('1m')
  .rate(10)
  .flatMap((timestamp) =>
    instance
      .transaction({ transactionName: 'GET /api/product/list' })
      .timestamp(timestamp)
      .duration(1000)
      .success()
      .children(
        instance
          .span('GET apm-*/_search', 'db', 'elasticsearch')
          .timestamp(timestamp + 50)
          .duration(900)
          .destination('elasticsearch')
          .success()
      )
      .serialize()
  );

const metricsets = timerange(from, to)
  .interval('30s')
  .rate(1)
  .flatMap((timestamp) =>
    instance
      .appMetrics({
        'system.memory.actual.free': 800,
        'system.memory.total': 1000,
        'system.cpu.total.norm.pct': 0.6,
        'system.process.cpu.total.norm.pct': 0.7,
      })
      .timestamp(timestamp)
      .serialize()
  );

const esEvents = toElasticsearchOutput(traceEvents.concat(metricsets));

Generating metricsets

@kbn/apm-synthtrace can also automatically generate transaction metrics, span destination metrics and transaction breakdown metrics based on the generated trace events. If we expand on the previous example:

import {
  getTransactionMetrics,
  getSpanDestinationMetrics,
  getBreakdownMetrics,
} from '@kbn/apm-synthtrace';

const esEvents = toElasticsearchOutput([
  ...traceEvents,
  ...getTransactionMetrics(traceEvents),
  ...getSpanDestinationMetrics(traceEvents),
  ...getBreakdownMetrics(traceEvents),
]);

CLI

Via the CLI, you can run scenarios, either using a fixed time range or continuously generating data. Scenarios are available in packages/kbn-apm-synthtrace/src/scenarios/.

For live data ingestion:

node scripts/synthtrace simple_trace.ts --target=http://admin:changeme@localhost:9200 --live

For a fixed time window:

node scripts/synthtrace simple_trace.ts --target=http://admin:changeme@localhost:9200 --from=now-24h --to=now

The script will try to automatically find bootstrapped APM indices. If these indices do not exist, the script will exit with an error. It will not bootstrap the indices itself.

Understanding Scenario Files

Scenario files accept 3 arguments, 2 of them optional and 1 mandatory

Arguments Type Description
generate mandatory This is the main function responsible for returning the events which will be indexed
bootstrap optional In case some setup needs to be done, before the data is generated, this function provides access to all available ES Clients to play with
setClient optional By default the apmEsClient used to generate data. If anyother client like logsEsClient needs to be used instead, this is where it should be returned

The following options are supported:

Connection options

Option Type Default Description
--target [string] Elasticsearch target
--kibana [string] Kibana target, used to bootstrap datastreams/mappings/templates/settings
--versionOverride [string] String to be used for observer.version. Defauls to the version of the installed package.

Note:

  • If --target is not set, Synthtrace will try to detect a locally running Elasticsearch and Kibana.
  • For Elastic Cloud urls, --target will be used to infer the location of the Cloud instance of Kibana.
  • The latest version of the APM integration will automatically be installed and used for observer.version when ingesting APM data. In some cases, you'll want to use --versionOverride to set observer.version explicitly.

Scenario options

Option Type Default Description
--from [date] now() The start of the time window
--to [date] The end of the time window
--live [boolean] Generate and index data continuously
--scenarioOpts Raw options specific to the scenario
Note:
  • The default --to is 15m.
  • You can combine --from and --to with --live to back-fill some data.
  • To specify --scenarioOpts you need to use yargs Objects syntax. (e.g. --scenarioOpts.myOption=myValue)

Setup options

Option Type Default Description
--clean [boolean] false Clean APM data before indexing new data
--workers [number] Amount of Node.js worker threads
--logLevel [enum] info Log level
--type [string] apm Type of data to be generated, log must be passed when generating logs

Testing

Run the Jest tests:

node scripts/jest --config ./packages/kbn-apm-synthtrace/jest.config.js

Typescript

Run the type checker:

node scripts/type_check.js --project packages/kbn-apm-synthtrace/tsconfig.json