Report performance metrics docs followup (#139852)

* doc

* Update dev_docs/tutorials/adding_performance_metrics.mdx

Co-authored-by: Baturalp Gurdin <9674241+suchcodemuchwow@users.noreply.github.com>

* Update dev_docs/tutorials/adding_performance_metrics.mdx

Co-authored-by: Baturalp Gurdin <9674241+suchcodemuchwow@users.noreply.github.com>

* Update dev_docs/tutorials/adding_performance_metrics.mdx

Co-authored-by: Baturalp Gurdin <9674241+suchcodemuchwow@users.noreply.github.com>

* Update dev_docs/tutorials/adding_performance_metrics.mdx

Co-authored-by: Baturalp Gurdin <9674241+suchcodemuchwow@users.noreply.github.com>

* docs

* event name

Co-authored-by: Baturalp Gurdin <9674241+suchcodemuchwow@users.noreply.github.com>
This commit is contained in:
Liza Katz 2022-09-01 16:24:11 +03:00 committed by GitHub
parent c2a1f48fd4
commit b98fe995b9
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23

View file

@ -7,76 +7,157 @@ date: 2022-07-07
tags: ['kibana', 'onboarding', 'setup', 'performance', 'development', 'telemetry']
---
Reporting performance events is as easy as shown below. However, it's highly suggested to follow guidelines for what to report in certain fields:
## Reporting performance events
### Simple performance events
Let's assume we intend to report the performance of a specific action called `APP_ACTION`.
In order to do so, we need to first measure the timing of that action.
Once we have the time measurement, we can use the `reportPerformanceMetricEvent` API to report it.
The most basic form of reporting would be:
```typescript
reportPerformanceMetricEvent(analytics, {
eventName: APP_ACTION,
duration, // Duration in milliseconds
});
```
Once executed, the metric would be delivered to the `stack-telemetry` cluster, alongside with the event's context.
The data is updated periodically, so you might have to wait up to 30 minutes to see your data in the index.
Once indexed, this metric will appear in `ebt-kibana` index. It is also mapped into an additional index, dedicated to performance metrics.
We recommend using the `Kibana Peformance` space on the telemetry cluster, where you get an `index patten` to easily access this data.
Each document in the index has the following structure:
```typescript
{
"_index": "backing-ebt-kibana-browser-performance-metrics-000001", // Performance metrics are stored at a dedicated simplified index (browser \ server).
"_source": {
"timestamp": "2022-08-31T11:29:58.275Z"
"event_type": "performance_metric", // All events share a common event type to simplify mapping
"eventName": "dashboard_loaded", // Event name as specified when reporting it
"duration": 736, // Event duration as specified when reporting it
"context": { // Context holds information identifying the deployment, version, application and page that generated the event
"version": "8.5.0-SNAPSHOT",
"applicationId": "dashboards",
"page": "app",
"entityId": "61c58ad0-3dd3-11e8-b2b9-5d5dc1715159",
"branch": "main",
"labels": {
"journeyName": "flight_dashboard",
...
}
...
},
...
},
}
```
### Performance events with breakdowns and metadata
Lets assume we are interested in benchmarking the performance of a more complex event `COMPLEX_APP_ACTION`, that is made up of two steps:
- `INSPECT_DATA` measures the time it takes to retrieve a user's profile and check if there is a cached version of their data.
- If the cached data is fresh it proceeds with a flow `use-local-data`
- If data needs to be refreshed, it proceeds with a flow `load-data-from-api`.
- `PROCESS_DATA` loads and processes the data depending on the flow chosen in the previous step.
We could utilise the additional options supported by the `reportPerformanceMetricEvent` API:
```typescript
import { reportPerformanceMetricEvent } from '@kbn/ebt-tools';
reportPerformanceMetricEvent(analytics, {
eventName: KIBANA_LOADED_EVENT,
eventName: COMPLEX_APP_ACTION,
duration, // Total duration in milliseconds
key1 : INSPECT_DATA, // Claiming free key1 to be used for INSPECT_DATA
value1 : durationOfStepA, // Total duration of step INSPECT_DATA in milliseconds
key2 : PROCESS_DATA, // Claiming free key2 to be used for PROCESS_DATA
value2 : durationOfStepB, // Total duration of step PROCESS_DATA in milliseconds
meta: {
// fields that are searchable-as-keywords but not aggregateable
kibana_version: this.coreContext.env.packageInfo.version,
protocol : window.location.protocol,
dataSource: 'flow2', // Providing event specific context. This can be useful to create meaningful aggregations.
},
duration: timing[LOAD_FIRST_NAV],
key1 : LOAD_START,
value1 : timing[LOAD_START],
key2 : LOAD_BOOTSTRAP_START,
value2 : timing[LOAD_BOOTSTRAP_START],
key3 : LOAD_CORE_CREATED,
value3 : timing[LOAD_CORE_CREATED],
key4 : LOAD_SETUP_DONE,
value4 : timing[LOAD_SETUP_DONE],
key5 : LOAD_START_DONE,
value5 : timing[LOAD_START_DONE],
});
reportPerformanceMetricEvent(analytics, {
eventName: "dashboard_loaded",
meta: {
// fields that are searchable-as-keywords but not aggregateable
kibana_version: this.coreContext.env.packageInfo.version,
protocol : window.location.protocol,
},
duration: 2265,
key1 : "time_to_data",
value1 : 1801,
key2 : "num_of_panels",
value2 : 7,
});
```
Normally reporting an event requires a registration of event type however since we want
performance metrics to be uniform across all kibana we already summarized it with the structure above.
We welcome all the feedback based on your usecases.
This event will be indexed with the following structure:
# Guidelines of What to report in Performance Metric Events
```typescript
{
"_index": "backing-ebt-kibana-browser-performance-metrics-000001", // Performance metrics are stored in a dedicated simplified index (browser \ server).
"_source": {
"timestamp": "2022-08-31T11:29:58.275Z"
"event_type": "performance_metric", // All events share a common event type to simplify mapping
"eventName": COMPLEX_APP_ACTION, // Event name as specified when reporting it
"duration": 736, // Event duration as specified when reporting it
"key1": INSPECT_DATA, // The key name of INSPECT_DATA
"value1": 250, // The duration of step INSPECT_DATA
"key2": PROCESS_DATA, // The key name of PROCESS_DATA
"value2": 520, // The duration of step PROCESS_DATA
"meta": {
"dataSource": 'load-data-from-api',
},
"context": { // Context holds information identifying the deployment, version, application and page that generated the event
"version": "8.5.0-SNAPSHOT",
"cluster_name": "job-ftr_configs_2-cluster-ftr",
"pageName": "application:dashboards:app",
"applicationId": "dashboards",
"page": "app",
"entityId": "61c58ad0-3dd3-11e8-b2b9-5d5dc1715159",
"branch": "main",
"labels": {
"journeyName": "flight_dashboard",
}
...
},
...
},
}
```
The performance metrics API supports **5 numbered free fields** that can be used to report numeric metrics that you intend to analyze.
Note that they can be used for any type of numeric information you may want to report and use to create your own flexible schema,
without having to add custom mappings.
In order to simplify the consumption of performance telemetry, we'd like to standardize the shape of events. This requires considering the following aspects:
- `Performance` - amount and size of events being sent
- `Discoverability` - how easy it is to find different events, dimensions and metrics
- `Queriability` - we would like to have the ability to write complex queries on top of telemetry data; for example getting all dashboard load events that took over 10 seconds for dashboards that contain less than 5 panels.
- `Flexibility` - Teams should be able to flexibly design their event structure without having to define custom mappings (except for edge cases). I.e. a team can decide that the first free field is always used for that page load.
- `Ease of visualization` - it should be easy to visualize the data
- `Index field explosion` - having too many field names in the index, damages the performance when visualizing the data.
If you want to provide event specific context, you can add properties to the `meta` field.
The `meta` object is stored as a [flattened field](https://www.elastic.co/guide/en/elasticsearch/reference/current/flattened.html) hence
it's searchable and can be used to further breakdown event metrics.
And event schema is composed of fields which is designed for specific purposes:
**Note**: It's important to keep in mind `free field` values are integers and floating point values will be rounded.
- `Event name` We decided to report all metric events under a single event_type, so that we can avoid registering it multiple times. This requires us to add a special standard field for eventName.
- `Standardized fields` (duration) these fields represent common values that might be reported for performance. This can include duration for time based reports, status, heapSize, memorySize, etc. Standardized fields will be mapped as is to the index, so they can be visualized.
- `Free fields` (key1, value1, key2, value2, ..., key5, value5) A limited set of integer-numbered free key-value pairs that can be used to report additional use-case specific fields. Teams may use them to create their own internal sub-schema. For example, a team may choose to use key1 to always report the loading time of a certain component that is always present on the page. Free fields will be mapped, if present, to the index, so they can be visualized. If a free field becomes popular, it can be promoted to a standardized field, freeing up the free field for other use.
- `Non-standard fields` (kibana_version, protocol ...) - additional fields may be added to an event by extending them to the `meta` object. However, those won't be automatically mapped to the index. Since the `meta` field is a Flattened Field, this means those fields will be searchable by default, but an additional field mapping would be required to visualize them.
### How to choose and measure events
**Note**: It's important to keep in mind `Free Field` values are integer otherwise they will be rounded.
Events should be meaningful and can have multiple sub metrics which will give specific information of certain action. For example
page-load event can be composed of render time, data load time during the page-load and so on. It's important to understand these
Events should be meaningful and can have multiple sub metrics which will give specific information of certain actions. For example
page-load events can be composed of render time, data load time during the page-load and so on. It's important to understand these
events will have meaning for performance investigations and that can be used in visualizations, aggregations. Considering this,
creating an event for cpuUsage does not bring any value because it doesn't bring any context with itself and reporting multiple of these
events in different places of code will have so much variability during performance analysis of your code. However it can be nice attribute
to follow if it's important for you to look inside of specific event e.g. page-load.
to follow if it's important for you to look inside of a specific event e.g. `page-load`.
- Understand your events
**Make sure that the event is clearly defined and consistent** (i.e. same code flow is executed each time).
Consider the start point and endpoint of the measurement and what happens between those points.
For example: a `app-data-load` event should not include the time it takes to render the data.
- **Choose event names wisely**.
Try to balance event names specificity. Calling an event `load` is too generic, calling an event `tsvb-data-load` is too specific (instead the visualization
type can be specified in a `meta` field)
- **Distinguish between flows with event context**.
If a function that loads data is called when an app loads, when the user changes filters and when the refresh button is clicked, you should distinguish between
these flows by specifying a `meta` field.
- **Avoid duplicate events**.
Make sure that measurement and reporting happens in a point of the code that is executed only once.
For example, make sure that refresh events are reported only once per button click.
- **Measure as close to the event as possible**.
For example, if you're measuring the execution of a specific React Effect execution, place the measurement code inside the effect.
try to place the measurement start right before the navigation is performed and stop measuring as soon as all resources are loaded
- **Use the `window.performance` API**.
The [`performance.now()`](https://developer.mozilla.org/en-US/docs/Web/API/Performance/now) API can be used to accurate way to receive timestamps
The [`performance.mark()`](https://developer.mozilla.org/en-US/docs/Web/API/Performance/mark) API can be used to track performance without having to pollute the
code.
- **Keep performance in mind**. Reporting the performance of Kibana should never harm its own performance.
Avoid sending events too frequently (`onMouseMove`) or adding serialized JSON objects (whole `SavedObjects`) into the meta object.
# Analytics Client