Report performance metrics docs followup (#139852)

* doc * Update dev_docs/tutorials/adding_performance_metrics.mdx Co-authored-by: Baturalp Gurdin <9674241+suchcodemuchwow@users.noreply.github.com> * Update dev_docs/tutorials/adding_performance_metrics.mdx Co-authored-by: Baturalp Gurdin <9674241+suchcodemuchwow@users.noreply.github.com> * Update dev_docs/tutorials/adding_performance_metrics.mdx Co-authored-by: Baturalp Gurdin <9674241+suchcodemuchwow@users.noreply.github.com> * Update dev_docs/tutorials/adding_performance_metrics.mdx Co-authored-by: Baturalp Gurdin <9674241+suchcodemuchwow@users.noreply.github.com> * docs * event name Co-authored-by: Baturalp Gurdin <9674241+suchcodemuchwow@users.noreply.github.com>
2025-06-27 10:40:07 -04:00 · 2022-09-01 16:24:11 +03:00 · 2022-09-01 16:24:11 +03:00 · b98fe995b9
commit b98fe995b9
parent c2a1f48fd4
1 changed files with 132 additions and 51 deletions
--- a/dev_docs/tutorials/adding_performance_metrics.mdx
+++ b/dev_docs/tutorials/adding_performance_metrics.mdx
@ -7,76 +7,157 @@ date: 2022-07-07
 tags: ['kibana', 'onboarding', 'setup', 'performance', 'development', 'telemetry']
 ---

-Reporting performance events is as easy as shown below. However, it's highly suggested to follow guidelines for what to report in certain fields:
+## Reporting performance events
+
+### Simple performance events
+
+Let's assume we intend to report the performance of a specific action called `APP_ACTION`.
+In order to do so, we need to first measure the timing of that action.
+Once we have the time measurement, we can use the `reportPerformanceMetricEvent` API to report it.
+
+The most basic form of reporting would be:
+
+```typescript
+reportPerformanceMetricEvent(analytics, {
+  eventName: APP_ACTION,     
+  duration,                  // Duration in milliseconds
+});
+```
+
+Once executed, the metric would be delivered to the `stack-telemetry` cluster, alongside with the event's context.
+The data is updated periodically, so you might have to wait up to 30 minutes to see your data in the index.
+
+Once indexed, this metric will appear in `ebt-kibana` index. It is also mapped into an additional index, dedicated to performance metrics.
+We recommend using the `Kibana Peformance` space on the telemetry cluster, where you get an `index patten` to easily access this data.
+Each document in the index has the following structure:
+
+```typescript
+{
+  "_index": "backing-ebt-kibana-browser-performance-metrics-000001",    // Performance metrics are stored at a dedicated simplified index (browser \ server). 
+  "_source": {
+    "timestamp": "2022-08-31T11:29:58.275Z"
+    "event_type": "performance_metric",                                 // All events share a common event type to simplify mapping
+    "eventName": "dashboard_loaded",                                    // Event name as specified when reporting it
+    "duration": 736,                                                    // Event duration as specified when reporting it
+    "context": {                                                        // Context holds information identifying the deployment, version, application and page that generated the event
+      "version": "8.5.0-SNAPSHOT",
+      "applicationId": "dashboards",
+      "page": "app",
+      "entityId": "61c58ad0-3dd3-11e8-b2b9-5d5dc1715159",
+      "branch": "main",
+      "labels": {
+        "journeyName": "flight_dashboard",
+        ...
+      }
+      ...
+    },
+    ...
+  },
+}
+```
+
+### Performance events with breakdowns and metadata
+
+Lets assume we are interested in benchmarking the performance of a more complex event `COMPLEX_APP_ACTION`, that is made up of two steps: 
+ - `INSPECT_DATA` measures the time it takes to retrieve a user's profile and check if there is a cached version of their data.
+    - If the cached data is fresh it proceeds with a flow `use-local-data`
+    - If data needs to be refreshed, it proceeds with a flow `load-data-from-api`.
+ - `PROCESS_DATA` loads and processes the data depending on the flow chosen in the previous step.
+
+We could utilise the additional options supported by the `reportPerformanceMetricEvent` API:

 ```typescript
 import { reportPerformanceMetricEvent } from '@kbn/ebt-tools';

 reportPerformanceMetricEvent(analytics, {
-  eventName: KIBANA_LOADED_EVENT,
+  eventName: COMPLEX_APP_ACTION,     
+  duration,                                   // Total duration in milliseconds
+  key1    : INSPECT_DATA,                     // Claiming free key1 to be used for INSPECT_DATA
+  value1  : durationOfStepA,                  // Total duration of step INSPECT_DATA in milliseconds
+  key2    : PROCESS_DATA,                     // Claiming free key2 to be used for PROCESS_DATA
+  value2  : durationOfStepB,                  // Total duration of step PROCESS_DATA in milliseconds
  meta: {
-    // fields that are searchable-as-keywords but not aggregateable
-    kibana_version: this.coreContext.env.packageInfo.version,
-    protocol      : window.location.protocol,
+    dataSource: 'flow2',                     // Providing event specific context. This can be useful to create meaningful aggregations.
  },
-  duration: timing[LOAD_FIRST_NAV],
-  key1    : LOAD_START,
-  value1  : timing[LOAD_START],
-  key2    : LOAD_BOOTSTRAP_START,
-  value2  : timing[LOAD_BOOTSTRAP_START],
-  key3    : LOAD_CORE_CREATED,
-  value3  : timing[LOAD_CORE_CREATED],
-  key4    : LOAD_SETUP_DONE,
-  value4  : timing[LOAD_SETUP_DONE],
-  key5    : LOAD_START_DONE,
-  value5  : timing[LOAD_START_DONE],
-});
-
-reportPerformanceMetricEvent(analytics, {
-  eventName: "dashboard_loaded",
-  meta: {
-    // fields that are searchable-as-keywords but not aggregateable
-    kibana_version: this.coreContext.env.packageInfo.version,
-    protocol      : window.location.protocol,
-  },
-  duration: 2265,
-  key1    : "time_to_data",
-  value1  : 1801,
-  key2    : "num_of_panels",
-  value2  : 7,
 });
 ```

-Normally reporting an event requires a registration of event type however since we want
-performance metrics to be uniform across all kibana we already summarized it with the structure above.
-We welcome all the feedback based on your usecases.
+This event will be indexed with the following structure:

-# Guidelines of What to report in Performance Metric Events
+```typescript
+{
+  "_index": "backing-ebt-kibana-browser-performance-metrics-000001",    // Performance metrics are stored in a dedicated simplified index (browser \ server). 
+  "_source": {
+    "timestamp": "2022-08-31T11:29:58.275Z"
+    "event_type": "performance_metric",                                 // All events share a common event type to simplify mapping
+    "eventName": COMPLEX_APP_ACTION,                                    // Event name as specified when reporting it
+    "duration": 736,                                                    // Event duration as specified when reporting it
+    "key1": INSPECT_DATA,                                               // The key name of INSPECT_DATA
+    "value1": 250,                                                      // The duration of step INSPECT_DATA
+    "key2": PROCESS_DATA,                                               // The key name of PROCESS_DATA
+    "value2": 520,                                                      // The duration of step PROCESS_DATA
+    "meta": {
+      "dataSource": 'load-data-from-api', 
+    },    
+    "context": {                                                        // Context holds information identifying the deployment, version, application and page that generated the event
+      "version": "8.5.0-SNAPSHOT",
+      "cluster_name": "job-ftr_configs_2-cluster-ftr",
+      "pageName": "application:dashboards:app",
+      "applicationId": "dashboards",
+      "page": "app",
+      "entityId": "61c58ad0-3dd3-11e8-b2b9-5d5dc1715159",
+      "branch": "main",
+      "labels": {
+        "journeyName": "flight_dashboard",
+      }
+      ...
+    },
+    ...
+  },
+}
+```

+The performance metrics API supports **5 numbered free fields** that can be used to report numeric metrics that you intend to analyze.
+Note that they can be used for any type of numeric information you may want to report and use to create your own flexible schema, 
+without having to add custom mappings.

-In order to simplify the consumption of performance telemetry, we'd like to standardize the shape of events. This requires considering the following aspects:
- `Performance` - amount and size of events being sent
- `Discoverability` - how easy it is to find different events, dimensions and metrics
- `Queriability` - we would like to have the ability to write complex queries on top of telemetry data; for example getting all dashboard load events that took over 10 seconds for dashboards that contain less than 5 panels.
- `Flexibility` - Teams should be able to flexibly design their event structure without having to define custom mappings (except for edge cases). I.e. a team can decide that the first free field is always used for that page load.
- `Ease of visualization` - it should be easy to visualize the data
- `Index field explosion` - having too many field names in the index, damages the performance when visualizing the data.
+If you want to provide event specific context, you can add properties to the `meta` field.
+The `meta` object is stored as a [flattened field](https://www.elastic.co/guide/en/elasticsearch/reference/current/flattened.html) hence 
+it's searchable and can be used to further breakdown event metrics.

-And event schema is composed of fields which is designed for specific purposes:
+**Note**: It's important to keep in mind `free field` values are integers and floating point values will be rounded.

- `Event name` We decided to report all metric events under a single event_type, so that we can avoid registering it multiple times. This requires us to add a special standard field for eventName.
- `Standardized fields` (duration) these fields represent common values that might be reported for performance. This can include duration for time based reports, status, heapSize, memorySize, etc. Standardized fields will be mapped as is to the index, so they can be visualized.
- `Free fields` (key1, value1, key2, value2, ..., key5, value5) A limited set of integer-numbered free key-value pairs that can be used to report additional use-case specific fields. Teams may use them to create their own internal sub-schema. For example, a team may choose to use key1 to always report the loading time of a certain component that is always present on the page. Free fields will be mapped, if present, to the index, so they can be visualized. If a free field becomes popular, it can be promoted to a standardized field, freeing up the free field for other use.
- `Non-standard fields` (kibana_version, protocol ...) - additional fields may be added to an event by extending them to the `meta` object. However, those won't be automatically mapped to the index. Since the `meta` field is a Flattened Field, this means those fields will be searchable by default, but an additional field mapping would be required to visualize them.
+### How to choose and measure events

-**Note**: It's important to keep in mind `Free Field` values are integer otherwise they will be rounded.
-
-Events should be meaningful and can have multiple sub metrics which will give specific information of certain action. For example 
-page-load event can be composed of render time, data load time during the page-load and so on. It's important to understand these 
+Events should be meaningful and can have multiple sub metrics which will give specific information of certain actions. For example 
+page-load events can be composed of render time, data load time during the page-load and so on. It's important to understand these 
 events will have meaning for performance investigations and that can be used in visualizations, aggregations. Considering this, 
 creating an event for cpuUsage does not bring any value because it doesn't bring any context with itself and reporting multiple of these 
 events in different places of code will have so much variability during performance analysis of your code. However it can be nice attribute
-to follow if it's important for you to look inside of specific event e.g. page-load. 
+to follow if it's important for you to look inside of a specific event e.g. `page-load`. 
+
+- Understand your events
+  **Make sure that the event is clearly defined and consistent** (i.e. same code flow is executed each time).
+  Consider the start point and endpoint of the measurement and what happens between those points.
+  For example: a `app-data-load` event should not include the time it takes to render the data.
+- **Choose event names wisely**.
+  Try to balance event names specificity. Calling an event `load` is too generic, calling an event `tsvb-data-load` is too specific (instead the visualization 
+  type can be specified in a `meta` field)
+- **Distinguish between flows with event context**. 
+  If a function that loads data is called when an app loads, when the user changes filters and when the refresh button is clicked, you should distinguish between
+  these flows by specifying a `meta` field.
+- **Avoid duplicate events**.
+  Make sure that measurement and reporting happens in a point of the code that is executed only once. 
+  For example, make sure that refresh events are reported only once per button click.
+- **Measure as close to the event as possible**.
+  For example, if you're measuring the execution of a specific React Effect execution, place the measurement code inside the effect.
+  try to place the measurement start right before  the navigation is performed and stop measuring as soon as all resources are loaded
+- **Use the `window.performance` API**.
+  The [`performance.now()`](https://developer.mozilla.org/en-US/docs/Web/API/Performance/now) API can be used to accurate way to receive timestamps 
+  The [`performance.mark()`](https://developer.mozilla.org/en-US/docs/Web/API/Performance/mark) API can be used to track performance without having to pollute the 
+  code.
+- **Keep performance in mind**. Reporting the performance of Kibana should never harm its own performance. 
+  Avoid sending events too frequently (`onMouseMove`) or adding serialized JSON objects (whole `SavedObjects`) into the meta object.

 # Analytics Client