[APM] Document serverless-specific UI (#135178)

Co-authored-by: Alexander Wert <AlexanderWert@users.noreply.github.com>
This commit is contained in:
Brandon Morelli 2022-07-14 15:04:57 -06:00 committed by GitHub
parent 4498161a47
commit 3044cb7ba5
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
8 changed files with 90 additions and 12 deletions

View file

@ -21,7 +21,7 @@ NOTE: Queries within the {apm-app} are also applied to the correlations.
==== Find high transaction latency correlations
The correlations on the *Latency correlations* tab help you discover which
attributes are contributing to increased transaction latency.
attributes are contributing to increased transaction latency.
[role="screenshot"]
image::apm/images/correlations-hover.png[Latency correlations]
@ -74,7 +74,7 @@ The table is sorted by scores, which are mapped to high, medium, or low impact
levels. Attributes with high impact levels are more likely to contribute to
failed transactions. By default, the attribute with the highest score is added
to the chart. To see a different attribute in the chart, select its row in the
table.
table.
For example, in the screenshot below, there are attributes such as a specific
node and pod name that have medium impact on the failed transactions.
@ -86,4 +86,4 @@ Select the `+` filter to create a new query in the {apm-app} for transactions
with one or more of these attributes. If you are unfamiliar with a field, click
the icon beside its name to view its most popular values and optionally filter
on those values too. Each time that you add another attribute, it is filtering
out more and more noise and bringing you closer to a diagnosis.
out more and more noise and bringing you closer to a diagnosis.

View file

@ -12,6 +12,7 @@ Learn how to perform common APM app tasks.
* <<filters>>
* <<correlations>>
* <<machine-learning-integration>>
* <<apm-lambda>>
* <<advanced-queries>>
* <<transactions-annotations>>
@ -30,6 +31,8 @@ include::correlations.asciidoc[]
include::machine-learning.asciidoc[]
include::lambda.asciidoc[]
include::advanced-queries.asciidoc[]
include::deployment-annotations.asciidoc[]

Binary file not shown.

After

Width:  |  Height:  |  Size: 519 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 210 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 663 KiB

51
docs/apm/lambda.asciidoc Normal file
View file

@ -0,0 +1,51 @@
[role="xpack"]
[[apm-lambda]]
=== Observe Lambda functions
Elastic APM provides performance and error monitoring for AWS Lambda functions.
Get insight into function execution and runtime behavior, as well as visibility into how your Lambda functions relate to and depend on other services.
To set up Lambda monitoring, see the relevant
{apm-guide-ref}/monitoring-aws-lambda.html[quick start guide].
[float]
[[apm-lambda-cold-start-info]]
==== Cold starts
A cold start occurs when a Lambda function has not been used for a certain period of time. A lambda worker receives a request to run the function and prepares an execution environment.
Cold starts are an unavoidable byproduct of the serverless world, but visibility into how they impact your services can help you make better decisions about factors like how much memory to allocate to a function, whether to enable provisioned concurrency, or if it's time to consider removing a large dependency.
[float]
[[apm-lambda-cold-start-rate]]
===== Cold start rate
The cold start rate (i.e. proportion of requests that experience a cold start) is displayed per service and per transaction.
[role="screenshot"]
image::apm/images/lambda-cold-start.png[lambda cold start graph]
Cold start is also displayed in the trace waterfall, where you can drill-down into individual traces and see trace metadata like AWS request ID, trigger type, and trigger request ID.
[role="screenshot"]
image::apm/images/lambda-cold-start-trace.png[lambda cold start trace]
[float]
[[apm-lambda-cold-start-latency]]
===== Latency distribution correlation
The <<correlations-latency,latency correlations>> feature can be used to visualize the impact of Lambda cold starts on latency--just select the `faas.coldstart` field.
[role="screenshot"]
image::apm/images/lambda-correlations.png[lambda correlations example]
[float]
[[apm-lambda-service-config]]
==== AWS Lambda function grouping
The default APM agent configuration results in one APM service per AWS Lambda function,
where the Lambda function name is the service name.
In some use cases, it makes more sense to logically group multiple lambda functions under a single
APM service. You can achieve this by setting the `ELASTIC_APM_SERVICE_NAME` environment variable
on related Lambda functions to the same value.

View file

@ -8,7 +8,8 @@ high-level visibility into how a service is performing across your infrastructur
* Service details like service version, runtime version, framework, and agent name and version
* Container and orchestration information
* Cloud provider, machine type, and availability zone
* Cloud provider, machine type, service name, region, and availability zone
* Serverless function names and event trigger type
* Latency, throughput, and errors over time
* Service dependencies
@ -16,10 +17,10 @@ high-level visibility into how a service is performing across your infrastructur
[[service-time-comparison]]
=== Time series and expected bounds comparison
For insight into the health of your services, you can compare how a service
performs relative to a previous time frame or to the expected bounds from the
corresponding {anomaly-job}. For example, has latency been slowly increasing
over time, did the service experience a sudden spike, is the throughput similar
For insight into the health of your services, you can compare how a service
performs relative to a previous time frame or to the expected bounds from the
corresponding {anomaly-job}. For example, has latency been slowly increasing
over time, did the service experience a sudden spike, is the throughput similar
to what the {ml} job expects enabling a comparison can provide the answer.
[role="screenshot"]
@ -42,8 +43,8 @@ The time-based comparison options are based on the selected time filter range:
|An identical amount of time immediately before the selected time range
|====
You can use the expected bounds comparison if {ml-jobs} exist in your selected
environment and you have
You can use the expected bounds comparison if {ml-jobs} exist in your selected
environment and you have
{ml-docs}/setup.html#kib-visibility-spaces[access to the {ml-features}].
[discrete]
@ -79,7 +80,7 @@ image::apm/images/traffic-transactions.png[Traffic and transactions]
The failed transaction rate represents the percentage of failed transactions from the perspective of the selected service.
It's useful for visualizing unexpected increases, decreases, or irregular patterns in a service's transactions.
+
[TIP]
====
HTTP **transactions** from the HTTP server perspective do not consider a `4xx` status code (client error) as a failure
@ -119,6 +120,17 @@ requires an agent version ≥ v5.6.3.
[role="screenshot"]
image::apm/images/spans-dependencies.png[Span type duration and dependencies]
[discrete]
[[service-cold-start]]
=== Cold start rate
The cold start rate chart is specific to serverless services.
It displays the percentage of requests that trigger a cold start of a serverless function.
See <<apm-lambda-cold-start-info>> for more information.
[role="screenshot"]
image::apm/images/lambda-cold-start.png[lambda cold start graph]
[discrete]
[[service-instances]]
=== Instances
@ -157,9 +169,16 @@ image::apm/images/metadata-icons.png[Service metadata]
*Cloud provider information*
* Cloud provider
* Cloud service name
* Availability zones
* Machine types
* Project ID
* Region
*Serverless information*
* Function name(s)
* Event trigger type
*Alerts*

View file

@ -8,7 +8,7 @@ APM agents automatically collect performance metrics on HTTP requests, database
[role="screenshot"]
image::apm/images/apm-transactions-overview.png[Example view of transactions table in the APM app in Kibana]
The *Latency*, *transactions per minute*, *Failed transaction rate*, and *Average duration by span type*
The *Latency*, *Throughput*, *Failed transaction rate*, *Average duration by span type*, and *Cold start rate*
charts display information on all transactions associated with the selected service:
*Latency*::
@ -48,6 +48,10 @@ This could be a sign that the agent does not have auto-instrumentation for whate
+
It's important to note that if you have asynchronous spans, the sum of all span times may exceed the duration of the transaction.
*Cold start rate*::
Only applicable to serverless transactions, this chart displays the percentage of requests that trigger a cold start of a serverless function.
See <<apm-lambda-cold-start-info>> for more information.
[discrete]
[[transactions-table]]
=== Transactions table
@ -149,6 +153,7 @@ Learn more about a trace sample in the *Metadata* tab:
* Agent information
* URL
* User - Requires additional configuration, but allows you to see which user experienced the current transaction.
* FaaS information, like cold start, AWS request ID, trigger type, and trigger request ID
TIP: All of this data is stored in documents in Elasticsearch.
This means you can select "Actions - View transaction in Discover" to see the actual Elasticsearch document under the discover tab.