[skip ci][Maps] rework terms join documentation (#40005) (#41041)

* [Maps] revamp terms join docs

* clean up

* clarify how join adds properties to left source features

* move configuration to relavent area

* add sentence explaining that features without join properties are not visible

* paired with gchaps for edits

* cleanup

* suggested changes from gchaps
This commit is contained in:
Nathan Reese 2019-07-12 13:08:20 -06:00 committed by GitHub
parent 5fcafbceff
commit c8343d0c94
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
3 changed files with 65 additions and 31 deletions

Binary file not shown.

Before

Width:  |  Height:  |  Size: 849 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 111 KiB

View file

@ -2,13 +2,13 @@
[[maps-aggregations]]
== Plot big data without plotting too much data
The Maps application uses {ref}/search-aggregations.html[aggregations] to plot large data sets without overwhemling your network or your browser.
Use {ref}/search-aggregations.html[aggregations] to plot large data sets without overwhemling your network or your browser.
Aggregations group your documents into buckets and calculate metrics for each bucket.
Your documents stay in Elasticsearch and only the metrics for each group are returned to your computer.
[float]
[role="xpack"]
[[maps-grid-aggregation]]
=== Grid aggregation
@ -24,14 +24,14 @@ The point location is the weighted centroid for all geo-points in the gridded ce
*Heat map*:: Creates a <<heatmap-layer, heat map layer>> that clusters the weighted centroids for each gridded cell.
[float]
[role="xpack"]
[[maps-top-hits-aggregation]]
=== Most recent entities
Most recent entities uses {es} {ref}/search-aggregations-bucket-terms-aggregation.html[terms aggregation] to group your documents by entity.
Then, {ref}/search-aggregations-metrics-top-hits-aggregation.html[top hits metric aggregation] accumulates the most recent documents for each entry.
Most recent entities is only available for vector layers with *Documents* source.
Most recent entities is available for <<vector-layer, vector layers>> with *Documents* source.
To enable most recent entities, click "Show most recent documents by entity" and configure the following:
. Set *Entity* to the field that identifies entities in your documents.
@ -41,21 +41,42 @@ This field will be used to sort your documents in the top hits aggregation.
. Set *Documents per entity* to configure the maximum number of documents accumulated per entity.
[float]
[role="xpack"]
[[terms-join]]
=== Terms join
=== Term join
Terms joins use a shared key to combine the results of an {es} terms aggregation and vector features.
You can augment vector features with property values that symbolize features and provide richer tooltip content.
Use term joins to augment vector features with properties for <<maps-vector-style-data-driven, data driven styling>> and richer tooltip content.
Term joins are available for <<vector-layer, vector layers>> with the following sources:
* Custom vector shapes
* Documents
* Vector shapes
==== Example term join
The <<maps-add-choropleth-layer, choropleth layer example>> uses a term join to shade world countries by web log traffic.
Darker shades symbolize countries with more web log traffic, and lighter shades symbolize countries with less traffic.
[role="screenshot"]
image::maps/images/terms_join.png[]
image::maps/images/gs_add_cloropeth_layer.png[]
Follow the example below to understand how *Terms joins* work.
This example uses https://www.elastic.co/elastic-maps-service[Elastic Maps Service (EMS)] World Countries as the vector source and
the Kibana sample data set "Sample web logs" as the Elasticsearch index.
===== How a term join works
Example feature from World Countries:
A term join uses a shared key to combine vector features, the left source, with the results of an {es} terms aggregation, the right source.
The cloropeth example uses the shared key, https://wikipedia.org/wiki/ISO_3166-1_alpha-2[ISO 3166-1 alpha-2 code], to join world countries and web log traffic.
ISO 3166-1 alpha-2 code is an international standard that identifies countries by a two-letter country code.
For example, *Sweden* has an ISO 3166-1 alpha-2 code of *SE*.
[role="screenshot"]
image::maps/images/terms_join_shared_key_config.png[]
===== Left source
The left source for the term join is the https://www.elastic.co/elastic-maps-service[Elastic Maps Service (EMS)] World Countries. Vector features for this source are provided by EMS. You can also use your own vector features.
In the following example, *iso2* property defines the shared key for the left source.
--------------------------------------------------
{
geometry: {
@ -64,14 +85,29 @@ Example feature from World Countries:
},
properties: {
name: "Sweden",
iso2: "SE",
iso3: "SWE"
iso2: "SE"
},
type: "Feature"
}
--------------------------------------------------
Example documents from Sample web logs:
===== Right source
The right source uses the Kibana sample data set "Sample web logs".
In this data set, the *geo.src* field contains the ISO 3166-1 alpha-2 code of the country of origin.
A {ref}/search-aggregations-bucket-terms-aggregation.html[terms aggregation] groups the sample web log documents by *geo.src* and calculates metrics for each term.
The METRICS configuration defines two metric aggregations:
* The count of all documents in the terms bucket.
* The average of the field "bytes" for all documents in the terms bucket.
[role="screenshot"]
image::maps/images/terms_join_metric_config.png[]
The right source does not provide individual documents, but instead provides the metrics from a terms aggregation.
The metrics are calculated from the following sample web logs documents.
--------------------------------------------------
{
bytes: 1837,
@ -103,18 +139,9 @@ Example documents from Sample web logs:
}
--------------------------------------------------
The JOIN configuration links the vector source "World Countries" to the Elasticsearch index "kibana_sample_data_logs"
on the shared key *iso2 = geo.src*.
[role="screenshot"]
image::maps/images/terms_join_shared_key_config.png[]
The terms aggregation creates a bucket for each unique *geo.src* value. Metrics are calucated for all documents in a bucket.
The METRICS configuration defines two metric aggregations:
the count of all documents in the terms bucket and
the average of the field "bytes" for all documents in the terms bucket.
[role="screenshot"]
image::maps/images/terms_join_metric_config.png[]
Example terms aggregation response:
The following shows an example terms aggregation response. Note the *key* property, which defines the shared key for the right source.
--------------------------------------------------
{
aggregations: {
@ -126,14 +153,21 @@ Example terms aggregation response:
avg_of_bytes: {
value: 3177.25
}
}
},
...
]
}
}
}
--------------------------------------------------
Finally, the terms aggregation response is joined with the vector features.
[role="screenshot"]
image::maps/images/terms_join_tooltip.png[]
==== Augmenting the left source with metrics from the right source
The join adds metrics for each terms aggregation bucket to the world country feature with the corresponding ISO 3166-1 alpha-2 code. Features that do not have a corresponding terms aggregation bucket are not visible on the map.
The world country features now have two additional properties:
* Count of web log traffic originating from the world country
* Average bytes of web log traffic originating from the world country
The cloropeth example uses the count of web log traffic to symbolize countries by web log traffic.