Refactors to address feedback, new screenshots.

2025-04-23 17:28:26 -04:00 · 2015-05-19 18:49:07 -07:00 · 2015-05-19 18:49:07 -07:00 · 006ef675c8
commit 006ef675c8
parent 659729a638
15 changed files with 169 additions and 36 deletions
--- a/docs/getting-started.asciidoc
+++ b/docs/getting-started.asciidoc
@ -19,9 +19,18 @@ The material in this section assumes you have a working Kibana install connected
 The tutorials in this section rely on the following data sets:

 * The complete works of William Shakespeare, suitably parsed into fields. Download this data set by clicking here: 
-https://www.elastic.co/guide/en/kibana/3.0/snippets/shakespeare.json[shakespeare.json].
+https://www.elastic.co/guide/en/kibana/3.0/snippets/shakespeare.json[shakespeare.json.gz].
 * A set of fictitious accounts with randomly generated data. Download this data set by clicking here: 
-https://github.com/bly2k/files/blob/master/accounts.zip?raw=true[accounts.json]
+https://github.com/bly2k/files/blob/master/accounts.zip?raw=true[accounts.json.gz]
+* A set of randomly generated log files. Download this data set by clicking here: [logstash.json.gz]
+
+The data sets are compressed with the `gzip` utility. Unzip the files after downloading them with the following 
+commands:
+
+[source,shell]
+gunzip shakespeare.json.gz
+gunzip accounts.json.gz
+gunzip logstash.json.gz

 The Shakespeare data set is organized in the following schema:

@ -52,11 +61,51 @@ The accounts data set is organized in the following schema:
    "state": "String"
 }

-After downloading the data sets, load them into Elasticsearch with the following commands:
+The schema for the logs data set has 114 different fields, but the notable ones used in this tutorial are:
+
+[source,json]
+{
+    "memory": INT,
+    "geo.coordinates": "geo_point"
+    "@timestamp": "date"
+}
+
+Before we load the Shakespeare data set, we need to set up a {ref}/mapping.html[_mapping_] for the fields. Mapping 
+divides the documents in the index into logical groups and specifies a field's characteristics, such as the field's
+searchability or whether or not it's _tokenized_, or broken up into separate words.
+
+Use the following command to set up a mapping for the Shakespeare data set:

 [source,shell]
-$ curl -XPOST 'localhost:9200/bank/account/_bulk?pretty' --data-binary @accounts.json
-$ curl -XPOST 'localhost:9200/play/shakespeare/_bulk?pretty' --data-binary @shakespeare.json
+curl -XPUT http://localhost:9200/shakespeare -d '
+{
+ "mappings" : {
+  "_default_" : {
+   "properties" : {
+    "speaker" : {"type": "string", "index" : "not_analyzed" },
+    "play_name" : {"type": "string", "index" : "not_analyzed" },
+    "line_id" : { "type" : "integer" },
+    "speech_number" : { "type" : "integer" }
+   }
+  }
+ }
+}
+';
+
+This mapping specifies the following qualities for the data set:
+
+* The _speaker_ field is a string that isn't analyzed. The string in this field is treated as a single unit, even if
+there are multiple words in the field.
+* The same applies to the _play_name_ field.
+* The line_id and speech_number fields are integers.
+
+The accounts and logstash data sets don't require any mappings, so at this point we're ready to load the data sets into 
+Elasticsearch with the following commands:
+
+[source,shell]
+curl -XPOST 'localhost:9200/bank/_bulk?pretty' --data-binary @accounts.json
+curl -XPOST 'localhost:9200/shakespeare/_bulk?pretty' --data-binary @shakespeare.json
+curl -XPOST 'localhost:9200/_bulk?pretty' --data-binary @logstash.json

 These commands may take some time to execute, depending on the computing resources available.

@ -71,15 +120,18 @@ You should see output similar to the following:
 health status index               pri rep docs.count docs.deleted store.size pri.store.size
 yellow open   bank                  5   1       1000            0    418.2kb        418.2kb
 yellow open   shakespeare           5   1     111396            0     17.6mb         17.6mb
+yellow open   logstash-2015.05.18   5   1       4631            0     15.6mb         15.6mb
+yellow open   logstash-2015.05.19   5   1       4624            0     15.7mb         15.7mb
+yellow open   logstash-2015.05.20   5   1       4750            0     16.4mb         16.4mb

 [float]
 [[tutorial-define-index]]
 === Defining Your Index Patterns

 Each set of data loaded to Elasticsearch has an https://www.elastic.co/guide/en/kibana/current/settings.html#settings-create-pattern[index pattern]. In the previous section, the Shakespeare data set has an index named `shakespeare`, and the accounts 
-data set has an index named `bank`. An _index pattern_ is a regular expression that can 
-match multiple indices. For example, in the common logging use case, a typical index name contains the date in MM-DD-YYYY 
-format, and an index pattern for May would look something like `logstash-05-*`.
+data set has an index named `bank`. An _index pattern_ is a string with optional wildcards that can match multiple 
+indices. For example, in the common logging use case, a typical index name contains the date in MM-DD-YYYY 
+format, and an index pattern for May would look something like `logstash-2015.05*`.

 For this tutorial, any pattern that matches either of the two indices we've loaded will work. Open a browser and 
 navigate to `localhost:5601`. Click the *Settings* tab, then the *Indices* tab. Click *Add New* to define a new index 
@ -102,11 +154,17 @@ which you can save and load by clicking the buttons to the right of the search b
 Beneath the search box, the current index pattern is displayed in a drop-down. You can change the index pattern by 
 selecting a different pattern from the drop-down selector.

+You can construct searches by using the field names and the values you're interested in. With numeric fields you can 
+use comparison operators such as greater than (>), less than (<), or equals (=). You can link elements with the
+logical operators AND, OR, and NOT, all in uppercase.
+
 Try selecting the `ba*` index pattern and putting the following search into the search box:

 [source,text]
 account_number:<100 AND balance:>47500

+This search returns all account numbers between zero and 99 with balances in excess of 47,500.
+
 If you're using the linked sample data set, this search returns 5 results: Account numbers 8, 32, 78, 85, and 97.

 image::images/tutorial-discover-2.png[]
@ -122,17 +180,20 @@ image::images/tutorial-discover-3.png[]
 === Data Visualization: Beyond Discovery

 The visualization tools available on the *Visualize* tab enable you to display aspects of your data sets in several 
-different ways. Visualizations depend on Elasticsearch {ref}/search-aggregations.html[aggregations] in two different 
-types: _bucket_ aggregations and _metric_ aggregations. A bucket aggregation sorts your data according to criteria you 
-specify. For example, in our accounts data set, we can establish a range of account balances, then display what 
-proportions of the total fall into which range of balances.
+different ways. 

 Click on the *Visualize* tab to start:

 image::images/tutorial-visualize.png[]

-Click on *Pie chart*, then *From a new search*. Select the `ba*` index pattern. The whole pie displays, since we 
-haven't specified any buckets yet.
+Click on *Pie chart*, then *From a new search*. Select the `ba*` index pattern. 
+
+Visualizations depend on Elasticsearch {ref}/search-aggregations.html[aggregations] in two different types: _bucket_ 
+aggregations and _metric_ aggregations. A bucket aggregation sorts your data according to criteria you specify. For 
+example, in our accounts data set, we can establish a range of account balances, then display what proportions of the 
+total fall into which range of balances.
+
+The whole pie displays, since we  haven't specified any buckets yet.

 image::images/tutorial-visualize-pie-1.png[]

@ -165,23 +226,67 @@ Save this chart by clicking the *Save Visualization* button to the right of the
 _Pie Example_.

 Next, we're going to make a bar chart. Click on *New Visualization*, then *Vertical bar chart*. Select *From a new 
-search* and the `ba*` index pattern, just as you did for the pie chart. You'll see a single big bar, since we haven't
-defined any buckets yet:
+search* and the `shakes*` index pattern. You'll see a single big bar, since we haven't defined any buckets yet:

 image::images/tutorial-visualize-bar-1.png[]

-For the Y-axis metrics aggregation, select *Average*, with *age* as the field. For the X-Axis buckets, select the 
-*Range* aggregation and define the same ranges as you did for the pie chart.
+For the Y-axis metrics aggregation, select *Unique Count*, with *speaker* as the field. For Shakespeare plays, it might 
+be useful to know which plays have the lowest number of distinct speaking parts, if your theater company is short on 
+actors. For the X-Axis buckets, select the *Terms* aggregation with the *play_name* field. For the *Order*, select
+*Bottom*, leaving the *Size* at 5.

-Now, click *Add sub-buckets* and *Split Bars* to refine our data. In addition to listing the average age of the 
-accounts in each balance range, we're going to split the bars by the top five states with the highest average ages.
-Select *Terms* as the sub-aggregation, with *state* as the field. Leave the other elements at their default values and 
-click the green *Apply changes* button. Your chart should now look like this:
+Leave the other elements at their default values and click the green *Apply changes* button. Your chart should now look 
+like this:

 image::images/tutorial-visualize-bar-2.png[]

+Notice how the individual play names show up as whole phrases, instead of being broken down into individual words. This 
+is the result of the mapping we did at the beginning of the tutorial, when we marked the *play_name* field as 'not 
+analyzed'.
+
+Hovering on each bar shows you the number of speaking parts for each play as a tooltip. You can turn this behavior off, 
+as well as change many other options for your visualizations, by clicking the *Options* tab in the top left.
+
+Now that you have a list of the smallest casts for Shakespeare plays, you might also be curious to see which of these 
+plays makes the greatest demands on an individual actor by showing the maximum number of speeches for a given part. Add 
+a Y-axis aggregation with the *Add metrics* button, then choose the *Max* aggregation for the *speech_number* field. In 
+the *Options* tab, change the *Bar Mode* drop-down to *grouped*, then click the green *Apply changes* button. Your 
+chart should now look like this:
+
+image::images/tutorial-visualize-bar-3.png[]
+
+As you can see, _Love's Labours Lost_ has an unusually high maximum speech number, compared to the other plays, and 
+might therefore make more demands on an actor's memory.
+
 Save this chart with the name _Bar Example_.

+Next, we're going to make a tile map chart to visualize some geographic data. Click on *New Visualization*, then 
+*Tile map*. Select *From a new search* and the `logstash-*` index pattern. Define the time window for the events we're 
+exploring by clicking the time selector at the top right of the Kibana interface. Click on *Absolute*, then set the 
+end time for the range to May 20, 2015 and the start time to May 18, 2015:
+
+image::images/tutorial-timepicker.png[]
+
+Once you've got the time range set up, click the *Go* button, then close the time picker by clicking the small up arrow 
+at the bottom. You'll see a map of the world, since we haven't defined any buckets yet:
+
+image::images/tutorial-visualize-map-1.png[]
+
+Select *Geo Coordinates* as the bucket, then click the green *Apply changes* button. Your chart should now look like 
+this:
+
+image::images/tutorial-visualize-map-2.png[]
+
+You can navigate the map by clicking and dragging, zoom with the *+/-* buttons, or hit the *Fit Data Bounds* button to
+zoom to the lowest level that includes all the points. You can also create a filter to define a rectangle on the map, 
+either to include or exclude, by clicking the *Latitude/Longitude Filter* button and drawing a bounding box on the map. 
+A green oval with the filter definition displays right under the query box:
+
+image::images/tutorial-visualize-map-3.png[]
+
+Hover on the filter to display the controls to toggle, pin, invert, or delete the filter. Save this chart with the name 
+_Bar Example_.
+
 Finally, we're going to define a sample Markdown widget to display on our dashboard. Click on *New Visualization*, then 
 *Markdown widget*, to display a very simple Markdown entry field:

@ -206,10 +311,10 @@ Save this visualization with the name _Markdown Example_.

 A Kibana dashboard is a collection of visualizations that you can arrange and share. To get started, click the 
 *Dashboard* tab, then the *Add Visualization* button at the far right of the search box to display the list of saved 
-visualizations. Select _Markdown Example_, _Pie Example_, and _Bar Example_, then close the list of visualizations by
-clicking the small up-arrow at the bottom of the list. You can move the containers for each visualization by 
-clicking and dragging the title bar. Resize the containers by dragging the lower right corner of a visualization's 
-container. Your sample dashboard should end up looking roughly like this:
+visualizations. Select _Markdown Example_, _Pie Example_, _Bar Example_, and _Map Example_, then close the list of 
+visualizations by clicking the small up-arrow at the bottom of the list. You can move the containers for each 
+visualization by clicking and dragging the title bar. Resize the containers by dragging the lower right corner of a 
+visualization's container. Your sample dashboard should end up looking roughly like this:

 image::images/tutorial-dashboard.png[]

--- a/docs/images/NYCTA-Dashboard.jpg
+++ b/docs/images/NYCTA-Dashboard.jpg
--- a/docs/images/tutorial-dashboard.png
+++ b/docs/images/tutorial-dashboard.png
--- a/docs/images/tutorial-timepicker.png
+++ b/docs/images/tutorial-timepicker.png
--- a/docs/images/tutorial-visualize-bar-1.png
+++ b/docs/images/tutorial-visualize-bar-1.png
--- a/docs/images/tutorial-visualize-bar-2.png
+++ b/docs/images/tutorial-visualize-bar-2.png
--- a/docs/images/tutorial-visualize-bar-3.png
+++ b/docs/images/tutorial-visualize-bar-3.png
--- a/docs/images/tutorial-visualize-map-1.png
+++ b/docs/images/tutorial-visualize-map-1.png
--- a/docs/images/tutorial-visualize-map-2.png
+++ b/docs/images/tutorial-visualize-map-2.png
--- a/docs/images/tutorial-visualize-map-3.png
+++ b/docs/images/tutorial-visualize-map-3.png
--- a/docs/images/tutorial-visualize-pie-2.png
+++ b/docs/images/tutorial-visualize-pie-2.png
--- a/docs/images/tutorial-visualize-pie-3.png
+++ b/docs/images/tutorial-visualize-pie-3.png
--- a/docs/line.asciidoc
+++ b/docs/line.asciidoc
@ -30,10 +30,19 @@ The availability of these options varies depending on the aggregation you choose

 Select the *Options* tab to change the following aspects of the chart:

-*Y-Axis Scale*:: You can select *linear*, *log*, or *square root* scales for the chart's Y axis.
-*Smooth Lines*:: Check this box to curve the line from point to point.
+*Y-Axis Scale*:: You can select *linear*, *log*, or *square root* scales for the chart's Y axis. You can use a log 
+scale to display data that varies exponentially, such as a compounding interest chart, or a square root scale to 
+regularize the display of data sets with variabilities that are themselves highly variable. This kind of data, where 
+the variability is itself variable over the domain being examined, is known as _heteroscedastic_ data. For example, if 
+a data set of height versus weight has a relatively narrow range of variability at the short end of height, but a wider
+range at the taller end, the data set is heteroscedastic. 
+*Smooth Lines*:: Check this box to curve the line from point to point. Bear in mind that smoothed lines necessarily 
+affect the representation of your data and create a potential for ambiguity.
 *Show Connecting Lines*:: Check this box to draw lines between the points on the chart.
 *Show Circles*:: Check this box to draw each data point on the chart as a small circle.
+*Current time marker*:: For charts of time-series data, check this box to draw a red line on the current time.
+*Set Y-Axis Extents*:: Check this box and enter values in the *y-max* and *y-min* fields to set the Y axis to specific 
+values. 
 *Show Tooltip*:: Check this box to enable the display of tooltips.
 *Show Legend*:: Check this box to enable the display of a legend next to the chart.
 *Scale Y-Axis to Data Bounds*:: The default Y-axis bounds are zero and the maximum value returned in the data. Check 
--- a/docs/tilemap.asciidoc
+++ b/docs/tilemap.asciidoc
@ -30,6 +30,9 @@ for the {ref}/search-aggregations-bucket-geohashgrid-aggregation.html#_cell_dime
 aggregation for details on the area specified by each precision level. As of the 4.1 release, Kibana supports a maximum 
 geohash length of 7.

+NOTE: Higher precisions increase memory usage for the browser displaying Kibana as well as for the underlying 
+Elasticsearch cluster.
+
 Once you've specified a buckets aggregation, you can define sub-aggregations to refine the visualization. Tile maps 
 only support sub-aggregations as split charts. Click *+ Add Sub Aggregation*, then *Split Chart* to select a 
 sub-aggregation from the list of types:
@ -63,6 +66,8 @@ add another filter.
 *Geohash*:: The {ref}/search-aggregations-bucket-geohashgrid-aggregation.html[_geohash_] aggregation displays points 
 based on the geohash coordinates.

+NOTE: By default, the *Change precision on map zoom* box is checked. Uncheck the box to disable this behavior.
+
 You can click the *Advanced* link to display more customization options for your metrics or bucket aggregation:

 *Exclude Pattern*:: Specify a pattern in this field to exclude from the results.
@ -82,10 +87,21 @@ The availability of these options varies depending on the aggregation you choose

 Select the *Options* tab to change the following aspects of the chart:

-*Shaded Circle Markers*:: Displays the markers with different shades based on the metric aggregation's value.
-*Scaled Circle Markers*:: Scale the size of the markers based on the metric aggregation's value.
-*Shaded Geohash Grid*:: Displays the rectangular cells of the geohash grid instead of circular markers, with different 
+*Map type*:: Select one of the following options from the drop-down.
+*_Scaled Circle Markers_*:: Scale the size of the markers based on the metric aggregation's value.
+*_Shaded Circle Markers_*:: Displays the markers with different shades based on the metric aggregation's value.
+*_Shaded Geohash Grid_*:: Displays the rectangular cells of the geohash grid instead of circular markers, with different 
 shades based on the metric aggregation's value.
+*_Heatmap_*:: A heat map applies blurring to the circle markers and applies shading based on the amount of overlap. 
+Heatmaps have the following options:
+
+* *Radius*: Sets the size of the individual heatmap dots.
+* *Blur*: Sets the amount of blurring for the heatmap dots.
+* *Maximum zoom*: Tilemaps in Kibana support 18 zoom levels. This slider defines the maximum zoom level at which the 
+heatmap dots appear at full intensity.
+* *Minimum opacity*: Sets the opacity cutoff for the dots.
+* *Show Tooltip*: Check this box to have a tooltip with the values for a given dot when the cursor is on that dot.
+
 *Desaturate map tiles*:: Desaturate the map's color in order to make the markers stand out more clearly.

 After changing options, click the green *Apply changes* button to update your visualization, or the grey *Discard 
@ -94,7 +110,10 @@ changes* button to keep your visualization in its current state.
 [float]
 [[navigating-map]]
 ==== Navigating the Map
-Once your tilemap visualization is ready, you can explore the map in several ways. Click and hold anywhere on the map 
-and move the cursor to move the map center. Hold Shift and drag a bounding box across the map to zoom in on the 
-selection. Click the *Fit Data Bounds* button to automatically crop the map boundaries to the geohash buckets that have 
-at least one result.
+Once your tilemap visualization is ready, you can explore the map in several ways:
+* Click and hold anywhere on the map and move the cursor to move the map center. Hold Shift and drag a bounding box 
+across the map to zoom in on the selection. 
+* Click the *Fit Data Bounds* button to automatically crop the map boundaries to the geohash buckets that have at least 
+one result.
+* Click the *Latitude/Longitude Filter* button, then drag a bounding box across the map, to create a filter for the box
+coordinates.
--- a/docs/visualize.asciidoc
+++ b/docs/visualize.asciidoc
@ -91,7 +91,7 @@ Use the aggregation builder on the left of the page to configure the {ref}/searc
 visualization. Buckets are analogous to SQL `GROUP BY` statements. For more information on aggregations, see the main
 {ref}/search-aggregations.html[Elasticsearch aggregations reference].

-Bar or line chart visualizations use _metrics_ for the y-axis and _buckets_ are used for the x-axis, segment bar 
+Bar, line, or area chart visualizations use _metrics_ for the y-axis and _buckets_ are used for the x-axis, segment bar 
 colors, and row/column splits. For pie charts, use the metric for the slice size and the bucket for the number of 
 slices.