[DOCS] Refactor data stream setup tutorial (#71074)

This commit is contained in:
James Rodewig 2021-03-31 17:28:55 -04:00 committed by GitHub
parent 67d98c6638
commit f41320616c
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
16 changed files with 215 additions and 197 deletions

View file

@ -2,7 +2,7 @@
[[data-streams-change-mappings-and-settings]]
== Change mappings and settings for a data stream
Each data stream has a <<create-a-data-stream-template,matching index
Each data stream has a <<create-index-template,matching index
template>>. Mappings and index settings from this template are applied to new
backing indices created for the stream. This includes the stream's first
backing index, which is auto-generated when the stream is created.
@ -417,7 +417,7 @@ mappings and settings you'd like to apply to the new data stream's backing
indices.
+
This index template must meet the
<<create-a-data-stream-template,requirements for a data stream template>>. It
<<create-index-template,requirements for a data stream template>>. It
should also contain your previously chosen name or index pattern in the
`index_patterns` property.
+
@ -471,7 +471,7 @@ PUT /_index_template/new-data-stream-template
create the new data stream. The name of the data stream must match the index
pattern defined in the new template's `index_patterns` property.
+
We do not recommend <<create-a-data-stream,indexing new data
We do not recommend <<create-data-stream,indexing new data
to create this data stream>>. Later, you will reindex older data from an
existing data stream into this new stream. This could result in one or more
backing indices that contains a mix of new and old data.

View file

@ -27,7 +27,7 @@ backing indices.
image::images/data-streams/data-streams-diagram.svg[align="center"]
Each data stream requires a matching <<index-templates,index template>>. The
A data stream requires a matching <<index-templates,index template>>. The
template contains the mappings and settings used to configure the stream's
backing indices.

View file

@ -4,48 +4,74 @@
To set up a data stream, follow these steps:
. <<configure-a-data-stream-ilm-policy>>.
. <<create-a-data-stream-template>>.
. <<create-a-data-stream>>.
. <<secure-a-data-stream>>.
* <<create-index-lifecycle-policy>>
* <<create-component-templates>>
* <<create-index-template>>
* <<create-data-stream>>
* <<secure-data-stream>>
You can also <<convert-an-index-alias-to-a-data-stream,convert an index alias to
You can also <<convert-index-alias-to-data-stream,convert an index alias to
a data stream>>.
IMPORTANT: If you use {fleet} or {agent}, skip this tutorial. {fleet} and
{agent} set up data streams for you. See {fleet-guide}/data-streams.html[Data
streams] in the {fleet} Guide.
[discrete]
[[configure-a-data-stream-ilm-policy]]
=== Optional: Configure an {ilm-init} lifecycle policy
[[create-index-lifecycle-policy]]
=== Step 1. Create an index lifecycle policy
While optional, we recommend you configure an <<set-up-lifecycle-policy,{ilm}
({ilm-init}) policy>> to automate the management of your data stream's backing
indices.
While optional, we recommend using {ilm-init} to automate the management of your
data stream's backing indices. {ilm-init} requires an index lifecycle policy.
In {kib}, open the menu and go to *Stack Management > Index Lifecycle Policies*.
Click *Create policy*.
To create an index lifecycle policy in {kib}, open the main menu and go to
*Stack Management > Index Lifecycle Policies*. Click *Create policy*.
[role="screenshot"]
image::images/ilm/create-policy.png[Create Policy page]
[%collapsible]
.API example
====
Use the <<ilm-put-lifecycle,create lifecycle policy API>> to configure a policy:
You can also use the <<ilm-put-lifecycle,create lifecycle policy API>>.
[source,console]
----
PUT /_ilm/policy/my-data-stream-policy
PUT _ilm/policy/my-lifecycle-policy
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_primary_shard_size": "25GB"
"max_age": "30d",
"max_primary_shard_size": "50gb"
}
}
},
"warm": {
"min_age": "30d",
"actions": {
"shrink": {
"number_of_shards": 1
},
"forcemerge": {
"max_num_segments": 1
}
}
},
"cold": {
"min_age": "60d",
"actions": {
"searchable_snapshot": {
"snapshot_repository": "my-snapshot-repo"
}
}
},
"frozen": {
"min_age": "90d",
"actions": {
"searchable_snapshot": {
"snapshot_repository": "my-snapshot-repo"
}
}
},
"delete": {
"min_age": "30d",
"min_age": "735d",
"actions": {
"delete": {}
}
@ -54,139 +80,158 @@ PUT /_ilm/policy/my-data-stream-policy
}
}
----
====
[discrete]
[[create-a-data-stream-template]]
=== Create an index template
[[create-component-templates]]
=== Step 2. Create component templates
. In {kib}, open the menu and go to *Stack Management > Index Management*.
. In the *Index Templates* tab, click *Create template*.
. In the Create template wizard, use the *Data stream* toggle to indicate the
template is used for data streams.
. Use the wizard to finish defining your template. Specify:
A data stream requires a matching index template. In most cases, you compose
this index template using one or more component templates. You typically use
separate component templates for mappings and index settings. This lets you
reuse the component templates in multiple index templates.
* One or more index patterns that match the data stream's name. +
include::{es-repo-dir}/indices/create-data-stream.asciidoc[tag=data-stream-name]
When creating your component templates, include:
* Mappings and settings for the stream's backing indices.
* A <<date,`date`>> or <<date_nanos,`date_nanos`>> mapping for the `@timestamp`
field. If you don't specify a mapping, {es} maps `@timestamp` as a `date` field
with default options.
* A priority for the index template
+
include::{es-repo-dir}/indices/index-templates.asciidoc[tag=built-in-index-templates]
* Your lifecycle policy in the `index.lifecycle.name` index setting.
[[elastic-data-stream-naming-scheme]]
.The Elastic data stream naming scheme
****
The {agent} uses the Elastic data stream naming scheme to name its data streams.
To help you organize your data consistently and avoid naming collisions, we
recommend you also use the Elastic naming scheme for your other data streams.
To create a component template in {kib}, open the main menu and go to *Stack
Management > Index Management*. In the *Index Templates* view, click *Create a
component template*.
The naming scheme splits data into different data streams based on the following
components. Each component corresponds to a
<<constant-keyword-field-type,constant keyword>> field defined in the
{ecs-ref}[Elastic Common Schema (ECS)].
`type`::
Generic type describing the data, such as `logs`, `metrics`, or `synthetics`.
Corresponds to the `data_stream.type` field.
`dataset`::
Describes the ingested data and its structure. Corresponds to the
`data_stream.dataset` field. Defaults to `generic`.
`namespace`::
User-configurable arbitrary grouping. Corresponds to the `data_stream.dataset`
field. Defaults to `default`.
The naming scheme separates these components with a `-` character:
```
<type>-<dataset>-<namespace>
```
For example, the {agent} uses the `logs-nginx.access-production` data
stream to store data with a type of `logs`, a dataset of `nginx.access`, and a
namespace of `production`. If you use the {agent} to ingest a log file, it
stores the data in the `logs-generic-default` data stream.
For more information about the naming scheme and its benefits, see our
https://www.elastic.co/blog/an-introduction-to-the-elastic-data-stream-naming-scheme[An
introduction to the Elastic data stream naming scheme] blog post.
****
include::{es-repo-dir}/data-streams/data-streams.asciidoc[tag=timestamp-reqs]
If using {ilm-init}, specify your lifecycle policy in the `index.lifecycle.name`
setting.
TIP: Carefully consider your template's mappings and settings. Later changes may
require reindexing. See <<data-streams-change-mappings-and-settings>>.
[role="screenshot"]
image::images/data-streams/create-index-template.png[Create template page]
[%collapsible]
.API example
====
Use the <<indices-put-template,create or update index template API>> to create
an index template. The template must include a `data_stream` object, indicating
it's used for data streams.
You can also use the <<indices-component-template,create component template
API>>.
[source,console]
----
PUT /_index_template/my-data-stream-template
# Creates a component template for mappings
PUT _component_template/my-mappings
{
"template": {
"mappings": {
"properties": {
"@timestamp": {
"type": "date",
"format": "date_optional_time||epoch_millis"
},
"message": {
"type": "wildcard"
}
}
}
},
"_meta": {
"description": "Mappings for @timestamp and message fields",
"my-custom-meta-field": "More arbitrary metadata"
}
}
# Creates a component template for index settings
PUT _component_template/my-settings
{
"index_patterns": [ "my-data-stream*" ],
"data_stream": { },
"priority": 500,
"template": {
"settings": {
"index.lifecycle.name": "my-data-stream-policy"
"index.lifecycle.name": "my-lifecycle-policy"
}
},
"_meta": {
"description": "Settings for ILM",
"my-custom-meta-field": "More arbitrary metadata"
}
}
----
// TEST[continued]
====
[discrete]
[[create-a-data-stream]]
=== Create the data stream
[[create-index-template]]
=== Step 3. Create an index template
To automatically create the data stream, submit an
<<add-documents-to-a-data-stream,indexing request>> to the stream. The stream's
name must match one of your template's index patterns.
Use your component templates to create an index template. Specify:
* One or more index patterns that match the data stream's name. We recommend
using our {fleet-guide}/data-streams.html#data-streams-naming-scheme[data stream
naming scheme].
* That the template is data stream enabled.
* Any component templates that contain your mappings and index settings.
* A priority higher than `200` to avoid collisions with built-in templates.
See <<avoid-index-pattern-collisions>>.
To create an index template in {kib}, open the main menu and go to *Stack
Management > Index Management*. In the *Index Templates* view, click *Create
template*.
You can also use the <<indices-put-template,create index template API>>.
Include the `data_stream` object to enable data streams.
[source,console]
----
POST /my-data-stream/_doc/
PUT _index_template/my-index-template
{
"@timestamp": "2099-03-07T11:04:05.000Z",
"user": {
"id": "vlb44hny"
},
"message": "Login attempt failed"
"index_patterns": ["my-data-stream*"],
"data_stream": { },
"composed_of": [ "my-mappings", "my-settings" ],
"priority": 500,
"_meta": {
"description": "Template for my time series data",
"my-custom-meta-field": "More arbitrary metadata"
}
}
----
// TEST[continued]
You can also use the <<indices-create-data-stream,create data stream API>> to
manually create the data stream. The stream's name must match one of your
template's index patterns.
[discrete]
[[create-data-stream]]
=== Step 4. Create the data stream
To automatically create the data stream, submit an
<<add-documents-to-a-data-stream,indexing request>> that targets the stream's
name. This name must match one of your index template's index patterns. The
request must use an `op_type` of `create`. Documents must include a `@timestamp`
field.
[source,console]
----
PUT /_data_stream/my-data-stream
PUT my-data-stream/_bulk
{ "create":{ } }
{ "@timestamp": "2099-05-06T16:21:15.000Z", "message": "192.0.2.42 - - [06/May/2099:16:21:15 +0000] \"GET /images/bg.jpg HTTP/1.0\" 200 24736" }
{ "create":{ } }
{ "@timestamp": "2099-05-06T16:25:42.000Z", "message": "192.0.2.255 - - [06/May/2099:16:25:42 +0000] \"GET /favicon.ico HTTP/1.0\" 200 3638" }
POST my-data-stream/_doc
{
"@timestamp": "2099-05-06T16:21:15.000Z",
"message": "192.0.2.42 - - [06/May/2099:16:21:15 +0000] \"GET /images/bg.jpg HTTP/1.0\" 200 24736"
}
----
// TEST[continued]
You can also manually create the stream using the
<<indices-create-data-stream,create data stream API>>. The stream's name must
still match one of your template's index patterns.
[source,console]
----
PUT _data_stream/my-data-stream
----
// TEST[continued]
// TEST[s/my-data-stream/my-data-stream-alt/]
When you create a data stream, {es} automatically creates a backing index for
the stream. This index also acts as the stream's first write index.
[discrete]
[[secure-data-stream]]
=== Step 5. Secure the data stream
include::{xes-repo-dir}/security/authorization/alias-privileges.asciidoc[tag=data-stream-security]
For an example, see <<data-stream-privileges>>.
[discrete]
[[convert-an-index-alias-to-a-data-stream]]
[[convert-index-alias-to-data-stream]]
=== Convert an index alias to a data stream
// tag::time-series-alias-tip[]
@ -196,12 +241,11 @@ functionality, require less maintenance, and automatically integrate with
<<data-tiers,data tiers>>.
// end::time-series-alias-tip[]
To convert an index alias with a write index to a new data stream with the same
To convert an index alias with a write index to a data stream with the same
name, use the <<indices-migrate-to-data-stream,migrate to data stream API>>.
During conversion, the aliass indices become hidden backing indices for the
stream. The aliass write index becomes the streams write index. Note the data
stream still requires a matching <<create-a-data-stream-template,index
template>>.
stream. The aliass write index becomes the streams write index. The stream
still requires a matching index template with data stream enabled.
////
[source,console]
@ -218,7 +262,7 @@ POST idx2/_doc/
"@timestamp" : "2099-01-01"
}
POST /_aliases
POST _aliases
{
"actions": [
{
@ -237,7 +281,7 @@ POST /_aliases
]
}
PUT /_index_template/template
PUT _index_template/template
{
"index_patterns": ["my-time-series-data"],
"data_stream": { }
@ -248,79 +292,58 @@ PUT /_index_template/template
[source,console]
----
POST /_data_stream/_migrate/my-time-series-data
POST _data_stream/_migrate/my-time-series-data
----
// TEST[continued]
[discrete]
[[secure-a-data-stream]]
=== Secure the data stream
To control access to the data stream and its
data, use <<data-stream-privileges,{es}'s {security-features}>>.
[discrete]
[[get-info-about-a-data-stream]]
[[get-info-about-data-stream]]
=== Get information about a data stream
In {kib}, open the menu and go to *Stack Management > Index Management*. In the
*Data Streams* tab, click the data stream's name.
To get information about a data stream in {kib}, open the main menu and go to
*Stack Management > Index Management*. In the *Data Streams* view, click the
data stream's name.
[role="screenshot"]
image::images/data-streams/data-streams-list.png[Data Streams tab]
[%collapsible]
.API example
====
Use the <<indices-get-data-stream,get data stream API>> to retrieve information
about one or more data streams:
You can also use the <<indices-get-data-stream,get data stream API>>.
////
[source,console]
----
POST /my-data-stream/_rollover/
POST my-data-stream/_rollover/
----
// TEST[continued]
////
[source,console]
----
GET /_data_stream/my-data-stream
GET _data_stream/my-data-stream
----
// TEST[continued]
====
[discrete]
[[delete-a-data-stream]]
[[delete-data-stream]]
=== Delete a data stream
To delete a data stream and its backing indices, open the {kib} menu and go to
*Stack Management > Index Management*. In the *Data Streams* tab, click the
trash icon. The trash icon only displays if you have the `delete_index`
To delete a data stream and its backing indices in {kib}, open the main menu and
go to *Stack Management > Index Management*. In the *Data Streams* view, click
the trash icon. The icon only displays if you have the `delete_index`
<<security-privileges, security privilege>> for the data stream.
[role="screenshot"]
image::images/data-streams/data-streams-no-delete.png[Data Streams tab]
[%collapsible]
.API example
====
Use the <<indices-delete-data-stream,delete data stream API>> to delete a data
stream and its backing indices:
You can also use the <<indices-delete-data-stream,delete data stream API>>.
[source,console]
----
DELETE /_data_stream/my-data-stream
DELETE _data_stream/my-data-stream
----
// TEST[continued]
====
////
[source,console]
----
DELETE /_data_stream/*
DELETE /_index_template/*
DELETE /_ilm/policy/my-data-stream-policy
DELETE _data_stream/*
DELETE _index_template/*
DELETE _component_template/my-*
DELETE _ilm/policy/my-lifecycle-policy
----
// TEST[continued]
////

View file

@ -58,7 +58,7 @@ stream enabled. See <<set-up-a-data-stream>>.
(Required, string) Name of the data stream or index to target.
+
If the target doesn't exist and matches the name or wildcard (`*`) pattern of an
<<create-a-data-stream-template,index template with a `data_stream`
<<create-index-template,index template with a `data_stream`
definition>>, this request creates the data stream. See
<<set-up-a-data-stream>>.
+
@ -195,7 +195,7 @@ exist. To update an existing document, you must use the `_doc` resource.
===== Automatically create data streams and indices
If request's target doesn't exist and matches an
<<create-a-data-stream-template,index template with a `data_stream`
<<create-index-template,index template with a `data_stream`
definition>>, the index operation automatically creates the data stream. See
<<set-up-a-data-stream>>.

View file

@ -37,7 +37,7 @@ events imitating a Squiblydoo attack. The data has been mapped to
To get started:
. Create an <<index-templates,index template>> with
<<create-a-data-stream-template,data stream enabled>>:
<<create-index-template,data stream enabled>>:
+
////
[source,console]

View file

@ -53,11 +53,8 @@ See <<set-up-a-data-stream>>.
`<data-stream>`::
+
--
(Required, string) Name of the data stream to create.
// tag::data-stream-name[]
We recommend using the <<elastic-data-stream-naming-scheme,Elastic data stream
naming scheme>>. Data stream names must meet the following criteria:
(Required, string) Name of the data stream to create. Data stream names must
meet the following criteria:
- Lowercase only
- Cannot include `\`, `/`, `*`, `?`, `"`, `<`, `>`, `|`, `,`, `#`, `:`, or a
@ -66,6 +63,5 @@ space character
- Cannot be `.` or `..`
- Cannot be longer than 255 bytes. Multi-byte characters
count towards this limit faster.
// end::data-stream-name[]
--

View file

@ -6,7 +6,7 @@
++++
Deletes one or more <<data-streams,data streams>> and their backing
indices. See <<delete-a-data-stream>>.
indices. See <<delete-data-stream>>.
////
[source,console]

View file

@ -6,7 +6,7 @@
++++
Retrieves information about one or more <<data-streams,data streams>>.
See <<get-info-about-a-data-stream>>.
See <<get-info-about-data-stream>>.
////
[source,console]
@ -157,7 +157,7 @@ acts as a cumulative count of the stream's rollovers, starting at `1`.
`_meta`::
(object)
Custom metadata for the stream, copied from the `_meta` object of the
stream's matching <<create-a-data-stream-template,index template>>. If empty,
stream's matching <<create-index-template,index template>>. If empty,
the response omits this property.
`status`::
@ -186,7 +186,7 @@ One or more primary shards are unassigned, so some data is unavailable.
Name of the index template used to create the data stream's backing indices.
+
The template's index pattern must match the name of this data stream. See
<<create-a-data-stream-template>>.
<<create-index-template,create an index template>>.
`ilm_policy`::
(string)

View file

@ -31,6 +31,7 @@ templates.
* If a new data stream or index matches more than one index template, the index
template with the highest priority is used.
[[avoid-index-pattern-collisions]]
.Avoid index pattern collisions
****
// tag::built-in-index-templates[]

View file

@ -98,7 +98,7 @@ If this object is included, the template is used to create data streams and
their backing indices. Supports an empty object: `data_stream: { }`
+
Data streams require a matching index template with a `data_stream` object.
See <<create-a-data-stream-template>>.
See <<create-index-template,create an index template>>.
+
.Properties of `data_stream`
[%collapsible%open]
@ -294,7 +294,7 @@ To check the `_meta`, you can use the <<indices-get-template, get index template
To use an index template for a data stream, the template must include an empty `data_stream` object.
Data stream templates are only used for a stream's backing indices,
they are not applied to regular indices.
See <<create-a-data-stream-template>>.
See <<create-index-template,create an index template>>.
[source,console]
--------------------------------------------------

View file

@ -156,7 +156,7 @@ and reopen the index.
You cannot close the write index of a data stream.
To update the analyzer for a data stream's write index and future backing
indices, update the analyzer in the <<create-a-data-stream-template,index
indices, update the analyzer in the <<create-index-template,index
template used by the stream>>. Then <<manually-roll-over-a-data-stream,roll over
the data stream>> to apply the new analyzer to the streams write index and
future backing indices. This affects searches and any new data added to the

View file

@ -317,7 +317,7 @@ PUT _ingest/pipeline/logs-my_app-default
. Create an <<index-templates,index template>> that includes your pipeline in
the <<index-default-pipeline,`index.default_pipeline`>> or
<<index-final-pipeline,`index.final_pipeline`>> index setting. Ensure the
template is <<create-a-data-stream-template,data stream enabled>>. The
template is <<create-index-template,data stream enabled>>. The
template's index pattern should match `logs-<dataset-name>-*`.
+
--

View file

@ -156,7 +156,7 @@ pipeline**.
Youre now ready to index the logs data to a <<data-streams,data stream>>.
. Create an <<index-templates,index template>> with
<<create-a-data-stream-template,data stream enabled>>.
<<create-index-template,data stream enabled>>.
+
[source,console]
----

View file

@ -38,14 +38,14 @@ by default as well.
[WARNING]
====
Each data stream requires a matching
<<create-a-data-stream-template,index template>>. The stream uses this
<<create-index-template,index template>>. The stream uses this
template to create new backing indices.
When restoring a data stream, ensure a matching template exists for the stream.
You can do this using one of the following methods:
* Check for existing templates that match the stream. If no matching template
exists, <<create-a-data-stream-template,create one>>.
exists, <<create-index-template,create one>>.
* Restore a global cluster state that includes a matching template for the
stream.
@ -158,7 +158,7 @@ indices.
The `index_settings` and `ignore_index_settings` parameters affect
restored backing indices only. New backing indices created for a stream use the index
settings specified in the stream's matching
<<create-a-data-stream-template,index template>>.
<<create-index-template,index template>>.
If you change index settings during a restore, we recommend you make similar
changes in the stream's matching index template. This ensures new backing

View file

@ -128,7 +128,7 @@ By setting `include_global_state` to `false` it's possible to prevent the cluste
the snapshot.
IMPORTANT: The global cluster state includes the cluster's index
templates, such as those <<create-a-data-stream-template,matching a data
templates, such as those <<create-index-template,matching a data
stream>>. If your snapshot includes data streams, we recommend storing the
global state as part of the snapshot. This lets you later restored any
templates required for a data stream.

View file

@ -8,14 +8,12 @@
[[data-stream-privileges]]
==== Data stream privileges
A data stream consists of one or more backing indices, which store the stream's
data. Most requests sent to a data stream are routed to one or more of these
backing indices.
// tag::data-stream-security[]
Similar to an index, you can use <<privileges-list-indices,indices privileges>>
to control access to a data stream. Any role or user granted privileges to a
data stream are automatically granted the same privileges to its backing
indices.
Use <<privileges-list-indices,indices privileges>> to control access to
a data stream. Any role or user granted privileges to a data
stream are automatically granted the same privileges to its backing indices.
// end::data-stream-security[]
For example, `my-data-stream` consists of two backing indices:
`.ds-my-data-stream-2099.03.07-000001` and
@ -39,7 +37,7 @@ backing indices, the user can retrieve a document directly from
////
[source,console]
----
PUT /my-index/_doc/2
PUT my-index/_doc/2
{
"my-field": "foo"
}
@ -48,7 +46,7 @@ PUT /my-index/_doc/2
[source,console]
----
GET /.ds-my-data-stream-2099.03.08-000002/_doc/2
GET .ds-my-data-stream-2099.03.08-000002/_doc/2
----
// TEST[continued]
// TEST[s/.ds-my-data-stream-2099.03.08-000002/my-index/]
@ -60,7 +58,7 @@ documents directly from `.ds-my-data-stream-2099.03.09-000003`:
[source,console]
----
GET /.ds-my-data-stream-2099.03.09-000003/_doc/2
GET .ds-my-data-stream-2099.03.09-000003/_doc/2
----
// TEST[continued]
// TEST[s/.ds-my-data-stream-2099.03.09-000003/my-index/]