kibana/docs/concepts/data-views.asciidoc
Matthew Kime acdaf0127e
[data views] cache field caps requests (#168910)
## Summary

Closes: https://github.com/elastic/kibana/issues/169622

Field caps request caching - implement `stale-while-revalidate` cache
headers and etags with 304 responses as appropriate.

This PR accomplishes
- Adds 304 Not Modified to http server
- Adds refresh button to data view management that force refreshes field
list
- Adds `/internal/data_views/fields` endpoint which supports caching. 
- This is necessary since `fields_for_wildcard` doesn't support caching
due to POST requests
- Adds `stale-while-revalidate` header directive UNLESS field list is
empty.
- Uses Vary header with hash of user id to force requests when user has
changed.
- Unchanged field list responses won't recreate data view field list

### How to test
1. Pop open the dev tools to the network tab and make sure 'Disable
cache' is unchecked. filter for 'fields' requests
2. Load more than one data set (sample data is fine)
3. Go to discover
4. Switch selected data views. Notice status code on first load vs
subsequent loads
5. Open a new window, notice loading from cache
6. Push document that changes field list. You'll need to wait for cache
to expire for it to load. The default is 5s.
7. Notice that field list loads after cache is old properly result in
304 response.
8. Set `data_views:cache_max_age`, shift reload. No field requests will
be cached.

#### Note on Safari

Safari doesn't support the `stale-while-revalidate` directive.
Additionally, the Safari dev console shows 304 responses as 200's. I
figured this out by adding console statements to the fields endpoint. As
best I can tell, Safari won't be performant on slow clusters but its
also no slower than it was before this PR. Safari does respect the
disabling of cache headers.

## Release note

Data View field list requests are now cached with a
`stale-while-revalidate` strategy. This means that after the initial
field list request, all subsequent requests return the cached response
which is very fast. If the cache is determined to be stale then the
cache will update in the background and new data will be available on
the next request.

This behavior can be modified via the `data_views:cache_max_age` Kibana
advanced setting. Setting it to zero will disable the cache. All other
values (in seconds) will be used to determine whether the cache is
stale. The default value is 5 seconds.

The field list can be manually updated via the refresh button in data
view management or a hard refresh with your browser.

Note for Safari: The `stale-while-revalidate` cache directive is
unsupported, therefore it makes additional requests. If this is
impacting Kibana performance then try Chrome or Firefox.

Data View Management

<img width="1064" alt="Screenshot 2024-01-09 at 1 36 20 PM"
src="f272c19f-81b4-4697-9303-b1f8f150e2b9">

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Matthias Wilhelm <matthias.wilhelm@elastic.co>
Co-authored-by: amyjtechwriter <61687663+amyjtechwriter@users.noreply.github.com>
Co-authored-by: Julia Rechkunova <julia.rechkunova@elastic.co>
Co-authored-by: Julia Rechkunova <julia.rechkunova@gmail.com>
2024-01-16 06:54:38 -06:00

185 lines
No EOL
6.8 KiB
Text
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

[[data-views]]
=== Create a {data-source}
{kib} requires a {data-source} to access the {es} data that you want to explore.
A {data-source} can point to one or more indices, {ref}/data-streams.html[data streams], or {ref}/alias.html[index aliases].
For example, a {data-source} can point to your log data from yesterday,
or all indices that contain your data.
[float]
[[data-views-read-only-access]]
=== Required permissions
* Access to *Data Views* requires the <<kibana-role-management, {kib} privilege>>
`Data View Management`.
* To create a {data-source}, you must have the <<kibana-role-management,{es} privilege>>
`view_index_metadata`.
* If a read-only indicator appears in {kib}, you have insufficient privileges
to create or save {data-sources}. In addition, the buttons to create {data-sources} or
save existing {data-sources} are not visible. For more information,
refer to <<xpack-security-authorization,Granting access to {kib}>>.
[float]
[[settings-create-pattern]]
=== Create a data view
If you collected data using one of the {kib} <<connect-to-elasticsearch,ingest options>>,
uploaded a file, or added sample data,
you get a {data-source} for free, and can start exploring your data.
If you loaded your own data, follow these steps to create a {data-source}.
. Open *Lens* or *Discover*, and then open the data view menu.
+
[role="screenshot"]
image::images/discover-data-view.png[How to set the {data-source} in Discover, width="40%"]
. Click *Create a {data-source}*.
. Give your {data-source} a name.
. Start typing in the *Index pattern* field, and {kib} looks for the names of
indices, data streams, and aliases that match your input. You can
view all available sources or only the sources that the data view targets.
+
[role="screenshot"]
image:management/index-patterns/images/create-data-view.png["Create data view"]
+
** To match multiple sources, use a wildcard (*). `filebeat-*` matches
`filebeat-apache-a`, `filebeat-apache-b`, and so on.
+
** To match multiple single sources, enter their names,
separated by a comma. Do not include a space after the comma.
`filebeat-a,filebeat-b` matches two indices.
+
** To exclude a source, use a minus sign (-), for example, `-test3`.
. Open the *Timestamp field* dropdown,
and then select the default field for filtering your data by time.
+
** If you dont set a default time field, you can't use
global time filters on your dashboards. This is useful if
you have multiple time fields and want to create dashboards that combine visualizations
based on different timestamps.
+
** If your index doesnt have time-based data, choose *I dont want to use the time filter*.
. Click *Show advanced settings* to:
** Display hidden and system indices.
** Specify your own {data-source} name. For example, enter your {es} index alias name.
. [[reload-fields]] Click *Save {data-source} to {kib}*.
+
You can manage your data view from *Stack Management*.
[float]
==== Create a temporary {data-source}
Want to explore your data or create a visualization without saving it as a data view?
Select *Use without saving* in the *Create {data-source}* form in *Discover*
or *Lens*. With a temporary {data-source}, you can add fields and create an {es}
query alert, just like you would a regular {data-source}. Your work won't be visible to others in your space.
A temporary {data-source} remains in your space until you change apps, or until you save it.
[role="screenshot"]
image::https://images.contentstack.io/v3/assets/bltefdd0b53724fa2ce/blte3a4f3994c44c0cc/637eb0c95834861044c21a25/ad-hoc-data-view.gif[how to create an ad-hoc data view]
NOTE: Temporary {data-sources} are not available in *Stack Management.*
[float]
[[rollup-data-view]]
==== Use {data-sources} with rolled up data
deprecated::[8.11.0,'Rollups are deprecated and will be removed in a future version. Use {ref}/downsampling.html[downsampling] instead.']
A {data-source} can match one rollup index. For a combination rollup
{data-source} with both raw and rolled up data, use the standard notation:
```ts
rollup_logstash,kibana_sample_data_logs
```
For an example, refer to <<rollup-data-tutorial,Create and visualize rolled up data>>.
[float]
[[management-cross-cluster-search]]
==== Use {data-sources} with {ccs}
If your {es} clusters are configured for {ref}/modules-cross-cluster-search.html[{ccs}],
you can create a {data-source} to search across the clusters of your choosing.
Specify data streams, indices, and aliases in a remote cluster using the
following syntax:
```ts
<remote_cluster_name>:<target>
```
To query {ls} indices across two {es} clusters
that you set up for {ccs}, named `cluster_one` and `cluster_two`:
```ts
cluster_one:logstash-*,cluster_two:logstash-*
```
Use wildcards in your cluster names
to match any number of clusters. To search {ls} indices across
clusters named `cluster_foo`, `cluster_bar`, and so on:
```ts
cluster_*:logstash-*
```
To query across all {es} clusters that have been configured for {ccs},
use a standalone wildcard for your cluster name:
```ts
*:logstash-*
```
To match indices starting with `logstash-`, but exclude those starting with `logstash-old`, from
all clusters having a name starting with `cluster_`:
```ts
`cluster_*:logstash-*,cluster_*:-logstash-old*`
```
Excluding a cluster avoids sending any network calls to that cluster.
To exclude a cluster with the name `cluster_one`:
```ts
`cluster_*:logstash-*,-cluster_one:*`
```
Once you configure a {data-source} to use the {ccs} syntax, all searches and
aggregations using that {data-source} in {kib} take advantage of {ccs}.
For more information, refer to
{ref}/modules-cross-cluster-search.html#exclude-problematic-clusters[Excluding
clusters or indicies from cross-cluster search].
[float]
[[delete-data-view]]
=== Delete a {data-source}
When you delete a {data-source}, you cannot recover the associated field formatters, runtime fields, source filters,
and field popularity data. Deleting a {data-source} does not remove any indices or data documents from {es}.
WARNING: Deleting a {data-source} breaks all visualizations, saved searches, and other saved objects that reference the data view.
. Open the main menu, and then click *Stack Management > Data Views*.
. Find the {data-source} that you want to delete, and then
click image:management/index-patterns/images/delete.png[Delete icon] in the *Actions* column.
[float]
[[data-view-field-cache]]
=== {data-source} field cache
The browser caches {data-source} field lists for increased performance. This is particularly impactful
for {data-sources} with a high field count that span a large number of indices and clusters. The field
list is updated every couple of minutes in typical {kib} usage. Alternatively, use the refresh button on the {data-source}
management detail page to get an updated field list. A force reload of {kib} has the same effect.
The field list may be impacted by changes in indices and user permissions.