[8.14] [DOCS] Add Playground docs (#182692) (#182973)

# Backport This will backport the following commits from `main` to `8.14`: - [[DOCS] Add Playground docs (#182692)](https://github.com/elastic/kibana/pull/182692)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport)  Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
2025-04-23 09:19:04 -04:00 · 2024-05-08 12:55:37 -04:00 · 2024-05-08 12:55:37 -04:00 · 81863bb5a0
commit 81863bb5a0
parent 96312f4017
9 changed files with 351 additions and 6 deletions
--- a/docs/playground/images/chat-interface.png
+++ b/docs/playground/images/chat-interface.png
--- a/docs/playground/images/edit-query.png
+++ b/docs/playground/images/edit-query.png
--- a/docs/playground/images/select-indices.png
+++ b/docs/playground/images/select-indices.png
--- a/docs/playground/index.asciidoc
+++ b/docs/playground/index.asciidoc
@ -0,0 +1,203 @@
+[role="xpack"]
+[[playground]]
+= Playground
+
+preview::[]
+
+// Variable (attribute) definition 
+:x:                    Playground 
+
+Use {x} to combine your Elasticsearch data with the power of large language models (LLMs) for retrieval augmented generation (RAG).
+The chat interface translates your natural language questions into {es} queries, retrieves the most relevant results from your {es} documents, and passes those documents to the LLM to generate tailored responses.
+
+Once you start chatting, use the UI to view and modify the Elasticsearch queries that search your data.
+You can also view the underlying Python code that powers the chat interface, and download this code to integrate into your own application.
+
+Learn how to get started on this page.
+Refer to the following for more advanced topics:
+
+* <<playground-context>>
+* <<playground-query>>
+* <<playground-troubleshooting>> 
+
+[float]
+[[playground-how-it-works]]
+== How {x} works
+
+Here's a simpified overview of how {x} works:
+
+* User *creates a connection* to LLM provider
+* User *selects a model* to use for generating responses
+* User *define the model's behavior and tone* with initial instructions
+** *Example*: "_You are a friendly assistant for question-answering tasks. Keep responses as clear and concise as possible._"
+* User *selects {es} indices* to search
+* User *enters a question* in the chat interface
+* {x} *autogenerates an {es} query* to retrieve relevant documents
+** User can *view and modify underlying {es} query* in the UI
+* {x} *auto-selects relevant fields* from retrieved documents to pass to the LLM
+** User can *edit fields targeted*
+* {x} passes *filtered documents* to the LLM
+** The LLM generates a response based on the original query, initial instructions, chat history, and {es} context
+* User can *view the Python code* that powers the chat interface
+** User can also *Download the code* to integrate into application
+
+[float]
+[[playground-availability-prerequisites]]
+== Availability and prerequisites
+
+For Elastic Cloud and self-managed deployments {x} is available in the *Search* space in {kib}, under *Content* > *{x}*.
+
+For Elastic Serverless, {x} is available in your {es} project UI.
+// TODO: Confirm URL path for Serverless
+
+To use {x}, you'll need the following:
+
+1. An Elastic *v8.14.0+* deployment or {es} *Serverless* project. (Start a https://cloud.elastic.co/registration[free trial]).
+2. At least one *{es} index* with documents to search.
+** See <<playground-getting-started-ingest, ingest data>> if you'd like to ingest sample data.
+3. An account with a *supported LLM provider*. {x} supports the following:
+
+[cols="2a,2a,1a", options="header"]
+|===
+| Provider | Models | Notes
+
+| *Amazon Bedrock*
+a|
+* Anthropic: Claude 3 Sonnet
+* Anthropic: Claude 3 Haiku
+a|
+Does not currently support streaming.
+
+| *OpenAI*
+a|
+* GPT-3 turbo
+* GPT-4 turbo
+a|
+
+| *Azure OpenAI*
+a|
+* GPT-3 turbo
+* GPT-4 turbo
+a|
+
+|===
+
+[float]
+[[playground-getting-started]]
+== Getting started
+
+[float]
+[[playground-getting-started-connect]]
+=== Connect to LLM provider
+
+To get started with {x}, you need to create a <<action-types,connector>> for your LLM provider.
+Follow these steps on the {x} landing page:
+
+. Under *Connect to LLM*, click *Create connector*.
+. Select your *LLM provider*.
+. *Name* your connector.
+. Select a *URL endpoint* (or use the default).
+. Enter *access credentials* for your LLM provider.
+
+[TIP]
+====
+If you need to update a connector, or add a new one, click the wrench button (🔧) under *Model settings*.
+====
+
+[float]
+[[playground-getting-started-ingest]]
+=== Ingest data (optional)
+
+_You can skip this step if you already have data in one or more {es} indices._
+
+There are many options for ingesting data into {es}, including:
+
+* The {enterprise-search-ref}/crawler.html[Elastic crawler] for web content (*NOTE*: Not yet available in _Serverless_)
+* {enterprise-search-ref}/connectors.html[Elastic connectors] for data synced from third-party sources
+* The {es} {ref}/docs-bulk.html[Bulk API] for JSON documents
+
+.*Expand* for example
+[%collapsible]
+==============
+To add a few documents to an index called `books` run the following in Dev Tools Console:
+
+[source,console]
+----
+POST /_bulk
+{ "index" : { "_index" : "books" } }
+{"name": "Snow Crash", "author": "Neal Stephenson", "release_date": "1992-06-01", "page_count": 470}
+{ "index" : { "_index" : "books" } }
+{"name": "Revelation Space", "author": "Alastair Reynolds", "release_date": "2000-03-15", "page_count": 585}
+{ "index" : { "_index" : "books" } }
+{"name": "1984", "author": "George Orwell", "release_date": "1985-06-01", "page_count": 328}
+{ "index" : { "_index" : "books" } }
+{"name": "Fahrenheit 451", "author": "Ray Bradbury", "release_date": "1953-10-15", "page_count": 227}
+{ "index" : { "_index" : "books" } }
+{"name": "Brave New World", "author": "Aldous Huxley", "release_date": "1932-06-01", "page_count": 268}
+{ "index" : { "_index" : "books" } }
+{"name": "The Handmaids Tale", "author": "Margaret Atwood", "release_date": "1985-06-01", "page_count": 311}
+----
+==============
+
+We've also provided some Jupyter notebooks to easily ingest sample data into {es}. 
+Find these in the https://github.com/elastic/elasticsearch-labs/blob/main/notebooks/ingestion-and-chunking[elasticsearch-labs] repository.
+These notebooks use the official {es} Python client.
+// TODO: [The above link will be broken until https://github.com/elastic/elasticsearch-labs/pull/232 is merged]
+
+[float]
+[[playground-getting-started-index]]
+=== Select {es} indices
+
+Once you've connected to your LLM provider, it's time to choose the data you want to search.
+Follow the steps under *Select indices*:
+
+. Select one or more {es} indices under *Add index*.
+. Click *Start* to launch the chat interface.
+
+[.screenshot]
+image::select-indices.png[width=400]
+
+Learn more about the underlying {es} queries used to search your data in <<playground-query>>.
+
+[float]
+[[playground-getting-started-setup-chat]]
+=== Set up the chat interface
+
+You can start chatting with your data immediately, but you might want to tweak some defaults first.
+
+[.screenshot]
+image::chat-interface.png[]
+
+You can adjust the following under *Model settings*:
+
+* *Model*. The model used for generating responses.
+* *Instructions*. Also known as the _system prompt_, these initial instructions and guidelines define the behavior of the model throughout the conversation. Be *clear and specific* for best results.
+* *Include citations*. A toggle to include citations from the relevant {es} documents in responses.
+
+{x} also uses another LLM under the hood, to encode all previous questions and responses, and make them available to the main model.
+This ensures the model has "conversational memory".
+
+Under *Indices*, you can edit which {es} indices will be searched.
+This will affect the underlying {es} query.
+
+[TIP]
+====
+Click *✨ Regenerate* to resend the last query to the model for a fresh response.
+
+Click *⟳ Clear chat* to clear chat history and start a new conversation.
+====
+
+[float]
+[[playground-next-steps]]
+=== Next steps
+
+Once you've got {x} up and running, and you've tested out the chat interface, you might want to explore some more advanced topics:
+
+* <<playground-context>>
+* <<playground-query>>
+* <<playground-troubleshooting>> 
+
+include::playground-context.asciidoc[]
+include::playground-query.asciidoc[]
+include::playground-troubleshooting.asciidoc[]
+
--- a/docs/playground/playground-context.asciidoc
+++ b/docs/playground/playground-context.asciidoc
@ -0,0 +1,68 @@
+[role="xpack"]
+[[playground-context]]
+== Optimize model context
+
+preview::[]
+
+// Variable (attribute) definition 
+:x:                    Playground 
+
+Context is the information you provide to the LLM, to optimize the relevance of your query results.
+Without additional context, an LLM will generate results solely based on its training data.
+In {x}, this additional context is the information contained in your {es} indices.
+
+There are a few ways to optimize this context for better results.
+Some adjustments can be made directly in the {x} UI.
+Others require refining your indexing strategy, and potentially reindexing your data.
+
+[float]
+[[playground-context-ui]]
+== Edit context in UI
+
+Use the *Edit context* button in the {x} UI to adjust the number of documents and fields sent to the LLM.
+
+If you're hitting context length limits, try the following:
+
+* Limit the number of documents retrieved
+* Pick a field with less tokens, reducing the context length
+
+[float]
+[[playground-context-index]]
+== Other context optimizations
+
+This section covers additional context optimizations that you won't be able to make directly in the UI.
+
+[float]
+[[playground-context-index-chunking]]
+=== Chunking large documents
+
+If you're working with large fields, you may need to adjust your indexing strategy.
+Consider breaking your documents into smaller chunks, such as sentences or paragraphs.
+
+If you don't yet have a chunking strategy, start by chunking your documents into passages.
+
+Otherwise, consider updating your chunking strategy, for example, from sentence based to paragraph based chunking.
+
+Refer to the following Python notebooks for examples of how to chunk your documents:
+
+* https://github.com/elastic/elasticsearch-labs/tree/main/notebooks/ingestion-and-chunking/json-chunking-ingest.ipynb[JSON documents]
+* https://github.com/elastic/elasticsearch-labs/tree/main/notebooks/ingestion-and-chunking/pdf-chunking-ingest.ipynb[PDF document]
+* https://github.com/elastic/elasticsearch-labs/tree/main/notebooks/ingestion-and-chunking/website-chunking-ingest.ipynb[Website content]
+
+[float]
+[[playground-context-balance]]
+=== Balancing cost and latency
+
+Here are some general recommendations for balancing cost and latency with different context sizes:
+
+Optimize context length::
+Determine the optimal context length through empirical testing.
+Start with a baseline and adjust incrementally to find a balance that optimizes both response quality and system performance.
+Implement token pruning for ELSER model::
+If you're using our ELSER model, consider implementing token pruning to reduce the number of tokens sent to the model.
+Refer to these relevant blog posts:
+
+* https://www.elastic.co/search-labs/blog/introducing-elser-v2-part-2[Optimizing retrieval with ELSER v2]
+* https://www.elastic.co/search-labs/blog/text-expansion-pruning[Improving text expansion performance using token pruning]
+Monitor and adjust::
+Continuously monitor the effects of context size changes on performance and adjust as necessary.
--- a/docs/playground/playground-query.asciidoc
+++ b/docs/playground/playground-query.asciidoc
@ -0,0 +1,51 @@
+[xpack]
+[[playground-query]]
+== View and modify queries
+
+:x:                    Playground
+
+preview::[]
+
+Once you've set up your chat interface, you can start chatting with the model.
+{x} will automatically generate {es} queries based on your questions, and retrieve the most relevant documents from your {es} indices.
+The {x} UI enables you to view and modify these queries.
+
+* Click *View query* to open the visual query editor.
+* Modify the query by selecting fields to query per index.
+* Click *Save changes*.
+
+[TIP]
+====
+The `{query}` variable represents the user's question, rewritten as an {es} query.
+====
+
+The following screenshot shows the query editor in the {x} UI.
+In this simple example, the `books` index has two fields: `author` and `name`.
+Selecting a field adds it to the `fields` array in the query.
+
+[.screenshot]
+image::images/edit-query.png[View and modify queries]
+
+[float]
+[[playground-query-relevance]]
+=== Improving relevance
+
+The fields you select in the query editor determine the relevance of the retrieved documents.
+
+Remember that the next step in the workflow is to send the retrieved documents to the LLM to answer the question.
+Context length is an important factor in ensuring the model has enough information to generate a relevant answer.
+Refer to <<playground-context, Optimize context>> for more information.
+
+<<playground-troubleshooting, Troubleshooting>> provides tips on how to diagnose and fix relevance issues.
+
+[.screenshot]
+
+
+
+[NOTE]
+====
+{x} uses the {ref}/retriever.html[`retriever`] syntax for {es} queries.
+Retrievers make it easier to compose and test different retrieval strategies in your search pipelines. 
+====
+// TODO: uncomment and add to note once following page is live
+//Refer to {ref}/retrievers-overview.html[documentation] for a high level overview of retrievers.
--- a/docs/playground/playground-troubleshooting.asciidoc
+++ b/docs/playground/playground-troubleshooting.asciidoc
@ -0,0 +1,26 @@
+[role="xpack"]
+[[playground-troubleshooting]]
+== Troubleshooting
+
+preview::[]
+
+:x:                    Playground
+
+Dense vectors are not searchable::
+Embeddings must be generated using the {ref}/inference-processor.html[inference processor] with an ML node.
+
+Context length error::
+You'll need to adjust the size of the context you're sending to the model.
+Refer to <<playground-context>>.
+
+LLM credentials not working::
+Under *Model settings*, use the wrench button (🔧) to edit your GenAI connector settings.
+
+Poor answer quality::
+Check the retrieved documents to see if they are valid.
+Adjust your {es} queries to improve the relevance of the documents retrieved. Refer to <<playground-query>>.
+
+You can update the initial instructions to be more detailed. This is called 'prompt engineering'. Refer to this https://platform.openai.com/docs/guides/prompt-engineering[OpenAI guide] for more information.
+
+You might need to click *⟳ Clear chat* to clear chat history and start a new conversation.
+If you mix topics, the model will find it harder to generate relevant responses.
--- a/docs/redirects.asciidoc
+++ b/docs/redirects.asciidoc
@ -432,9 +432,4 @@ This connector was renamed. Refer to <<openai-action-type>>.
 == APIs

 For the most up-to-date API details, refer to the
-{kib-repo}/tree/{branch}/x-pack/plugins/alerting/docs/openapi[alerting], {kib-repo}/tree/{branch}/x-pack/plugins/cases/docs/openapi[cases], {kib-repo}/tree/{branch}/x-pack/plugins/actions/docs/openapi[connectors], and {kib-repo}/tree/{branch}/x-pack/plugins/ml/common/openapi[machine learning] open API specifications.
-
-[role="exclude",id="playground"]
-== Playground
-
-Coming in 8.14.0.
+{kib-repo}/tree/{branch}/x-pack/plugins/alerting/docs/openapi[alerting], {kib-repo}/tree/{branch}/x-pack/plugins/cases/docs/openapi[cases], {kib-repo}/tree/{branch}/x-pack/plugins/actions/docs/openapi[connectors], and {kib-repo}/tree/{branch}/x-pack/plugins/ml/common/openapi[machine learning] open API specifications.
--- a/docs/user/index.asciidoc
+++ b/docs/user/index.asciidoc
@ -28,6 +28,8 @@ include::alerting/index.asciidoc[]

 include::{kibana-root}/docs/observability/index.asciidoc[]

+include::{kibana-root}/docs/playground/index.asciidoc[]
+
 include::{kibana-root}/docs/apm/index.asciidoc[]

 include::{kibana-root}/docs/siem/index.asciidoc[]