elasticsearch/docs/reference/ingest/search-inference-processing.asciidoc

[[ingest-pipeline-search-inference]]
=== Inference processing

When you create an index through the **Content** UI, a set of default ingest pipelines are also created, including a ML inference pipeline.
The <<ingest-pipeline-search-details-specific-ml-reference, ML inference pipeline>> uses inference processors to analyze fields and enrich documents with the output.
Inference processors use ML trained models, so you need to use a built-in model or {ml-docs}/ml-nlp-deploy-models.html[deploy a trained model in your cluster^] to use this feature.

This guide focuses on the ML inference pipeline, its use, and how to manage it.

[IMPORTANT]
====
This feature is not available at all Elastic subscription levels.
Refer to the Elastic subscriptions pages for https://www.elastic.co/subscriptions/cloud[Elastic Cloud^] and https://www.elastic.co/subscriptions[self-managed] deployments.
====

[discrete#ingest-pipeline-search-inference-nlp-use-cases]
==== NLP use cases

{ml-docs}/ml-nlp-overview.html[Natural Language Processing (NLP)^] allows developers to create rich search experiences that go beyond the standards of lexical search.
A few examples of ways to improve search experiences through the use of NLP models:

[discrete#ingest-pipeline-search-inference-elser]
==== ELSER text expansion

Using Elastic's {ml-docs}/ml-nlp-elser.html[ELSER machine learning model^] you can easily incorporate text expansion for your queries.
This works by using ELSER to provide semantic enrichments to your documents upon ingestion, combined with the power of <<search-application-overview, Elastic Search Application templates>> to provide automated text expansion at query time.

[discrete#ingest-pipeline-search-inference-ner]
==== Named entity recognition (NER)

Most commonly used to detect entities such as People, Places, and Organization information from text, {ml-docs}/ml-nlp-extract-info.html#ml-nlp-ner[NER^] can be used to extract key information from text and group results based on that information.
A sports news media site could use NER to automatically extract names of professional athletes, stadiums, and sports teams in their articles and link to season stats or schedules.

[discrete#ingest-pipeline-search-inference-text-classification]
==== Text classification

{ml-docs}/ml-nlp-classify-text.html#ml-nlp-text-classification[Text classification^] is commonly used for sentiment analysis and can be used for similar tasks, such as labeling content as containing hate speech in public forums, or triaging and labeling support tickets so they reach the correct level of escalation automatically.

[discrete#ingest-pipeline-search-inference-text-embedding]
==== Text embedding

Analyzing a text field using a {ml-docs}/ml-nlp-search-compare.html#ml-nlp-text-embedding[Text embedding^] model will generate a <<dense-vector, dense vector>> representation of the text.
This array of numeric values encodes the semantic _meaning_ of the text.
Using the same model with a user's search query will produce a vector that can then be used to search, ranking results based on vector similarity - semantic similarity - as opposed to traditional word or text similarity.

A common use case is a user searching FAQs, or a support agent searching a knowledge base, where semantically similar content may be indexed with little similarity in phrasing.

[discrete#ingest-pipeline-search-inference-nlp-in-enterprise-search]
==== NLP in Content UI

[discrete#ingest-pipeline-search-inference-overview]
===== Overview of ML inference pipeline

The diagram below shows how documents are processed during ingestion.

// Original diagram: https://whimsical.com/ml-in-enterprise-search-ErCetPqrcCPu2QYHvAwrgP@2bsEvpTYSt1Hiuq6UBf68tUWvFiXdzLt6ao
image::images/ingest/document-enrichment-diagram.png["ML inference pipeline diagram"]

* Documents are processed by the `my-index-0001` pipeline, which happens automatically when indexing through a an Elastic connector or crawler.
* The `_run_ml_inference` field is set to `true` to ensure the ML inference pipeline (`my-index-0001@ml-inference`) is executed.
  This field is removed during the ingestion process.
* The inference processor analyzes the `message` field on the document using the `my-positivity-model-id` trained model.
  The inference output is stored in the `ml.inference.positivity_prediction` field.
* The resulting enriched document is then indexed into the `my-index-0001` index.
* The `ml.inference.positivity_prediction` field can now be used at query time to search for documents above or below a certain threshold.

[discrete#ingest-pipeline-search-inference-find-deploy-manage-trained-models]
==== Find, deploy, and manage trained models

This feature is intended to make it easier to use your ML trained models.
First, you need to figure out which model works best for your data.
Make sure to use a {ml-docs}/ml-nlp-model-ref.html[compatible third party NLP model^].
Since these are publicly available, it is not possible to fine-tune models before {ml-docs}/ml-nlp-deploy-models.html[deploying them^].

Trained models must be available in the current {kibana-ref}/xpack-spaces.html[Kibana Space^] and running in order to use them.
By default, models should be available in all Kibana Spaces that have the *Analytics* > *Machine Learning* feature enabled.
To manage your trained models, use the Kibana UI and navigate to *Stack Management -> Machine Learning -> Trained Models*.
Spaces can be controlled in the **spaces** column.
To stop or start a model, go to the *Machine Learning* tab in the *Analytics* menu of Kibana and click *Trained Models* in the *Model Management* section.

[NOTE]
=========================
The `monitor_ml` <<security-privileges, Elasticsearch cluster privilege>> is required to manage ML models and ML inference pipelines which use those models.
=========================

[discrete#ingest-pipeline-search-inference-add-inference-processors]
===== Add inference processors to your ML inference pipeline

To create the index-specific ML inference pipeline, go to *Search -> Content -> Indices -> <your index> -> Pipelines* in the Kibana UI.

If you only see the `ent-search-generic-ingestion` pipeline, you will need to click *Copy and customize* to create index-specific pipelines.
This will create the `{index_name}@ml-inference` pipeline.

Once your index-specific ML inference pipeline is ready, you can add inference processors that use your ML trained models.
To add an inference processor to the ML inference pipeline, click the *Add Inference Pipeline* button in the *Machine Learning Inference Pipelines* card.

[role="screenshot"]
image::images/ingest/document-enrichment-add-inference-pipeline.png["Add Inference Pipeline"]

Here, you'll be able to:

1. Choose a name for your pipeline.
  - This name will need to be unique across the whole deployment.
  If you want this pipeline to be index-specific, we recommend including the name of your index in the pipeline name.
  - If you do not set the pipeline name, a default unique name will be provided upon selecting a trained model.
2. Select the ML trained model you want to use.
  - The model must be deployed before you can select it.
  To begin deployment of a model, click the *Deploy* button.
3. Select one or more source fields as input for the inference processor.
  - If there are no source fields available, your index will need a <<mapping, field mapping>>.
4. (Optional) Choose a name for your target field(s).
This is where the output of the inference model will be stored. Changing the default name is only possible if you have a single source field selected.
5. Add the source-target field mapping to the configuration by clicking the *Add* button.
6. Repeat steps 3-5 for each field mapping you want to add.
7. (Optional) Test the pipeline with a sample document.
8. (Optional) Review the pipeline definition before creating it with the *Create pipeline* button.

[discrete#ingest-pipeline-search-inference-manage-inference-processors]
===== Manage and delete inference processors from your ML inference pipeline

Inference processors added to your index-specific ML inference pipelines are normal Elasticsearch pipelines.
Once created, each processor will have options to *View in Stack Management* and *Delete Pipeline*.
Deleting an inference processor from within the *Content* UI deletes the pipeline and also removes its reference from your index-specific ML inference pipeline.

These pipelines can also be viewed, edited, and deleted in Kibana via *Stack Management -> Ingest Pipelines*, just like all other Elasticsearch ingest pipelines.
You may also use the <<ingest-apis,Ingest pipeline APIs>>.
If you delete any of these pipelines outside of the *Content* UI in Kibana, make sure to edit the ML inference pipelines that reference them.

[discrete#ingest-pipeline-search-inference-test-inference-pipeline]
==== Test your ML inference pipeline

You can verify the expected structure of the inference output before indexing any documents while creating the {ml} inference pipeline under the *Test* tab.
Provide a sample document, click *Simulate*, and look for the `ml.inference` object in the results.

To ensure the ML inference pipeline will be run when ingesting documents, you must make sure the documents you are ingesting have a field named `_run_ml_inference` that is set to `true` and you must set the pipeline to `{index_name}`.
For connector and crawler indices, this will happen automatically if you've configured the settings appropriately for the pipeline name `{index_name}`.
To manage these settings:

  1. Go to *Search > Content > Indices > <your index> > Pipelines*.
  2. Click on the *Settings* link in the *Ingest Pipelines* card for the `{index_name}` pipeline.
  3. Ensure *ML inference pipelines* is selected.
     If it is not, select it and save the changes.

[discrete#ingest-pipeline-search-inference-learn-more]
==== Learn More

* See <<ingest-pipeline-search-in-enterprise-search>> for information on the various pipelines that are created.
* Learn about {ml-docs}/ml-nlp-elser.html[ELSER], Elastic's proprietary retrieval model for semantic search with sparse vectors.
* https://huggingface.co/models?library=pytorch&pipeline_tag=token-classification&sort=downloads[NER HuggingFace Models^]
* https://huggingface.co/models?library=pytorch&pipeline_tag=text-classification&sort=downloads[Text Classification HuggingFace Models^]
* https://huggingface.co/models?library=pytorch&pipeline_tag=sentence-similarity&sort=downloads[Text Embedding HuggingFace Models^]