Implement synthetic source support for annotated text field (#107735)

This PR adds synthetic source support for annotated_text fields. Existing implementation for text is reused including test infrastructure so the majority of the change is moving and making things accessible. Contributes to #106460, #78744.
2025-06-28 17:34:17 -04:00 · 2024-04-25 10:31:27 -07:00 · 2024-04-25 10:31:27 -07:00 · e1d902d33b
commit e1d902d33b
parent 4ef8b3825e
16 changed files with 824 additions and 300 deletions
--- a/docs/changelog/107735.yaml
+++ b/docs/changelog/107735.yaml
@ -0,0 +1,5 @@
 pr: 107735
 summary: Implement synthetic source support for annotated text field
 area: Mapping
 type: feature
 issues: []
--- a/docs/plugins/mapper-annotated-text.asciidoc
+++ b/docs/plugins/mapper-annotated-text.asciidoc
@ -6,7 +6,7 @@ experimental[]
 The mapper-annotated-text plugin provides the ability to index text that is a
 combination of free-text and special markup that is typically used to identify
 items of interest such as people or organisations (see NER or Named Entity Recognition
-tools). 
+tools).
 The elasticsearch markup allows one or more additional tokens to be injected, unchanged, into the token
@ -18,7 +18,7 @@ include::install_remove.asciidoc[]
 [[mapper-annotated-text-usage]]
 ==== Using the `annotated-text` field
-The `annotated-text` tokenizes text content as per the more common {ref}/text.html[`text`] field (see 
+The `annotated-text` tokenizes text content as per the more common {ref}/text.html[`text`] field (see
 "limitations" below) but also injects any marked-up annotation tokens directly into
 the search index:
@ -49,7 +49,7 @@ in the search index:
 --------------------------
 GET my-index-000001/_analyze
 {
-  "field": "my_field", 
+  "field": "my_field",
  "text":"Investors in [Apple](Apple+Inc.) rejoiced."
 }
 --------------------------
@ -76,7 +76,7 @@ Response:
      "position": 1
    },
    {
-      "token": "Apple Inc.", <1> 
+      "token": "Apple Inc.", <1>
      "start_offset": 13,
      "end_offset": 18,
      "type": "annotation",
@ -106,7 +106,7 @@ the token stream and at the same position (position 2) as the text token (`apple
 We can now perform searches for annotations using regular `term` queries that don't tokenize
-the provided search values. Annotations are a more precise way of matching as can be seen 
+the provided search values. Annotations are a more precise way of matching as can be seen
 in this example where a search for `Beck` will not match `Jeff Beck` :
 [source,console]
@ -133,18 +133,119 @@ GET my-index-000001/_search
 }
 --------------------------
-<1> As well as tokenising the plain text into single words e.g. `beck`, here we 
+<1> As well as tokenising the plain text into single words e.g. `beck`, here we
 inject the single token value `Beck` at the same position as `beck` in the token stream.
 <2> Note annotations can inject multiple tokens at the same position - here we inject both
 the very specific value `Jeff Beck` and the broader term `Guitarist`. This enables
 broader positional queries e.g. finding mentions of a `Guitarist` near to `strat`.
-<3> A benefit of searching with these carefully defined annotation tokens is that a query for 
+<3> A benefit of searching with these carefully defined annotation tokens is that a query for
 `Beck` will not match document 2 that contains the tokens `jeff`, `beck` and `Jeff Beck`
-WARNING: Any use of `=` signs in annotation values eg `[Prince](person=Prince)` will 
+WARNING: Any use of `=` signs in annotation values eg `[Prince](person=Prince)` will
 cause the document to be rejected with a parse failure. In future we hope to have a use for
 the equals signs so wil actively reject documents that contain this today.
 [[annotated-text-synthetic-source]]
 ===== Synthetic `_source`
 IMPORTANT: Synthetic `_source` is Generally Available only for TSDB indices
 (indices that have `index.mode` set to `time_series`). For other indices
 synthetic `_source` is in technical preview. Features in technical preview may
 be changed or removed in a future release. Elastic will work to fix
 any issues, but features in technical preview are not subject to the support SLA
 of official GA features.
 `annotated_text` fields support {ref}/mapping-source-field.html#synthetic-source[synthetic `_source`] if they have
 a {ref}/keyword.html#keyword-synthetic-source[`keyword`] sub-field that supports synthetic
 `_source` or if the `text` field sets `store` to `true`. Either way, it may
 not have {ref}/copy-to.html[`copy_to`].
 If using a sub-`keyword` field then the values are sorted in the same way as
 a `keyword` field's values are sorted. By default, that means sorted with
 duplicates removed. So:
 [source,console,id=synthetic-source-text-example-default]
 ----
 PUT idx
 {
  "mappings": {
    "_source": { "mode": "synthetic" },
    "properties": {
      "text": {
        "type": "annotated_text",
        "fields": {
          "raw": {
            "type": "keyword"
          }
        }
      }
    }
  }
 }
 PUT idx/_doc/1
 {
  "text": [
    "the quick brown fox",
    "the quick brown fox",
    "jumped over the lazy dog"
  ]
 }
 ----
 // TEST[s/$/\nGET idx\/_doc\/1?filter_path=_source\n/]
 Will become:
 [source,console-result]
 ----
 {
  "text": [
    "jumped over the lazy dog",
    "the quick brown fox"
  ]
 }
 ----
 // TEST[s/^/{"_source":/ s/\n$/}/]
 NOTE: Reordering text fields can have an effect on {ref}/query-dsl-match-query-phrase.html[phrase]
 and {ref}/span-queries.html[span] queries. See the discussion about {ref}/position-increment-gap.html[`position_increment_gap`] for more detail. You
 can avoid this by making sure the `slop` parameter on the phrase queries
 is lower than the `position_increment_gap`. This is the default.
 If the `annotated_text` field sets `store` to true then order and duplicates
 are preserved.
 [source,console,id=synthetic-source-text-example-stored]
 ----
 PUT idx
 {
  "mappings": {
    "_source": { "mode": "synthetic" },
    "properties": {
      "text": { "type": "annotated_text", "store": true }
    }
  }
 }
 PUT idx/_doc/1
 {
  "text": [
    "the quick brown fox",
    "the quick brown fox",
    "jumped over the lazy dog"
  ]
 }
 ----
 // TEST[s/$/\nGET idx\/_doc\/1?filter_path=_source\n/]
 Will become:
 [source,console-result]
 ----
 {
  "text": [
    "the quick brown fox",
    "the quick brown fox",
    "jumped over the lazy dog"
  ]
 }
 ----
 // TEST[s/^/{"_source":/ s/\n$/}/]
 [[mapper-annotated-text-tips]]
 ==== Data modelling tips
@ -153,13 +254,13 @@ the equals signs so wil actively reject documents that contain this today.
 Annotations are normally a way of weaving structured information into unstructured text for
 higher-precision search.
-`Entity resolution` is a form of document enrichment undertaken by specialist software or people 
+`Entity resolution` is a form of document enrichment undertaken by specialist software or people
 where references to entities in a document are disambiguated by attaching a canonical ID.
 The ID is used to resolve any number of aliases or distinguish between people with the
-same name. The hyperlinks connecting Wikipedia's articles are a good example of resolved 
+same name. The hyperlinks connecting Wikipedia's articles are a good example of resolved
-entity IDs woven into text. 
+entity IDs woven into text.
-These IDs can be embedded as annotations in an annotated_text field but it often makes 
+These IDs can be embedded as annotations in an annotated_text field but it often makes
 sense to include them in dedicated structured fields to support discovery via aggregations:
 [source,console]
@ -214,20 +315,20 @@ GET my-index-000001/_search
 --------------------------
 <1> Note the `my_twitter_handles` contains a list of the annotation values
-also used in the unstructured text. (Note the annotated_text syntax requires escaping). 
+also used in the unstructured text. (Note the annotated_text syntax requires escaping).
-By repeating the annotation values in a structured field this application has ensured that 
+By repeating the annotation values in a structured field this application has ensured that
-the tokens discovered in the structured field can be used for search and highlighting 
+the tokens discovered in the structured field can be used for search and highlighting
-in the unstructured field.  
+in the unstructured field.
 <2> In this example we search for documents that talk about components of the elastic stack
 <3> We use the `my_twitter_handles` field here to discover people who are significantly
 associated with the elastic stack.
 ===== Avoiding over-matching annotations
-By design, the regular text tokens and the annotation tokens co-exist in the same indexed 
+By design, the regular text tokens and the annotation tokens co-exist in the same indexed
 field but in rare cases this can lead to some over-matching.
 The value of an annotation often denotes a _named entity_ (a person, place or company).
-The tokens for these named entities are inserted untokenized, and differ from typical text 
+The tokens for these named entities are inserted untokenized, and differ from typical text
 tokens because they are normally:
 * Mixed case e.g. `Madonna`
@ -235,19 +336,19 @@ tokens because they are normally:
 * Can have punctuation or numbers e.g. `Apple Inc.` or `@kimchy`
 This means, for the most part, a search for a named entity in the annotated text field will
-not have any false positives e.g. when selecting `Apple Inc.` from an aggregation result 
+not have any false positives e.g. when selecting `Apple Inc.` from an aggregation result
-you can drill down to highlight uses in the text without "over matching" on any text tokens 
+you can drill down to highlight uses in the text without "over matching" on any text tokens
 like the word `apple` in this context:
    the apple was very juicy
-    
+
-However, a problem arises if your named entity happens to be a single term and lower-case e.g. the 
+However, a problem arises if your named entity happens to be a single term and lower-case e.g. the
 company `elastic`. In this case, a search on the annotated text field for the token `elastic`
 may match a text document such as this:
    they fired an elastic band
-To avoid such false matches users should consider prefixing annotation values to ensure 
+To avoid such false matches users should consider prefixing annotation values to ensure
 they don't name clash with text tokens e.g.
    [elastic](Company_elastic) released version 7.0 of the elastic stack today
@ -273,7 +374,7 @@ GET my-index-000001/_search
 {
  "query": {
    "query_string": {
-        "query": "cats" 
+        "query": "cats"
    }
  },
  "highlight": {
@ -291,21 +392,21 @@ GET my-index-000001/_search
 The annotated highlighter is based on the `unified` highlighter and supports the same
 settings but does not use the `pre_tags` or `post_tags` parameters. Rather than using
-html-like markup such as `<em>cat</em>` the annotated highlighter uses the same 
+html-like markup such as `<em>cat</em>` the annotated highlighter uses the same
 markdown-like syntax used for annotations and injects a key=value annotation where `_hit_term`
-is the key and the matched search term is the value e.g. 
+is the key and the matched search term is the value e.g.
    The [cat](_hit_term=cat) sat on the [mat](sku3578)
-The annotated highlighter tries to be respectful of any existing markup in the original 
+The annotated highlighter tries to be respectful of any existing markup in the original
 text:
-* If the search term matches exactly the location of an existing annotation then the 
+* If the search term matches exactly the location of an existing annotation then the
 `_hit_term` key is merged into the url-like syntax used in the `(...)` part of the
-existing annotation. 
+existing annotation.
 * However, if the search term overlaps the span of an existing annotation it would break
 the markup formatting so the original annotation is removed in favour of a new annotation
-with just the search hit information in the results. 
+with just the search hit information in the results.
 * Any non-overlapping annotations in the original text are preserved in highlighter
 selections
--- a/docs/reference/mapping/fields/synthetic-source.asciidoc
+++ b/docs/reference/mapping/fields/synthetic-source.asciidoc
@ -41,6 +41,7 @@ There are a couple of restrictions to be aware of:
 types:
 ** <<aggregate-metric-double-synthetic-source, `aggregate_metric_double`>>
 ** {plugins}/mapper-annotated-text-usage.html#annotated-text-synthetic-source[`annotated-text`]
 ** <<binary-synthetic-source,`binary`>>
 ** <<boolean-synthetic-source,`boolean`>>
 ** <<numeric-synthetic-source,`byte`>>
--- a/plugins/mapper-annotated-text/src/main/java/module-info.java
+++ b/plugins/mapper-annotated-text/src/main/java/module-info.java
@ -0,0 +1,19 @@
 /*
 * Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
 * or more contributor license agreements. Licensed under the Elastic License
 * 2.0 and the Server Side Public License, v 1; you may not use this file except
 * in compliance with, at your election, the Elastic License 2.0 or the Server
 * Side Public License, v 1.
 */
 module org.elasticsearch.index.mapper.annotatedtext {
    requires org.elasticsearch.base;
    requires org.elasticsearch.server;
    requires org.elasticsearch.xcontent;
    requires org.apache.lucene.core;
    requires org.apache.lucene.highlighter;
    // exports nothing
    provides org.elasticsearch.features.FeatureSpecification with org.elasticsearch.index.mapper.annotatedtext.Features;
 }
--- a/plugins/mapper-annotated-text/src/main/java/org/elasticsearch/index/mapper/annotatedtext/AnnotatedTextFieldMapper.java
+++ b/plugins/mapper-annotated-text/src/main/java/org/elasticsearch/index/mapper/annotatedtext/AnnotatedTextFieldMapper.java
@ -21,17 +21,22 @@ import org.apache.lucene.document.Field;
 import org.apache.lucene.document.FieldType;
 import org.apache.lucene.index.IndexOptions;
 import org.elasticsearch.ElasticsearchParseException;
 import org.elasticsearch.features.NodeFeature;
 import org.elasticsearch.index.IndexVersion;
 import org.elasticsearch.index.analysis.AnalyzerScope;
 import org.elasticsearch.index.analysis.IndexAnalyzers;
 import org.elasticsearch.index.analysis.NamedAnalyzer;
 import org.elasticsearch.index.mapper.DocumentParserContext;
 import org.elasticsearch.index.mapper.FieldMapper;
 import org.elasticsearch.index.mapper.KeywordFieldMapper;
 import org.elasticsearch.index.mapper.MapperBuilderContext;
 import org.elasticsearch.index.mapper.SourceLoader;
 import org.elasticsearch.index.mapper.StringStoredFieldFieldLoader;
 import org.elasticsearch.index.mapper.TextFieldMapper;
 import org.elasticsearch.index.mapper.TextParams;
 import org.elasticsearch.index.mapper.TextSearchInfo;
 import org.elasticsearch.index.similarity.SimilarityProvider;
 import org.elasticsearch.xcontent.XContentBuilder;
 import java.io.IOException;
 import java.io.Reader;
@ -41,6 +46,7 @@ import java.net.URLDecoder;
 import java.nio.charset.StandardCharsets;
 import java.util.ArrayList;
 import java.util.List;
 import java.util.Locale;
 import java.util.Map;
 import java.util.regex.Matcher;
 import java.util.regex.Pattern;
@ -58,6 +64,8 @@ import java.util.regex.Pattern;
 **/
 public class AnnotatedTextFieldMapper extends FieldMapper {
    public static final NodeFeature SYNTHETIC_SOURCE_SUPPORT = new NodeFeature("mapper.annotated_text.synthetic_source");
    public static final String CONTENT_TYPE = "annotated_text";
    private static Builder builder(FieldMapper in) {
@ -114,7 +122,7 @@ public class AnnotatedTextFieldMapper extends FieldMapper {
                meta };
        }
-        private AnnotatedTextFieldType buildFieldType(FieldType fieldType, MapperBuilderContext context) {
+        private AnnotatedTextFieldType buildFieldType(FieldType fieldType, MapperBuilderContext context, MultiFields multiFields) {
            TextSearchInfo tsi = new TextSearchInfo(
                fieldType,
                similarity.get(),
@ -126,12 +134,14 @@ public class AnnotatedTextFieldMapper extends FieldMapper {
                store.getValue(),
                tsi,
                context.isSourceSynthetic(),
                TextFieldMapper.SyntheticSourceHelper.syntheticSourceDelegate(fieldType, multiFields),
                meta.getValue()
            );
        }
        @Override
        public AnnotatedTextFieldMapper build(MapperBuilderContext context) {
            MultiFields multiFields = multiFieldsBuilder.build(this, context);
            FieldType fieldType = TextParams.buildFieldType(() -> true, store, indexOptions, norms, termVectors);
            if (fieldType.indexOptions() == IndexOptions.NONE) {
                throw new IllegalArgumentException("[" + CONTENT_TYPE + "] fields must be indexed");
@ -146,8 +156,8 @@ public class AnnotatedTextFieldMapper extends FieldMapper {
            return new AnnotatedTextFieldMapper(
                name(),
                fieldType,
-                buildFieldType(fieldType, context),
+                buildFieldType(fieldType, context, multiFields),
-                multiFieldsBuilder.build(this, context),
+                multiFields,
                copyTo,
                this
            );
@ -472,15 +482,15 @@ public class AnnotatedTextFieldMapper extends FieldMapper {
    }
    public static final class AnnotatedTextFieldType extends TextFieldMapper.TextFieldType {
        private AnnotatedTextFieldType(
            String name,
            boolean store,
            TextSearchInfo tsi,
            boolean isSyntheticSource,
            KeywordFieldMapper.KeywordFieldType syntheticSourceDelegate,
            Map<String, String> meta
        ) {
-            super(name, true, store, tsi, isSyntheticSource, null, meta, false, false);
+            super(name, true, store, tsi, isSyntheticSource, syntheticSourceDelegate, meta, false, false);
        }
        public AnnotatedTextFieldType(String name, Map<String, String> meta) {
@ -544,4 +554,36 @@ public class AnnotatedTextFieldMapper extends FieldMapper {
    public FieldMapper.Builder getMergeBuilder() {
        return new Builder(simpleName(), builder.indexCreatedVersion, builder.analyzers.indexAnalyzers).init(this);
    }
    @Override
    public SourceLoader.SyntheticFieldLoader syntheticFieldLoader() {
        if (copyTo.copyToFields().isEmpty() != true) {
            throw new IllegalArgumentException(
                "field [" + name() + "] of type [" + typeName() + "] doesn't support synthetic source because it declares copy_to"
            );
        }
        if (fieldType.stored()) {
            return new StringStoredFieldFieldLoader(name(), simpleName(), null) {
                @Override
                protected void write(XContentBuilder b, Object value) throws IOException {
                    b.value((String) value);
                }
            };
        }
        var kwd = TextFieldMapper.SyntheticSourceHelper.getKeywordFieldMapperForSyntheticSource(this);
        if (kwd != null) {
            return kwd.syntheticFieldLoader(simpleName());
        }
        throw new IllegalArgumentException(
            String.format(
                Locale.ROOT,
                "field [%s] of type [%s] doesn't support synthetic source unless it is stored or has a sub-field of"
                    + " type [keyword] with doc values or stored and without a normalizer",
                name(),
                typeName()
            )
        );
    }
 }
--- a/plugins/mapper-annotated-text/src/main/java/org/elasticsearch/index/mapper/annotatedtext/Features.java
+++ b/plugins/mapper-annotated-text/src/main/java/org/elasticsearch/index/mapper/annotatedtext/Features.java
@ -0,0 +1,26 @@
 /*
 * Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
 * or more contributor license agreements. Licensed under the Elastic License
 * 2.0 and the Server Side Public License, v 1; you may not use this file except
 * in compliance with, at your election, the Elastic License 2.0 or the Server
 * Side Public License, v 1.
 */
 package org.elasticsearch.index.mapper.annotatedtext;
 import org.elasticsearch.features.FeatureSpecification;
 import org.elasticsearch.features.NodeFeature;
 import java.util.Set;
 /**
 * Provides features for annotated text mapper.
 */
 public class Features implements FeatureSpecification {
    @Override
    public Set<NodeFeature> getFeatures() {
        return Set.of(
            AnnotatedTextFieldMapper.SYNTHETIC_SOURCE_SUPPORT // Added in 8.15
        );
    }
 }
--- a/plugins/mapper-annotated-text/src/main/resources/META-INF/services/org.elasticsearch.features.FeatureSpecification
+++ b/plugins/mapper-annotated-text/src/main/resources/META-INF/services/org.elasticsearch.features.FeatureSpecification
@ -0,0 +1,9 @@
 #
 # Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
 # or more contributor license agreements. Licensed under the Elastic License
 # 2.0 and the Server Side Public License, v 1; you may not use this file except
 # in compliance with, at your election, the Elastic License 2.0 or the Server
 # Side Public License, v 1.
 #
 org.elasticsearch.index.mapper.annotatedtext.Features
--- a/plugins/mapper-annotated-text/src/test/java/org/elasticsearch/index/mapper/annotatedtext/AnnotatedTextFieldMapperTests.java
+++ b/plugins/mapper-annotated-text/src/test/java/org/elasticsearch/index/mapper/annotatedtext/AnnotatedTextFieldMapperTests.java
@ -14,6 +14,7 @@ import org.apache.lucene.analysis.core.KeywordAnalyzer;
 import org.apache.lucene.analysis.core.WhitespaceAnalyzer;
 import org.apache.lucene.analysis.en.EnglishAnalyzer;
 import org.apache.lucene.analysis.standard.StandardAnalyzer;
 import org.apache.lucene.index.DirectoryReader;
 import org.apache.lucene.index.DocValuesType;
 import org.apache.lucene.index.IndexOptions;
 import org.apache.lucene.index.IndexableField;
@ -29,6 +30,7 @@ import org.elasticsearch.index.analysis.AnalyzerScope;
 import org.elasticsearch.index.analysis.CharFilterFactory;
 import org.elasticsearch.index.analysis.CustomAnalyzer;
 import org.elasticsearch.index.analysis.IndexAnalyzers;
 import org.elasticsearch.index.analysis.LowercaseNormalizer;
 import org.elasticsearch.index.analysis.NamedAnalyzer;
 import org.elasticsearch.index.analysis.StandardTokenizerFactory;
 import org.elasticsearch.index.analysis.TokenFilterFactory;
@ -38,6 +40,7 @@ import org.elasticsearch.index.mapper.MapperParsingException;
 import org.elasticsearch.index.mapper.MapperService;
 import org.elasticsearch.index.mapper.MapperTestCase;
 import org.elasticsearch.index.mapper.ParsedDocument;
 import org.elasticsearch.index.mapper.TextFieldFamilySyntheticSourceTestSetup;
 import org.elasticsearch.index.mapper.TextFieldMapper;
 import org.elasticsearch.plugins.Plugin;
 import org.elasticsearch.xcontent.ToXContent;
@ -54,6 +57,7 @@ import java.util.HashSet;
 import java.util.List;
 import java.util.Map;
 import java.util.Set;
 import java.util.function.Function;
 import static org.hamcrest.Matchers.containsString;
 import static org.hamcrest.Matchers.equalTo;
@ -144,7 +148,8 @@ public class AnnotatedTextFieldMapperTests extends MapperTestCase {
            )
        );
        return IndexAnalyzers.of(
-            Map.of("default", dflt, "standard", standard, "keyword", keyword, "whitespace", whitespace, "my_stop_analyzer", stop)
+            Map.of("default", dflt, "standard", standard, "keyword", keyword, "whitespace", whitespace, "my_stop_analyzer", stop),
            Map.of("lowercase", new NamedAnalyzer("lowercase", AnalyzerScope.INDEX, new LowercaseNormalizer()))
        );
    }
@ -595,7 +600,23 @@ public class AnnotatedTextFieldMapperTests extends MapperTestCase {
    @Override
    protected SyntheticSourceSupport syntheticSourceSupport(boolean ignoreMalformed) {
-        throw new AssumptionViolatedException("not supported");
+        assumeFalse("ignore_malformed not supported", ignoreMalformed);
        return TextFieldFamilySyntheticSourceTestSetup.syntheticSourceSupport("annotated_text", false);
    }
    @Override
    protected BlockReaderSupport getSupportedReaders(MapperService mapper, String loaderFieldName) {
        return TextFieldFamilySyntheticSourceTestSetup.getSupportedReaders(mapper, loaderFieldName);
    }
    @Override
    protected Function<Object, Object> loadBlockExpected(BlockReaderSupport blockReaderSupport, boolean columnReader) {
        return TextFieldFamilySyntheticSourceTestSetup.loadBlockExpected(blockReaderSupport, columnReader);
    }
    @Override
    protected void validateRoundTripReader(String syntheticSource, DirectoryReader reader, DirectoryReader roundTripReader) {
        TextFieldFamilySyntheticSourceTestSetup.validateRoundTripReader(syntheticSource, reader, roundTripReader);
    }
    @Override
--- a/plugins/mapper-annotated-text/src/yamlRestTest/resources/rest-api-spec/test/mapper_annotatedtext/20_synthetic_source.yml
+++ b/plugins/mapper-annotated-text/src/yamlRestTest/resources/rest-api-spec/test/mapper_annotatedtext/20_synthetic_source.yml
@ -0,0 +1,197 @@
 ---
 setup:
  - requires:
      cluster_features: ["mapper.annotated_text.synthetic_source"]
      reason: introduced in 8.15.0
 ---
 stored annotated_text field:
  - do:
      indices.create:
        index: test
        body:
          mappings:
            _source:
              mode: synthetic
            properties:
              annotated_text:
                type: annotated_text
                store: true
  - do:
      index:
        index: test
        id: 1
        refresh: true
        body:
          annotated_text: the quick brown fox
  - do:
      search:
        index: test
  - match:
      hits.hits.0._source:
        annotated_text: the quick brown fox
 ---
 annotated_text field with keyword multi-field:
  - do:
      indices.create:
        index: test
        body:
          mappings:
            _source:
              mode: synthetic
            properties:
              annotated_text:
                type: annotated_text
                fields:
                  keyword:
                    type: keyword
  - do:
      index:
        index: test
        id: 1
        refresh: true
        body:
          annotated_text: the quick brown fox
  - do:
      search:
        index: test
  - match:
      hits.hits.0._source:
        annotated_text: the quick brown fox
 ---
 multiple values in stored annotated_text field:
  - do:
      indices.create:
        index: test
        body:
          mappings:
            _source:
              mode: synthetic
            properties:
              annotated_text:
                type: annotated_text
                store: true
  - do:
      index:
        index: test
        id: 1
        refresh: true
        body:
          annotated_text: ["world", "hello", "world"]
  - do:
      search:
        index: test
  - match:
      hits.hits.0._source:
        annotated_text: ["world", "hello", "world"]
 ---
 multiple values in annotated_text field with keyword multi-field:
  - do:
      indices.create:
        index: test
        body:
          mappings:
            _source:
              mode: synthetic
            properties:
              annotated_text:
                type: annotated_text
                fields:
                  keyword:
                    type: keyword
  - do:
      index:
        index: test
        id: 1
        refresh: true
        body:
          annotated_text: ["world", "hello", "world"]
  - do:
      search:
        index: test
  - match:
      hits.hits.0._source:
        annotated_text: ["hello", "world"]
 ---
 multiple values in annotated_text field with stored keyword multi-field:
  - do:
      indices.create:
        index: test
        body:
          mappings:
            _source:
              mode: synthetic
            properties:
              annotated_text:
                type: annotated_text
                fields:
                  keyword:
                    type: keyword
                    store: true
                    doc_values: false
  - do:
      index:
        index: test
        id: 1
        refresh: true
        body:
          annotated_text: ["world", "hello", "world"]
  - do:
      search:
        index: test
  - match:
      hits.hits.0._source:
        annotated_text: ["world", "hello", "world"]
 ---
 multiple values in stored annotated_text field with keyword multi-field:
  - do:
      indices.create:
        index: test
        body:
          mappings:
            _source:
              mode: synthetic
            properties:
              annotated_text:
                type: annotated_text
                store: true
                fields:
                  keyword:
                    type: keyword
  - do:
      index:
        index: test
        id: 1
        refresh: true
        body:
          annotated_text: ["world", "hello", "world"]
  - do:
      search:
        index: test
  - match:
      hits.hits.0._source:
        annotated_text: ["world", "hello", "world"]
--- a/server/src/main/java/org/elasticsearch/index/mapper/KeywordFieldMapper.java
+++ b/server/src/main/java/org/elasticsearch/index/mapper/KeywordFieldMapper.java
@ -1026,7 +1026,7 @@ public final class KeywordFieldMapper extends FieldMapper {
        return syntheticFieldLoader(simpleName());
    }
-    SourceLoader.SyntheticFieldLoader syntheticFieldLoader(String simpleName) {
+    public SourceLoader.SyntheticFieldLoader syntheticFieldLoader(String simpleName) {
        if (hasScript()) {
            return SourceLoader.SyntheticFieldLoader.NOTHING;
        }
--- a/server/src/main/java/org/elasticsearch/index/mapper/TextFieldMapper.java
+++ b/server/src/main/java/org/elasticsearch/index/mapper/TextFieldMapper.java
@ -390,7 +390,7 @@ public final class TextFieldMapper extends FieldMapper {
                    store.getValue(),
                    tsi,
                    context.isSourceSynthetic(),
-                    syntheticSourceDelegate(fieldType, multiFields),
+                    SyntheticSourceHelper.syntheticSourceDelegate(fieldType, multiFields),
                    meta.getValue(),
                    eagerGlobalOrdinals.getValue(),
                    indexPhrases.getValue()
@ -402,17 +402,6 @@ public final class TextFieldMapper extends FieldMapper {
            return ft;
        }
        private static KeywordFieldMapper.KeywordFieldType syntheticSourceDelegate(FieldType fieldType, MultiFields multiFields) {
            if (fieldType.stored()) {
                return null;
            }
            var kwd = getKeywordFieldMapperForSyntheticSource(multiFields);
            if (kwd != null) {
                return kwd.fieldType();
            }
            return null;
        }
        private SubFieldInfo buildPrefixInfo(MapperBuilderContext context, FieldType fieldType, TextFieldType tft) {
            if (indexPrefixes.get() == null) {
                return null;
@ -1094,7 +1083,7 @@ public final class TextFieldMapper extends FieldMapper {
            return isSyntheticSource;
        }
-        KeywordFieldMapper.KeywordFieldType syntheticSourceDelegate() {
+        public KeywordFieldMapper.KeywordFieldType syntheticSourceDelegate() {
            return syntheticSourceDelegate;
        }
    }
@ -1473,7 +1462,7 @@ public final class TextFieldMapper extends FieldMapper {
            };
        }
-        var kwd = getKeywordFieldMapperForSyntheticSource(this);
+        var kwd = SyntheticSourceHelper.getKeywordFieldMapperForSyntheticSource(this);
        if (kwd != null) {
            return kwd.syntheticFieldLoader(simpleName());
        }
@ -1489,16 +1478,29 @@ public final class TextFieldMapper extends FieldMapper {
        );
    }
-    private static KeywordFieldMapper getKeywordFieldMapperForSyntheticSource(Iterable<? extends Mapper> multiFields) {
+    public static class SyntheticSourceHelper {
-        for (Mapper sub : multiFields) {
+        public static KeywordFieldMapper.KeywordFieldType syntheticSourceDelegate(FieldType fieldType, MultiFields multiFields) {
-            if (sub.typeName().equals(KeywordFieldMapper.CONTENT_TYPE)) {
+            if (fieldType.stored()) {
-                KeywordFieldMapper kwd = (KeywordFieldMapper) sub;
+                return null;
                if (kwd.hasNormalizer() == false && (kwd.fieldType().hasDocValues() || kwd.fieldType().isStored())) {
                    return kwd;
                }
            }
            var kwd = getKeywordFieldMapperForSyntheticSource(multiFields);
            if (kwd != null) {
                return kwd.fieldType();
            }
            return null;
        }
-        return null;
+        public static KeywordFieldMapper getKeywordFieldMapperForSyntheticSource(Iterable<? extends Mapper> multiFields) {
            for (Mapper sub : multiFields) {
                if (sub.typeName().equals(KeywordFieldMapper.CONTENT_TYPE)) {
                    KeywordFieldMapper kwd = (KeywordFieldMapper) sub;
                    if (kwd.hasNormalizer() == false && (kwd.fieldType().hasDocValues() || kwd.fieldType().isStored())) {
                        return kwd;
                    }
                }
            }
            return null;
        }
    }
 }
--- a/server/src/test/java/org/elasticsearch/index/mapper/KeywordFieldMapperTests.java
+++ b/server/src/test/java/org/elasticsearch/index/mapper/KeywordFieldMapperTests.java
@ -25,7 +25,6 @@ import org.elasticsearch.cluster.metadata.IndexMetadata;
 import org.elasticsearch.common.Strings;
 import org.elasticsearch.common.lucene.Lucene;
 import org.elasticsearch.common.settings.Settings;
 import org.elasticsearch.core.Tuple;
 import org.elasticsearch.index.IndexSettings;
 import org.elasticsearch.index.IndexVersion;
 import org.elasticsearch.index.analysis.AnalyzerScope;
@ -45,14 +44,11 @@ import org.elasticsearch.script.StringFieldScript;
 import org.elasticsearch.xcontent.XContentBuilder;
 import java.io.IOException;
 import java.util.ArrayList;
 import java.util.Arrays;
 import java.util.Collection;
 import java.util.HashSet;
 import java.util.List;
 import java.util.Map;
 import java.util.function.Function;
 import java.util.stream.Collectors;
 import static java.util.Collections.singletonList;
 import static java.util.Collections.singletonMap;
@ -658,7 +654,7 @@ public class KeywordFieldMapperTests extends MapperTestCase {
    @Override
    protected SyntheticSourceSupport syntheticSourceSupport(boolean ignoreMalformed) {
        assertFalse("keyword doesn't support ignore_malformed", ignoreMalformed);
-        return new KeywordSyntheticSourceSupport(
+        return new KeywordFieldSyntheticSourceSupport(
            randomBoolean() ? null : between(10, 100),
            randomBoolean(),
            usually() ? null : randomAlphaOfLength(2),
@ -666,110 +662,6 @@ public class KeywordFieldMapperTests extends MapperTestCase {
        );
    }
    static class KeywordSyntheticSourceSupport implements SyntheticSourceSupport {
        private final Integer ignoreAbove;
        private final boolean allIgnored;
        private final boolean store;
        private final boolean docValues;
        private final String nullValue;
        private final boolean exampleSortsUsingIgnoreAbove;
        KeywordSyntheticSourceSupport(Integer ignoreAbove, boolean store, String nullValue, boolean exampleSortsUsingIgnoreAbove) {
            this.ignoreAbove = ignoreAbove;
            this.allIgnored = ignoreAbove != null && rarely();
            this.store = store;
            this.nullValue = nullValue;
            this.exampleSortsUsingIgnoreAbove = exampleSortsUsingIgnoreAbove;
            this.docValues = store ? randomBoolean() : true;
        }
        @Override
        public SyntheticSourceExample example(int maxValues) {
            return example(maxValues, false);
        }
        public SyntheticSourceExample example(int maxValues, boolean loadBlockFromSource) {
            if (randomBoolean()) {
                Tuple<String, String> v = generateValue();
                Object loadBlock = v.v2();
                if (loadBlockFromSource == false && ignoreAbove != null && v.v2().length() > ignoreAbove) {
                    loadBlock = null;
                }
                return new SyntheticSourceExample(v.v1(), v.v2(), loadBlock, this::mapping);
            }
            List<Tuple<String, String>> values = randomList(1, maxValues, this::generateValue);
            List<String> in = values.stream().map(Tuple::v1).toList();
            List<String> outPrimary = new ArrayList<>();
            List<String> outExtraValues = new ArrayList<>();
            values.stream().map(Tuple::v2).forEach(v -> {
                if (exampleSortsUsingIgnoreAbove && ignoreAbove != null && v.length() > ignoreAbove) {
                    outExtraValues.add(v);
                } else {
                    outPrimary.add(v);
                }
            });
            List<String> outList = store ? outPrimary : new HashSet<>(outPrimary).stream().sorted().collect(Collectors.toList());
            List<String> loadBlock;
            if (loadBlockFromSource) {
                // The block loader infrastructure will never return nulls. Just zap them all.
                loadBlock = in.stream().filter(m -> m != null).toList();
            } else if (docValues) {
                loadBlock = new HashSet<>(outPrimary).stream().sorted().collect(Collectors.toList());
            } else {
                loadBlock = List.copyOf(outList);
            }
            Object loadBlockResult = loadBlock.size() == 1 ? loadBlock.get(0) : loadBlock;
            outList.addAll(outExtraValues);
            Object out = outList.size() == 1 ? outList.get(0) : outList;
            return new SyntheticSourceExample(in, out, loadBlockResult, this::mapping);
        }
        private Tuple<String, String> generateValue() {
            if (nullValue != null && randomBoolean()) {
                return Tuple.tuple(null, nullValue);
            }
            int length = 5;
            if (ignoreAbove != null && (allIgnored || randomBoolean())) {
                length = ignoreAbove + 5;
            }
            String v = randomAlphaOfLength(length);
            return Tuple.tuple(v, v);
        }
        private void mapping(XContentBuilder b) throws IOException {
            b.field("type", "keyword");
            if (nullValue != null) {
                b.field("null_value", nullValue);
            }
            if (ignoreAbove != null) {
                b.field("ignore_above", ignoreAbove);
            }
            if (store) {
                b.field("store", true);
            }
            if (docValues == false) {
                b.field("doc_values", false);
            }
        }
        @Override
        public List<SyntheticSourceInvalidExample> invalidExample() throws IOException {
            return List.of(
                new SyntheticSourceInvalidExample(
                    equalTo(
                        "field [field] of type [keyword] doesn't support synthetic source because "
                            + "it doesn't have doc values and isn't stored"
                    ),
                    b -> b.field("type", "keyword").field("doc_values", false)
                ),
                new SyntheticSourceInvalidExample(
                    equalTo("field [field] of type [keyword] doesn't support synthetic source because it declares a normalizer"),
                    b -> b.field("type", "keyword").field("normalizer", "lowercase")
                )
            );
        }
    }
    @Override
    protected IngestScriptSupport ingestScriptSupport() {
        return new IngestScriptSupport() {
--- a/server/src/test/java/org/elasticsearch/index/mapper/TextFieldMapperTests.java
+++ b/server/src/test/java/org/elasticsearch/index/mapper/TextFieldMapperTests.java
@ -75,7 +75,6 @@ import org.elasticsearch.search.lookup.SourceProvider;
 import org.elasticsearch.xcontent.ToXContent;
 import org.elasticsearch.xcontent.XContentBuilder;
 import org.elasticsearch.xcontent.XContentFactory;
 import org.hamcrest.Matcher;
 import org.junit.AssumptionViolatedException;
 import java.io.IOException;
@ -1178,120 +1177,12 @@ public class TextFieldMapperTests extends MapperTestCase {
    @Override
    protected SyntheticSourceSupport syntheticSourceSupport(boolean ignoreMalformed) {
        assumeFalse("ignore_malformed not supported", ignoreMalformed);
-        boolean storeTextField = randomBoolean();
+        return TextFieldFamilySyntheticSourceTestSetup.syntheticSourceSupport("text", true);
        boolean storedKeywordField = storeTextField || randomBoolean();
        boolean indexText = randomBoolean();
        Integer ignoreAbove = randomBoolean() ? null : between(10, 100);
        KeywordFieldMapperTests.KeywordSyntheticSourceSupport keywordSupport = new KeywordFieldMapperTests.KeywordSyntheticSourceSupport(
            ignoreAbove,
            storedKeywordField,
            null,
            false == storeTextField
        );
        return new SyntheticSourceSupport() {
            @Override
            public SyntheticSourceExample example(int maxValues) {
                if (storeTextField) {
                    SyntheticSourceExample delegate = keywordSupport.example(maxValues, true);
                    return new SyntheticSourceExample(
                        delegate.inputValue(),
                        delegate.expectedForSyntheticSource(),
                        delegate.expectedForBlockLoader(),
                        b -> {
                            b.field("type", "text");
                            b.field("store", true);
                            if (indexText == false) {
                                b.field("index", false);
                            }
                        }
                    );
                }
                // We'll load from _source if ignore_above is defined, otherwise we load from the keyword field.
                boolean loadingFromSource = ignoreAbove != null;
                SyntheticSourceExample delegate = keywordSupport.example(maxValues, loadingFromSource);
                return new SyntheticSourceExample(
                    delegate.inputValue(),
                    delegate.expectedForSyntheticSource(),
                    delegate.expectedForBlockLoader(),
                    b -> {
                        b.field("type", "text");
                        if (indexText == false) {
                            b.field("index", false);
                        }
                        b.startObject("fields");
                        {
                            b.startObject(randomAlphaOfLength(4));
                            delegate.mapping().accept(b);
                            b.endObject();
                        }
                        b.endObject();
                    }
                );
            }
            @Override
            public List<SyntheticSourceInvalidExample> invalidExample() throws IOException {
                Matcher<String> err = equalTo(
                    "field [field] of type [text] doesn't support synthetic source unless it is stored or"
                        + " has a sub-field of type [keyword] with doc values or stored and without a normalizer"
                );
                return List.of(
                    new SyntheticSourceInvalidExample(err, TextFieldMapperTests.this::minimalMapping),
                    new SyntheticSourceInvalidExample(err, b -> {
                        b.field("type", "text");
                        b.startObject("fields");
                        {
                            b.startObject("l");
                            b.field("type", "long");
                            b.endObject();
                        }
                        b.endObject();
                    }),
                    new SyntheticSourceInvalidExample(err, b -> {
                        b.field("type", "text");
                        b.startObject("fields");
                        {
                            b.startObject("kwd");
                            b.field("type", "keyword");
                            b.field("normalizer", "lowercase");
                            b.endObject();
                        }
                        b.endObject();
                    }),
                    new SyntheticSourceInvalidExample(err, b -> {
                        b.field("type", "text");
                        b.startObject("fields");
                        {
                            b.startObject("kwd");
                            b.field("type", "keyword");
                            b.field("doc_values", "false");
                            b.endObject();
                        }
                        b.endObject();
                    })
                );
            }
        };
    }
    @Override
    protected Function<Object, Object> loadBlockExpected(BlockReaderSupport blockReaderSupport, boolean columnReader) {
-        if (nullLoaderExpected(blockReaderSupport.mapper(), blockReaderSupport.loaderFieldName())) {
+        return TextFieldFamilySyntheticSourceTestSetup.loadBlockExpected(blockReaderSupport, columnReader);
            return null;
        }
        return v -> ((BytesRef) v).utf8ToString();
    }
    private boolean nullLoaderExpected(MapperService mapper, String fieldName) {
        MappedFieldType type = mapper.fieldType(fieldName);
        if (type instanceof TextFieldType t) {
            if (t.isSyntheticSource() == false || t.canUseSyntheticSourceDelegateForQuerying() || t.isStored()) {
                return false;
            }
            String parentField = mapper.mappingLookup().parentField(fieldName);
            return parentField == null || nullLoaderExpected(mapper, parentField);
        }
        return false;
    }
    @Override
@ -1300,9 +1191,8 @@ public class TextFieldMapperTests extends MapperTestCase {
    }
    @Override
-    protected void validateRoundTripReader(String syntheticSource, DirectoryReader reader, DirectoryReader roundTripReader)
+    protected void validateRoundTripReader(String syntheticSource, DirectoryReader reader, DirectoryReader roundTripReader) {
-        throws IOException {
+        TextFieldFamilySyntheticSourceTestSetup.validateRoundTripReader(syntheticSource, reader, roundTripReader);
        // Disabled because it currently fails
    }
    public void testUnknownAnalyzerOnLegacyIndex() throws IOException {
@ -1433,21 +1323,7 @@ public class TextFieldMapperTests extends MapperTestCase {
    @Override
    protected BlockReaderSupport getSupportedReaders(MapperService mapper, String loaderFieldName) {
-        MappedFieldType ft = mapper.fieldType(loaderFieldName);
+        return TextFieldFamilySyntheticSourceTestSetup.getSupportedReaders(mapper, loaderFieldName);
        String parentName = mapper.mappingLookup().parentField(ft.name());
        if (parentName == null) {
            TextFieldMapper.TextFieldType text = (TextFieldType) ft;
            boolean supportsColumnAtATimeReader = text.syntheticSourceDelegate() != null
                && text.syntheticSourceDelegate().hasDocValues()
                && text.canUseSyntheticSourceDelegateForQuerying();
            return new BlockReaderSupport(supportsColumnAtATimeReader, mapper, loaderFieldName);
        }
        MappedFieldType parent = mapper.fieldType(parentName);
        if (false == parent.typeName().equals(KeywordFieldMapper.CONTENT_TYPE)) {
            throw new UnsupportedOperationException();
        }
        KeywordFieldMapper.KeywordFieldType kwd = (KeywordFieldMapper.KeywordFieldType) parent;
        return new BlockReaderSupport(kwd.hasDocValues(), mapper, loaderFieldName);
    }
    public void testBlockLoaderFromParentColumnReader() throws IOException {
@ -1460,7 +1336,7 @@ public class TextFieldMapperTests extends MapperTestCase {
    private void testBlockLoaderFromParent(boolean columnReader, boolean syntheticSource) throws IOException {
        boolean storeParent = randomBoolean();
-        KeywordFieldMapperTests.KeywordSyntheticSourceSupport kwdSupport = new KeywordFieldMapperTests.KeywordSyntheticSourceSupport(
+        KeywordFieldSyntheticSourceSupport kwdSupport = new KeywordFieldSyntheticSourceSupport(
            null,
            storeParent,
            null,
--- a/test/framework/src/main/java/org/elasticsearch/index/mapper/KeywordFieldSyntheticSourceSupport.java
+++ b/test/framework/src/main/java/org/elasticsearch/index/mapper/KeywordFieldSyntheticSourceSupport.java
@ -0,0 +1,126 @@
 /*
 * Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
 * or more contributor license agreements. Licensed under the Elastic License
 * 2.0 and the Server Side Public License, v 1; you may not use this file except
 * in compliance with, at your election, the Elastic License 2.0 or the Server
 * Side Public License, v 1.
 */
 package org.elasticsearch.index.mapper;
 import org.apache.lucene.tests.util.LuceneTestCase;
 import org.elasticsearch.core.Tuple;
 import org.elasticsearch.test.ESTestCase;
 import org.elasticsearch.xcontent.XContentBuilder;
 import java.io.IOException;
 import java.util.ArrayList;
 import java.util.HashSet;
 import java.util.List;
 import java.util.stream.Collectors;
 import static org.hamcrest.Matchers.equalTo;
 public class KeywordFieldSyntheticSourceSupport implements MapperTestCase.SyntheticSourceSupport {
    private final Integer ignoreAbove;
    private final boolean allIgnored;
    private final boolean store;
    private final boolean docValues;
    private final String nullValue;
    private final boolean exampleSortsUsingIgnoreAbove;
    KeywordFieldSyntheticSourceSupport(Integer ignoreAbove, boolean store, String nullValue, boolean exampleSortsUsingIgnoreAbove) {
        this.ignoreAbove = ignoreAbove;
        this.allIgnored = ignoreAbove != null && LuceneTestCase.rarely();
        this.store = store;
        this.nullValue = nullValue;
        this.exampleSortsUsingIgnoreAbove = exampleSortsUsingIgnoreAbove;
        this.docValues = store ? ESTestCase.randomBoolean() : true;
    }
    @Override
    public MapperTestCase.SyntheticSourceExample example(int maxValues) {
        return example(maxValues, false);
    }
    public MapperTestCase.SyntheticSourceExample example(int maxValues, boolean loadBlockFromSource) {
        if (ESTestCase.randomBoolean()) {
            Tuple<String, String> v = generateValue();
            Object loadBlock = v.v2();
            if (loadBlockFromSource == false && ignoreAbove != null && v.v2().length() > ignoreAbove) {
                loadBlock = null;
            }
            return new MapperTestCase.SyntheticSourceExample(v.v1(), v.v2(), loadBlock, this::mapping);
        }
        List<Tuple<String, String>> values = ESTestCase.randomList(1, maxValues, this::generateValue);
        List<String> in = values.stream().map(Tuple::v1).toList();
        List<String> outPrimary = new ArrayList<>();
        List<String> outExtraValues = new ArrayList<>();
        values.stream().map(Tuple::v2).forEach(v -> {
            if (exampleSortsUsingIgnoreAbove && ignoreAbove != null && v.length() > ignoreAbove) {
                outExtraValues.add(v);
            } else {
                outPrimary.add(v);
            }
        });
        List<String> outList = store ? outPrimary : new HashSet<>(outPrimary).stream().sorted().collect(Collectors.toList());
        List<String> loadBlock;
        if (loadBlockFromSource) {
            // The block loader infrastructure will never return nulls. Just zap them all.
            loadBlock = in.stream().filter(m -> m != null).toList();
        } else if (docValues) {
            loadBlock = new HashSet<>(outPrimary).stream().sorted().collect(Collectors.toList());
        } else {
            loadBlock = List.copyOf(outList);
        }
        Object loadBlockResult = loadBlock.size() == 1 ? loadBlock.get(0) : loadBlock;
        outList.addAll(outExtraValues);
        Object out = outList.size() == 1 ? outList.get(0) : outList;
        return new MapperTestCase.SyntheticSourceExample(in, out, loadBlockResult, this::mapping);
    }
    private Tuple<String, String> generateValue() {
        if (nullValue != null && ESTestCase.randomBoolean()) {
            return Tuple.tuple(null, nullValue);
        }
        int length = 5;
        if (ignoreAbove != null && (allIgnored || ESTestCase.randomBoolean())) {
            length = ignoreAbove + 5;
        }
        String v = ESTestCase.randomAlphaOfLength(length);
        return Tuple.tuple(v, v);
    }
    private void mapping(XContentBuilder b) throws IOException {
        b.field("type", "keyword");
        if (nullValue != null) {
            b.field("null_value", nullValue);
        }
        if (ignoreAbove != null) {
            b.field("ignore_above", ignoreAbove);
        }
        if (store) {
            b.field("store", true);
        }
        if (docValues == false) {
            b.field("doc_values", false);
        }
    }
    @Override
    public List<MapperTestCase.SyntheticSourceInvalidExample> invalidExample() throws IOException {
        return List.of(
            new MapperTestCase.SyntheticSourceInvalidExample(
                equalTo(
                    "field [field] of type [keyword] doesn't support synthetic source because "
                        + "it doesn't have doc values and isn't stored"
                ),
                b -> b.field("type", "keyword").field("doc_values", false)
            ),
            new MapperTestCase.SyntheticSourceInvalidExample(
                equalTo("field [field] of type [keyword] doesn't support synthetic source because it declares a normalizer"),
                b -> b.field("type", "keyword").field("normalizer", "lowercase")
            )
        );
    }
 }
--- a/test/framework/src/main/java/org/elasticsearch/index/mapper/MapperTestCase.java
+++ b/test/framework/src/main/java/org/elasticsearch/index/mapper/MapperTestCase.java
@ -1286,7 +1286,7 @@ public abstract class MapperTestCase extends MapperServiceTestCase {
     * @param loaderFieldName the field name to use for loading the field
     */
    public record BlockReaderSupport(boolean columnAtATimeReader, boolean syntheticSource, MapperService mapper, String loaderFieldName) {
-        BlockReaderSupport(boolean columnAtATimeReader, MapperService mapper, String loaderFieldName) {
+        public BlockReaderSupport(boolean columnAtATimeReader, MapperService mapper, String loaderFieldName) {
            this(columnAtATimeReader, true, mapper, loaderFieldName);
        }
--- a/test/framework/src/main/java/org/elasticsearch/index/mapper/TextFieldFamilySyntheticSourceTestSetup.java
+++ b/test/framework/src/main/java/org/elasticsearch/index/mapper/TextFieldFamilySyntheticSourceTestSetup.java
@ -0,0 +1,207 @@
 /*
 * Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
 * or more contributor license agreements. Licensed under the Elastic License
 * 2.0 and the Server Side Public License, v 1; you may not use this file except
 * in compliance with, at your election, the Elastic License 2.0 or the Server
 * Side Public License, v 1.
 */
 package org.elasticsearch.index.mapper;
 import org.apache.lucene.index.DirectoryReader;
 import org.apache.lucene.util.BytesRef;
 import org.hamcrest.Matcher;
 import java.io.IOException;
 import java.util.List;
 import java.util.Locale;
 import java.util.function.Function;
 import static org.elasticsearch.test.ESTestCase.between;
 import static org.elasticsearch.test.ESTestCase.randomAlphaOfLength;
 import static org.elasticsearch.test.ESTestCase.randomBoolean;
 import static org.hamcrest.Matchers.equalTo;
 /**
 * Provides functionality needed to test synthetic source support in text and text-like fields (e.g. "text", "annotated_text").
 */
 public final class TextFieldFamilySyntheticSourceTestSetup {
    public static MapperTestCase.SyntheticSourceSupport syntheticSourceSupport(String fieldType, boolean supportsCustomIndexConfiguration) {
        return new TextFieldFamilySyntheticSourceSupport(fieldType, supportsCustomIndexConfiguration);
    }
    public static MapperTestCase.BlockReaderSupport getSupportedReaders(MapperService mapper, String loaderFieldName) {
        MappedFieldType ft = mapper.fieldType(loaderFieldName);
        String parentName = mapper.mappingLookup().parentField(ft.name());
        if (parentName == null) {
            TextFieldMapper.TextFieldType text = (TextFieldMapper.TextFieldType) ft;
            boolean supportsColumnAtATimeReader = text.syntheticSourceDelegate() != null
                && text.syntheticSourceDelegate().hasDocValues()
                && text.canUseSyntheticSourceDelegateForQuerying();
            return new MapperTestCase.BlockReaderSupport(supportsColumnAtATimeReader, mapper, loaderFieldName);
        }
        MappedFieldType parent = mapper.fieldType(parentName);
        if (false == parent.typeName().equals(KeywordFieldMapper.CONTENT_TYPE)) {
            throw new UnsupportedOperationException();
        }
        KeywordFieldMapper.KeywordFieldType kwd = (KeywordFieldMapper.KeywordFieldType) parent;
        return new MapperTestCase.BlockReaderSupport(kwd.hasDocValues(), mapper, loaderFieldName);
    }
    public static Function<Object, Object> loadBlockExpected(MapperTestCase.BlockReaderSupport blockReaderSupport, boolean columnReader) {
        if (nullLoaderExpected(blockReaderSupport.mapper(), blockReaderSupport.loaderFieldName())) {
            return null;
        }
        return v -> ((BytesRef) v).utf8ToString();
    }
    private static boolean nullLoaderExpected(MapperService mapper, String fieldName) {
        MappedFieldType type = mapper.fieldType(fieldName);
        if (type instanceof TextFieldMapper.TextFieldType t) {
            if (t.isSyntheticSource() == false || t.canUseSyntheticSourceDelegateForQuerying() || t.isStored()) {
                return false;
            }
            String parentField = mapper.mappingLookup().parentField(fieldName);
            return parentField == null || nullLoaderExpected(mapper, parentField);
        }
        return false;
    }
    public static void validateRoundTripReader(String syntheticSource, DirectoryReader reader, DirectoryReader roundTripReader) {
        // `reader` here is reader of original document and `roundTripReader` reads document
        // created from synthetic source.
        // This check fails when synthetic source is constructed using keyword subfield
        // since in that case values are sorted (due to being read from doc values) but original document isn't.
        //
        // So it is disabled.
    }
    private static class TextFieldFamilySyntheticSourceSupport implements MapperTestCase.SyntheticSourceSupport {
        private final String fieldType;
        private final boolean storeTextField;
        private final boolean storedKeywordField;
        private final boolean indexText;
        private final Integer ignoreAbove;
        private final KeywordFieldSyntheticSourceSupport keywordSupport;
        TextFieldFamilySyntheticSourceSupport(String fieldType, boolean supportsCustomIndexConfiguration) {
            this.fieldType = fieldType;
            this.storeTextField = randomBoolean();
            this.storedKeywordField = storeTextField || randomBoolean();
            this.indexText = supportsCustomIndexConfiguration ? randomBoolean() : true;
            this.ignoreAbove = randomBoolean() ? null : between(10, 100);
            this.keywordSupport = new KeywordFieldSyntheticSourceSupport(ignoreAbove, storedKeywordField, null, false == storeTextField);
        }
        @Override
        public MapperTestCase.SyntheticSourceExample example(int maxValues) {
            if (storeTextField) {
                MapperTestCase.SyntheticSourceExample delegate = keywordSupport.example(maxValues, true);
                return new MapperTestCase.SyntheticSourceExample(
                    delegate.inputValue(),
                    delegate.expectedForSyntheticSource(),
                    delegate.expectedForBlockLoader(),
                    b -> {
                        b.field("type", fieldType);
                        b.field("store", true);
                        if (indexText == false) {
                            b.field("index", false);
                        }
                    }
                );
            }
            // We'll load from _source if ignore_above is defined, otherwise we load from the keyword field.
            boolean loadingFromSource = ignoreAbove != null;
            MapperTestCase.SyntheticSourceExample delegate = keywordSupport.example(maxValues, loadingFromSource);
            return new MapperTestCase.SyntheticSourceExample(
                delegate.inputValue(),
                delegate.expectedForSyntheticSource(),
                delegate.expectedForBlockLoader(),
                b -> {
                    b.field("type", fieldType);
                    if (indexText == false) {
                        b.field("index", false);
                    }
                    b.startObject("fields");
                    {
                        b.startObject(randomAlphaOfLength(4));
                        delegate.mapping().accept(b);
                        b.endObject();
                    }
                    b.endObject();
                }
            );
        }
        @Override
        public List<MapperTestCase.SyntheticSourceInvalidExample> invalidExample() throws IOException {
            Matcher<String> err = equalTo(
                String.format(
                    Locale.ROOT,
                    "field [field] of type [%s] doesn't support synthetic source unless it is stored or"
                        + " has a sub-field of type [keyword] with doc values or stored and without a normalizer",
                    fieldType
                )
            );
            return List.of(
                new MapperTestCase.SyntheticSourceInvalidExample(err, b -> b.field("type", fieldType)),
                new MapperTestCase.SyntheticSourceInvalidExample(err, b -> {
                    b.field("type", fieldType);
                    b.startObject("fields");
                    {
                        b.startObject("l");
                        b.field("type", "long");
                        b.endObject();
                    }
                    b.endObject();
                }),
                new MapperTestCase.SyntheticSourceInvalidExample(err, b -> {
                    b.field("type", fieldType);
                    b.startObject("fields");
                    {
                        b.startObject("kwd");
                        b.field("type", "keyword");
                        b.field("normalizer", "lowercase");
                        b.endObject();
                    }
                    b.endObject();
                }),
                new MapperTestCase.SyntheticSourceInvalidExample(err, b -> {
                    b.field("type", fieldType);
                    b.startObject("fields");
                    {
                        b.startObject("kwd");
                        b.field("type", "keyword");
                        b.field("doc_values", "false");
                        b.endObject();
                    }
                    b.endObject();
                }),
                new MapperTestCase.SyntheticSourceInvalidExample(err, b -> {
                    b.field("type", fieldType);
                    b.field("store", "false");
                    b.startObject("fields");
                    {
                        b.startObject("kwd");
                        b.field("type", "keyword");
                        b.field("doc_values", "false");
                        b.endObject();
                    }
                    b.endObject();
                }),
                new MapperTestCase.SyntheticSourceInvalidExample(err, b -> {
                    b.field("type", fieldType);
                    b.startObject("fields");
                    {
                        b.startObject("kwd");
                        b.field("type", "keyword");
                        b.field("doc_values", "false");
                        b.field("store", "false");
                        b.endObject();
                    }
                    b.endObject();
                })
            );
        }
    }
 }