Prepare tsdb doc values format for merging optimizations. (#125933)

The change contains the following changes:

- The numDocsWithField field moved from SortedNumericEntry to NumericEntry. Making this statistic always available.
- Store jump table after values in ES87TSDBDocValuesConsumer#writeField(...). Currently it is stored before storing values. This will allow us later to iterate over the SortedNumericDocValues once. When merging, this is expensive as a merge sort on the fly is being executed.

This change will allow all the optimizations that are listed in #125403
This commit is contained in:
Martijn van Groningen 2025-04-02 13:39:41 +02:00 committed by GitHub
parent 40dd91b800
commit 52d68392d0
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
16 changed files with 2774 additions and 13 deletions

View file

@ -443,7 +443,10 @@ module org.elasticsearch.server {
org.elasticsearch.index.codec.bloomfilter.ES85BloomFilterPostingsFormat,
org.elasticsearch.index.codec.bloomfilter.ES87BloomFilterPostingsFormat,
org.elasticsearch.index.codec.postings.ES812PostingsFormat;
provides org.apache.lucene.codecs.DocValuesFormat with org.elasticsearch.index.codec.tsdb.ES87TSDBDocValuesFormat;
provides org.apache.lucene.codecs.DocValuesFormat
with
org.elasticsearch.index.codec.tsdb.ES87TSDBDocValuesFormat,
org.elasticsearch.index.codec.tsdb.es819.ES819TSDBDocValuesFormat;
provides org.apache.lucene.codecs.KnnVectorsFormat
with
org.elasticsearch.index.codec.vectors.ES813FlatVectorFormat,