Today the overloads of `XContentBuilder#timeField` do two rather
different things: one formats an object as a `String` representation of
a time (where the object is either an unambiguous time object or else a
`long`) and the other formats only a `long` as one or two fields
depending on the `?human` flag.
This is trappy in a number of ways:
- `long` means an absolute (epoch) time, but sometimes folks will
mistakenly use this for time intervals too.
- `long` means only milliseconds, there is no facility to specify a
different unit.
- the dependence on the `?human` flag in exactly one of the overloads is
kinda weird.
This commit removes the confusion by dropping support for considering a
`Long` as a valid representation of a time at all, and instead requiring
callers to either convert it into a proper time object or else call a
method that is explicitly expecting an epoch time in milliseconds.
It seemed if there wasn't any text to parse, this is not an internal
issue but instead an argument issue.
I simply changed the exception thrown. If we don't agree with this, I
can adjust `query` parsing directly, but this seemed like the better
choice.
closes: https://github.com/elastic/elasticsearch/issues/112296
Currently we wrap JsonEOFException from advancing the json parser into our own
XContentEOFException, but this has the drawback that is results in 500 errors on
the client side. Instead this should be 400 errors.
This changes XContentEOFException to extend XContentParseException so we report
a 400 error instead.
Closes#111542
When xcontent is copied, the parse tree is walked and each element is
passed to the given generator. In the case of numbers, BigInteger is
currently not handled. Although arbitrary precision BigIntegers are not
supported in Elasticsearch, they appear in xcontent when using unsigned
long fields. This commit adds handling for that case, and also ensures
all token types are handled. Note that BigDecimal are not supported at
all since double is the largest floating point mapper supported.
closes#111812
Rename `xContent.streamSeparator()` and
`RestHandler.supportsStreamContent()` to `xContent.bulkSeparator()` and
`RestHandler.supportsBulkContent()`.
I want to reserve use of "supportsStreamContent" for current work in
HTTP layer to [support incremental content
handling](https://github.com/elastic/elasticsearch/pull/111438) besides
fully aggregated byte buffers. `supportsStreamContent` would indicate
that handler can parse chunks of http content as they arrive.
Seen this come up in some profiling, wasting some cycles. If we do a
method per type here instead of a getter + field, we pay for a
megamorphic callsite potentially. It's faster and uses less code
anyway to just use a field + getter here.
With this commit we use `writeRawValue` instead of `writeRaw` when
serializing raw strings as XContent. The latter method does not consider
context (e.g. is the value being written as part of an array and
requires a comma separator?) whereas the former does. This ensures that
pre-rendered double values as we use them in the flamegraph response are
rendered correctly as XContent.
Closes#106103
Sometimes libs have subprojects that may not be java projects. This commit adjusts the shared
configuration for libs to only affect direct subprojects of :lib.
The response of the flamegraph is quite large: A typical response can
easily reach 50MB (uncompressed). In order to reduce memory pressure and
also to start sending the response sooner, we chunk the response.
However, this leads to many chunks that are very small and lead to high
overhead. In our experiments, just the serialization takes more than
500ms.
With this commit we take the following measures:
1. We split the response into chunks only when it makes sense and
otherwise send one larger chunk.
2. Serialization of doubles is very expensive: Just the serialization of
annual CO2 tons takes around 80ms in our test setup. Therefore, we
apply a custom serialization that is both faster than the builtin
serialization as well reduces the amount of bytes sent over the wire
because we round to four decimal places (which is more than sufficient for
our purposes).
* Use String.replace() instead of replaceAll() for non-regexp replacements
When arguments do not make use of regexp features replace() is a more efficient option, especially the char-variant.
There's no need for this helper to take more than one argument. Almost
all the usages only passed in a single argument, and the few cases that
supplied more than one can be rewritten as a single argument to save
allocating all those extra lambdas.
x-content embeds its jackson implementation inside its jar. This commit
formalizes the setup for this embedding with a gradle plugin so that it
can be reused by other libs.
Our readEnum code instantiates/clones enum value arrays on read.
Normally, this doesn't matter much but the two spots adjusted here are
visibly hot during bulk indexing, causing GBs of allocations during e.g.
the http_logs indexing run.
Remove the rough limit on string length from Jackson 2.15. The limit was already relaxed for JSON in #96031, this extends that change to other XContent types.
Refs: #96031Fixes: #104009
We're leaking quite a few of these parsers. That doesn't seem to be much
of a problem but results in some memory inefficiencies in Jackson here
and there. This PR bulk fixes a bunch of instances that I could easily
automatically fix. I'll open a follow-up for closing the parser on the
document parsing context which also suffers from this but is non-trivial
to fix.
Another round of automated fixes to this, marking things that can be
made static as static. Saves some JIT cycles but also turns some lambdas
from capturing to non-capturing and makes the "utilityness" of some
classes visible.
Jackson has a direct method for writing string arrays
that saves us some of the indirection we have when looping
over a string array. This normally doesn't gain much, but for extreme
cases like long index name lists in field caps it saves a couple percent
in CPU time.
While ingesting documents that contain nested objects and the
mapping property subobjects is set to false instead of throwing
a mapping exception and dropping the document(s), we map only
leaf field(s) with their full path as their name separated by dots.
This PR introduces downsampling configuration to the data stream lifecycle. Keep in mind downsampling implementation will come in a follow up PR. Configuration looks like this:
```
{
"lifecycle": {
"data_retention": "90d",
"downsampling": [
{
"after": "1d",
"fixed_interval": "2h"
},
{ "after": "15d", "fixed_interval": "1d" },
{ "after": "30d", "fixed_interval": "1w" }
]
}
}
```
We will also support using `null` to unset downsampling configuration during template composition:
```
{
"lifecycle": {
"data_retention": "90d",
"downsampling": null
}
}
```
This commits fixes the incorrect pattern for TChar defined in RFC7230 section 3.2.6
`a-zA-z` was accidentally used and the pattern `a-zA-Z` should be used instead
Jackson 2.15 introduced a (rough) maximum limit on string length. This
commit relaxes that limit to its maximum size, leaving document size
constraints to other existing limits in the system. We can revisit
whether string length within a document should be independently
constrainted later.
I saw this in some hot-threads. Splitting by a pattern that isn't a single char is expensive
because it instantiates a `Pattern`. Seems like it's redundant to split the spaces+tabs away anyway since
we trim values and keys later on in the logic.
-> lets use the split fast path and not have this on the transport thread.
Pushes the chunking of `GET _nodes/stats` down to avoid creating
unboundedly large chunks. With this commit we yield one chunk per shard
(if `?level=shards`) or index (if `?level=indices`) and per HTTP client
and per transport action.
Closes#93985
Fixes#82794. Upgrade the spotless plugin, which addresses the issue
around formatting `instanceof` expressions. Formatting of statements
including lambdas seems to have improved too.
When writing generic objects to x-content the value may cause an error
if XContentBuilder does not know how to understand the concrete object
type. This commit adds a new helper method, similar to
StreamOutput.checkWriteable, which validates the type of an object (and
any inner objects if it is a collection) are writeable to x-content.
String.split is very visibly slow in profiling the BeatsMapperBenchmark.
The reason for that is mainly in its creation of an intermediary `ArrayList`
and the generality of the method.
We can do a much faster split by only doing the split method's operations
for a char and, avoiding the list by pre-counting the dots (which we somewhat
get for free since we do the upfront check for `contains('.')` so we can use
counting the dots for that as well.
This speeds up the beats mapper benchmark from 9k ns to about 8.4k ns on my x86 workstation
and from 6k ns about 5.5k ns on my M1 MBP which I think is considerable enough to justify the
small amount of added complexity.
The mapping code for stack frames uses the utility method
`ObjectPath#eval` to read nested properties. Callers need to pass a
dot-separated path which is then split internally via a regex. This is
quite slow: in a typical deployment we saw overheads of 50ms just for
mapping stack frames (total response time is ~ 1 second).
With this commit we pass the property path as an array to avoid this
overhead. In a microbenchmark the new implementation was 23 times faster
than the current one.
This updates the gradle wrapper to 7.6.1. This patch release contains a
fix for incremental compilation of java modules we have raised against
gradle 7.6
see https://github.com/gradle/gradle/issues/23067
before the jackson 2.14.2 elasticserach had to override the jackson locally to avoid a bug when filtering empty arrays. #92984
This commit reverts the local override and upgrades jackson to 2.14.2 which contain the fix to the bug
in #92984 we override a file in jackson jar, but we rely on gradle internals which might change at any point.
This fixes this by excluding a element from a jar and allowing a new class to be added
while jackson 2.14.2 with FasterXML/jackson-core#882 is still not released
we want to patch the jackson-core used by x-content with the modified class that fixes the bug #92480closes#92480