Commit graph

235 commits

Author SHA1 Message Date
Carlos Delgado
26e8a7602a
Do not exclude empty arrays or empty objects in source filtering with Jackson streaming (#112250) (#115223)
(cherry picked from commit 6be3036c01)

Co-authored-by: mccheah <mcheah@palantir.com>
2024-10-21 16:29:16 +02:00
David Turner
98209e44de
Simplify XContent output of epoch times (#114491) (#114736)
Today the overloads of `XContentBuilder#timeField` do two rather
different things: one formats an object as a `String` representation of
a time (where the object is either an unambiguous time object or else a
`long`) and the other formats only a `long` as one or two fields
depending on the `?human` flag.

This is trappy in a number of ways:

- `long` means an absolute (epoch) time, but sometimes folks will
  mistakenly use this for time intervals too.

- `long` means only milliseconds, there is no facility to specify a
  different unit.

- the dependence on the `?human` flag in exactly one of the overloads is
  kinda weird.

This commit removes the confusion by dropping support for considering a
`Long` as a valid representation of a time at all, and instead requiring
callers to either convert it into a proper time object or else call a
method that is explicitly expecting an epoch time in milliseconds.
2024-10-15 03:46:31 +11:00
Mark Vieira
0279c0a909
Add AGPLv3 as a supported license 2024-09-13 14:30:33 -07:00
Benjamin Trent
281ee04f7a
JSON parse failures should be 4xx codes (#112703)
It seemed if there wasn't any text to parse, this is not an internal
issue but instead an argument issue.

I simply changed the exception thrown. If we don't agree with this, I
can adjust `query` parsing directly, but this seemed like the better
choice.

closes: https://github.com/elastic/elasticsearch/issues/112296
2024-09-12 00:15:56 +10:00
Christoph Büscher
000ebaf7c2
Json parsing exceptions should not cause 500 errors (#111548)
Currently we wrap JsonEOFException from advancing the json parser into our own
XContentEOFException, but this has the drawback that is results in 500 errors on
the client side. Instead this should be 400 errors.
This changes XContentEOFException to extend XContentParseException so we report
a 400 error instead.

Closes #111542
2024-09-06 09:13:30 +02:00
Nhat Nguyen
98fe686da4
Upgrade xcontent to Jackson 2.17.2 (#112320)
Avoid FasterXML/jackson-core#1256
2024-08-28 15:59:12 -07:00
Ryan Ernst
ba87a48833
Handle BigInteger in xcontent copy (#111937)
When xcontent is copied, the parse tree is walked and each element is
passed to the given generator. In the case of numbers, BigInteger is
currently not handled. Although arbitrary precision BigIntegers are not
supported in Elasticsearch, they appear in xcontent when using unsigned
long fields. This commit adds handling for that case, and also ensures
all token types are handled. Note that BigDecimal are not supported at
all since double is the largest floating point mapper supported.

closes #111812
2024-08-22 00:12:02 +10:00
Ryan Ernst
5a10545d37
Upgrade xcontent to Jackson 2.17.0 (#111948) 2024-08-20 06:07:08 -07:00
Patrick Doyle
4516400143
More XContent long coercion cases (#111641)
* More XContent long coercion cases

* spotless
2024-08-14 08:08:58 -04:00
Mikhail Berezovskiy
1163d2e4f9
Rename streamContent/Separator to bulkContent/Separator (#111716)
Rename `xContent.streamSeparator()` and
`RestHandler.supportsStreamContent()` to `xContent.bulkSeparator()` and
`RestHandler.supportsBulkContent()`.

I want to reserve use of "supportsStreamContent" for current work in
HTTP layer to [support incremental content
handling](https://github.com/elastic/elasticsearch/pull/111438) besides
fully aggregated byte buffers. `supportsStreamContent` would indicate
that handler can parse chunks of http content as they arrive.
2024-08-09 06:32:20 +10:00
Simon Cooper
17f819269a
Check the scale before converting xcontent long values, rather than the absolute value (#111538)
Large numbers are rejected, small numbers rounded to zero (if rounding enabled)
2024-08-05 10:31:09 +01:00
Oleksandr Kolomiiets
5440f178aa
Support synthetic source for geo_point when ignore_malformed is used (#109651) 2024-06-18 08:37:27 -07:00
Armin Braun
8bc84b6e37
Make XContentType.xContent() a getter (#109264)
Seen this come up in some profiling, wasting some cycles. If we do a
method per type here instead of a getter + field, we pay for a
megamorphic callsite potentially. It's faster and uses less code
anyway to just use a field + getter here.
2024-06-05 11:14:12 +02:00
Daniel Mitterdorfer
890bd4b8a5
Consider context in raw serialization (#106163)
With this commit we use `writeRawValue` instead of `writeRaw` when
serializing raw strings as XContent. The latter method does not consider
context (e.g. is the value being written as part of an array and
requires a comma separator?) whereas the former does. This ensures that
pre-rendered double values as we use them in the flamegraph response are
rendered correctly as XContent.

Closes #106103
2024-03-11 13:48:12 +01:00
Ryan Ernst
83585315fe
Only apply build to direct libs (#106101)
Sometimes libs have subprojects that may not be java projects. This commit adjusts the shared
configuration for libs to only affect direct subprojects of :lib.
2024-03-08 13:48:26 -08:00
Daniel Mitterdorfer
7179c12b24
[Profiling] Speed up serialization of flamegraph (#105779)
The response of the flamegraph is quite large: A typical response can
easily reach 50MB (uncompressed). In order to reduce memory pressure and
also to start sending the response sooner, we chunk the response.
However, this leads to many chunks that are very small and lead to high
overhead. In our experiments, just the serialization takes more than
500ms.

With this commit we take the following measures:

1. We split the response into chunks only when it makes sense and
   otherwise send one larger chunk.
2. Serialization of doubles is very expensive: Just the serialization of
   annual CO2 tons takes around 80ms in our test setup. Therefore, we
apply a custom serialization that is both faster than the builtin
serialization as well reduces the amount of bytes sent over the wire
because we round to four decimal places (which is more than sufficient for 
our purposes).
2024-03-07 15:31:02 +01:00
Dmitry Cherniachenko
e21a4874ab
Use String.replace() instead of replaceAll() for non-regexp replacements (#105127)
* Use String.replace() instead of replaceAll() for non-regexp replacements

When arguments do not make use of regexp features replace() is a more efficient option, especially the char-variant.
2024-02-12 13:11:15 -05:00
Dmitry Cherniachenko
263ea5e987
Replace generic HashSet / HashMap with more efficient EnumSet / EnumMap (#105238) 2024-02-08 13:43:14 +00:00
David Turner
3b7b86c507
Simplify ChunkedToXContentHelper#singleChunk (#105225)
There's no need for this helper to take more than one argument. Almost
all the usages only passed in a single argument, and the few cases that
supplied more than one can be rewritten as a single argument to save
allocating all those extra lambdas.
2024-02-07 03:53:02 -05:00
Ryan Ernst
b250f06b09
Add a gradle plugin for embedded providers (#105094)
x-content embeds its jackson implementation inside its jar. This commit
formalizes the setup for this embedding with a gradle plugin so that it
can be reused by other libs.
2024-02-05 15:21:52 -05:00
Armin Braun
50bafd306c
Save allocating enum values array in two hot spots (#104952)
Our readEnum code instantiates/clones enum value arrays on read.
Normally, this doesn't matter much but the two spots adjusted here are
visibly hot during bulk indexing, causing GBs of allocations during e.g.
the http_logs indexing run.
2024-01-31 11:26:36 +01:00
Stuart Tettemer
a359b1f648
Relax limit on max string size in CBOR, Smile, YAML (#103930)
Remove the rough limit on string length from Jackson 2.15. The limit was already relaxed for JSON in #96031, this extends that change to other XContent types.

Refs: #96031
Fixes: #104009
2024-01-08 13:31:54 -06:00
Armin Braun
49f1b5b787
Make sure to close XContentParser in more spots (#103504)
We're leaking quite a few of these parsers. That doesn't seem to be much
of a problem but results in some memory inefficiencies in Jackson here
and there. This PR bulk fixes a bunch of instances that I could easily
automatically fix. I'll open a follow-up for closing the parser on the
document parsing context which also suffers from this but is non-trivial
to fix.
2023-12-19 10:26:06 +01:00
Rene Groeschke
f7ba5efcb0
Fix generation of xcontent provider Manifest (#101200)
Fixes #101191
2023-10-23 06:35:02 -04:00
Rene Groeschke
d9ca42bf7d
Use custom task implementation for use generate manifest (#101165)
Follow up on #101161 to make this behave better when using gradle
configuration cache
2023-10-19 17:18:21 -04:00
Armin Braun
b7eafce32c
Make some practically static methods static (#97565)
Another round of automated fixes to this, marking things that can be
made static as static. Saves some JIT cycles but also turns some lambdas
from capturing to non-capturing and makes the "utilityness" of some
classes visible.
2023-10-06 23:37:07 +02:00
Armin Braun
16dd1e69e9
Add some type specific overrides to XContentBuilder (#99110)
We can add a couple more overrides here that resolve just fine to avoid
the slow `field(Object)` path here and there.
2023-09-01 08:01:13 +02:00
Armin Braun
37d55dac1c
Speed up String array writes to XContent (#98957)
Jackson has a direct method for writing string arrays
that saves us some of the indirection we have when looping
over a string array. This normally doesn't gain much, but for extreme
cases like long index name lists in field caps it saves a couple percent
in CPU time.
2023-08-30 12:02:41 +02:00
Matteo Piergiovanni
392c497551
Automatically flatten objects when subobjects:false (#97972)
While ingesting documents that contain nested objects and the
mapping property subobjects is set to false instead of throwing
a mapping exception and dropping the document(s), we map only
leaf field(s) with their full path as their name separated by dots.
2023-08-24 18:28:57 +02:00
Rene Groeschke
b8627079b4
Update Gradle Wrapper to 8.2 (#96686)
- Convention usage has been deprecated and was fixed in our build files
- Fix test dependencies and deprecation
2023-07-04 15:35:15 +02:00
Mary Gouseti
f87c2c7758
Introduce downsampling configuration for data stream lifecycle (#97041)
This PR introduces downsampling configuration to the data stream lifecycle. Keep in mind downsampling implementation will come in a follow up PR. Configuration looks like this:
```
{
  "lifecycle": {
    "data_retention": "90d",
    "downsampling": [
      {
        "after": "1d",
        "fixed_interval": "2h"
      },
      { "after": "15d", "fixed_interval": "1d" },
      { "after": "30d", "fixed_interval": "1w" }
    ]
  }
}
```
We will also support using `null` to unset downsampling configuration during template composition:
```
{
  "lifecycle": {
    "data_retention": "90d",
    "downsampling": null
  }
}
```
2023-06-29 16:41:17 +03:00
Przemyslaw Gomulka
66a951e270
Fix tchar pattern in RestRequest (#96406)
This commits fixes the incorrect pattern for TChar defined in RFC7230 section 3.2.6
`a-zA-z` was accidentally used and the pattern `a-zA-Z` should be used instead
2023-05-30 16:31:00 +02:00
Ryan Ernst
1208c02cee
Relax limit on max string size (#96031)
Jackson 2.15 introduced a (rough) maximum limit on string length. This
commit relaxes that limit to its maximum size, leaving document size
constraints to other existing limits in the system. We can revisit
whether string length within a document should be independently
constrainted later.
2023-05-11 08:54:27 -07:00
Ryan Ernst
8b8a2be7dd
Upgrade Jackson xml to 2.15.0 (#95641)
Additionally this commit updates snakeyaml to 2.0 as that is the version
now used by Jackson.
2023-05-02 13:59:17 -07:00
Armin Braun
e61ab5a86e
Speed up ParsedMediaType.parseMediaType (#95305)
I saw this in some hot-threads. Splitting by a pattern that isn't a single char is expensive
because it instantiates a `Pattern`. Seems like it's redundant to split the spaces+tabs away anyway since
we trim values and keys later on in the logic.
-> lets use the split fast path and not have this on the transport thread.
2023-04-28 11:02:51 +02:00
David Turner
c282f50f80
Deeper chunking of node stats response (#95060)
Pushes the chunking of `GET _nodes/stats` down to avoid creating
unboundedly large chunks. With this commit we yield one chunk per shard
(if `?level=shards`) or index (if `?level=indices`) and per HTTP client
and per transport action.

Closes #93985
2023-04-06 01:26:41 -04:00
Rory Hunter
fe1083f6c5
Upgrade spotless plugin to 6.17.0 (#94994)
Fixes #82794. Upgrade the spotless plugin, which addresses the issue
around formatting `instanceof` expressions. Formatting of statements
including lambdas seems to have improved too.
2023-04-04 10:03:32 +01:00
Howard
03dcad7ff3
Fix comment line issue. (#94759) 2023-03-29 12:38:57 -05:00
Ryan Ernst
8c554029d6
Add helper to assert writeability to x-content (#94847)
When writing generic objects to x-content the value may cause an error
if XContentBuilder does not know how to understand the concrete object
type. This commit adds a new helper method, similar to
StreamOutput.checkWriteable, which validates the type of an object (and
any inner objects if it is a collection) are writeable to x-content.
2023-03-28 22:34:37 -04:00
Simon Cooper
c6487f64f2
Use double wildcards for JSON filtered excludes properly (#94195) 2023-03-10 08:50:28 +00:00
Armin Braun
c807a096c0
Speed up DotExpandingXContentParser.expandDots (#94315)
String.split is very visibly slow in profiling the BeatsMapperBenchmark.
The reason for that is mainly in its creation of an intermediary `ArrayList`
and the generality of the method.
We can do a much faster split by only doing the split method's operations
for a char and, avoiding the list by pre-counting the dots (which we somewhat
get for free since we do the upfront check for `contains('.')` so we can use
counting the dots for that as well.
This speeds up the beats mapper benchmark from 9k ns to about 8.4k ns on my x86 workstation
and from 6k ns about 5.5k ns on my M1 MBP which I think is considerable enough to justify the
small amount of added complexity.
2023-03-08 06:35:49 +01:00
Daniel Mitterdorfer
299eff5496
[Profiling] Map stack frames more efficiently (#94327)
The mapping code for stack frames uses the utility method
`ObjectPath#eval` to read nested properties. Callers need to pass a
dot-separated path which is then split internally via a regex. This is
quite slow: in a typical deployment we saw overheads of 50ms just for
mapping stack frames (total response time is ~ 1 second).

With this commit we pass the property path as an array to avoid this
overhead. In a microbenchmark the new implementation was 23 times faster
than the current one.
2023-03-06 15:56:53 +01:00
Rene Groeschke
08845b78f2
Update Gradle Wrapper to 7.6.1 (#89796) (#92241) (#94122)
This updates the gradle wrapper to 7.6.1. This patch release contains a
fix for  incremental compilation of java modules we have raised against
gradle 7.6

see https://github.com/gradle/gradle/issues/23067
2023-02-24 11:48:08 -05:00
Przemyslaw Gomulka
d065d4b76d
Remove jackson override and upgrade to jackson to 2.14.2 (#93342)
before the jackson 2.14.2 elasticserach had to override the jackson locally to avoid a bug when filtering empty arrays. #92984
This commit reverts the local override and upgrades jackson to 2.14.2 which contain the fix to the bug
2023-01-30 16:58:09 +01:00
Luca Cavanna
edd7749164
Upgrade to lucene-9.5.0-snapshot-d19c3e2e0ed (#92957)
9.5 will include several changes related to vector search. An extensive list is available at https://github.com/apache/lucene/milestone/4 .
2023-01-19 14:07:33 +01:00
Przemyslaw Gomulka
26ccfab8bb
Exclude jackson patched class from spotlessApply (#93059) 2023-01-18 19:08:32 +01:00
Przemyslaw Gomulka
8f37934a76
Exclude the class from jackson jar (#93052)
in #92984 we override a file in jackson jar, but we rely on gradle internals which might change at any point.
This fixes this by excluding a element from a jar and allowing a new class to be added
2023-01-18 16:59:09 +01:00
Przemyslaw Gomulka
441e77c8cf
Patch jackson-core with locally modified class (#92984)
while jackson 2.14.2 with FasterXML/jackson-core#882 is still not released
we want to patch the jackson-core used by x-content with the modified class that fixes the bug #92480

closes #92480
2023-01-18 14:48:14 +01:00
Przemyslaw Gomulka
d19721b701
Update jackson to 2.14.1 (#92990)
Closes #92341
2023-01-17 16:30:49 +01:00
Simon Cooper
4d37feea8c
Update bug url for failing test (#92633)
Remove references to fixed issue in jackson. Test is still failing for other reasons (#92632)
2023-01-03 14:31:05 +00:00