[backport 8.10] Expanded the description how to size the memory of the JVM (#15210) (#15350)

Expands the description of memory used by Logstash, dividing the heap and non-heap; describing in details which parts composes the non-heap, how to size it and list the JVM settings that can be used to properly define this space. Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com> Co-authored-by: João Duarte <jsvd@users.noreply.github.com> (cherry picked from commit b31c429d19) Co-authored-by: Andrea Selva <selva.andre@gmail.com>
2025-04-24 14:47:19 -04:00 · 2023-09-26 11:39:48 +02:00 · 2023-09-26 11:39:48 +02:00 · 69ea40cc9d
commit 69ea40cc9d
parent 872f5a177b
1 changed files with 74 additions and 5 deletions
--- a/docs/static/config-details.asciidoc
+++ b/docs/static/config-details.asciidoc
@ -49,9 +49,19 @@ JVM falls in the inclusive range of the two numbers

 * all other lines are rejected

+[[memory-size]]
+==== Setting the memory size
+
+The memory of the JVM executing {ls} can be divided in two zones: heap and off-heap memory.
+In the heap refers to Java heap, which contains all the Java objects created by {ls} during its operation, see <<heap-size>> for
+description on how to size it.
+What's not part of the heap is named off-heap and consists of memory that can be used and controlled by {ls}, generally
+thread stacks, direct memory and memory mapped pages, check <<off-heap-size>> for comprehensive descriptions.
+In off-heap space there is some space which is used by JVM and contains all the data structures functional to the execution
+of the virtual machine. This memory can't be controlled by {ls} and the settings are rarely customized.

 [[heap-size]]
-==== Setting the JVM heap size
+===== Setting the JVM heap size

 Here are some tips for adjusting the JVM heap size:

@ -77,15 +87,74 @@ process.
 info, see <<profiling-the-heap>>.
 // end::heap-size-tips[]

-
 [[off-heap-size]]
-==== Setting the off-heap size
+===== Setting the off-heap size

 The operating system, persistent queue mmap pages, direct memory, and other processes require memory in addition to memory allocated to heap size.
-Keep the overall memory requirements in mind when you allocate memory.

+Internal JVM data structures, thread stacks, memory mapped files and direct memory for input/output (IO) operations are all parts of the off-heap JVM memory.
+Memory mapped files are not part of the Logstash's process off-heap memory, but consume RAM when paging files from disk.
+These mapped files speed up the access to Persistent Queues pages, a performance improvement - or trade off - to reduce expensive disk operations such as read, write, and seek.
+Some network I/O operations also resort to in-process direct memory usage to avoid, for example, copying of buffers between network sockets. Input plugins such as Elastic Agent, Beats, TCP, and HTTP inputs, use direct memory.
+The zone for Thread stacks contains the list of stack frames for each Java thread created by the JVM; each frame keeps the local arguments passed during method calls.
+Read on <<stacks-size>> if the size needs to be adapted to the processing needs.
+
+Plugins, depending on their type (inputs, filters, and outputs), have different thread models.
+Every input plugin runs in its own thread and can potentially spawn others. For example, each JDBC input
+plugin launches a scheduler thread. Netty based plugins like TCP, Beats or HTTP input spawn a thread pool with 2 * number_of_cores threads.
+Output plugins may also start helper threads, such as a connection management thread for each
+{es} output instance.
+Every pipeline, also, has its own thread responsible to manage the pipeline lifecycle.
+
+To summarize, we have 3 categories of memory usage, where 2 can be limited by the JVM and the other relies on available, free memory:
+
+[cols="<,<,<",options="header",]
+|=====
+| Memory Type | Configured using | Used by
+| JVM Heap  |   -Xmx   | any normal object allocation
+| JVM direct memory |   -XX:MaxDirectMemorySize   | beats, tcp and http inputs
+| Native memory  |  N/A   | Persistent Queue Pages, Thread Stacks
+|=====
+
+Keep these memory requirements in mind as you calculate your ideal memory allocation.
+
+[[memory-size-calculation]]
+===== Memory sizing
+
+Total JVM memory allocation must be estimated and is controlled indirectly using Java heap and direct memory settings.
 By default, a JVM's off-heap direct memory limit is the same as the heap size. Check out <<plugins-inputs-beats-memory,beats input memory usage>>.
-Consider setting `-XX:MaxDirectMemorySize` to half of the heap size.
+Consider setting `-XX:MaxDirectMemorySize` to half of the heap size or any value that can accommodate the load you expect these plugins to handle.
+
+As you make your capacity calculations, keep in mind that the JVM can't consume the total amount of the host's memory available,
+as the Operating System and other processes will require memory too.
+
+For a {ls} instance with persistent queue (PQ) enabled on multiple pipelines, we could
+estimate memory consumption using:
+
+[source,text]
+-----
+pipelines number * (pipeline threads * stack size + 2 * PQ page size) + direct memory + Java heap
+-----
+
+NOTE: Each Persistent Queue requires that at least head and tail pages are present and accessible in memory.
+The default page size is 64 MB so each PQ requires at least 128 MB of heap memory, which can be a significant source
+of memory consumption per pipeline. Note that the size of memory mapped file can't be limited with an upper bound.
+
+NOTE: Stack size is a setting that depends on the JVM used, but could be customized with `-Xss` setting.
+
+NOTE: Direct memory space by default is big as much as Java heap, but can be customized with the `-XX:MaxDirectMemorySize` setting.
+
+**Example**
+
+Consider a {ls} instance running 10 pipelines, with simple input and output plugins that doesn't start additional threads,
+it has 1 pipelines thread, 1 input plugin thread and 12 workers, summing up to 14.
+Keep in mind that, by default, JVM allocates direct memory equal to memory allocated for Java heap.
+
+The calculation results in:
+
+* native memory: 1.4Gb  [derived from 10 * (14 * 1Mb + 128Mb)]
+* direct memory: 4Gb
+* Java heap: 4Gb


 [[stacks-size]]