Aggregations: Add Holt-Winters model to moving_avg pipeline aggregation

Closes #11043
This commit is contained in:
Zachary Tong 2015-05-06 16:13:11 -04:00
parent cbb7b633f6
commit 491afbe01c
18 changed files with 1204 additions and 99 deletions

View file

@ -180,11 +180,11 @@ The default value of `alpha` is `0.5`, and the setting accepts any float from 0-
[[single_0.2alpha]]
.Single Exponential moving average with window of size 10, alpha = 0.2
.EWMA with window of size 10, alpha = 0.2
image::images/pipeline_movavg/single_0.2alpha.png[]
[[single_0.7alpha]]
.Single Exponential moving average with window of size 10, alpha = 0.7
.EWMA with window of size 10, alpha = 0.7
image::images/pipeline_movavg/single_0.7alpha.png[]
==== Holt-Linear
@ -223,13 +223,111 @@ to see. Small values emphasize long-term trends (such as a constant linear tren
values emphasize short-term trends. This will become more apparently when you are predicting values.
[[double_0.2beta]]
.Double Exponential moving average with window of size 100, alpha = 0.5, beta = 0.2
.Holt-Linear moving average with window of size 100, alpha = 0.5, beta = 0.2
image::images/pipeline_movavg/double_0.2beta.png[]
[[double_0.7beta]]
.Double Exponential moving average with window of size 100, alpha = 0.5, beta = 0.7
.Holt-Linear moving average with window of size 100, alpha = 0.5, beta = 0.7
image::images/pipeline_movavg/double_0.7beta.png[]
==== Holt-Winters
The `holt_winters` model (aka "triple exponential") incorporates a third exponential term which
tracks the seasonal aspect of your data. This aggregation therefore smooths based on three components: "level", "trend"
and "seasonality".
The level and trend calculation is identical to `holt` The seasonal calculation looks at the difference between
the current point, and the point one period earlier.
Holt-Winters requires a little more handholding than the other moving averages. You need to specify the "periodicity"
of your data: e.g. if your data has cyclic trends every 7 days, you would set `period: 7`. Similarly if there was
a monthly trend, you would set it to `30`. There is currently no periodicity detection, although that is planned
for future enhancements.
There are two varieties of Holt-Winters: additive and multiplicative.
===== "Cold Start"
Unfortunately, due to the nature of Holt-Winters, it requires two periods of data to "bootstrap" the algorithm. This
means that your `window` must always be *at least* twice the size of your period. An exception will be thrown if it
isn't. It also means that Holt-Winters will not emit a value for the first `2 * period` buckets; the current algorithm
does not backcast.
[[holt_winters_cold_start]]
.Holt-Winters showing a "cold" start where no values are emitted
image::images/reducers_movavg/triple_untruncated.png[]
Because the "cold start" obscures what the moving average looks like, the rest of the Holt-Winters images are truncated
to not show the "cold start". Just be aware this will always be present at the beginning of your moving averages!
===== Additive Holt-Winters
Additive seasonality is the default; it can also be specified by setting `"type": "add"`. This variety is preferred
when the seasonal affect is additive to your data. E.g. you could simply subtract the seasonal effect to "de-seasonalize"
your data into a flat trend.
The default value of `alpha`, `beta` and `gamma` is `0.5`, and the settings accept any float from 0-1 inclusive.
The default value of `period` is `1`.
[source,js]
--------------------------------------------------
{
"the_movavg":{
"moving_avg":{
"buckets_path": "the_sum",
"model" : "holt_winters",
"settings" : {
"type" : "add",
"alpha" : 0.5,
"beta" : 0.5,
"gamma" : 0.5,
"period" : 7
}
}
}
--------------------------------------------------
[[holt_winters_add]]
.Holt-Winters moving average with window of size 120, alpha = 0.5, beta = 0.7, gamma = 0.3, period = 30
image::images/reducers_movavg/triple.png[]
===== Multiplicative Holt-Winters
Multiplicative is specified by setting `"type": "mult"`. This variety is preferred when the seasonal affect is
multiplied against your data. E.g. if the seasonal affect is x5 the data, rather than simply adding to it.
The default value of `alpha`, `beta` and `gamma` is `0.5`, and the settings accept any float from 0-1 inclusive.
The default value of `period` is `1`.
[WARNING]
======
Multiplicative Holt-Winters works by dividing each data point by the seasonal value. This is problematic if any of
your data is zero, or if there are gaps in the data (since this results in a divid-by-zero). To combat this, the
`mult` Holt-Winters pads all values by a very small amount (1*10^-10^) so that all values are non-zero. This affects
the result, but only minimally. If your data is non-zero, or you prefer to see `NaN` when zero's are encountered,
you can disable this behavior with `pad: false`
======
[source,js]
--------------------------------------------------
{
"the_movavg":{
"moving_avg":{
"buckets_path": "the_sum",
"model" : "holt_winters",
"settings" : {
"type" : "mult",
"alpha" : 0.5,
"beta" : 0.5,
"gamma" : 0.5,
"period" : 7,
"pad" : true
}
}
}
--------------------------------------------------
==== Prediction
All the moving average model support a "prediction" mode, which will attempt to extrapolate into the future given the
@ -263,7 +361,7 @@ value, we can extrapolate based on local constant trends (in this case the predi
of the series was heading in a downward direction):
[[double_prediction_local]]
.Double Exponential moving average with window of size 100, predict = 20, alpha = 0.5, beta = 0.8
.Holt-Linear moving average with window of size 100, predict = 20, alpha = 0.5, beta = 0.8
image::images/pipeline_movavg/double_prediction_local.png[]
In contrast, if we choose a small `beta`, the predictions are based on the global constant trend. In this series, the
@ -272,3 +370,10 @@ global trend is slightly positive, so the prediction makes a sharp u-turn and be
[[double_prediction_global]]
.Double Exponential moving average with window of size 100, predict = 20, alpha = 0.5, beta = 0.1
image::images/pipeline_movavg/double_prediction_global.png[]
The `holt_winters` model has the potential to deliver the best predictions, since it also incorporates seasonal
fluctuations into the model:
[[holt_winters_prediction_global]]
.Holt-Winters moving average with window of size 120, predict = 25, alpha = 0.8, beta = 0.2, gamma = 0.7, period = 30
image::images/pipeline_movavg/triple_prediction.png[]

Binary file not shown.

After

Width:  |  Height:  |  Size: 96 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 91 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB