elasticsearch/qa/ccs-rolling-upgrade-remote-cluster
Nhat Nguyen 541cb02077
Fix search states of CCS requests in mixed cluster (#71265)
Previously, the search states are stored in ReaderContext on data nodes.
Since 7.10, we send them to the coordinating node in a QuerySearchResult
of a ShardSearchRequest and the coordinating node then sends them back
in ShardFetchSearchRequest. We must keep the search states in data nodes
unless they are sent back in the fetch phase. We used the channel version
to determine this guarantee. However, it's not correct in CCS requests in 
mixed clusters.

1. The coordinating node of the local cluster on the old version sends a
ShardSearchRequest to a proxy node of the remote cluster on the new
version. That proxy node delivers the request to the data node. In this
case, the channel version between the data node and the proxy node is 
>= 7.10, but we won't receive the search states in the fetch phase as they
are stripped out in the channel between the old coordinating node and
the new proxy.

```
[coordinating node v7.9] --> [proxy node v7.10] --> [data node on v7.10]
```

2. The coordinating node of the local on the new version sends a
ShardSearchRequest to a proxy node of the remote cluster on the new
version. However, the coordinating node sends a ShardFetchSearchRequest
to another proxy node of the remote cluster that is still on an old
version. The search states then are stripped out and never reach the
data node.

```
-> query phase: [coordinating node v7.10] --> [proxy node v7.10] --> [data node on v7.10]
-> fetch phase: [coordinating node v7.10] --> [proxy node v7.9] --> [data node on v7.10]
```

This commit fixes the first issue by explicitly serializing the channel version in 
a ShardSearchRequest and the second by continue storing the search states 
in ReaderContext unless all nodes are upgraded.

Relates #52741
2021-04-04 09:47:53 -04:00
..
src/test/java/org/elasticsearch/upgrades Fix search states of CCS requests in mixed cluster (#71265) 2021-04-04 09:47:53 -04:00
build.gradle Fix search states of CCS requests in mixed cluster (#71265) 2021-04-04 09:47:53 -04:00