|
1 |
| -# Distributed Area Team Internals |
| 1 | +# Distributed Area Internals |
2 | 2 |
|
3 |
| -(Summary, brief discussion of our features) |
| 3 | +The Distributed Area contains indexing and coordination systems. |
| 4 | + |
| 5 | +The index path stretches from the user REST command through shard routing down to each individual shard's translog and storage |
| 6 | +engine. Reindexing is effectively reading from a source index and writing to a destination index (perhaps on different nodes). |
| 7 | +The coordination side includes cluster coordination, shard allocation, cluster autoscaling stats, task management, and cross |
| 8 | +cluster replication. Less obvious coordination systems include networking, the discovery plugin system, the snapshot/restore |
| 9 | +logic, and shard recovery. |
| 10 | + |
| 11 | +A guide to the general Elasticsearch components can be found [here](https://github.com/elastic/elasticsearch/blob/main/docs/internal/GeneralArchitectureGuide.md). |
4 | 12 |
|
5 | 13 | # Networking
|
6 | 14 |
|
@@ -237,9 +245,101 @@ works in parallel with the storage engine.)
|
237 | 245 |
|
238 | 246 | # Autoscaling
|
239 | 247 |
|
240 |
| -(Reactive and proactive autoscaling. Explain that we surface recommendations, how control plane uses it.) |
241 |
| - |
242 |
| -(Sketch / list the different deciders that we have, and then also how we use information from each to make a recommendation.) |
| 248 | +The Autoscaling API in ES (Elasticsearch) uses cluster and node level statistics to provide a recommendation |
| 249 | +for a cluster size to support the current cluster data and active workloads. ES Autoscaling is paired |
| 250 | +with an ES Cloud service that periodically polls the ES elected master node for suggested cluster |
| 251 | +changes. The cloud service will add more resources to the cluster based on Elasticsearch's recommendation. |
| 252 | +Elasticsearch by itself cannot automatically scale. |
| 253 | + |
| 254 | +Autoscaling recommendations are tailored for the user [based on user defined policies][], composed of data |
| 255 | +roles (hot, frozen, etc) and [deciders][]. There's a public [webinar on autoscaling][], as well as the |
| 256 | +public [Autoscaling APIs] docs. |
| 257 | + |
| 258 | +Autoscaling's current implementation is based primary on storage requirements, as well as memory capacity |
| 259 | +for ML and frozen tier. It does not yet support scaling related to search load. Paired with ES Cloud, |
| 260 | +autoscaling only scales upward, not downward, except for ML nodes that do get scaled up _and_ down. |
| 261 | + |
| 262 | +[based on user defined policies]: https://www.elastic.co/guide/en/elasticsearch/reference/current/xpack-autoscaling.html |
| 263 | +[deciders]: https://www.elastic.co/guide/en/elasticsearch/reference/current/autoscaling-deciders.html |
| 264 | +[webinar on autoscaling]: https://www.elastic.co/webinars/autoscaling-from-zero-to-production-seamlessly |
| 265 | +[Autoscaling APIs]: https://www.elastic.co/guide/en/elasticsearch/reference/current/autoscaling-apis.html |
| 266 | + |
| 267 | +### Plugin REST and TransportAction entrypoints |
| 268 | + |
| 269 | +Autoscaling is a [plugin][]. All the REST APIs can be found in [autoscaling/rest/][]. |
| 270 | +`GetAutoscalingCapacityAction` is the capacity calculation operation REST endpoint, as opposed to the |
| 271 | +other rest commands that get/set/delete the policies guiding the capacity calculation. The Transport |
| 272 | +Actions can be found in [autoscaling/action/], where [TransportGetAutoscalingCapacityAction][] is the |
| 273 | +entrypoint on the master node for calculating the optimal cluster resources based on the autoscaling |
| 274 | +policies. |
| 275 | + |
| 276 | +[plugin]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/Autoscaling.java#L72 |
| 277 | +[autoscaling/rest/]: https://github.com/elastic/elasticsearch/tree/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/rest |
| 278 | +[autoscaling/action/]: https://github.com/elastic/elasticsearch/tree/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/action |
| 279 | +[TransportGetAutoscalingCapacityAction]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/action/TransportGetAutoscalingCapacityAction.java#L82-L98 |
| 280 | + |
| 281 | +### How cluster capacity is determined |
| 282 | + |
| 283 | +[AutoscalingMetadata][] implements [Metadata.Custom][] in order to persist autoscaling policies. Each |
| 284 | +Decider is an implementation of [AutoscalingDeciderService][]. The [AutoscalingCalculateCapacityService][] |
| 285 | +is responsible for running the calculation. |
| 286 | + |
| 287 | +[TransportGetAutoscalingCapacityAction.computeCapacity] is the entry point to [AutoscalingCalculateCapacityService.calculate], |
| 288 | +which creates a [AutoscalingDeciderResults][] for [each autoscaling policy][]. [AutoscalingDeciderResults.toXContent][] then |
| 289 | +determines the [maximum required capacity][] to return to the caller. [AutoscalingCapacity][] is the base unit of a cluster |
| 290 | +resources recommendation. |
| 291 | + |
| 292 | +The `TransportGetAutoscalingCapacityAction` response is cached to prevent concurrent callers |
| 293 | +overloading the system: the operation is expensive. `TransportGetAutoscalingCapacityAction` contains |
| 294 | +a [CapacityResponseCache][]. `TransportGetAutoscalingCapacityAction.masterOperation` |
| 295 | +calls [through the CapacityResponseCache][], into the `AutoscalingCalculateCapacityService`, to handle |
| 296 | +concurrent callers. |
| 297 | + |
| 298 | +[AutoscalingMetadata]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/AutoscalingMetadata.java#L38 |
| 299 | +[Metadata.Custom]: https://github.com/elastic/elasticsearch/blob/v8.13.2/server/src/main/java/org/elasticsearch/cluster/metadata/Metadata.java#L141-L145 |
| 300 | +[AutoscalingDeciderService]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/capacity/AutoscalingDeciderService.java#L16-L19 |
| 301 | +[AutoscalingCalculateCapacityService]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/capacity/AutoscalingCalculateCapacityService.java#L43 |
| 302 | + |
| 303 | +[TransportGetAutoscalingCapacityAction.computeCapacity]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/action/TransportGetAutoscalingCapacityAction.java#L102-L108 |
| 304 | +[AutoscalingCalculateCapacityService.calculate]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/capacity/AutoscalingCalculateCapacityService.java#L108-L139 |
| 305 | +[AutoscalingDeciderResults]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/capacity/AutoscalingDeciderResults.java#L34-L38 |
| 306 | +[each autoscaling policy]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/capacity/AutoscalingCalculateCapacityService.java#L124-L131 |
| 307 | +[AutoscalingDeciderResults.toXContent]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/capacity/AutoscalingDeciderResults.java#L78 |
| 308 | +[maximum required capacity]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/capacity/AutoscalingDeciderResults.java#L105-L116 |
| 309 | +[AutoscalingCapacity]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/capacity/AutoscalingCapacity.java#L27-L35 |
| 310 | + |
| 311 | +[CapacityResponseCache]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/action/TransportGetAutoscalingCapacityAction.java#L44-L47 |
| 312 | +[through the CapacityResponseCache]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/action/TransportGetAutoscalingCapacityAction.java#L97 |
| 313 | + |
| 314 | +### Where the data comes from |
| 315 | + |
| 316 | +The Deciders each pull data from different sources as needed to inform their decisions. The |
| 317 | +[DiskThresholdMonitor][] is one such data source. The Monitor runs on the master node and maintains |
| 318 | +lists of nodes that exceed various disk size thresholds. [DiskThresholdSettings][] contains the |
| 319 | +threshold settings with which the `DiskThresholdMonitor` runs. |
| 320 | + |
| 321 | +[DiskThresholdMonitor]: https://github.com/elastic/elasticsearch/blob/v8.13.2/server/src/main/java/org/elasticsearch/cluster/routing/allocation/DiskThresholdMonitor.java#L53-L58 |
| 322 | +[DiskThresholdSettings]: https://github.com/elastic/elasticsearch/blob/v8.13.2/server/src/main/java/org/elasticsearch/cluster/routing/allocation/DiskThresholdSettings.java#L24-L27 |
| 323 | + |
| 324 | +### Deciders |
| 325 | + |
| 326 | +The `ReactiveStorageDeciderService` tracks information that demonstrates storage limitations are causing |
| 327 | +problems in the cluster. It uses [an algorithm defined here][]. Some examples are |
| 328 | +- information from the `DiskThresholdMonitor` to find out whether nodes are exceeding their storage capacity |
| 329 | +- number of unassigned shards that failed allocation because of insufficient storage |
| 330 | +- the max shard size and minimum node size, and whether these can be satisfied with the existing infrastructure |
| 331 | + |
| 332 | +[an algorithm defined here]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/storage/ReactiveStorageDeciderService.java#L158-L176 |
| 333 | + |
| 334 | +The `ProactiveStorageDeciderService` maintains a forecast window that [defaults to 30 minutes][]. It only |
| 335 | +runs on data streams (ILM, rollover, etc), not regular indexes. It looks at past [index changes][] that |
| 336 | +took place within the forecast window to [predict][] resources that will be needed shortly. |
| 337 | + |
| 338 | +[defaults to 30 minutes]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/storage/ProactiveStorageDeciderService.java#L32 |
| 339 | +[index changes]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/storage/ProactiveStorageDeciderService.java#L79-L83 |
| 340 | +[predict]: https://github.com/elastic/elasticsearch/blob/v8.13.2/x-pack/plugin/autoscaling/src/main/java/org/elasticsearch/xpack/autoscaling/storage/ProactiveStorageDeciderService.java#L85-L95 |
| 341 | + |
| 342 | +There are several more Decider Services, implementing the `AutoscalingDeciderService` interface. |
243 | 343 |
|
244 | 344 | # Snapshot / Restore
|
245 | 345 |
|
|
0 commit comments