Closed
Description
The new load balancer extends from LoadBalancer
instead of CommonLoadBalancer
where all of the common metrics were emitted such as:
protected def emitMetrics() = {
MetricEmitter.emitGaugeMetric(LOADBALANCER_ACTIVATIONS_INFLIGHT(controllerInstance), totalActivations.longValue)
MetricEmitter.emitGaugeMetric(
LOADBALANCER_MEMORY_INFLIGHT(controllerInstance, ""),
totalBlackBoxActivationMemory.longValue + totalManagedActivationMemory.longValue)
MetricEmitter.emitGaugeMetric(
LOADBALANCER_MEMORY_INFLIGHT(controllerInstance, "Blackbox"),
totalBlackBoxActivationMemory.longValue)
MetricEmitter.emitGaugeMetric(
LOADBALANCER_MEMORY_INFLIGHT(controllerInstance, "Managed"),
totalManagedActivationMemory.longValue)
}
and the completion ack metrics:
// Singletons for counter metrics related to completion acks
protected val LOADBALANCER_COMPLETION_ACK_REGULAR =
LoggingMarkers.LOADBALANCER_COMPLETION_ACK(controllerInstance, RegularCompletionAck)
protected val LOADBALANCER_COMPLETION_ACK_FORCED =
LoggingMarkers.LOADBALANCER_COMPLETION_ACK(controllerInstance, ForcedCompletionAck)
protected val LOADBALANCER_COMPLETION_ACK_HEALTHCHECK =
LoggingMarkers.LOADBALANCER_COMPLETION_ACK(controllerInstance, HealthcheckCompletionAck)
protected val LOADBALANCER_COMPLETION_ACK_REGULAR_AFTER_FORCED =
LoggingMarkers.LOADBALANCER_COMPLETION_ACK(controllerInstance, RegularAfterForcedCompletionAck)
protected val LOADBALANCER_COMPLETION_ACK_FORCED_AFTER_REGULAR =
LoggingMarkers.LOADBALANCER_COMPLETION_ACK(controllerInstance, ForcedAfterRegularCompletionAck)
The fpc pool balancer has it's own processCompletion
function so at the very least those metrics need to be ported into the FPC version of that function
Then you have the ShardingPool balancer specific metrics which I think are needed as well since the controller still reports the status of the entire invoker fleet. Knowing the total memory pool is still valuable as well as the count of invokers in each state through metrics. Not sure if these metrics below should still be reported in the load balancer but they should still exist somewhere.
override protected def emitMetrics() = {
super.emitMetrics()
MetricEmitter.emitGaugeMetric(
INVOKER_TOTALMEM_BLACKBOX,
schedulingState.blackboxInvokers.foldLeft(0L) { (total, curr) =>
if (curr.status.isUsable) {
curr.id.userMemory.toMB + total
} else {
total
}
})
MetricEmitter.emitGaugeMetric(
INVOKER_TOTALMEM_MANAGED,
schedulingState.managedInvokers.foldLeft(0L) { (total, curr) =>
if (curr.status.isUsable) {
curr.id.userMemory.toMB + total
} else {
total
}
})
MetricEmitter.emitGaugeMetric(HEALTHY_INVOKER_MANAGED, schedulingState.managedInvokers.count(_.status == Healthy))
MetricEmitter.emitGaugeMetric(
UNHEALTHY_INVOKER_MANAGED,
schedulingState.managedInvokers.count(_.status == Unhealthy))
MetricEmitter.emitGaugeMetric(
UNRESPONSIVE_INVOKER_MANAGED,
schedulingState.managedInvokers.count(_.status == Unresponsive))
MetricEmitter.emitGaugeMetric(OFFLINE_INVOKER_MANAGED, schedulingState.managedInvokers.count(_.status == Offline))
MetricEmitter.emitGaugeMetric(HEALTHY_INVOKER_BLACKBOX, schedulingState.blackboxInvokers.count(_.status == Healthy))
MetricEmitter.emitGaugeMetric(
UNHEALTHY_INVOKER_BLACKBOX,
schedulingState.blackboxInvokers.count(_.status == Unhealthy))
MetricEmitter.emitGaugeMetric(
UNRESPONSIVE_INVOKER_BLACKBOX,
schedulingState.blackboxInvokers.count(_.status == Unresponsive))
MetricEmitter.emitGaugeMetric(OFFLINE_INVOKER_BLACKBOX, schedulingState.blackboxInvokers.count(_.status == Offline))
}
Metadata
Metadata
Assignees
Labels
No labels